Py学习  »  Python

Csv排序和删除python

mKalita • 4 年前 • 809 次点击  

192.168.136.192,2848,100.100.100.212,6667,"other"
100.100.100.212,6667,192.168.136.192,2848,"other"
100.100.100.212,6667,192.168.136.192,2848,"CHAT IRC message"
192.168.61.74,4662,69.192.30.179,80,"other"
192.168.107.87,4662,69.192.30.179,80,"other"
192.168.107.87,4662,69.192.30.179,80,"infection"
192.168.177.85,4662,69.192.30.179,80,"infection"
192.168.177.85,4662,69.192.30.179,80,"other"
192.168.118.168,4662,69.192.30.179,80,"infection"
192.168.118.168,4662,69.192.30.179,80,"other"
192.168.110.111,4662,69.192.30.179,80,"infection"

到目前为止我已经可以删除副本了现在我需要删除 还有src=src&dest=dest | |;src=dest&dest=source&删除带有“other”的那些,如果它们的=被标记为“infected” 这就是我目前为止要移除的复制品

with open(r'alerts.csv','r') as in_file, open('alertsfix.csv','w') as     out_file:
seen = set() # set for fast O(1) amortized lookup
for line in in_file:
    if line in seen: continue # skip duplicate

    seen.add(line)
    out_file.write(line)

src/prt/dest/prt/msg
1. a/a1/b/b1/c
2. 2a/2a1/2b/2b1/2c

条件:

if a==2b && a1==2b1 && b==2a && b1==2a1 c==2c
    delete one of them being they are equal 

if a==2b && a1==2b1 && b==2a && b1==2a1  ( c==other ) &&( 2c=="infected" || 2c=='CNC") 
    delete one that has message "other" 

Python社区是高质量的Python/Django开发社区
本文地址:http://www.python88.com/topic/54948
 
809 次点击  
文章 [ 1 ]  |  最新文章 4 年前
Ralf
Reply   •   1 楼
Ralf    5 年前

首先你必须定义平等的条件。例如,只有满足这两个条件时,以下代码才会将行视为相等:

  • 两个参与地址(ip和post一起)都是相同的;我使用 frozenset
  • 信息是一样的。

冰冻 (一个内置的不可修改的集合)为每一行构建密钥,以在 seen

with open('alerts.csv','r') as in_file, open('alertsfix.csv','w') as out_file:
    seen = set()
    for line in in_file:
        line = line.strip()
        if len(line) > 0:
            src_ip, src_port, dst_ip, dst_port, msg = line.split(',')
            src = '{}:{}'.format(src_ip, src_port)
            dst = '{}:{}'.format(dst_ip, dst_port)
            key = frozenset([
                frozenset([src, dst]),
                msg,
            ])

            if key not in seen:
                seen.add(key)         # we add 'key' to the set
                out_file.write(line)  # we write 'line' to the new file

这有助于你完成任务吗?