In [71]: txt = '''16.37.235.153|119.222.242.130|38673|161|17|62|4646|
...: 16.37.235.153|119.222.242.112|56388|161|17|62|4646|
...: 16.37.235.200|16.37.235.153|59009|514|17|143|21271|
...: '''
那
encoding
警告令人讨厌,但并不重要。
当dtype=none时,应该得到一个结构化数组
field
每列:
In [74]: data = np.genfromtxt(txt.splitlines(), encoding=None, dtype=None,delimiter='|')
In [75]: data
Out[75]:
array([('16.37.235.153', '119.222.242.130', 38673, 161, 17, 62, 4646, False),
('16.37.235.153', '119.222.242.112', 56388, 161, 17, 62, 4646, False),
('16.37.235.200', '16.37.235.153', 59009, 514, 17, 143, 21271, False)],
dtype=[('f0', '<U13'), ('f1', '<U15'), ('f2', '<i8'), ('f3', '<i8'), ('f4', '<i8'), ('f5', '<i8'), ('f6', '<i8'), ('f7', '?')])
这是1D。
作为列表(或元组)的列表
In [76]: data.tolist()
Out[76]:
[('16.37.235.153', '119.222.242.130', 38673, 161, 17, 62, 4646, False),
('16.37.235.153', '119.222.242.112', 56388, 161, 17, 62, 4646, False),
('16.37.235.200', '16.37.235.153', 59009, 514, 17, 143, 21271, False)]
好像是在填写最后一个字段(在最后一个字段之后
|
)使用布尔值
False
. 也许可以用一些
filling
参数。
或者限制usecols来省略它
In [77]: data = np.genfromtxt(txt.splitlines(), encoding=None, dtype=None,delimiter='|',u
...: secols=range(7))
In [78]: data
Out[78]:
array([('16.37.235.153', '119.222.242.130', 38673, 161, 17, 62, 4646),
('16.37.235.153', '119.222.242.112', 56388, 161, 17, 62, 4646),
('16.37.235.200', '16.37.235.153', 59009, 514, 17, 143, 21271)],
dtype=[('f0', '<U13'), ('f1', '<U15'), ('f2', '<i8'), ('f3', '<i8'), ('f4', '<i8'), ('f5', '<i8'), ('f6', '<i8')])