对于大学工作,我们在熊猫数据框中有一些世界位置的地理坐标:
df = pd.DataFrame({'NAME': ['Paris', 'New York', 'Rio', 'Airport GRU', 'ORLY'],
'GEO': ['POINT (2.31647 48.85)',
'POINT (-73.993457389558 40.731499671618)',
'POINT (-43.2 -22.9)',
'POINT (-46.47313507388693 -23.429382262746415)',
'GEOMETRYCOLLECTION EMPTY']})
print(df)
NAME GEO
Paris POINT (2.31647 48.85)
New York POINT (-73.993457389558 40.731499671618)
Rio POINT (-43.2 -22.9)
Airport GRU POINT (-46.47313507388693 -23.429382262746415)
ORLY GEOMETRYCOLLECTION EMPTY
我想编辑“地理”专栏。起初,我想忽略“点”这个词,然后我想把它放在(纬度,经度)格式,因为这个列是点(经度-纬度)格式(它也是一个逗号)。
为了解决这个问题,我创建了两个单独的列来存储LAT和LONG(这部分正在工作):
df2 = df.join(df['GEO'].str.extract(r'(?P<LONG>-?\d+\.\d+) (?P<LAT>-?\d+\.\d+)').astype(float))
print(df2)
NAME GEO LONG LAT
Paris POINT (2.31647 48.85) 2.316470 48.85000
New York POINT (-73.993457389558 40.731499671618) -73.993457 40.731500
Rio POINT (-43.2 -22.9) -43.200000 -22.900000
Airport GRU POINT (-46.47313507388693 -23.429382262746415) -46.473135 -23.429382
ORLY GEOMETRYCOLLECTION EMPTY NaN NaN
但是,当我尝试创建一个新列来接收格式时:(LAT,LONG)。代码不起作用:
df2['Result'] = "(" + df2['LAT'] + "," + df2['LONG'] + ")"
出现错误:“ufunctypererror:ufunc'add'不包含具有签名匹配类型的循环(dtype('<U1')、dtype('float64'))->无”
我希望输出为:
NAME GEO
Paris (48.85, 2.31647)
New York (40.731499671618, -73.993457389558)
Rio (-22.9, -43.2)
Airport GRU (-23.429382262746415, -46.47313507388693)
ORLY GEOMETRYCOLLECTION EMPTY]})