有2个数据帧
df
和
events
如下所示:
import pandas as pd
df = pd.DataFrame({'Place':['university','residential','hospital','university','residential','hospital'],
'Date':['2017-01-01','2017-01-01','2017-01-01','2017-01-02','2017-01-02','2017-01-02'],
'Event':['None','None','None','None','None','None']
})
events = pd.DataFrame({'Place':['university','residential','hospital'], 'Start_Date':['2017-01-01','2017-01-01','2017-01-01'],
'End_Date':['2017-02-26','2017-01-02','2017-01-02'],
'Event':['UniHolidays','PublicHoliday','PublicHoliday']})
#Convert to datetime
events.Start_Date = pd.to_datetime(events.Start_Date.astype(str), format='%Y-%m-%d')
events.End_Date = pd.to_datetime(events.End_Date.astype(str), format='%Y-%m-%d')
df.Date = pd.to_datetime(df.Date.astype(str), format='%Y-%m-%d')
df在2017年每个地点的每个日期都有1条记录。
df:
Date Place Event
2017-01-01 university None
2017-01-01 residential None
2017-01-01 hospital None
2017-01-02 university None
2017-01-02 residential None
2017-01-02 hospital None
第二个数据帧包含这些位置的事件,但具有日期范围
events:
Place Start_Date End_Date Event
a 2017-01-01 2017-02-26 UniHoliday
b 2017-01-01 2017-01-02 PublicHoliday
c 2017-01-01 2017-01-02 PublicHoliday
任务是更新
东风
使用
事件
这样的话
如果
df.Place
=
events.Place
和
df.Date
在范围内(
events.Start_Date, events.End_Date
)
df.Event
应该用相应的
event.Event
预期产量为:
Date Place Event
2017-01-01 university UniHoliday
2017-01-01 residential PublicHoliday
2017-01-01 hospital PublicHoliday
2017-01-02 university UniHoliday
2017-01-02 residential PublicHoliday
2017-01-02 hospital PublicHoliday
没有重叠的事件,每个地方都有独特的事件记录
到目前为止,我一直在思考:
Populate column in data frame based on a range found in another dataframe
但是我的头没法绕过去。感谢您的帮助。谢谢您!