set_index
你要找的是:
Documents
df.set_index(['Gap','Date'])
没有注意到问题的另一部分。
以下是tweets栏的数量:
level_name = df.index.get_level_values(0).tolist()
level_name = [str(i).split(' ')[0] for i in level_name]
level_name = list(set(level_name))
num_of_tweets = {}
for i in level_name:
df1 = df.loc[i]
num_of_tweets[i] = len(df1)
df.reset_index(inplace=True)
df['num_of_tweets'] = 0
for key in num_of_tweets.keys():
df.loc[df['Gap'] == key,'num_of_tweets'] = num_of_tweets[key]
# set the index again.
逻辑有点迂回,可能不是最好的解决办法。
但是,逻辑可用于获取列的任意组合。