python pandas dataframe:如何分组来自不同列的值

我需要帮助清理数据框。数据帧如下:

         Gap      Date          Time      Full text   Retweets   Likes
0   3.160003  2018-05-21    03:30:56  @georgechang..  19         462
1   3.160003  2018-05-21    21:15:03  @reveal         141        1610
2   3.160003  2018-05-21    11:25:21  RT @nova_road:  2030       0
3   3.160003  2018-05-21    07:10:01  @MrsYomaddy     48         917
4   3.160003  2018-05-21    07:06:54  @Dani21 @dmatki 40         5367

可以看到,对于所有行,间隙值都等于日期值。

我希望获得以下数据帧:

                         num    Time      Full text    Retweets   Likes
    Gap       Date         
0   3.160003  2018-05-21    1     03:30:56  .....        19      462
1                           2     21:15:03  .....        141     1610
2                           3     11:25:21  .....        2030    0 
3                           4     07:10:01  .....        48      917
4                           5     07:06:54  .....        40      5367

其中num是带有tweets数量的额外列。

我已经问了一个类似的问题,但现在问题有点不同了。这是链接。 How can I create a multiindex data frame with the following datasets? 多索引数据帧

我试图做的是以下代码:

StockbyTweets.set_index(['Date','Gap','Time'],inplace=True)
StockbyTweets

但我得到的只是:

                           Time       Full text    Retweets   Likes
    Gap       Date         
0   3.160003  2018-05-21    03:30:56  .....        19        462
1                           21:15:03  .....        141       1610
2                           11:25:21  .....        2030      0 
3                           07:10:01  .....        48        917
4                           07:06:54  .....        40        5367

我如何获得一个额外的列与tweet的数量?

level_name = df.index.get_level_values(0).tolist() level_name = [str(i).split(' ')[0] for i in level_name] level_name = list(set(level_name)) num_of_tweets = {} for i in level_name: df1 = df.loc[i] num_of_tweets[i] = len(df1) df.reset_index(inplace=True) df['num_of_tweets'] = 0 for key in num_of_tweets.keys(): df.loc[df['Gap'] == key,'num_of_tweets'] = num_of_tweets[key] # set the index again.