anky_91的社区主页

anky_91 最近回复了

6 年前

回复了 anky_91 创建的主题 » 从一个表中返回列名,在该表中,可以使用python和pandas找到任何行中的特定值

IIUC,使用:

df.columns[df.eq(100).any()]

#Index(['A', 'B'], dtype='object')

要获得序列输出,请调用 pd.Series() pd.Series(df.columns[df.eq(100).any()])

5 年前

回复了 anky_91 创建的主题 » 替换Python Pandas中所有重复行的值

你可以用 df.mask 这里有 df.duplicated

df.mask(df.duplicated(['B','C'],keep=False),'ERR')

     A     B       C    D
0    1  Blue   Green    4
1  ERR   ERR     ERR  ERR
2  ERR   ERR     ERR  ERR
3    4  Blue    Pink    6
4  ERR   ERR     ERR  ERR
5  ERR   ERR     ERR  ERR
6    7  Blue     Red    8
7    8   Red  Orange    9

6 年前

回复了 anky_91 创建的主题 » Pandas-如何从Python的datetime列中提取HH:MM?

假设df看起来像

print(df)

             date_col
0 2018-07-25 11:14:00
1 2018-08-26 11:15:00
2 2018-07-29 11:17:00
#convert from string to datetime
df['date_col'] = pd.to_datetime(df['date_col']) 

#to get date only
print(df['date_col'].dt.date)
0    2018-07-25
1    2018-08-26
2    2018-07-29

#to get time:
print(df['date_col'].dt.time)

0    11:14:00
1    11:15:00
2    11:17:00
#to get hour and minute
print(df['date_col'].dt.strftime('%H:%M'))
0    11:14
1    11:15
2    11:17

6 年前

回复了 anky_91 创建的主题 » python pandas:按其他列分组时创建累积平均值

你需要 expanding().mean() 使用Groupby:

df.groupby('name')['value'].expanding().mean().reset_index(0)

对于未排序的df,以下命令将起作用:

df.groupby('name')['value'].expanding().mean().reset_index(0).sort_index()

   name  value
0  Jack  0.000
1  Jack  0.500
2  Jack  0.500
3  Jack  0.625
4  Jill  0.000
5  Jill  1.000

6 年前

回复了 anky_91 创建的主题 » python:删除字符串中的特殊字符

你可以使用

a = "Peter North  /  John West"
import re
a = re.sub(' +/ +','_',a)

这个模式可以替换带有斜杠和任意斜杠的任意数量的空格。

6 年前

回复了 anky_91 创建的主题 » python groupby、cumsum和max用于计算月末余额

Date 列到 datetime :

df.Date=pd.to_datetime(df.Date,format='%m/%d/%Y')

然后:

m=(df.assign(cum_Amount=df.Amount.cumsum()).
  groupby(df.Date.dt.month)['cum_Amount'].max().reset_index())
print(m)

   Date  cum_Amount
0     9    32029.59
1    10    31063.64
2    11    32596.70
3    12    30630.96

编辑似乎您已经有余额,您只想筛选月末日期,请使用:

from pandas.tseries.offsets import MonthEnd
df[df.Date.eq(df.Date+MonthEnd(0))]

        Date   Amount  Month   Balance
1 2018-09-30    29.59      9  32029.59
3 2018-10-31 -1000.00     10  31063.64
5 2018-11-30    33.06     11  32596.70
7 2018-12-31    34.26     12  30630.96

6 年前

回复了 anky_91 创建的主题 » 如何使用python/pandas在一列中根据字符串拆分和复制行?

使用@Wen Ben的解决方案 here :

s=pd.DataFrame([[x] + [z] for x, y in zip(df.index,df.fruit.str.split(',')) for z in y],
               columns=[0,'Fruit'])
df_new=s.merge(df,left_on=0,right_index=True).drop(0,1)
print(df_new)

         Fruit                    fruit  colour      sport  wins
0        Apple  Apple, Kiwi, Clementine     NaN    Cycling     5
1         Kiwi  Apple, Kiwi, Clementine     NaN    Cycling     5
2   Clementine  Apple, Kiwi, Clementine     NaN    Cycling     5
3         Kiwi                     Kiwi    Blue        NaN    20
4       Banana       Banana, Clementine     NaN     Hockey    12
5   Clementine       Banana, Clementine     NaN     Hockey    12
6        Apple                    Apple  Purple  Triathlon    15
7         Kiwi                     Kiwi     NaN   Swimming     8

注释你可以选择放弃 fruit 列(如果需要)。

» anky_91 创建的更多回复