python - Doing calculations on Pandas DataFrame with groupby and then passing it back into a DataFrame? -


i have data frame want group 2 variables, , perform calculation within variables. there easy way , put information dataframe when i'm done, i.e. this:

df=pd.dataframe({'a':[1,1,1,2,2,2,30,12,122,345], 'b':[1,1,1,2,3,3,3,2,3,4], 'c':[101,230,12,122,345,23,943,83,923,10]})  total = [] avg = [] aid = [] bid = [] name, group in df.groupby(['a', 'b']):     total.append(group.c.sum())     avg.append(group.c.sum()/group.c.nunique())     aid.append(name[0])     bid.append(name[1])  x = pd.dataframe({'total':total,'avg':avg,'aid':aid,'bid':bid}) 

but more efficiently?

you can use pandas aggregate function after groupby:

import pandas pd import numpy np df.groupby(['a', 'b'])['c'].agg({'total': np.sum, 'avg': np.mean}).reset_index()  #         b   total          avg # 0    1    1     343   114.333333 # 1    2    2     122   122.000000 # 2    2    3     368   184.000000 # 3   12    2      83    83.000000 # 4   30    3     943   943.000000 # 5  122    3     923   923.000000 # 6  345    4      10    10.000000 

Comments

Popular posts from this blog

jOOQ update returning clause with Oracle -

java - Warning equals/hashCode on @Data annotation lombok with inheritance -

java - BasicPathUsageException: Cannot join to attribute of basic type -