本文介绍了pandas groupby,您将获得一列的最大值和另一列的最小值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据框,如下所示:
I have a dataframe as follows:
user num1 num2 a 1 1 a 2 2 a 3 3 b 4 4 b 5 5我想要一个数据帧,该数据帧的每个用户的编号均应为num1起的最小值,每个用户的最大编号应为num2.
I want a dataframe which has the minimum from num1 for each user, and the maximum of num2 for each user.
输出应类似于:
user num1 num2 a 1 3 b 4 5我知道,如果我想同时获得两列的最大值,就可以这样做:
I know that if I wanted the max of both columns I could just do:
a.groupby('user')['num1', 'num2'].max()是否存在一些等效项,而不必执行以下操作:
Is there some equivalent without having to do something like:
series_1 = a.groupby('user')['num1'].min() series_2 = a.groupby('user')['num2'].max() # converting from series to df so I can do a join on user df_1 = pd.DataFrame(np.array([series_1]).transpose(), index=series_1.index, columns=['num1']) df_2 = pd.DataFrame(np.array([series_2]).transpose(), index=series_2.index, columns=['num2']) df_1.join(df_2)推荐答案
使用 groupby + agg (dict),因此必须按subset或 reindex_axis .最后添加 reset_index 进行转换到column.
Use groupby + agg by dict, so then is necessary order columns by subset or reindex_axis. Last add reset_index for convert index to column if necessary.
df = a.groupby('user').agg({'num1':'min', 'num2':'max'})[['num1','num2']].reset_index() print (df) user num1 num2 0 a 1 3 1 b 4 5与什么相同:
df = a.groupby('user').agg({'num1':'min', 'num2':'max'}) .reindex_axis(['num1','num2'], axis=1) .reset_index() print (df) user num1 num2 0 a 1 3 1 b 4 5更多推荐
pandas groupby,您将获得一列的最大值和另一列的最小值
发布评论