pandas 按组汇总排序

编程入门行业动态更新时间:2024-10-27 06:32:14

本文介绍了 pandas 按组汇总排序的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我已经看过这个问题，但是需要结果与我的略有不同.

I've already seen this question, but the desired outcome there is slightly different from mine.

想象一下这样分组的数据框:

Imagine a dataframe grouped thusly:

df.groupby(['product_name', 'usage_type']).total_cost.sum() product_name usage_type Lorem A 30.694665 B 0.000634 C 1.659360 D 0.000031 E 3339.140042 F 0.074340 Ipsum G 9.627360 A 19.053377 D 14.492155 Dolor B 9.698245 H 6993.792163 C 31947.955679 D 2150.400001 E 26.337789 Name: total_cost, dtype: float6

我想要的输出是相同的结构，但是具有两个属性:

The output I want is the same structure, but with two properties:

按费用总和订购产品名称

按字典顺序对使用类型进行排序(另一种可行的选择:按降序对这些使用类型进行排序)

首先出现价格最高的产品，但仍保留故障.

Such that the highest-cost products show up first, but still preserving the breakdown.

如果要简单得多，我可以按使用类型删除次级排序.

If it is significantly simpler, I'm okay with dropping the secondary sorting by usage type.

推荐答案

从分组的DataFrame开始:

Starting with your grouped DataFrame:

import pandas as pd df2 = pd.read_table('data', sep='\s+').set_index(['product_name', 'usage_type']) # val # product_name usage_type # Lorem A 30.694665 # B 0.000634 # C 1.659360 # D 0.000031 # E 3339.140042 # F 0.074340 # Ipsum G 9.627360 # A 19.053377 # D 14.492155 # Dolor B 9.698245 # H 6993.792163 # C 31947.955679 # D 2150.400001 # E 26.337789

您可以将键值存储在新列中:

You could store the key values in new columns:

df2['key1'] = df2.groupby(level='product_name')['val'].transform('sum') df2['key2'] = df2.index.get_level_values('usage_type')

，然后按这些关键列进行排序:

and then sort by those key columns:

# >>> df2.sort(['key1', 'key2'], ascending=[False,True]) # val key1 key2 # product_name usage_type # Dolor B 9.698245 41128.183877 B # C 31947.955679 41128.183877 C # D 2150.400001 41128.183877 D # E 26.337789 41128.183877 E # H 6993.792163 41128.183877 H # Lorem A 30.694665 3371.569072 A # B 0.000634 3371.569072 B # C 1.659360 3371.569072 C # D 0.000031 3371.569072 D # E 3339.140042 3371.569072 E # F 0.074340 3371.569072 F # Ipsum A 19.053377 43.172892 A # D 14.492155 43.172892 D # G 9.627360 43.172892 G result = df2.sort(['key1', 'key2'], ascending=[False,True])['val'] print(result)

收益

product_name usage_type Dolor B 9.698245 C 31947.955679 D 2150.400001 E 26.337789 H 6993.792163 Lorem A 30.694665 B 0.000634 C 1.659360 D 0.000031 E 3339.140042 F 0.074340 Ipsum A 19.053377 D 14.492155 G 9.627360

更多推荐

pandas 按组汇总排序

本文发布于:2023-11-22 04:14:47，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1615940.html