pandas 按组汇总排序

编程入门 行业动态 更新时间:2024-10-27 06:32:14
本文介绍了 pandas 按组汇总排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我已经看过这个问题,但是需要结果与我的略有不同.

I've already seen this question, but the desired outcome there is slightly different from mine.

想象一下这样分组的数据框:

Imagine a dataframe grouped thusly:

df.groupby(['product_name', 'usage_type']).total_cost.sum() product_name usage_type Lorem A 30.694665 B 0.000634 C 1.659360 D 0.000031 E 3339.140042 F 0.074340 Ipsum G 9.627360 A 19.053377 D 14.492155 Dolor B 9.698245 H 6993.792163 C 31947.955679 D 2150.400001 E 26.337789 Name: total_cost, dtype: float6

我想要的输出是相同的结构,但是具有两个属性:

The output I want is the same structure, but with two properties:

  • 按费用总和订购产品名称
  • 按字典顺序对使用类型进行排序(另一种可行的选择:按降序对这些使用类型进行排序)
  • 首先出现价格最高的产品,但仍保留故障.

    Such that the highest-cost products show up first, but still preserving the breakdown.

    如果要简单得多,我可以按使用类型删除次级排序.

    If it is significantly simpler, I'm okay with dropping the secondary sorting by usage type.

    推荐答案

    从分组的DataFrame开始:

    Starting with your grouped DataFrame:

    import pandas as pd df2 = pd.read_table('data', sep='\s+').set_index(['product_name', 'usage_type']) # val # product_name usage_type # Lorem A 30.694665 # B 0.000634 # C 1.659360 # D 0.000031 # E 3339.140042 # F 0.074340 # Ipsum G 9.627360 # A 19.053377 # D 14.492155 # Dolor B 9.698245 # H 6993.792163 # C 31947.955679 # D 2150.400001 # E 26.337789

    您可以将键值存储在新列中:

    You could store the key values in new columns:

    df2['key1'] = df2.groupby(level='product_name')['val'].transform('sum') df2['key2'] = df2.index.get_level_values('usage_type')

    ,然后按这些关键列进行排序:

    and then sort by those key columns:

    # >>> df2.sort(['key1', 'key2'], ascending=[False,True]) # val key1 key2 # product_name usage_type # Dolor B 9.698245 41128.183877 B # C 31947.955679 41128.183877 C # D 2150.400001 41128.183877 D # E 26.337789 41128.183877 E # H 6993.792163 41128.183877 H # Lorem A 30.694665 3371.569072 A # B 0.000634 3371.569072 B # C 1.659360 3371.569072 C # D 0.000031 3371.569072 D # E 3339.140042 3371.569072 E # F 0.074340 3371.569072 F # Ipsum A 19.053377 43.172892 A # D 14.492155 43.172892 D # G 9.627360 43.172892 G result = df2.sort(['key1', 'key2'], ascending=[False,True])['val'] print(result)

    收益

    product_name usage_type Dolor B 9.698245 C 31947.955679 D 2150.400001 E 26.337789 H 6993.792163 Lorem A 30.694665 B 0.000634 C 1.659360 D 0.000031 E 3339.140042 F 0.074340 Ipsum A 19.053377 D 14.492155 G 9.627360

    更多推荐

    pandas 按组汇总排序

    本文发布于:2023-11-22 04:14:47,感谢您对本站的认可!
    本文链接:https://www.elefans.com/category/jswz/34/1615940.html
    版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
    本文标签:pandas

    发布评论

    评论列表 (有 0 条评论)
    草根站长

    >www.elefans.com

    编程频道|电子爱好者 - 技术资讯及电子产品介绍!