pandas :将行取消堆叠到新列中

编程入门行业动态更新时间:2024-10-12 01:29:23

本文介绍了 pandas :将行取消堆叠到新列中的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我有一个看起来像这样的 df:

i have a df that looks like this:

a date c 0 ABC 2020-06-01 0.1 1 ABC 2020-05-01 0.2 2 DEF 2020-07-01 0.3 3 DEF 2020-01-01 0.4 4 DEF 2020-02-01 0.5 5 DEF 2020-07-01 0.6

我想取消堆叠"；列'a'所以我的新df看起来像这样

i would like to "unstack" column 'a' so my new df looks like this

a date1 c1 date2 c2 date3 c3 date4 c4 0 ABC 2020-06-01 0.1 2020-05-01 0.2 nan nan nan nan 1 DEF 2020-07-01 0.3 2020-01-01 0.4 2020-02-01 0.5 2020-07-01 0.6

我该怎么做?

推荐答案

使用 GroupBy.cumcount 用于 MultiIndex 的辅助计数器并通过 DataFrame.unstack，然后为了正确的顺序使用 DataFrame.sort_index 和 map 用于展平 MultiIndex:

Use GroupBy.cumcount for helper counter for MultiIndex and reshape by DataFrame.unstack, then for correct order is used DataFrame.sort_index with map for flatten MultiIndex:

df = (df.set_index(['a',df.groupby('a').cumcount().add(1)]) .unstack() .sort_index(axis=1, level=[1, 0], ascending=[True, False])) df.columns = df.columns.map(lambda x: f'{x[0]}{x[1]}') df = df.reset_index() print (df) a date1 c1 date2 c2 date3 c3 date4 c4 0 ABC 2020-06-01 0.1 2020-05-01 0.2 NaN NaN NaN NaN 1 DEF 2020-07-01 0.3 2020-01-01 0.4 2020-02-01 0.5 2020-07-01 0.6

或者如果由于不同的列名称而无法进行排序，一种想法是使用 DataFrame.reindex:

Or if sorting is not possible because different columns names one idea is use DataFrame.reindex:

df1 = df.set_index(['a',df.groupby('a').cumcount().add(1)]) mux = pd.MultiIndex.from_product([df1.index.levels[1], ['date','c']]) df = df1.unstack().swaplevel(1,0, axis=1).reindex(mux, axis=1) df.columns = df.columns.map(lambda x: f'{x[1]}{x[0]}') df = df.reset_index() print (df) a date1 c1 date2 c2 date3 c3 date4 c4 0 ABC 2020-06-01 0.1 2020-05-01 0.2 NaN NaN NaN NaN 1 DEF 2020-07-01 0.3 2020-01-01 0.4 2020-02-01 0.5 2020-07-01 0.6

更多推荐

pandas :将行取消堆叠到新列中

本文发布于:2023-11-30 00:04:16，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1648047.html