如何仅聚合混合 dtypes 数据框中的数字列

编程入门行业动态更新时间:2024-10-08 12:37:13

本文介绍了如何仅聚合混合 dtypes 数据框中的数字列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我有一个混合的 pd.DataFrame:

import pandas as pd import numpy as np df = pd.DataFrame({ 'A' : 1., 'B' : pd.Timestamp('20130102'), 'C' : pd.Timestamp('20180101'), 'D' : np.random.rand(10), 'F' : 'foo' }) df Out[12]: A B C D F 0 1.0 2013-01-02 2018-01-01 0.592533 foo 1 1.0 2013-01-02 2018-01-01 0.819248 foo 2 1.0 2013-01-02 2018-01-01 0.298035 foo 3 1.0 2013-01-02 2018-01-01 0.330128 foo 4 1.0 2013-01-02 2018-01-01 0.371705 foo 5 1.0 2013-01-02 2018-01-01 0.541246 foo 6 1.0 2013-01-02 2018-01-01 0.976108 foo 7 1.0 2013-01-02 2018-01-01 0.423069 foo 8 1.0 2013-01-02 2018-01-01 0.863764 foo 9 1.0 2013-01-02 2018-01-01 0.037085 foo

我想聚合我的数字列，但也要保留非数字列.如果我执行 gropuby 后跟 agg.我得到:

I would like to aggregate my numerical columns, but keep also the non-numerical ones. If I do a gropuby followed by agg. I get:

df.groupby('B').agg(np.median) Out[13]: A D B 2013-01-02 1.0 0.482157

这很好，我知道这是期望的行为，因为其他 dtypes 可能会在 np.median 期间引发异常，但我也想获得我的原始列 F 值 foo，以及 C 和 2018-01-01

which is fine, and I know is desired behavior as the other dtypes probably raise exceptions during np.median, but I would like to get also my original column F with value foo, as well as C with 2018-01-01

到目前为止，我已经用自定义包装器解决了我的数值聚合函数，例如如果我想对我的数据框执行 nanmean:

So far, I have solved with a custom wrapper to my numerical aggregation functions e.g. if I wanted to do a nanmean over my dataframe:

def my_nan_median(x): if isinstance(x.values[0], np.datetime64): return np.min(x) # let the first datetime pass! elif isinstance(x.values[0], str): return x.values[0] # let the strings pass! else: return np.nanmedian(x)

但它看起来很糟糕.这样做的正确方法是什么?

but it looks awful. What is the right way to do so?

推荐答案

通过使用 select_dtypes:

df.groupby(list(df.select_dtypes(exclude=[np.number]))).agg(np.median).reset_index()

或者像这样:

df1 = df.groupby('B',as_index=False).agg(np.median) pd.concat([df1,df.drop_duplicates(['B']).drop(list(df1),1).reset_index(drop=True)],axis=1)

更多推荐

如何仅聚合混合 dtypes 数据框中的数字列

本文发布于:2023-11-22 08:04:58，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1616609.html