我正在尝试对 Pandas 数据框的列求和,当我在每一列中都有 NaN 时,总和 = 零;根据文档,我希望 sum = NaN .这是我所拥有的:
I'm trying to sum across columns of a Pandas dataframe, and when I have NaNs in every column I'm getting sum = zero; I'd expected sum = NaN based on the docs. Here's what I've got:
In [136]: df = pd.DataFrame() In [137]: df['a'] = [1,2,np.nan,3] In [138]: df['b'] = [4,5,np.nan,6] In [139]: df Out[139]: a b 0 1 4 1 2 5 2 NaN NaN 3 3 6 In [140]: df['total'] = df.sum(axis=1) In [141]: df Out[141]: a b total 0 1 4 5 1 2 5 7 2 NaN NaN 0 3 3 6 9pandas.DataFrame.sum 文档说如果整个行/列都是 NA,结果将是 NA",所以我不明白为什么索引 2 的total"= 0 而不是 NaN.我是什么不见了?
The pandas.DataFrame.sum docs say "If an entire row/column is NA, the result will be NA", so I don't understand why "total" = 0 and not NaN for index 2. What am I missing?
推荐答案pandas 文档 » API 参考 » DataFrame » pandas.DataFrame »
DataFrame.sum(self, axis=None, skipna=None, level=None, numeric_only=None, min_count=0, **kwargs)
DataFrame.sum(self, axis=None, skipna=None, level=None, numeric_only=None, min_count=0, **kwargs)
min_count:整数,默认为 0
所需的有效值数量执行操作.如果少于 min_count 的非 NA 值是呈现结果将是 NA.
The required number of valid values to perform the operation. If fewer than min_count non-NA values are present the result will be NA.
0.22.0 新版:新增,默认为 0.这意味着全 NA 或空系列的总和为 0,并且全 NA 或空系列为 1.
New in version 0.22.0: Added with the default being 0. This means the sum of an all-NA or empty Series is 0, and the product of an all-NA or empty Series is 1.
引用 pandas 的最新文档,它说 min_count 对于所有 NA 系列都是 0
Quoting from pandas latest docs it says the min_count will be 0 for all-NA series
如果你说 min_count=1 那么总和的结果将是 nan
If you say min_count=1 then the result of the sum will be a nan
更多推荐
pandas 中所有 NaN 的总和返回零?
发布评论