给定一个DataFrame,我想计算每行的零个数。 我如何用Pandas进行计算?
这是我现在所做的,这返回了零的索引
def is_blank(x): return x == 0 indexer = train_df.applymap(is_blank)Given a DataFrame I would like to compute number of zeros per each row. How can I compute it with Pandas?
This is presently what I ve done, this returns indices of zeros
def is_blank(x): return x == 0 indexer = train_df.applymap(is_blank)最满意答案
使用一个布尔型的比较,它会产生一个布尔值df,然后我们可以将它转换为int,True变为1,False变为0,然后调用count并传递param axis=1来逐行计算:
In [56]: df = pd.DataFrame({'a':[1,0,0,1,3], 'b':[0,0,1,0,1], 'c':[0,0,0,0,0]}) df Out[56]: a b c 0 1 0 0 1 0 0 0 2 0 1 0 3 1 0 0 4 3 1 0 In [64]: (df == 0).astype(int).sum(axis=1) Out[64]: 0 2 1 3 2 2 3 2 4 1 dtype: int64打破以上:
In [65]: (df == 0) Out[65]: a b c 0 False True True 1 True True True 2 True False True 3 False True True 4 False False True In [66]: (df == 0).astype(int) Out[66]: a b c 0 0 1 1 1 1 1 1 2 1 0 1 3 0 1 1 4 0 0 1编辑
正如david所指出的那样, astype为int是不必要的,因为在调用sum时Boolean类型将被升级为int ,所以这简化为:
(df == 0).sum(axis=1)Use a boolean comparison which will produce a boolean df, we can then cast this to int, True becomes 1, False becomes 0 and then call count and pass param axis=1 to count row-wise:
In [56]: df = pd.DataFrame({'a':[1,0,0,1,3], 'b':[0,0,1,0,1], 'c':[0,0,0,0,0]}) df Out[56]: a b c 0 1 0 0 1 0 0 0 2 0 1 0 3 1 0 0 4 3 1 0 In [64]: (df == 0).astype(int).sum(axis=1) Out[64]: 0 2 1 3 2 2 3 2 4 1 dtype: int64Breaking the above down:
In [65]: (df == 0) Out[65]: a b c 0 False True True 1 True True True 2 True False True 3 False True True 4 False False True In [66]: (df == 0).astype(int) Out[66]: a b c 0 0 1 1 1 1 1 1 2 1 0 1 3 0 1 1 4 0 0 1EDIT
as pointed out by david the astype to int is unnecessary as the Boolean types will be upcasted to int when calling sum so this simplifies to:
(df == 0).sum(axis=1)更多推荐
发布评论