pandas 条件列计数

编程入门行业动态更新时间:2024-10-17 07:36:30

本文介绍了 pandas 条件列计数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我有一个数据框，如下所示：

I have a dataframe that looks like this:

a1 | a2 | b3 | b4 | b5 | c | d 1 | 2 | 3 | 4 | 5 | 1 | 1 1 | 4 | 5 | 3 | 2 | 0 | 0 2 | 3 | 1 | 1 | 0 | 0 | 0

我想创建两列a_count和b_count。

I want to create two columns, "a_count", and "b_count".

对于d的值为1 ORc的每一行为0：

For each row where the value of "d" is 1 OR "c" is 0:

a_count应表示a1或 a2中出现的次数1

"a_count" should represent the number of times '1' appears in a1 or a2

b_count 应该代表 b3 / b4 / b5

"b_count" should represent the number of times '1' appears in b3/b4/b5

如果'd'和'c'是0，它应该是一个0。

If both 'd' and 'c' are 0 it should just be a 0.

所以结果输出看起来像...

So the resulting output would look like...

a1 | a2 | b3 | b4 | b5 | c | d | a_count | b_count 1 | 2 | 3 | 4 | 5 | 0 | 0 | 0 | 0 1 | 4 | 5 | 3 | 2 | 1 | 0 | 1 | 0 1 | 1 | 1 | 1 | 0 | 0 | 1 | 2 | 2

如果我分别计算a_count和b_count，可以吗？我想我可以使用np.where等的组合，但是我觉得困惑我弄清楚如何得到a1 / a2或b3 / b4 / b5列中的计数，其中相应的值为1并且满足c和d的条件。

It's fine if I compute a_count and b_count separately. I guess I could use a combination of np.where, etc. but I think what confused me was figuring out how to get a count within either columns a1/a2 or b3/b4/b5 where the respective values were 1 AND the condition for c and d was met.

也许这是一个直截了当的问题，但我的大脑刚刚被油炸（如果这太简单了，有人可以指出我在正确的方向？

Maybe it's a straightforward question but my brain is just fried right now :( If it is too trivial can someone just point me in the right direction?

推荐答案

是， np.where 是这个问题的好选择。

Yes, np.where is a good choice for this problem.

df['a_count'] = np.where((df['c'] == 0) & (df['d'] == 0), 0, (df[['a1', 'a2']]==1).sum(1)) df['b_count'] = np.where((df['c'] == 0) & (df['d'] == 0), 0, (df[['b3', 'b4', 'b5']]==1).sum(1))

更多推荐

pandas 条件列计数

本文发布于:2023-10-28 05:03:26，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1535602.html