我的输入看起来像下面的df.
My input looks like the below df.
我需要按列(A,B)分组并计算连续零的数量/计算每个组中连续零的长度,然后写入新列"Zero_count"
I need to group by column (A, B) and count the number of consecutive zeros/ count the length of the consecutive zeros in each of the groups and write to a new column "Zero_count"
Input: A B DATE hour measure A10 1 1/1/2014 0 0 A10 1 1/1/2014 1 0 A10 1 1/1/2014 2 0 A10 1 1/1/2014 3 0 A10 2 1/1/2014 4 0 A10 2 1/1/2014 5 1 A10 2 1/1/2014 6 2 A10 3 1/1/2014 7 0 A11 1 1/1/2014 8 0 A11 1 1/1/2014 9 0 A11 1 1/1/2014 10 2 A11 1 1/1/2014 11 0 A11 1 1/1/2014 12 0 A12 2 1/1/2014 13 1 A12 2 1/1/2014 14 3 A12 2 1/1/2014 15 0 A12 4 1/1/2014 16 5 A12 4 1/1/2014 17 0 A12 6 1/1/2014 18 0我尝试使用"groupby"技术来获取组,但是我一直在寻找组内连续的零计数.我尝试使用lambda函数,但是它计算的是零的总数,而我有兴趣重复连续的零.我希望我的输出看起来像这样:
I tried using "groupby" technique to get the groups, but consecutive zero counting within the group is something that I am looking for. I have tried to use lambda function but that counts the total number of zeros, while I am interested in repeating consecutive zeros. I want my output to look like this:
Output A B DATE hour measure Consec_zero_count A10 1 1/1/2014 0 0 4 A10 1 1/1/2014 1 0 4 A10 1 1/1/2014 2 0 4 A10 1 1/1/2014 3 0 4 A10 2 1/1/2014 4 0 1 A10 2 1/1/2014 5 1 0 A10 2 1/1/2014 6 2 0 A10 3 1/1/2014 7 0 1 A11 1 1/1/2014 8 0 2 A11 1 1/1/2014 9 0 2 A11 1 1/1/2014 10 2 0 A11 1 1/1/2014 11 0 2 A11 1 1/1/2014 12 0 2 A12 2 1/1/2014 13 1 0 A12 2 1/1/2014 14 3 0 A12 2 1/1/2014 15 0 1 A12 4 1/1/2014 16 5 0 A12 4 1/1/2014 17 0 1 A12 6 1/1/2014 18 0 1任何线索都将不胜感激.预先感谢!
Any leads would be appreciated. Thanks in advance!
推荐答案通过 ne (!=) /stable/generated/pandas.Series.shift.html"rel =" nofollow noreferrer> shift ed值和 cumsum .然后 groupby 与 transform 和 size .仅对0具有 numpy.where :
Create helper Series for unique groups of consecutive values by compare by ne (!=) of shifted values with cumsum. Then groupby with transform and size. Last fiter values only for 0 with numpy.where:
g = df['measure'].ne(df['measure'].shift()).cumsum() counts = df.groupby(['A','B', g])['measure'].transform('size') df['Consec_zero_count'] = np.where(df['measure'].eq(0), counts, 0) print (df) A B DATE hour measure Consec_zero_count 0 A10 1 1/1/2014 0 0 4 1 A10 1 1/1/2014 1 0 4 2 A10 1 1/1/2014 2 0 4 3 A10 1 1/1/2014 3 0 4 4 A10 2 1/1/2014 4 0 1 5 A10 2 1/1/2014 5 1 0 6 A10 2 1/1/2014 6 2 0 7 A10 3 1/1/2014 7 0 1 8 A11 1 1/1/2014 8 0 2 9 A11 1 1/1/2014 9 0 2 10 A11 1 1/1/2014 10 2 0 11 A11 1 1/1/2014 11 0 2 12 A11 1 1/1/2014 12 0 2 13 A12 2 1/1/2014 13 1 0 14 A12 2 1/1/2014 14 3 0 15 A12 2 1/1/2014 15 0 1 16 A12 4 1/1/2014 16 5 0 17 A12 4 1/1/2014 17 0 1 18 A12 6 1/1/2014 18 0 1更多推荐
GroupBandas将连续计数归零
发布评论