如何根据每个id的条件选择行

编程入门 行业动态 更新时间:2024-10-10 09:18:41
本文介绍了如何根据每个id的条件选择行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我有以下数据框:

Hotel_id Month_Year Chef_Id Chef_is_masterchef Transition 2400188 February-2018 4597566 1 0 2400188 March-2018 4597566 1 0 2400188 April-2018 4597566 1 0 2400188 May-2018 4597566 1 0 2400188 June-2018 4597566 1 0 2400188 July-2018 4597566 1 0 2400188 August-2018 4597566 1 0 2400188 September-2018 4597566 0 1 2400188 October-2018 4597566 0 0 2400188 November-2018 4597566 0 0 2400188 December-2018 4597566 0 0 2400188 January-2019 4597566 0 0 2400188 February-2019 4597566 0 0 2400188 March-2019 4597566 0 0 2400188 April-2019 4597566 0 0 2400188 May-2019 4597566 0 0 2400614 May-2015 2297544 0 0 2400614 June-2015 2297544 0 0 2400614 July-2015 2297544 0 0 2400614 August-2015 2297544 0 0 2400614 September-2015 2297544 0 0 2400614 October-2015 2297544 0 0 2400614 November-2015 2297544 0 0 2400614 December-2015 2297544 0 0 2400614 January-2016 2297544 1 1 2400614 February-2016 2297544 1 0 2400614 March-2016 2297544 1 0 3400624 May-2016 2597531 0 0 3400624 June-2016 2597531 0 0 3400624 July-2016 2597531 0 0 3400624 August-2016 2597531 1 1 2400133 February-2016 4597531 0 0 2400133 March-2016 4597531 0 0 2400133 April-2016 4597531 0 0 2400133 May-2016 4597531 0 0 2400133 June-2016 4597531 0 0 2400133 July-2016 4597531 0 0 2400133 August-2016 4597531 1 1 2400133 September-2016 4597531 1 0 2400133 October-2016 4597531 1 0 2400133 November-2016 4597531 1 0 2400133 December-2016 4597531 1 0 2400133 January-2017 4597531 1 0 2400133 February-2017 4597531 1 0 2400133 March-2017 4597531 1 0 2400133 April-2017 4597531 1 0 2400133 May-2017 4597531 1 0

当 Chef_is_Masterchef 列中从 0 到 1 或 1 到 0 的转换发生时,此转换会在 >Transition 列作为 1.

When the transition takes place from 0 to 1 or 1 to 0 in the Chef_is_Masterchef column, this transition is indicated in the Transition column as 1.

实际上,我想创建另一列(名为Var"),其中的值将按照下面提到的原始数据框填充,

Actually, I thought of creating another column (named as "Var") where the values will be filled as mentioned below for the original data frame,

预期数据框:

Hotel_id Month_Year Chef_Id Chef_is_masterchef Transition Var 2400188 February-2018 4597566 1 0 -7 2400188 March-2018 4597566 1 0 -6 2400188 April-2018 4597566 1 0 -5 2400188 May-2018 4597566 1 0 -4 2400188 June-2018 4597566 1 0 -3 2400188 July-2018 4597566 1 0 -2 2400188 August-2018 4597566 1 0 -1 2400188 September-2018 4597566 0 1 0 2400188 October-2018 4597566 0 0 1 2400188 November-2018 4597566 0 0 2 2400188 December-2018 4597566 0 0 3 2400188 January-2019 4597566 0 0 4 2400188 February-2019 4597566 0 0 5 2400188 March-2019 4597566 0 0 6 2400188 April-2019 4597566 0 0 7 2400188 May-2019 4597566 0 0 8 2400614 May-2015 2297544 0 0 -8 2400614 June-2015 2297544 0 0 -7 2400614 July-2015 2297544 0 0 -6 2400614 August-2015 2297544 0 0 -5 2400614 September-2015 2297544 0 0 -4 2400614 October-2015 2297544 0 0 -3 2400614 November-2015 2297544 0 0 -2 2400614 December-2015 2297544 0 0 -1 2400614 January-2016 2297544 1 1 0 2400614 February-2016 2297544 1 0 1 2400614 March-2016 2297544 1 0 2 3400624 May-2016 2597531 0 0 -3 3400624 June-2016 2597531 0 0 -2 3400624 July-2016 2597531 0 0 -1 3400624 August-2016 2597531 1 1 0 2400133 February-2016 4597531 0 0 -6 2400133 March-2016 4597531 0 0 -5 2400133 April-2016 4597531 0 0 -4 2400133 May-2016 4597531 0 0 -3 2400133 June-2016 4597531 0 0 -2 2400133 July-2016 4597531 0 0 -1 2400133 August-2016 4597531 1 1 0 2400133 September-2016 4597531 1 0 1 2400133 October-2016 4597531 1 0 2 2400133 November-2016 4597531 1 0 3 2400133 December-2016 4597531 1 0 4 2400133 January-2017 4597531 1 0 5 2400133 February-2017 4597531 1 0 6 2400133 March-2017 4597531 1 0 7 2400133 April-2017 4597531 1 0 8 2400133 May-2017 4597531 1 0 9

如果观察到,在 Var 列中的转换点,我将值设为零,并且对于我维护相应整数值之前和之后的行.

If observed, at the point of transition in the Var column I am giving the value as zero and for the rows before and after I am maintaining the corresponding integer values.

但是在使用以下代码后,我在 Var 列中遇到了问题,

s = df['Chef_is_masterchef'].eq(0).groupby(df['Chef_Id']).transform('sum') df['var'] = df.groupby('Chef_Id').cumcount().sub(s)

以上代码的输出:

Hotel_id Month_Year Chef_Id Chef_is_masterchef Transition Var 2400188 February-2018 4597566 1 0 -9 2400188 March-2018 4597566 1 0 -8 2400188 April-2018 4597566 1 0 -7 2400188 May-2018 4597566 1 0 -6 2400188 June-2018 4597566 1 0 -5 2400188 July-2018 4597566 1 0 -4 2400188 August-2018 4597566 1 0 -3 2400188 September-2018 4597566 0 1 -2 2400188 October-2018 4597566 0 0 -1 2400188 November-2018 4597566 0 0 0 2400188 December-2018 4597566 0 0 1 2400188 January-2019 4597566 0 0 2 2400188 February-2019 4597566 0 0 3 2400188 March-2019 4597566 0 0 4 2400188 April-2019 4597566 0 0 5 2400188 May-2019 4597566 0 0 6 2400614 May-2015 2297544 0 0 -8 2400614 June-2015 2297544 0 0 -7 2400614 July-2015 2297544 0 0 -6 2400614 August-2015 2297544 0 0 -5 2400614 September-2015 2297544 0 0 -4 2400614 October-2015 2297544 0 0 -3 2400614 November-2015 2297544 0 0 -2 2400614 December-2015 2297544 0 0 -1 2400614 January-2016 2297544 1 1 0 2400614 February-2016 2297544 1 0 1 2400614 March-2016 2297544 1 0 2 3400624 May-2016 2597531 0 0 -3 3400624 June-2016 2597531 0 0 -2 3400624 July-2016 2597531 0 0 -1 3400624 August-2016 2597531 1 1 0 2400133 February-2016 4597531 0 0 -6 2400133 March-2016 4597531 0 0 -5 2400133 April-2016 4597531 0 0 -4 2400133 May-2016 4597531 0 0 -3 2400133 June-2016 4597531 0 0 -2 2400133 July-2016 4597531 0 0 -1 2400133 August-2016 4597531 1 1 0 2400133 September-2016 4597531 1 0 1 2400133 October-2016 4597531 1 0 2 2400133 November-2016 4597531 1 0 3 2400133 December-2016 4597531 1 0 4 2400133 January-2017 4597531 1 0 5 2400133 February-2017 4597531 1 0 6 2400133 March-2017 4597531 1 0 7 2400133 April-2017 4597531 1 0 8 2400133 May-2017 4597531 1 0 9

如果观察到,对于 Chef_Id = 4597566,您可以在转换点看到 Var 列中的值不同而不是零.

If Observed, for the Chef_Id = 4597566 you can see at the point of transition the value is different instead of zero in the Var column.

这会产生一个问题,因为在转换点,我必须为每个 ID 选择最多 3 个月前和 2 个月后的行.同样在转换点,我必须使用以下代码为每个 id 选择最多 6 个月之前和 5 个月之后的行:

This creates a problem because, at the point of transition, I have to select rows including up to 3 months before and 2 months after for each id. Also at the point of transition, I have to select rows including up to 6 months before and 5 months after for each id using the below code:

df1 = df[df['var'].between(-3, 2)] print (df1) df2 = df[df['var'].between(-6, 5)] print (df2)

所以请告诉我解决方案.

So please let me know the solution.

提前致谢!

推荐答案

使用 GroupBy.cumcount 用于每组的计数器,然后通过 比较减去 0 值的数量0 和 GroupBy.transform:

Use GroupBy.cumcount for counter per groups and then subtract number of 0 values by compare by 0 and GroupBy.transform:

s = df['Chef_is_masterchef'].eq(0).groupby(df['Chef_Id']).transform('sum') df['var'] = df.groupby('Chef_Id').cumcount().sub(s)

print (df) Hotel_id Month_Year Chef_Id Chef_is_masterchef Transition var 0 2400614 May-2015 2297544 0 0 -8 1 2400614 June-2015 2297544 0 0 -7 2 2400614 July-2015 2297544 0 0 -6 3 2400614 August-2015 2297544 0 0 -5 4 2400614 September-2015 2297544 0 0 -4 5 2400614 October-2015 2297544 0 0 -3 6 2400614 November-2015 2297544 0 0 -2 7 2400614 December-2015 2297544 0 0 -1 8 2400614 January-2016 2297544 1 1 0 9 2400614 February-2016 2297544 1 0 1 10 2400614 March-2016 2297544 1 0 2 11 3400624 May-2016 2597531 0 0 -3 12 3400624 June-2016 2597531 0 0 -2 13 3400624 July-2016 2597531 0 0 -1 14 3400624 August-2016 2597531 1 1 0 15 2400133 February-2016 4597531 0 0 -6 16 2400133 March-2016 4597531 0 0 -5 17 2400133 April-2016 4597531 0 0 -4 18 2400133 May-2016 4597531 0 0 -3 19 2400133 June-2016 4597531 0 0 -2 20 2400133 July-2016 4597531 0 0 -1 21 2400133 August-2016 4597531 1 1 0 22 2400133 September-2016 4597531 1 0 1 23 2400133 October-2016 4597531 1 0 2 24 2400133 November-2016 4597531 1 0 3 25 2400133 December-2016 4597531 1 0 4 26 2400133 January-2017 4597531 1 0 5 27 2400133 February-2017 4597531 1 0 6 28 2400133 March-2017 4597531 1 0 7 29 2400133 April-2017 4597531 1 0 8 30 2400133 May-2017 4597531 1 0 9

最后过滤器 系列.之间:

Last filter by Series.between:

df1 = df[df['var'].between(-3, 2)] print (df1) df2 = df[df['var'].between(-6, 5)] print (df2)

更多推荐

如何根据每个id的条件选择行

本文发布于:2023-11-22 00:28:18,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1615271.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:条件   id

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!