有条件地设置 pandas 数据框列值

编程入门 行业动态 更新时间:2024-10-24 04:42:32
本文介绍了有条件地设置 pandas 数据框列值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

此问题与以下要求完全相同,但又有一个问题,

This question is exactly as the following request, with one more twist,

  • >熊猫:替换数据框中的列值
  • 对熊猫数据框列中的值进行有条件的替换
  • Pandas: Replacing column values in dataframe
  • Conditional Substitution of values in pandas dataframe columns

因此,我想设置或有条件地设置pandas dataframe列值.增加的复杂性是,我无需使用字符串常量(df['data1'])来寻址数据帧列,而是需要使用变量(df[var_for_data1])来寻址它们,因为构造了我的df列名.

So, I want to set, or conditionally set pandas dataframe column values. The added complexity is, instead of addressing the dataframe columns with string constant (df['data1']), I need to address them with variables (df[var_for_data1]), becaus my df column names are constructed.

以下是简化了的示例来解释我想要的内容:

Here is the much simplified example to explain what I want:

df = pd.DataFrame({'data1': np.random.randn(100),'data2': np.random.randn(100)}) print(df.head()) Col = 'data1' print(df[Col].head()) df.data1 = df.data1 +.1 print(df[Col].head()) # so far so good, now how to do above with variable dataframe column name `Col` #df.Col = df.Col + .1

问题出在代码中,到目前为止,现在还不错,现在如何在上面使用可变数据框列名Col 进行操作.

The question is in the code, so far so good, now how to do above with variable dataframe column name Col.

下一个问题是如何向上述分配中添加条件,比如说要这样做if df.data1 >=.25 and df.data1 <= .35:.当然,可以使用可变数据框列名称Col来表达它.

The next question is how to add a condition to the above assignment, say to do it if df.data1 >=.25 and df.data1 <= .35:. Of course, expressing it using the variable dataframe column name Col.

推荐答案

您可以使用方括号使用字符串而不是属性来访问列名,我也强烈建议您放弃使用按属性访问列的习惯因为这会导致混乱的行为,例如,如果您具有列名sum而您执行df.sum则会返回方法sum而不是列'sum'的地址.

You can use square brackets to access a column name using the string rather than as an attribute, I also strongly recommend that you ditch this habit of accessing columns by attribute as this can lead to confusing behaviour such as if you have a column name sum and you do df.sum will return the address of the method sum rather than the column 'sum'.

所以df[Col] = df[Col] + 1

就可以工作.

关于第二个问题,要将数组与标量值进行比较,请分别对and,or和not使用按位运算符&,|和~,它们将返回一个数组布尔值,要使用多个条件,由于运算符优先级,您需要将条件包装在括号中,因为&的优先级高于比较运算符.

Regarding your 2nd question, to compare an array against a scalar value use the bitwise operators &, | and ~ for and, or and not respectively these will return an array of boolean values, to use more than 1 condition you need to wrap the conditions in parentheses due to operator precedence as & has higher precedence than the comparison operators.

所以:

df[(df[col] >=.25) & (df[col] <= .35)]

应该起作用,这会将df只屏蔽同时满足两个条件的行

should work, this will mask the df to only the rows where both conditions are met

更多推荐

有条件地设置 pandas 数据框列值

本文发布于:2023-10-24 08:44:47,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1523476.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:有条件   数据   pandas   框列值

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!