需要在几个列上进行过滤并在Python Pandas中更改一个值(Need to filter on several columns and change value of one in Python

编程入门 行业动态 更新时间:2024-10-28 21:21:19
需要在几个列上进行过滤并在Python Pandas中更改一个值(Need to filter on several columns and change value of one in Python Pandas)

我有一个150.000行和15列的表。 此示例的重要列是COUNTRY,COSTCENTER和EXTENSION。 我正在从CSV读取到Pandas Dataframe。 所有列都是object类型。

我想做的是:

搜索某个国家/地区(例如“中国”) 过滤COSTCENTER为1000或2000或EXTENSION以“862”开头的实例 应用所有过滤器后,将COUNTRY中的国家/地区名称更改为新的。

我有一个解决方案,但我总是得到一个链接问题的警告:

df.COUNTRY[df.COUNTRY.str.match("China") & (df.COSTCENTER.str.match("1000") | df.COSTCENTER.str.match("2000"))] = 'China_new_name'

我不能说,我完全理解,为什么我可以在这里遇到问题,但我一直在寻找替代方案。 我正在尝试使用lambda并申请,但我不断遇到各种各样的错误。

我现在的最新方法是:

filter_China = df.ix[(df["COUNTRY"]=="China") & ((df["COSTCENTER"]=="1000") | (df["COSTCENTER"]=="2000"))]

它似乎过滤,我正在寻找(我还没有包括搜索EXTENSION,因为我首先想要这个工作)。

但是当我尝试根据搜索条件更改值时,我遇到了麻烦:

df.ix[(df["COUNTRY"]=="China") & ((df["COSTCENTER"]=="1000") | (df["COSTCENTER"]=="2000")), df["COUNTRY"]] = "China_new_name"

我收到此错误:引发KeyError('%s不在索引'%objarr [mask])

我在这里想念的是什么? 这种方法是正确的还是我需要走完一条完全不同的路线?

I have a table with 150.000 rows and 15 columns. Important columns for this example are COUNTRY, COSTCENTER and EXTENSION. I am reading from a CSV into a Pandas Dataframe. All columns are of type object.

What I want to do is:

Search for a certain COUNTRY (e.g. "China") Filter for these instances where the COSTCENTER is either 1000 or 2000 or where an EXTENSION starts with "862" Once all filters have been applied, change the country name in COUNTRY to something new.

I had a solution, but I always got the warning for a chaining issue:

df.COUNTRY[df.COUNTRY.str.match("China") & (df.COSTCENTER.str.match("1000") | df.COSTCENTER.str.match("2000"))] = 'China_new_name'

I cannot say, I understood completely, why I could have problems here, but I was looking for an alternative. I was trying with lambda and apply, but I kept getting all sorts of errors.

My latest approach now was:

filter_China = df.ix[(df["COUNTRY"]=="China") & ((df["COSTCENTER"]=="1000") | (df["COSTCENTER"]=="2000"))]

and it seems to filter, what I am looking for (I did not include the search on EXTENSION yet, as I first wanted this to work).

But when I am trying to change a value, based on my search criteria, I am running into trouble:

df.ix[(df["COUNTRY"]=="China") & ((df["COSTCENTER"]=="1000") | (df["COSTCENTER"]=="2000")), df["COUNTRY"]] = "China_new_name"

I am getting this error: raise KeyError('%s not in index' % objarr[mask])

What am I missing here? Is the approach the right one or would I need to go a total different route?

最满意答案

您需要阅读有关链式索引和SettingWithCopy警告的文档部分

df.loc[df.COUNTRY.str.match("China") & (df.COSTCENTER.str.match("1000") | df.COSTCENTER.str.match("2000")), "COUNTRY"] = 'China_new_name'

You need to read the section of the documentation on chained indexing and the SettingWithCopy warning

df.loc[df.COUNTRY.str.match("China") & (df.COSTCENTER.str.match("1000") | df.COSTCENTER.str.match("2000")), "COUNTRY"] = 'China_new_name'

更多推荐

本文发布于:2023-08-03 15:55:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1393284.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:几个   并在   Python   Pandas   columns

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!