在熊猫DataFrame中快速应用字符串操作(Quickly applying string operations in a pandas DataFrame)

编程入门 行业动态 更新时间:2024-10-28 08:27:27
熊猫DataFrame中快速应用字符串操作(Quickly applying string operations in a pandas DataFrame)

假设我有一个具有100k行和列name的DataFrame 。 我想尽可能有效地将这个名字分成姓和名。 我目前的方法是,

def splitName(name): return pandas.Series(name.split()[0:2]) df[['first', 'last']] = df.apply(lambda x: splitName(x['name']), axis=1)

不幸的是, DataFrame.apply真的很慢。 我能做些什么来使这个字符串操作几乎和一个numpy操作一样快?

谢谢!

Suppose I have a DataFrame with 100k rows and a column name. I would like to split this name into first and last name as efficiently as possibly. My current method is,

def splitName(name): return pandas.Series(name.split()[0:2]) df[['first', 'last']] = df.apply(lambda x: splitName(x['name']), axis=1)

Unfortunately, DataFrame.apply is really, really slow. Is there anything I can do to make this string operation nearly as fast as a numpy operation?

Thanks!

最满意答案

尝试(要求熊猫> = 0.8.1):

splits = x['name'].split() df['first'] = splits.str[0] df['last'] = splits.str[1]

Try (requires pandas >= 0.8.1):

splits = x['name'].split() df['first'] = splits.str[0] df['last'] = splits.str[1]

更多推荐

本文发布于:2023-07-26 18:56:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1279565.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:熊猫   字符串   快速   操作   operations

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!