在熊猫DataFrame中快速应用字符串操作(Quickly applying string operations in a pandas DataFrame)

编程入门行业动态更新时间:2024-10-28 08:27:27

假设我有一个具有100k行和列name的DataFrame 。我想尽可能有效地将这个名字分成姓和名。我目前的方法是，

def splitName(name): return pandas.Series(name.split()[0:2]) df[['first', 'last']] = df.apply(lambda x: splitName(x['name']), axis=1)

不幸的是， DataFrame.apply真的很慢。我能做些什么来使这个字符串操作几乎和一个numpy操作一样快？

谢谢！

Suppose I have a DataFrame with 100k rows and a column name. I would like to split this name into first and last name as efficiently as possibly. My current method is,

def splitName(name): return pandas.Series(name.split()[0:2]) df[['first', 'last']] = df.apply(lambda x: splitName(x['name']), axis=1)

Unfortunately, DataFrame.apply is really, really slow. Is there anything I can do to make this string operation nearly as fast as a numpy operation?

Thanks!

最满意答案

尝试（要求熊猫> = 0.8.1）：

splits = x['name'].split() df['first'] = splits.str[0] df['last'] = splits.str[1]

Try (requires pandas >= 0.8.1):

splits = x['name'].split() df['first'] = splits.str[0] df['last'] = splits.str[1]

更多推荐

本文发布于:2023-07-26 18:56:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1279565.html

熊猫字符串快速操作 operations

上一篇： [前缀树]leetcode336：回文对(hard)
下一篇： Ember.js路由：如何设置默认路由立即呈现？

发布评论取消回复

评论列表（有 0 条评论）

在熊猫DataFrame中快速应用字符串操作(Quickly applying string operations in a pandas DataFrame)

最满意答案

发布评论取消回复

最近发表

热门文章

标签列表