我是熊猫的新手,试图弄清楚如何将格式化为字符串的多列转换为float64.目前我正在做下面的事情,但是apply()或applymap()似乎应该能够更加有效地完成此任务……不幸的是,我太菜鸟了,无法弄清楚该怎么做.当前值是百分比格式,格式为"15.5%"
I'm new to pandas and trying to figure out how to convert multiple columns which are formatted as strings to float64's. Currently I'm doing the below, but it seems like apply() or applymap() should be able to accomplish this task even more efficiently...unfortunately I'm a bit too much of a rookie to figure out how. Currently the values are percentages formatted as strings like '15.5%'
for column in ['field1', 'field2', 'field3']: data[column] = data[column].str.rstrip('%').astype('float64') / 100推荐答案
从0.11.1开始(本周推出),replace具有一个新选项来替换为正则表达式,因此这成为可能
Starting in 0.11.1 (coming out this week), replace has a new option to replace with a regex, so this becomes possible
In [14]: df = DataFrame('10.0%',index=range(100),columns=range(10)) In [15]: df.replace('%','',regex=True).astype('float')/100 Out[15]: <class 'pandas.core.frame.DataFrame'> Int64Index: 100 entries, 0 to 99 Data columns (total 10 columns): 0 100 non-null values 1 100 non-null values 2 100 non-null values 3 100 non-null values 4 100 non-null values 5 100 non-null values 6 100 non-null values 7 100 non-null values 8 100 non-null values 9 100 non-null values dtypes: float64(10)更快一点
In [16]: %timeit df.replace('%','',regex=True).astype('float')/100 1000 loops, best of 3: 1.16 ms per loop In [18]: %timeit df.applymap(lambda x: float(x[:-1]))/100 1000 loops, best of 3: 1.67 ms per loop更多推荐
pandas 将字符串转换为浮点表示数据框中的多列
发布评论