在Python中自动填充计算功能(Automating fill with calculation function in python)

编程入门 行业动态 更新时间:2024-10-28 14:30:07
在Python中自动填充计算功能(Automating fill with calculation function in python)

我到目前为止是下面的代码,它工作正常,并带来了它应该的结果:如果没有给定previous c * b ,它会用previous c * b计算填充df['c'] 。 问题是我必须将它应用于更大的数据集len(df.index) = ca. 10.000 len(df.index) = ca. 10.000 ,所以我目前使用的函数是不合适的,因为我必须写几千次: df['c'] = df.apply(func, axis =1) 。 对于这个大小的数据集, while循环在pandas是没有选择的。 有任何想法吗?

import pandas as pd import numpy as np import datetime randn = np.random.randn rng = pd.date_range('1/1/2011', periods=10, freq='D') df = pd.DataFrame({'a': [None] * 10, 'b': [2, 3, 10, 3, 5, 8, 4, 1, 2, 6]},index=rng) df["c"] =np.NaN df["c"][0] = 1 df["c"][2] = 3 def func(x): if pd.notnull(x['c']): return x['c'] else: return df.iloc[df.index.get_loc(x.name) - 1]['c'] * x['b'] df['c'] = df.apply(func, axis =1) df['c'] = df.apply(func, axis =1) df['c'] = df.apply(func, axis =1) df['c'] = df.apply(func, axis =1) df['c'] = df.apply(func, axis =1) df['c'] = df.apply(func, axis =1) df['c'] = df.apply(func, axis =1)

What I got so far is the code below and it works fine and brings the results it should: It fills df['c'] with the calculation previous c * b if there is no c given. The problem is that I have to apply this to a bigger data set len(df.index) = ca. 10.000, so the function I have so far is inappropriate since I would have to write a couple of thousand times: df['c'] = df.apply(func, axis =1). A while loop is no option in pandas for this size of dataset. Any ideas?

import pandas as pd import numpy as np import datetime randn = np.random.randn rng = pd.date_range('1/1/2011', periods=10, freq='D') df = pd.DataFrame({'a': [None] * 10, 'b': [2, 3, 10, 3, 5, 8, 4, 1, 2, 6]},index=rng) df["c"] =np.NaN df["c"][0] = 1 df["c"][2] = 3 def func(x): if pd.notnull(x['c']): return x['c'] else: return df.iloc[df.index.get_loc(x.name) - 1]['c'] * x['b'] df['c'] = df.apply(func, axis =1) df['c'] = df.apply(func, axis =1) df['c'] = df.apply(func, axis =1) df['c'] = df.apply(func, axis =1) df['c'] = df.apply(func, axis =1) df['c'] = df.apply(func, axis =1) df['c'] = df.apply(func, axis =1)

最满意答案

这是解决再发问题的好方法。 在v0.16.2中会有这方面的文档(下周发布)。 查看关于numba的文档

这将是非常高效的,因为真正的繁重工作是在快速跳转的编译代码中完成的。

import pandas as pd import numpy as np from numba import jit rng = pd.date_range('1/1/2011', periods=10, freq='D') df = pd.DataFrame({'a': np.nan * 10, 'b': [2, 3, 10, 3, 5, 8, 4, 1, 2, 6]},index=rng) df.ix[0,"c"] = 1 df.ix[2,"c"] = 3 @jit def ffill(arr_b, arr_c): n = len(arr_b) assert len(arr_b) == len(arr_c) result = arr_c.copy() for i in range(1,n): if not np.isnan(arr_c[i]): result[i] = arr_c[i] else: result[i] = result[i-1]*arr_b[i] return result df['d'] = ffill(df.b.values, df.c.values) a b c d 2011-01-01 NaN 2 1 1 2011-01-02 NaN 3 NaN 3 2011-01-03 NaN 10 3 3 2011-01-04 NaN 3 NaN 9 2011-01-05 NaN 5 NaN 45 2011-01-06 NaN 8 NaN 360 2011-01-07 NaN 4 NaN 1440 2011-01-08 NaN 1 NaN 1440 2011-01-09 NaN 2 NaN 2880 2011-01-10 NaN 6 NaN 17280

Here is a nice way of solving a recurrence problem. There will be docs on this in v0.16.2 (releasing next week). See docs for numba

This will be quite performant as the real heavy lifting is done in fast jit-ted compiled code.

import pandas as pd import numpy as np from numba import jit rng = pd.date_range('1/1/2011', periods=10, freq='D') df = pd.DataFrame({'a': np.nan * 10, 'b': [2, 3, 10, 3, 5, 8, 4, 1, 2, 6]},index=rng) df.ix[0,"c"] = 1 df.ix[2,"c"] = 3 @jit def ffill(arr_b, arr_c): n = len(arr_b) assert len(arr_b) == len(arr_c) result = arr_c.copy() for i in range(1,n): if not np.isnan(arr_c[i]): result[i] = arr_c[i] else: result[i] = result[i-1]*arr_b[i] return result df['d'] = ffill(df.b.values, df.c.values) a b c d 2011-01-01 NaN 2 1 1 2011-01-02 NaN 3 NaN 3 2011-01-03 NaN 10 3 3 2011-01-04 NaN 3 NaN 9 2011-01-05 NaN 5 NaN 45 2011-01-06 NaN 8 NaN 360 2011-01-07 NaN 4 NaN 1440 2011-01-08 NaN 1 NaN 1440 2011-01-09 NaN 2 NaN 2880 2011-01-10 NaN 6 NaN 17280

更多推荐

本文发布于:2023-08-03 04:18:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1382892.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:功能   Automating   Python   fill   python

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!