特定列上 pandas 的滚动平均值

编程入门 行业动态 更新时间:2024-10-28 18:36:41
本文介绍了特定列上 pandas 的滚动平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我有一个这样的数据框,它是从 CSV 导入的.

I have a data frame like this which is imported from a CSV.

stock pop Date 2016-01-04 325.316 82 2016-01-11 320.036 83 2016-01-18 299.169 79 2016-01-25 296.579 84 2016-02-01 295.334 82 2016-02-08 309.777 81 2016-02-15 317.397 75 2016-02-22 328.005 80 2016-02-29 315.504 81 2016-03-07 328.802 81 2016-03-14 339.559 86 2016-03-21 352.160 82 2016-03-28 348.773 84 2016-04-04 346.482 83 2016-04-11 346.980 80 2016-04-18 357.140 75 2016-04-25 357.439 77 2016-05-02 356.443 78 2016-05-09 365.158 78 2016-05-16 352.160 72 2016-05-23 344.540 74 2016-05-30 354.998 81 2016-06-06 347.428 77 2016-06-13 341.053 78 2016-06-20 363.515 80 2016-06-27 349.669 80 2016-07-04 371.583 82 2016-07-11 358.335 81 2016-07-18 362.021 79 2016-07-25 368.844 77 ... ... ...

我想添加一个新列 MA,用于计算列 pop 的滚动平均值.我尝试了以下

I wanted to add a new column MA which calculates Rolling mean for the column pop. I tried the following

df['MA']=data.rolling(5,on='pop').mean()

我收到一个错误

ValueError: Wrong number of items passed 2, placement implies 1

所以我想让我尝试一下它是否可以在不添加列的情况下工作.我用过

So I thought let me try if it just works without adding a column. I used

data.rolling(5,on='pop').mean()

我得到了输出

stock pop Date 2016-01-04 NaN 82 2016-01-11 NaN 83 2016-01-18 NaN 79 2016-01-25 NaN 84 2016-02-01 307.2868 82 2016-02-08 304.1790 81 2016-02-15 303.6512 75 2016-02-22 309.4184 80 2016-02-29 313.2034 81 2016-03-07 319.8970 81 2016-03-14 325.8534 86 2016-03-21 332.8060 82 2016-03-28 336.9596 84 2016-04-04 343.1552 83 2016-04-11 346.7908 80 2016-04-18 350.3070 75 2016-04-25 351.3628 77 2016-05-02 352.8968 78 2016-05-09 356.6320 78 2016-05-16 357.6680 72 2016-05-23 355.1480 74 2016-05-30 354.6598 81 2016-06-06 352.8568 77 2016-06-13 348.0358 78 2016-06-20 350.3068 80 2016-06-27 351.3326 80 2016-07-04 354.6496 82 2016-07-11 356.8310 81 2016-07-18 361.0246 79 2016-07-25 362.0904 77 ... ... ...

我似乎无法在弹出列上应用滚动均值.我做错了什么?

I can't seem to apply Rolling mean on the column pop. What am I doing wrong?

推荐答案

要分配一列,您可以根据您的 Series 创建滚动对象:

To assign a column, you can create a rolling object based on your Series:

df['new_col'] = data['column'].rolling(5).mean()

ac2001 发布的答案并不是执行此操作的最高效方法.他正在计算数据框中每一列的滚动平均值,然后他使用pop"列分配ma"列.下面的第一种方法效率更高:

The answer posted by ac2001 is not the most performant way of doing this. He is calculating a rolling mean on every column in the dataframe, then he is assigning the "ma" column using the "pop" column. The first method of the following is much more efficient:

%timeit df['ma'] = data['pop'].rolling(5).mean() %timeit df['ma_2'] = data.rolling(5).mean()['pop'] 1000 loops, best of 3: 497 µs per loop 100 loops, best of 3: 2.6 ms per loop

除非您需要在所有其他列上存储计算滚动均值,否则我不建议使用第二种方法.

I would not recommend using the second method unless you need to store computed rolling means on all other columns.

更多推荐

特定列上 pandas 的滚动平均值

本文发布于:2023-10-30 13:43:01,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1543026.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:平均值   pandas

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!