滚动平均值/标准偏差(带条件)

编程入门 行业动态 更新时间:2024-10-28 08:21:44
本文介绍了滚动平均值/标准偏差(带条件)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

关于基于条件计算滚动平均值/标准偏差,我有一个问题。老实说,它更多是一个语法问题,但是由于我认为这使我的代码变慢了很多,所以我认为我应该在这里要求它找出正在发生的事情。我有一些财务数据,其列如股票名称,中间报价等,我想计算滚动平均值

I have a bit of a question about computing the Rolling Mean/standard deviation based on conditions. To be honest it is more of a syntax question, but since I think it is slowing down my code quite a bit I thought I should ask it here to find out what's going on. I have some finance data with columns such as Stock Name, Midquotes etc. and I would like to compute the rolling mean and rolling standard deviation based on the stock.

现在我想计算每只股票的波动率,这是通过取前20个引号。为此,在搜索了stackoverflow论坛之后,我使用 data.table 包找到了一行,如下所示:

Right now I wish to compute the volatility of each stock, and this is done by taking the rolling standard deviation of the previous 20 midquotes. To this end, after searching through the stackoverflow forums, I found a line using the data.table package as follows:

DT[, volatility:=( roll_sd(DT$Midquotes, 20, fill=0, align = "right") ), by = Stock]

其中 DT 是 data.table 包含我的所有数据。

Where DT is the data.table which contains all my data.

现在,这在计算上相当慢,尤其是当我将其与没有给出任何条件的典型滚动标准偏差计算进行比较时此处:

Now, this is quite computationally slow, especially when I compare it to a typical rolling standard deviation calculation without any conditions as given here:

DT$volatility <- roll_sd(DT$Midquotes, 20, fill=0, align = "right")

但是当我尝试对带有条件的滚动标准偏差执行类似操作时,R不会让我可以这样做:

But when I try to do something similar with the rolling standard deviation with a condition, R will not let me do this:

DT$volatility <- DT[, ( roll_sd(DT$Midquotes, 20, fill=0, align = "right") ), by = Stock]

此行出现错误:

Error: cannot allocate vector of size 10.9 Gb

$ b的向量$ b

所以我只是想知道,为什么这行: DT [,volatility:=(roll_sd(DT $ Midquotes,20,fill = 0,align = right))), =股票] 这么慢吗?每次为不同的股票计算滚动标准偏差时,是否可能会复制整个 data.table ?

So I was just wondering, why is this line: DT[, volatility:=( roll_sd(DT$Midquotes, 20, fill=0, align = "right") ), by = Stock] so slow? Is it perhaps making a copy of the entire data.table each time the rolling standard deviation is computed for each different stock?

推荐答案

我认为您的问题是您使用了:= 函数并在方括号内使用 DT 。我认为您的设置类似于:

I think your problem is your use of the := function and that you use DT inside the square brackets. I assume your setup is something like:

> library(data.table) > set.seed(83385668) > DT <- data.table( + x = rnorm(5 * 3), + stock = c(sapply(letters[1:3], rep, times = 5)), + time = c(replicate(3, 1:5))) > DT x stock time 1: 0.25073356 a 1 2: -0.24408170 a 2 3: -0.87475856 a 3 4: 0.50843761 a 4 5: -1.91331773 a 5 6: 0.07850094 b 1 7: -0.15922989 b 2 8: 1.09806870 b 3 9: 0.27995610 b 4 10: 0.45090842 b 5 11: 0.03400554 c 1 12: -0.34918734 c 2 13: 2.16602740 c 3 14: -0.04758261 c 4 15: 1.24869663 c 5

我不确定 roll_sd 函数的来源。但是,您可以计算 zoo 库的滚动平均值如下:

I am not sure where the roll_sd function is from. However, you can compute e.g. a rolling mean with the zoo library as follows:

> library(zoo) > setkey(DT, stock, time) # make sure data is sorted by time > DT[, rollmean := rollmean(x, k = 3, fill = 0, align = "right"), + by = .(stock)] > DT x stock time rollmean 1: 0.25073356 a 1 0.0000000 2: -0.24408170 a 2 0.0000000 3: -0.87475856 a 3 -0.2893689 4: 0.50843761 a 4 -0.2034676 5: -1.91331773 a 5 -0.7598796 6: 0.07850094 b 1 0.0000000 7: -0.15922989 b 2 0.0000000 8: 1.09806870 b 3 0.3391132 9: 0.27995610 b 4 0.4062650 10: 0.45090842 b 5 0.6096444 11: 0.03400554 c 1 0.0000000 12: -0.34918734 c 2 0.0000000 13: 2.16602740 c 3 0.6169485 14: -0.04758261 c 4 0.5897525 15: 1.24869663 c 5 1.1223805

或等价

> DT[, `:=`(rollmean = rollmean(x, k = 3, fill = 0, align = "right")), + by = .(stock)] > DT x stock time rollmean 1: 0.25073356 a 1 0.0000000 2: -0.24408170 a 2 0.0000000 3: -0.87475856 a 3 -0.2893689 4: 0.50843761 a 4 -0.2034676 5: -1.91331773 a 5 -0.7598796 6: 0.07850094 b 1 0.0000000 7: -0.15922989 b 2 0.0000000 8: 1.09806870 b 3 0.3391132 9: 0.27995610 b 4 0.4062650 10: 0.45090842 b 5 0.6096444 11: 0.03400554 c 1 0.0000000 12: -0.34918734 c 2 0.0000000 13: 2.16602740 c 3 0.6169485 14: -0.04758261 c 4 0.5897525 15: 1.24869663 c 5 1.1223805

更多推荐

滚动平均值/标准偏差(带条件)

本文发布于:2023-10-30 13:41:09,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1543021.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:平均值   偏差   条件   标准

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!