与 pandas 或numpy的n维滑动窗口

编程入门 行业动态 更新时间:2024-10-21 18:36:36
本文介绍了与 pandas 或numpy的n维滑动窗口的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我该怎么办rollapply(....,by.column = FALSE)的R(XTS)等效,使用numpy的还是熊猫?当给定一个数据帧,熊猫rolling_apply似乎只列工作列,而不是提供选项来提供一个完整的(窗口大小)×(数据帧的宽度)矩阵为目标的功能。

How do I do the R(xts) equivalent of rollapply(...., by.column=FALSE), using Numpy or Pandas? When given a dataframe, pandas rolling_apply seems only to work column by column instead of providing the option to provide a full (window-size) x (data-frame-width) matrix to the target function.

import pandas as pd import numpy as np xx = pd.DataFrame(np.zeros([10, 10])) pd.rolling_apply(xx, 5, lambda x: np.shape(x)[0]) 0 1 2 3 4 5 6 7 8 9 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6 5 5 5 5 5 5 5 5 5 5 7 5 5 5 5 5 5 5 5 5 5 8 5 5 5 5 5 5 5 5 5 5 9 5 5 5 5 5 5 5 5 5 5

所以,发生了什么事是rolling_apply正在下降,反过来每一列和应用滑动5长窗下的这些各一个,而我希望的是滑动窗口每次是一个5×10阵列,而在这种情况下, ,我会得到一个列向量(而不是二维数组)的结果。

So what's happening is rolling_apply is going down each column in turn and applying a sliding 5-length window down each one of these, whereas what I want is for the sliding windows to be a 5x10 array each time, and in this case, I would get a single column vector (not 2d array) result.

推荐答案

我确实无法找到一种方法来计算宽,在大熊猫滚动的应用文档,所以我会使用numpy的,以获得阵列上的窗口化的观点,并应用ufunc给它。这里有一个例子:

I indeed cannot find a way to compute "wide" rolling application in pandas docs, so I'd use numpy to get a "windowing" view on the array and apply a ufunc to it. Here's an example:

In [40]: arr = np.arange(50).reshape(10, 5); arr Out[40]: array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24], [25, 26, 27, 28, 29], [30, 31, 32, 33, 34], [35, 36, 37, 38, 39], [40, 41, 42, 43, 44], [45, 46, 47, 48, 49]]) In [41]: win_size = 5 In [42]: isize = arr.itemsize; isize Out[42]: 8

arr.itemsize 是8,因为默认情况下DTYPE是 np.int64 ,你需要它下面的窗口鉴于成语:

arr.itemsize is 8 because default dtype is np.int64, you need it for the following "window" view idiom:

In [43]: windowed = np.lib.stride_tricks.as_strided(arr, shape=(arr.shape[0] - win_size + 1, win_size, arr.shape[1]), strides=(arr.shape[1] * isize, arr.shape[1] * isize, isize)); windowed Out[43]: array([[[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]], [[ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24], [25, 26, 27, 28, 29]], [[10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24], [25, 26, 27, 28, 29], [30, 31, 32, 33, 34]], [[15, 16, 17, 18, 19], [20, 21, 22, 23, 24], [25, 26, 27, 28, 29], [30, 31, 32, 33, 34], [35, 36, 37, 38, 39]], [[20, 21, 22, 23, 24], [25, 26, 27, 28, 29], [30, 31, 32, 33, 34], [35, 36, 37, 38, 39], [40, 41, 42, 43, 44]], [[25, 26, 27, 28, 29], [30, 31, 32, 33, 34], [35, 36, 37, 38, 39], [40, 41, 42, 43, 44], [45, 46, 47, 48, 49]]])

健是沿给定轴的两个相邻元件之间的字节数,因此, =迈进(arr.shape [1] * isize,arr.shape [1] * isize,isize)表示跳过5从元素窗口去当[0]为窗口[1],并跳过时,5种元素从窗口去[0,0]为窗口[0,1]。现在你可以呼吁任何ufunc结果数组,例如:

Strides are number of bytes between two neighbour elements along given axis, thus strides=(arr.shape[1] * isize, arr.shape[1] * isize, isize) means skip 5 elements when going from windowed[0] to windowed[1] and skip 5 elements when going from windowed[0, 0] to windowed[0, 1]. Now you can call any ufunc on the resulting array, e.g.:

In [44]: windowed.sum(axis=(1,2)) Out[44]: array([300, 425, 550, 675, 800, 925])

更多推荐

与 pandas 或numpy的n维滑动窗口

本文发布于:2023-11-28 20:59:32,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1643930.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:窗口   pandas   numpy

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!