优化numpy数组的rebin到任意binsize(Optimizing a rebin of a numpy array to arbitrary binsize)

编程入门 行业动态 更新时间:2024-10-25 07:17:38
优化numpy数组的rebin到任意binsize(Optimizing a rebin of a numpy array to arbitrary binsize) python

我正在建立这个问题。 我正在使用在那里发布的解决方案重新绑定一个numpy数组,只需添加一小部分:

from numpy import arange,append x = arange(20) x = x[:(x.shape[0]/bin)*bin].reshape((x.shape[0]//bin,-1)).mean(1) x= append(x,x[(x.shape[0]/bin)*bin:].mean())

这是为了处理x.shape[0]非除数二进制位。 append添加剩余单元格的平均值。 问题是我在这里制作了很多数组,并且超出了无法提高运行效率的内存。 有没有更好的办法? 我甚至考虑转移到列表,重新分箱,最后使用数组(结果)并返回。

要明确bin=6 ,第一行产生:

array([ 2.5, 8.5, 14.5])

第二个将附加:

18.5

在mean算子之前,得到的矩阵是:

array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17]])

第二个:

array([18, 19])

最后的结果当然是:

array([ 2.5, 8.5, 14.5, 18.5])

I'm building from this question. I'm re-binning a numpy array using the solution posted there, with a small addition for the extra:

from numpy import arange,append x = arange(20) x = x[:(x.shape[0]/bin)*bin].reshape((x.shape[0]//bin,-1)).mean(1) x= append(x,x[(x.shape[0]/bin)*bin:].mean())

This is to handle non divisor bins of x.shape[0]. The append adds the average of the remaining cells. The thing is I'm making a lot of arrays here, and beyond memory that can't be runtime efficient. Is there a better way? I'm even considering transferring to lists, re-binning, and finally using array(result) and return that.

To be clear for bin=6, the first line yields:

array([ 2.5, 8.5, 14.5])

and the second will append:

18.5

Before the mean operator the resulting matrices are:

array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17]])

and the second:

array([18, 19])

The final result is of course:

array([ 2.5, 8.5, 14.5, 18.5])

最满意答案

方法#1:如果你关心内存,最好初始化输出数组,然后分两步给它分配值,就像在原始代码中一样,但没有附加,就像这样 -

n = x.size//bin out = np.empty((x.size-1 + bin)//bin) out[:n] = x[:bin*n].reshape(-1,bin).mean(1) out[n:] = x[-x.size+n*bin:].mean()

方法#2:这是使用np.add.reduceat关注内存效率的另一种方法 -

out = np.add.reduceat(x, bin*np.arange((x.size-1+bin)//bin)).astype(float) out[:n] /= bin out[n:] /= x.size - n*bin

另外,使用np.add.reduceat()获得分组求和的另一种方法是使用np.bincount -

np.bincount(np.arange(x.size)//bin,x)

Approach #1 : If you care about memory, it might be better to initialize the output array and then assign values into it in two steps just like in the original code but without appending, like so -

n = x.size//bin out = np.empty((x.size-1 + bin)//bin) out[:n] = x[:bin*n].reshape(-1,bin).mean(1) out[n:] = x[-x.size+n*bin:].mean()

Approach #2 : Here's another approach with focus on memory efficiency with np.add.reduceat -

out = np.add.reduceat(x, bin*np.arange((x.size-1+bin)//bin)).astype(float) out[:n] /= bin out[n:] /= x.size - n*bin

Alternatively, another way to get the grouped summations as done with np.add.reduceat() would be with np.bincount -

np.bincount(np.arange(x.size)//bin,x)

更多推荐

本文发布于:2023-07-29 15:48:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1317584.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:数组   rebin   numpy   binsize   arbitrary

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!