我正在建立这个问题。 我正在使用在那里发布的解决方案重新绑定一个numpy数组,只需添加一小部分:
from numpy import arange,append x = arange(20) x = x[:(x.shape[0]/bin)*bin].reshape((x.shape[0]//bin,-1)).mean(1) x= append(x,x[(x.shape[0]/bin)*bin:].mean())这是为了处理x.shape[0]非除数二进制位。 append添加剩余单元格的平均值。 问题是我在这里制作了很多数组,并且超出了无法提高运行效率的内存。 有没有更好的办法? 我甚至考虑转移到列表,重新分箱,最后使用数组(结果)并返回。
要明确bin=6 ,第一行产生:
array([ 2.5, 8.5, 14.5])第二个将附加:
18.5在mean算子之前,得到的矩阵是:
array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17]])第二个:
array([18, 19])最后的结果当然是:
array([ 2.5, 8.5, 14.5, 18.5])I'm building from this question. I'm re-binning a numpy array using the solution posted there, with a small addition for the extra:
from numpy import arange,append x = arange(20) x = x[:(x.shape[0]/bin)*bin].reshape((x.shape[0]//bin,-1)).mean(1) x= append(x,x[(x.shape[0]/bin)*bin:].mean())This is to handle non divisor bins of x.shape[0]. The append adds the average of the remaining cells. The thing is I'm making a lot of arrays here, and beyond memory that can't be runtime efficient. Is there a better way? I'm even considering transferring to lists, re-binning, and finally using array(result) and return that.
To be clear for bin=6, the first line yields:
array([ 2.5, 8.5, 14.5])and the second will append:
18.5Before the mean operator the resulting matrices are:
array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17]])and the second:
array([18, 19])The final result is of course:
array([ 2.5, 8.5, 14.5, 18.5])最满意答案
方法#1:如果你关心内存,最好初始化输出数组,然后分两步给它分配值,就像在原始代码中一样,但没有附加,就像这样 -
n = x.size//bin out = np.empty((x.size-1 + bin)//bin) out[:n] = x[:bin*n].reshape(-1,bin).mean(1) out[n:] = x[-x.size+n*bin:].mean()方法#2:这是使用np.add.reduceat关注内存效率的另一种方法 -
out = np.add.reduceat(x, bin*np.arange((x.size-1+bin)//bin)).astype(float) out[:n] /= bin out[n:] /= x.size - n*bin另外,使用np.add.reduceat()获得分组求和的另一种方法是使用np.bincount -
np.bincount(np.arange(x.size)//bin,x)Approach #1 : If you care about memory, it might be better to initialize the output array and then assign values into it in two steps just like in the original code but without appending, like so -
n = x.size//bin out = np.empty((x.size-1 + bin)//bin) out[:n] = x[:bin*n].reshape(-1,bin).mean(1) out[n:] = x[-x.size+n*bin:].mean()Approach #2 : Here's another approach with focus on memory efficiency with np.add.reduceat -
out = np.add.reduceat(x, bin*np.arange((x.size-1+bin)//bin)).astype(float) out[:n] /= bin out[n:] /= x.size - n*binAlternatively, another way to get the grouped summations as done with np.add.reduceat() would be with np.bincount -
np.bincount(np.arange(x.size)//bin,x)更多推荐
发布评论