脾气暴躁的多维数据集拆分成多维数据集

编程入门 行业动态 更新时间:2024-10-10 21:22:03
本文介绍了脾气暴躁的多维数据集拆分成多维数据集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

有一个函数np.split()可以沿1轴拆分数组.我想知道是否有一个多轴版本,例如可以沿轴(0,1,2)拆分.

There is a function np.split() which can split an array along 1 axis. I was wondering if there was a multi axis version where you can split along axes (0,1,2) for example.

推荐答案

假设cube具有形状(W, H, D),并且您希望将其分解为N个形状为(w, h, d)的小立方体.由于NumPy数组具有固定长度的轴,因此w必须均匀地划分W,并且对于h和d同样.

Suppose the cube has shape (W, H, D) and you wish to break it up into N little cubes of shape (w, h, d). Since NumPy arrays have axes of fixed length, w must evenly divide W, and similarly for h and d.

然后有一种方法可以将形状为(W, H, D)的多维数据集重塑为形状为(N, w, h, d)的新数组.

Then there is a way to reshape the cube of shape (W, H, D) into a new array of shape (N, w, h, d).

例如,如果arr = np.arange(4*4*4).reshape(4,4,4)(所以(W,H,D) = (4,4,4))并且我们希望将其分解为形状为(2,2,2)的立方体,则可以使用

For example, if arr = np.arange(4*4*4).reshape(4,4,4) (so (W,H,D) = (4,4,4)) and we wish to break it up into cubes of shape (2,2,2), then we could use

In [283]: arr.reshape(2,2,2,2,2,2).transpose(0,2,4,1,3,5).reshape(-1,2,2,2) Out[283]: array([[[[ 0, 1], [ 4, 5]], [[16, 17], [20, 21]]], ... [[[42, 43], [46, 47]], [[58, 59], [62, 63]]]])

这里的想法是向数组添加额外的轴,这些轴可以用作位置标记:

The idea here is to add extra axes to the array which sort of act as place markers:

number of repeats act as placemarkers o---o---o | | | v v v (2,2,2,2,2,2) ^ ^ ^ | | | o---o---o newshape

然后我们可以使用transpose对轴进行重新排序,以使重复次数排在最前面,而新形状则在末尾:

We can then reorder the axes (using transpose) so that the number of repeats comes first, and the newshape comes at the end:

arr.reshape(2,2,2,2,2,2).transpose(0,2,4,1,3,5)

最后,调用reshape(-1, w, h, d)将所有地标轴压缩为一个轴.这样会生成形状为(N, w, h, d)的数组,其中N是小立方体的数量.

And finally, call reshape(-1, w, h, d) to squash all the placemarking axes into a single axis. This produces an array of shape (N, w, h, d) where N is the number of little cubes.

上面使用的想法是将此想法的概括化为3个维度.可以将其进一步推广为任意维度的ndarray:

The idea used above is a generalization of this idea to 3 dimensions. It can be further generalized to ndarrays of any dimension:

import numpy as np def cubify(arr, newshape): oldshape = np.array(arr.shape) repeats = (oldshape / newshape).astype(int) tmpshape = np.column_stack([repeats, newshape]).ravel() order = np.arange(len(tmpshape)) order = np.concatenate([order[::2], order[1::2]]) # newshape must divide oldshape evenly or else ValueError will be raised return arr.reshape(tmpshape).transpose(order).reshape(-1, *newshape) print(cubify(np.arange(4*6*16).reshape(4,6,16), (2,3,4)).shape) print(cubify(np.arange(8*8*8*8).reshape(8,8,8,8), (2,2,2,2)).shape)

产生新的形状数组

(16, 2, 3, 4) (256, 2, 2, 2, 2)

要取消整理"数组:

To "uncubify" the arrays:

def uncubify(arr, oldshape): N, newshape = arr.shape[0], arr.shape[1:] oldshape = np.array(oldshape) repeats = (oldshape / newshape).astype(int) tmpshape = np.concatenate([repeats, newshape]) order = np.arange(len(tmpshape)).reshape(2, -1).ravel(order='F') return arr.reshape(tmpshape).transpose(order).reshape(oldshape)

下面是一些测试代码,用于检查cubify和uncubify是否为逆.

Here is some test code to check that cubify and uncubify are inverses.

import numpy as np def cubify(arr, newshape): oldshape = np.array(arr.shape) repeats = (oldshape / newshape).astype(int) tmpshape = np.column_stack([repeats, newshape]).ravel() order = np.arange(len(tmpshape)) order = np.concatenate([order[::2], order[1::2]]) # newshape must divide oldshape evenly or else ValueError will be raised return arr.reshape(tmpshape).transpose(order).reshape(-1, *newshape) def uncubify(arr, oldshape): N, newshape = arr.shape[0], arr.shape[1:] oldshape = np.array(oldshape) repeats = (oldshape / newshape).astype(int) tmpshape = np.concatenate([repeats, newshape]) order = np.arange(len(tmpshape)).reshape(2, -1).ravel(order='F') return arr.reshape(tmpshape).transpose(order).reshape(oldshape) tests = [[np.arange(4*6*16), (4,6,16), (2,3,4)], [np.arange(8*8*8*8), (8,8,8,8), (2,2,2,2)]] for arr, oldshape, newshape in tests: arr = arr.reshape(oldshape) assert np.allclose(uncubify(cubify(arr, newshape), oldshape), arr) # cuber = Cubify(oldshape,newshape) # assert np.allclose(cuber.uncubify(cuber.cubify(arr)), arr)

更多推荐

脾气暴躁的多维数据集拆分成多维数据集

本文发布于:2023-10-31 09:11:53,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1545688.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:多维   数据   暴躁   脾气

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!