使用cython将numpy数组列表传递给C

编程入门行业动态更新时间:2024-10-28 05:17:10

本文介绍了使用cython将numpy数组列表传递给C的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我有一个要通过模板传递给C函数的3D numpy数组列表list_of_arrays

I have a list list_of_arrays of 3D numpy arrays that I want to pass to a C function with the template

int my_func_c(double **data, int **shape, int n_arrays)

如此

data[i] : pointer to the numpy array values in list_of_arrays[i] shape[i] : pointer to the shape of the array in list_of_arrays[i] e.g. [2,3,4]

如何使用cython接口函数调用my_func_c?

How can I call my_func_c using a cython interface function?

我的第一个想法是执行以下操作(有效)，但我觉得有一种更好的方法是仅使用numpy数组而不进行malloc和释放.

My first idea was to do something like below (which works) but I feel there is a better way just using numpy arrays without mallocing and freeing.

# my_func_c.pyx import numpy as np cimport numpy as np cimport cython from libc.stdlib cimport malloc, free cdef extern from "my_func.c": double my_func_c(double **data, int **shape, int n_arrays) def my_func(list list_of_arrays): cdef int n_arrays = len(list_of_arrays) cdef double **data = <double **> malloc(n_arrays*sizeof(double *)) cdef int **shape = <int **> malloc(n_arrays*sizeof(int *)) cdef double x; cdef np.ndarray[double, ndim=3, mode="c"] temp for i in range(n_arrays): temp = list_of_arrays[i] data[i] = &temp[0,0,0] shape[i] = <int *> malloc(3*sizeof(int)) for j in range(3): shape[i][j] = list_of_arrays[i].shape[j] x = my_func_c(data, shape, n_arrays) # Free memory for i in range(n_arrays): free(shape[i]) free(data) free(shape) return x

N.B.

要查看工作示例，我们可以使用一个非常简单的函数来计算列表中所有数组的乘积.

To see a working example we can use a very simple function calculating the product of all the arrays in our list.

# my_func.c double my_func_c(double **data, int **shape, int n_arrays) { int array_idx, i0, i1, i2; double prod = 1.0; // Loop over all arrays for (array_idx=0; array_idx<n_arrays; array_idx++) { for (i0=0; i0<shape[array_idx][0]; i0++) { for (i1=0; i1<shape[array_idx][1]; i1++) { for (i2=0; i2<shape[array_idx][2]; i2++) { prod = prod*data[array_idx][i0*shape[array_idx][1]*shape[array_idx][2] + i1*shape[array_idx][2] + i2]; } } } } return prod; }

创建setup.py文件，

# setup.py from distutils.core import setup from Cython.Build import cythonize import numpy as np setup( name='my_func', ext_modules = cythonize("my_func_c.pyx"), include_dirs=[np.get_include()] )

编译

python3 setup.py build_ext --inplace

最后我们可以运行一个简单的测试

Finally we can run a simple test

# test.py import numpy as np from my_func_c import my_func a = [1+np.random.rand(3,1,2), 1+np.random.rand(4,5,2), 1+np.random.rand(1,2,3)] print('Numpy product: {}'.format(np.prod([i.prod() for i in a]))) print('my_func product: {}'.format(my_func(a)))

使用

python3 test.py

推荐答案

一种替代方法是让numpy为您管理内存.您可以通过使用np.uintp的numpy数组来实现此目的，该数组是一个无符号int，其大小与任何指针相同.

One alternative would be to let numpy manage your memory for you. You can do this by using numpy arrays of np.uintp which is an unsigned int with the same size as any pointer.

不幸的是，这确实需要进行一些类型转换(在指针大小的int"和指针之间)，这是隐藏逻辑错误的好方法，因此我对此不是100％满意.

Unfortunately, this does require some type-casting (between "pointer sized int" and pointers) which is a good way of hiding logic errors, so I'm not 100% happy with it.

def my_func(list list_of_arrays): cdef int n_arrays = len(list_of_arrays) cdef np.uintp_t[::1] data = np.array((n_arrays,),dtype=np.uintp) cdef np.uintp_t[::1] shape = np.array((n_arrays,),dtype=np.uintp) cdef double x; cdef np.ndarray[double, ndim=3, mode="c"] temp for i in range(n_arrays): temp = list_of_arrays[i] data[i] = <np.uintp_t>&temp[0,0,0] shape[i] = <np.uintp_t>&(temp.shape[0]) x = my_func_c(<double**>(&data[0]), <np.intp_t**>&shape[0], n_arrays)

(我应该指出，我只是确认它可以编译，并且没有对其进行进一步的测试，但是基本思想应该可以)

(I should point out that I've only confirmed that it compiles and not tested it further, but the basic idea should be OK)

您完成此操作的方式可能是非常明智的方式.对您应该使用的原始代码稍作简化

The way you've done it is probably a pretty sensible way. One slight simplification to your original code that should work

shape[i] = <np.uintp_t>&(temp.shape[0])

而不是malloc并复制.我还建议将free放在finally块中，以确保它们可以运行.

instead of malloc and copy. I'd also recommend putting the frees in a finally block to ensure they get run.

@ead有助于指出 numpy形状存储为np.intp_t -即，一个足以容纳指针的有符号整数(大多数为64位)-而int为通常为32位.因此，要传递形状而不进行复制，您需要更改C api.强制转换帮助使该错误更难以发现(隐藏逻辑错误的好方法")

@ead has helpfully pointed out that the numpy shape is stored as as np.intp_t - i.e. an signed integer big enough to fit a pointer in, which is mostly 64bit - while int is usually 32 bit. Therefore, to pass the shape without copying you'd need to change your C api. Casting help makes that mistake harder to spot ("a good way of hiding logic errors")

更多推荐

使用cython将numpy数组列表传递给C

本文发布于:2023-07-29 19:30:38，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1243436.html