我正在尝试编写Cython代码以转储密集的特征矩阵,将目标向量对转换为libsvm格式,比sklearn的内置代码更快.我收到一个编译错误,抱怨将目标向量(整数的numpy数组)传递给相关的c函数时发生类型问题.
I'm trying to write Cython code to dump a dense feature matrix, target vector pair to libsvm format faster than sklearn's built in code. I get a compilation error complaining about a type issue with passing the target vector (a numpy array of ints) to the relevant c function.
代码如下:
import numpy as np cimport numpy as np cimport cython cdef extern from "cdump.h": int filedump( double features[], int numexemplars, int numfeats, int target[], char* outfname) @cython.boundscheck(False) @cython.wraparound(False) def fastdumpdense_libsvmformat(np.ndarray[np.double_t,ndim=2] X, y, outfname): if X.shape[0] != len(y): raise ValueError("X and y need to have the same number of points") cdef int numexemplars = X.shape[0] cdef int numfeats = X.shape[1] cdef bytes py_bytes = outfname.encode() cdef char* outfnamestr = py_bytes cdef np.ndarray[np.double_t, ndim=2, mode="c"] X_c cdef np.ndarray[np.int_t, ndim=1, mode="c"] y_c X_c = np.ascontiguousarray(X, dtype=np.double) y_c = np.ascontiguousarray(y, dtype=np.int) retval = filedump( &X_c[0,0], numexemplars, numfeats, &y_c[0], outfnamestr) return retval当我尝试使用distutils编译此代码时,出现错误
When I attempt to compile this code using distutils, I get the error
cythoning fastdump_svm.pyx to fastdump_svm.cpp Error compiling Cython file: ------------------------------------------------------------ ... cdef np.ndarray[np.double_t, ndim=2, mode="c"] X_c cdef np.ndarray[np.int_t, ndim=1, mode="c"] y_c X_c = np.ascontiguousarray(X, dtype=np.double) y_c = np.ascontiguousarray(y, dtype=np.int) retval = filedump( &X_c[0,0], numexemplars, numfeats, &y_c[0], outfnamestr) ^ ------------------------------------------------------------ fastdump_svm.pyx:24:58: Cannot assign type 'int_t *' to 'int *'任何想法如何解决此错误?我最初遵循的是传递y_c.data的范例,该范例有效,但这显然不是推荐的方法.
Any idea how to fix this error? I originally was following the paradigm of passing y_c.data, which works, but this is apparently not the recommended way.
推荐答案在启动numpy数组以匹配计算机上的C int时,也可以使用dtype=np.dtype("i").
You can also use dtype=np.dtype("i") when initiating a numpy array to match the C int on your machine.
cdef int [:] y_c c_array = np.ascontiguousarray(y, dtype=np.dtype("i"))更多推荐
将numpy整数数组传递给C代码
发布评论