我需要在Cython方法中对scipy稀疏矩阵执行一组操作。
为了有效地应用这些,我需要访问lil_matrix表示。 python中的lil(链表稀疏矩阵)数据表示使用具有不同长度list of lists 。
如何有效地将不同长度的列表列表传递给cython(不复制)? 有没有其他方法可以访问cython中的lil-matrices?
I need to perform a set of operations on a scipy sparse matrix in a Cython method.
To efficiently apply these I need access to lil_matrix representation. The lil (linked-list sparse matrix) data representation in python uses list of lists with different lengths.
How can I efficiently pass a list of list of different length to cython (without copying)? Is there any other way to access lil-matrices in cython?
最满意答案
下面的示例遍历lil_matrix并计算每行的总和。
注意我没有声明,即使它非常快, 因为Cython已经针对内置类型(如列表)进行了优化 。 时间也如下所示......
import time import numpy as np cimport numpy as np from scipy.sparse import lil_matrix cdef iter_over_lil_matrix(m): cdef list sums, data_row sums = [] for data_row in m.data: s = 0 for value in data_row: s += value sums.append(s) return sums def main(): a = np.random.random((1e4*1e4)) a[a>0.1] = 0 a = a.reshape(1e4,1e4) m = lil_matrix(a) t0 = time.clock() sums = iter_over_lil_matrix(m) t1 = time.clock() print 'Cython lil_matrix Time', t1-t0 t0 = time.clock() array_sums = a.sum(axis=1) t1 = time.clock() print 'Numpy ndarray Time', t1-t0 t0 = time.clock() lil_sums = m.sum(axis=1) t1 = time.clock() print 'lil_matrix Time', t1-t0 mcsr = m.tocsr() t0 = time.clock() csr_sums = mcsr.sum(axis=1) t1 = time.clock() print 'csr_matrix Time', t1-t0 assert np.allclose(array_sums, sums) assert np.allclose(array_sums, np.asarray(lil_sums).flatten()) assert np.allclose(array_sums, np.asarray(csr_sums).flatten())以秒为单位的时间 - 仅比超优化的NumPy:D慢约2倍,比lil_matrix.sum()方法快得多,因为它之前转换为csr_matrix() ,正如@hpaulj所阐明并由下面的结果确认。 请注意,列上的csr_matrix.sum()几乎比密集总和快一个数量级。
Cython lil_matrix Time 0.183935034665 Numpy ndarray Time 0.106583238273 lil_matrix Time 2.47158218631 csr_matrix Time 0.0140050888745会减慢代码速度的事情:
for i in range(len(m.data)):使用for i in range(len(m.data)): with data_row = m.data[i] 使用data=m.data声明缓冲区,如np.ndarray[object, ndim=1] data不影响的事情:
boundscheck或wraparoundThe example below iterates over a lil_matrix and calculates the sum for each row.
Note I am doing no declarations and even though it is extremely fast because Cython is already optimized for built-in types such as lists. The timings are also shown below...
import time import numpy as np cimport numpy as np from scipy.sparse import lil_matrix cdef iter_over_lil_matrix(m): cdef list sums, data_row sums = [] for data_row in m.data: s = 0 for value in data_row: s += value sums.append(s) return sums def main(): a = np.random.random((1e4*1e4)) a[a>0.1] = 0 a = a.reshape(1e4,1e4) m = lil_matrix(a) t0 = time.clock() sums = iter_over_lil_matrix(m) t1 = time.clock() print 'Cython lil_matrix Time', t1-t0 t0 = time.clock() array_sums = a.sum(axis=1) t1 = time.clock() print 'Numpy ndarray Time', t1-t0 t0 = time.clock() lil_sums = m.sum(axis=1) t1 = time.clock() print 'lil_matrix Time', t1-t0 mcsr = m.tocsr() t0 = time.clock() csr_sums = mcsr.sum(axis=1) t1 = time.clock() print 'csr_matrix Time', t1-t0 assert np.allclose(array_sums, sums) assert np.allclose(array_sums, np.asarray(lil_sums).flatten()) assert np.allclose(array_sums, np.asarray(csr_sums).flatten())Timings in seconds - only about 2 times slower than the super-optimized NumPy :D, much faster than the lil_matrix.sum() method because it converts to csr_matrix() before, as clarified by @hpaulj and confirmed by the results below. Note that the csr_matrix.sum() over the columns is almost one order of magnitude faster than the dense sum.
Cython lil_matrix Time 0.183935034665 Numpy ndarray Time 0.106583238273 lil_matrix Time 2.47158218631 csr_matrix Time 0.0140050888745Things that will slow down the code:
use of for i in range(len(m.data)): with data_row = m.data[i] declare buffers like np.ndarray[object, ndim=1] data with data=m.dataThings that did not affect:
boundscheck or wraparound更多推荐
发布评论