scipy稀疏矩阵和cython(scipy sparse matrices and cython)

编程入门 行业动态 更新时间:2024-10-17 20:33:33
scipy稀疏矩阵和cython(scipy sparse matrices and cython)

我需要在Cython方法中对scipy稀疏矩阵执行一组操作。

为了有效地应用这些,我需要访问lil_matrix表示。 python中的lil(链表稀疏矩阵)数据表示使用具有不同长度list of lists 。

如何有效地将不同长度的列表列表传递给cython(不复制)? 有没有其他方法可以访问cython中的lil-matrices?

I need to perform a set of operations on a scipy sparse matrix in a Cython method.

To efficiently apply these I need access to lil_matrix representation. The lil (linked-list sparse matrix) data representation in python uses list of lists with different lengths.

How can I efficiently pass a list of list of different length to cython (without copying)? Is there any other way to access lil-matrices in cython?

最满意答案

下面的示例遍历lil_matrix并计算每行的总和。

注意我没有声明,即使它非常快, 因为Cython已经针对内置类型(如列表)进行了优化 。 时间也如下所示......

import time import numpy as np cimport numpy as np from scipy.sparse import lil_matrix cdef iter_over_lil_matrix(m): cdef list sums, data_row sums = [] for data_row in m.data: s = 0 for value in data_row: s += value sums.append(s) return sums def main(): a = np.random.random((1e4*1e4)) a[a>0.1] = 0 a = a.reshape(1e4,1e4) m = lil_matrix(a) t0 = time.clock() sums = iter_over_lil_matrix(m) t1 = time.clock() print 'Cython lil_matrix Time', t1-t0 t0 = time.clock() array_sums = a.sum(axis=1) t1 = time.clock() print 'Numpy ndarray Time', t1-t0 t0 = time.clock() lil_sums = m.sum(axis=1) t1 = time.clock() print 'lil_matrix Time', t1-t0 mcsr = m.tocsr() t0 = time.clock() csr_sums = mcsr.sum(axis=1) t1 = time.clock() print 'csr_matrix Time', t1-t0 assert np.allclose(array_sums, sums) assert np.allclose(array_sums, np.asarray(lil_sums).flatten()) assert np.allclose(array_sums, np.asarray(csr_sums).flatten())

以秒为单位的时间 - 仅比超优化的NumPy:D慢约2倍,比lil_matrix.sum()方法快得多,因为它之前转换为csr_matrix() ,正如@hpaulj所阐明并由下面的结果确认。 请注意,列上的csr_matrix.sum()几乎比密集总和快一个数量级。

Cython lil_matrix Time 0.183935034665 Numpy ndarray Time 0.106583238273 lil_matrix Time 2.47158218631 csr_matrix Time 0.0140050888745

会减慢代码速度的事情:

for i in range(len(m.data)):使用for i in range(len(m.data)): with data_row = m.data[i] 使用data=m.data声明缓冲区,如np.ndarray[object, ndim=1] data

不影响的事情:

boundscheck或wraparound

The example below iterates over a lil_matrix and calculates the sum for each row.

Note I am doing no declarations and even though it is extremely fast because Cython is already optimized for built-in types such as lists. The timings are also shown below...

import time import numpy as np cimport numpy as np from scipy.sparse import lil_matrix cdef iter_over_lil_matrix(m): cdef list sums, data_row sums = [] for data_row in m.data: s = 0 for value in data_row: s += value sums.append(s) return sums def main(): a = np.random.random((1e4*1e4)) a[a>0.1] = 0 a = a.reshape(1e4,1e4) m = lil_matrix(a) t0 = time.clock() sums = iter_over_lil_matrix(m) t1 = time.clock() print 'Cython lil_matrix Time', t1-t0 t0 = time.clock() array_sums = a.sum(axis=1) t1 = time.clock() print 'Numpy ndarray Time', t1-t0 t0 = time.clock() lil_sums = m.sum(axis=1) t1 = time.clock() print 'lil_matrix Time', t1-t0 mcsr = m.tocsr() t0 = time.clock() csr_sums = mcsr.sum(axis=1) t1 = time.clock() print 'csr_matrix Time', t1-t0 assert np.allclose(array_sums, sums) assert np.allclose(array_sums, np.asarray(lil_sums).flatten()) assert np.allclose(array_sums, np.asarray(csr_sums).flatten())

Timings in seconds - only about 2 times slower than the super-optimized NumPy :D, much faster than the lil_matrix.sum() method because it converts to csr_matrix() before, as clarified by @hpaulj and confirmed by the results below. Note that the csr_matrix.sum() over the columns is almost one order of magnitude faster than the dense sum.

Cython lil_matrix Time 0.183935034665 Numpy ndarray Time 0.106583238273 lil_matrix Time 2.47158218631 csr_matrix Time 0.0140050888745

Things that will slow down the code:

use of for i in range(len(m.data)): with data_row = m.data[i] declare buffers like np.ndarray[object, ndim=1] data with data=m.data

Things that did not affect:

boundscheck or wraparound

更多推荐

本文发布于:2023-08-05 21:46:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1441014.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:稀疏   矩阵   scipy   matrices   sparse

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!