Python，使用多重处理进一步加快cython函数的速度

编程入门行业动态更新时间:2024-10-11 23:24:02

本文介绍了Python，使用多重处理进一步加快cython函数的速度的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

此处显示的代码被简化，但触发了相同的PicklingError.我知道关于可以腌制什么和不能腌制什么有很多讨论，但是我确实从中找到了解决方案.

the code shown here are simplied but triggers the same PicklingError. I know there is a lot discussion on what can and cannot be pickled, but I did find the solution from them.

我编写了一个具有以下功能的简单cython脚本:

I write a simple cython script with the following function:

def pow2(int a) : return a**2

编译正常，我可以在python脚本中调用此函数.

The compilation is working, I can call this function in python script.

但是，我想知道如何将此功能与多处理一起使用

However, I am wondering how to use this function with multiprocessing,

from multiprocessing import Pool from fast import pow2 p = Pool(processes =4 ) y = p.map( pow2, np.arange( 10, dtype=int))

给我一个PicklingError:

gives me an PicklingError:

dtw是软件包的名称，fast是fast.pyx.

dtw is the name of the package, and fast is fast.pyx.

如何解决这个问题? 预先感谢

How can I get around this problem? Thanks in advance

推荐答案

代替使用multiprocessing，这意味着由于酸洗过程会在磁盘上写入数据，您可以使用OpenMP包装器prange.在您的情况下，您可以按如下所示使用它.

Instead of using multiprocessing, which implies writting data on disk due to the pickling process you can use the OpenMP wrapper prange. In your case you could use it like shown below.

请注意使用x*x而不是x**2，避免了函数调用pow(x, 2)):
使用double指针将数组的一部分传递给每个线程
当size % num_threads != 0

note the use of x*x instead of x**2, avoiding the function call pow(x, 2)):
a part of the array is passed to each thread, using double pointers
the last thread takes more values when size % num_threads != 0

代码:

#cython: wraparound=False #cython: boundscheck=False #cython: cdivision=True #cython: nonecheck=False #cython: profile=False import numpy as np cimport numpy as np from cython.parallel import prange cdef void cpow2(int size, double *inp, double *out) nogil: cdef int i for i in range(size): out[i] = inp[i]*inp[i] def pow2(np.ndarray[np.float64_t, ndim=1] inp, np.ndarray[np.float64_t, ndim=1] out, int num_threads=4): cdef int thread cdef np.ndarray[np.int32_t, ndim=1] sub_sizes, pos size = np.shape(inp)[0] sub_sizes = np.zeros(num_threads, np.int32) + size//num_threads pos = np.zeros(num_threads, np.int32) sub_sizes[num_threads-1] += size % num_threads pos[1:] = np.cumsum(sub_sizes)[:num_threads-1] for thread in prange(num_threads, nogil=True, chunksize=1, num_threads=num_threads, schedule='static'): cpow2(sub_sizes[thread], &inp[pos[thread]], &out[pos[thread]]) def main(): a = np.arange(642312323).astype(np.float64) pow2(a, out=a, num_threads=4)

更多推荐

Python,使用多重处理进一步加快cython函数的速度

本文发布于:2023-10-24 20:44:45，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1524954.html