使用pathos.multiprocessing时出现cPickle错误?

编程入门行业动态更新时间:2024-10-27 04:32:46

本文介绍了使用pathos.multiprocessing时出现cPickle错误?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我正在尝试使用多重处理来加快熊猫excel的阅读速度.但是，当我使用多处理时，我得到了错误 cPickle.PicklingError:无法腌制:属性查找__builtin __.function失败

I'm trying to use multiprocessing to speed up pandas excel reading. However when I use multiprocessing I'm getting the error cPickle.PicklingError: Can't pickle : attribute lookup __builtin__.function failed

当我尝试运行以下命令时: 进口莳萝从pathos.multiprocessing导入ProcessPool

when I try to run the following: import dill from pathos.multiprocessing import ProcessPool

class A(object): def __init__(self): self.files = glob.glob(\*) def read_file(self, filename): return pd.read_excel(filename) def file_data(self): pool = ProcessPool(9) file_list = [filename for filename in self.files] df_list = pool.map(A().read_file, file_list) combined_df = pd.concat(df_list, ignore_index=True)

pathos.multiprocessing是否旨在解决此问题?我在这里俯瞰什么吗?

Isn't pathos.multiprocessing designed to fix this issue? Am I overlooking something here?

完整的错误代码跟踪到

Full error code traces to

File "c:\users\zky3sse\appdata\local\continuum\anaconda2\lib\site-packages\pathos-0.2.0-py2.7.egg\ pathos\multiprocessing.py", line 136, in map return _pool.map(star(f), zip(*args)) # chunksize File "C:\Users\ZKY3SSE\AppData\Local\Continuum\Anaconda2\lib\multiprocessing\pool.py", line 251, in map return self.map_async(func, iterable, chunksize).get() File "C:\Users\ZKY3SSE\AppData\Local\Continuum\Anaconda2\lib\multiprocessing\pool.py", line 567, in get raise self._value

推荐答案

Pandas可能会将Swig用作C代码的包装器.如果是这种情况，那么莳萝可能无法正常工作，然后悲痛感就会切换为泡菜.有一些解决方法，如下所示:如何制作我的SWIG扩展模块可与Pickle一起使用?

It is possible that Pandas may be using Swig as a wrapper for C code. If this is the case, then dill may not work properly, and pathos would then switch to pickle. There are workarounds, as shown here: How to make my SWIG extension module work with Pickle?