我正在尝试使用多重处理来加快熊猫excel的阅读速度.但是,当我使用多处理时,我得到了错误 cPickle.PicklingError:无法腌制:属性查找__builtin __.function失败
I'm trying to use multiprocessing to speed up pandas excel reading. However when I use multiprocessing I'm getting the error cPickle.PicklingError: Can't pickle : attribute lookup __builtin__.function failed
当我尝试运行以下命令时: 进口莳萝 从pathos.multiprocessing导入ProcessPool
when I try to run the following: import dill from pathos.multiprocessing import ProcessPool
class A(object): def __init__(self): self.files = glob.glob(\*) def read_file(self, filename): return pd.read_excel(filename) def file_data(self): pool = ProcessPool(9) file_list = [filename for filename in self.files] df_list = pool.map(A().read_file, file_list) combined_df = pd.concat(df_list, ignore_index=True)pathos.multiprocessing是否旨在解决此问题?我在这里俯瞰什么吗?
Isn't pathos.multiprocessing designed to fix this issue? Am I overlooking something here?
完整的错误代码跟踪到
Full error code traces to
File "c:\users\zky3sse\appdata\local\continuum\anaconda2\lib\site-packages\pathos-0.2.0-py2.7.egg\ pathos\multiprocessing.py", line 136, in map return _pool.map(star(f), zip(*args)) # chunksize File "C:\Users\ZKY3SSE\AppData\Local\Continuum\Anaconda2\lib\multiprocessing\pool.py", line 251, in map return self.map_async(func, iterable, chunksize).get() File "C:\Users\ZKY3SSE\AppData\Local\Continuum\Anaconda2\lib\multiprocessing\pool.py", line 567, in get raise self._value推荐答案
Pandas可能会将Swig用作C代码的包装器.如果是这种情况,那么莳萝可能无法正常工作,然后悲痛感就会切换为泡菜.有一些解决方法,如下所示:如何制作我的SWIG扩展模块可与Pickle一起使用?
It is possible that Pandas may be using Swig as a wrapper for C code. If this is the case, then dill may not work properly, and pathos would then switch to pickle. There are workarounds, as shown here: How to make my SWIG extension module work with Pickle?
更多推荐
使用pathos.multiprocessing时出现cPickle错误?
发布评论