为什么子进程在Windows上启动时导入主模块,而不是在Linux上?(Why do subprocesses import the main module at start on Windows w

编程入门 行业动态 更新时间:2024-10-24 04:29:44
为什么子进程在Windows上启动时导入主模块,而不是在Linux上?(Why do subprocesses import the main module at start on Windows while they don't on Linux?)

示例:以下代码在Ubuntu 14.04上运行正常

# some imports import numpy as np import glob import sys import multiprocessing import os # creating some temporary data tmp_dir = os.path.join('tmp', 'nptest') if not os.path.exists(tmp_dir): os.makedirs(tmp_dir) for i in range(10): x = np.random.rand(100, 50) y = np.random.rand(200, 20) file_path = os.path.join(tmp_dir, '%05d.npz' % i) np.savez_compressed(file_path, x=x, y=y) def read_npz(path): data = dict(np.load(path)) return (data['x'], data['y']) def parallel_read(files): pool = multiprocessing.Pool(processes=4) data_list = pool.map(read_npz, files) return data_list files = glob.glob(os.path.join(tmp_dir, '*.npz')) x = parallel_read(files) print('done')

但在Windows 7上失败,并出现以下错误消息:

cmd = get_command_line() + [rhandle] pool = multiprocessing.Pool(processes=4) File "C:\Anaconda\lib\multiprocessing\forking.py", line 358, in get_command_line File "C:\Anaconda\lib\multiprocessing\__init__.py", line 232, in Pool return Pool(processes, initializer, initargs, maxtasksperchild) File "C:\Anaconda\lib\multiprocessing\pool.py", line 159, in __init__ is not going to be frozen to produce a Windows executable.''') RuntimeError: Attempt to start a new process before the current process has finished its bootstrapping phase. This probably means that you are on Windows and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce a Windows executable. self._repopulate_pool() File "C:\Anaconda\lib\multiprocessing\pool.py", line 223, in _repopulate_pool w.start() File "C:\Anaconda\lib\multiprocessing\process.py", line 130, in start self._popen = Popen(self) File "C:\Anaconda\lib\multiprocessing\forking.py", line 258, in __init__ cmd = get_command_line() + [rhandle] File "C:\Anaconda\lib\multiprocessing\forking.py", line 358, in get_command_line is not going to be frozen to produce a Windows executable.''') RuntimeError: Attempt to start a new process before the current process has finished its bootstrapping phase. This probably means that you are on Windows and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce a Windows executable.

根据我的理解,这源于这样一个事实:子进程在Windows上启动时导入主模块,而不是在Linux上。 通过在主函数中放置x = parallel_read(files)可以防止Windows上的问题。 例如:

if __name__ == '__main__': x = parallel_read(files) print('done')

为什么子进程在Windows上启动时导入主模块,而不是在Linux上?

Example: the following code runs fine on Ubuntu 14.04

# some imports import numpy as np import glob import sys import multiprocessing import os # creating some temporary data tmp_dir = os.path.join('tmp', 'nptest') if not os.path.exists(tmp_dir): os.makedirs(tmp_dir) for i in range(10): x = np.random.rand(100, 50) y = np.random.rand(200, 20) file_path = os.path.join(tmp_dir, '%05d.npz' % i) np.savez_compressed(file_path, x=x, y=y) def read_npz(path): data = dict(np.load(path)) return (data['x'], data['y']) def parallel_read(files): pool = multiprocessing.Pool(processes=4) data_list = pool.map(read_npz, files) return data_list files = glob.glob(os.path.join(tmp_dir, '*.npz')) x = parallel_read(files) print('done')

but fails on Windows 7, with an error message along the lines of:

cmd = get_command_line() + [rhandle] pool = multiprocessing.Pool(processes=4) File "C:\Anaconda\lib\multiprocessing\forking.py", line 358, in get_command_line File "C:\Anaconda\lib\multiprocessing\__init__.py", line 232, in Pool return Pool(processes, initializer, initargs, maxtasksperchild) File "C:\Anaconda\lib\multiprocessing\pool.py", line 159, in __init__ is not going to be frozen to produce a Windows executable.''') RuntimeError: Attempt to start a new process before the current process has finished its bootstrapping phase. This probably means that you are on Windows and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce a Windows executable. self._repopulate_pool() File "C:\Anaconda\lib\multiprocessing\pool.py", line 223, in _repopulate_pool w.start() File "C:\Anaconda\lib\multiprocessing\process.py", line 130, in start self._popen = Popen(self) File "C:\Anaconda\lib\multiprocessing\forking.py", line 258, in __init__ cmd = get_command_line() + [rhandle] File "C:\Anaconda\lib\multiprocessing\forking.py", line 358, in get_command_line is not going to be frozen to produce a Windows executable.''') RuntimeError: Attempt to start a new process before the current process has finished its bootstrapping phase. This probably means that you are on Windows and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce a Windows executable.

From my understanding, this stems from the fact that subprocesses import the main module at start on Windows while they don't on Linux. The issue on Windows can be prevented by placing x = parallel_read(files) in a main function. E.g.:

if __name__ == '__main__': x = parallel_read(files) print('done')

Why do subprocesses import the main module at start on Windows while they don't on Linux?

最满意答案

Windows没有fork功能。 大多数其他操作系统都可以,并且在这些平台上, multiprocessing使用它来启动与父进程具有相同状态的新进程。 Windows必须通过其他方式设置子进程的状态,包括导入__main__模块。

请注意,如果您请求,Python 3.4(及更高版本)允许您在所有操作系统上使用非分叉实现。 有关此功能的讨论,请参阅错误跟踪器上的问题8713 。

Windows doesn't have a fork function. Most other OSs do, and on those platforms it is used by multiprocessing to launch the new processes with the same state as the parent process. Windows has to set up the state of the child process by other means, including importing the __main__ module.

Note that Python 3.4 (and later) lets you use the non-forking implementation on all operating systems, if you request it. See issue 8713 on the bug tracker for the discussion of this feature.

更多推荐

本文发布于:2023-07-24 01:49:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1240105.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:是在   而不   启动时   模块   进程

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!