在python中并行遍历单个列表

编程入门 行业动态 更新时间:2024-10-09 19:15:54
本文介绍了在python中并行遍历单个列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

目标是同时使用 builtin sum & map函数在 parallel 中对单个iter进行计算.也许使用itertools(而不是经典的for loops)来分析通过iterator ...

The objective is to do calculations on a single iter in parallel using builtin sum & map functions concurrently. Maybe using (something like) itertools instead of classic for loops to analyze (LARGE) data that arrives via an iterator...

在一个简单的示例情况下,我想计算ilen, sum_x & sum_x_sq:

In one simple example case I want to calculate ilen, sum_x & sum_x_sq:

ilen,sum_x,sum_x_sq=iterlen(iter),sum(iter),sum(map(lambda x:x*x, iter))

但不将(large) iter转换为list(与iter=list(iter)一样)

But without converting the (large) iter to a list (as with iter=list(iter))

n.b.是否使用sum & map和不使用for loops,是否使用itertools和/或threading模块?

n.b. Do this using sum & map and without for loops, maybe using the itertools and/or threading modules?

def example_large_data(n=100000000, mean=0, std_dev=1): for i in range(n): yield random.gauss(mean,std_dev)

-编辑-

非常特定:我正在仔细研究itertools,希望有一个像map这样的双重功能可以做到.例如:len_x,sum_x,sum_x_sq=itertools.iterfork(iter_x,iterlen,sum,sum_sq)

Being VERY specific: I was taking a good look at itertools hoping that there was a dual function like map that could do it. For example: len_x,sum_x,sum_x_sq=itertools.iterfork(iter_x,iterlen,sum,sum_sq)

如果我要非常具体:我只是在寻找一个答案,那就是"iterfork"过程的python源代码.

If I was to be very very specific: I am looking for just one answer, python source code for the "iterfork" procedure.

推荐答案

您可以使用itertools.tee将单个迭代器变成三个迭代器,然后可以将其传递给三个函数.

You can use itertools.tee to turn your single iterator into three iterators which you can pass to your three functions.

iter0, iter1, iter2 = itertools.tee(input_iter, 3) ilen, sum_x, sum_x_sq = count(iter0),sum(iter1),sum(map(lambda x:x*x, iter2))

可以运行 ,但是内置函数sum(在Python 2中为map)不是以支持并行迭代的方式实现的.您调用的第一个函数将完全消耗其迭代器,第二个函数将消耗第二个迭代器,然后第三个函数将消耗第三个迭代器.由于tee必须存储其输出迭代器之一看到的值,但不能存储所有其他迭代器看到的值,因此从本质上讲,这与从迭代器创建列表并将其传递给每个函数相同.

That will work, but the builtin function sum (and map in Python 2) is not implemented in a way that supports parallel iteration. The first function you call will consume its iterator completely, then the second one will consume the second iterator, then the third function will consume the third iterator. Since tee has to store the values seen by one of its output iterators but not all of the others, this is essentially the same as creating a list from the iterator and passing it to each function.

现在,如果使用生成器函数,则对于每个输出值,它们仅消耗其输入中的单个值,则可以使用zip进行并行迭代工作.在Python 3中,map和zip都是生成器.问题是如何使sum成为生成器.

Now, if you use generator functions that consume only a single value from their input for each value they output, you might be able to make parallel iteration work using zip. In Python 3, map and zip are both generators. The question is how to make sum into a generator.

我认为您可以使用 (已在Python 3.2中添加).它是一个生成其输入的总和的生成器.这是解决问题的方法(我假设您的count函数应该是len的迭代器友好版本):

I think you can get pretty much what you want by using itertools.accumulate (which was added in Python 3.2). It is a generator that yields a running sum of its input. Here's how you could make it work for your problem (I'm assuming your count function was supposed to be an iterator-friendly version of len):

iter0, iter1, iter2 = itertools.tee(input_iter, 3) len_gen = itertools.accumulate(map(lambda x: 1, iter0)) sum_gen = itertools.accumulate(iter1) sum_sq_gen = itertools.accumulate(map(lambda x: x*x, iter2)) parallel_gen = zip(len_gen, sum_gen, sum_sq_gen) # zip is a generator in Python 3 for ilen, sum_x, sum_x_sq in parallel_gen: pass # the generators do all the work, so there's nothing for us to do here # ilen_x, sum_x, sum_x_sq have the right values here!

如果您使用的是Python 2,而不是3,则必须编写自己的accumulate生成器函数(我上面链接的文档中有一个纯Python实现),并使用itertools.imap和,而不是内置的map和zip函数.

If you're using Python 2, rather than 3, you'll have to write your own accumulate generator function (there's a pure Python implementation in the docs I linked above), and use itertools.imap and itertools.izip rather than the builtin map and zip functions.

更多推荐

在python中并行遍历单个列表

本文发布于:2023-10-28 10:44:10,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1536406.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:遍历   列表   python

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!