如何使用 Python asyncio 限制并发?

编程入门 行业动态 更新时间:2024-10-25 16:29:44
本文介绍了如何使用 Python asyncio 限制并发?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

假设我们有一堆链接要下载,每个链接的下载时间可能不同.而且我只能使用最多 3 个连接进行下载.现在,我想确保我使用 asyncio 有效地执行此操作.

Let's assume we have a bunch of links to download and each of the link may take a different amount of time to download. And I'm allowed to download using utmost 3 connections only. Now, I want to ensure that I do this efficiently using asyncio.

这就是我想要实现的目标:在任何时候,尽量确保我至少运行了 3 次下载.

Here's what I'm trying to achieve: At any point in time, try to ensure that I have atleast 3 downloads running.

Connection 1: 1---------7---9--- Connection 2: 2---4----6----- Connection 3: 3-----5---8-----

数字代表下载链接,连字符代表等待下载.

The numbers represent the download links, while hyphens represent Waiting for download.

这是我现在使用的代码

from random import randint import asyncio count = 0 async def download(code, permit_download, no_concurrent, downloading_event): global count downloading_event.set() wait_time = randint(1, 3) print('downloading {} will take {} second(s)'.format(code, wait_time)) await asyncio.sleep(wait_time) # I/O, context will switch to main function print('downloaded {}'.format(code)) count -= 1 if count < no_concurrent and not permit_download.is_set(): permit_download.set() async def main(loop): global count permit_download = asyncio.Event() permit_download.set() downloading_event = asyncio.Event() no_concurrent = 3 i = 0 while i < 9: if permit_download.is_set(): count += 1 if count >= no_concurrent: permit_download.clear() loop.create_task(download(i, permit_download, no_concurrent, downloading_event)) await downloading_event.wait() # To force context to switch to download function downloading_event.clear() i += 1 else: await permit_download.wait() await asyncio.sleep(9) if __name__ == '__main__': loop = asyncio.get_event_loop() try: loop.run_until_complete(main(loop)) finally: loop.close()

输出符合预期:

downloading 0 will take 2 second(s) downloading 1 will take 3 second(s) downloading 2 will take 1 second(s) downloaded 2 downloading 3 will take 2 second(s) downloaded 0 downloading 4 will take 3 second(s) downloaded 1 downloaded 3 downloading 5 will take 2 second(s) downloading 6 will take 2 second(s) downloaded 5 downloaded 6 downloaded 4 downloading 7 will take 1 second(s) downloading 8 will take 1 second(s) downloaded 7 downloaded 8

但这是我的问题:

  • 目前,我只是在等待 9 秒以保持主函数运行,直到下载完成.在退出 main 函数之前,是否有一种有效的方法可以等待最后一次下载完成?(我知道有 asyncio.wait,但我需要存储所有任务引用才能使其工作)

  • At the moment, I'm simply waiting for 9 seconds to keep the main function running till the downloads are complete. Is there an efficient way of waiting for the last download to complete before exiting the main function? (I know there's asyncio.wait, but I'll need to store all the task references for it to work)

    有什么好的库可以完成这种任务?我知道 javascript 有很多异步库,但是 Python 呢?

    What's a good library that does this kind of task? I know javascript has a lot of async libraries, but what about Python?

    2. 什么是处理常见异步模式的好库?(类似于 async)

    2. What's a good library that takes care of common async patterns? (Something like async)

    推荐答案

    在阅读本答案的其余部分之前,请注意使用 asyncio 限制并行任务数量的惯用方法是使用 asyncio.Semaphore,如 Mikhail 的回答 所示,并在 安德烈的回答.这个答案包含工作,但实现相同的更复杂的方法.我留下答案是因为在某些情况下,这种方法可能比信号量更具优势,特别是当要完成的工作非常大或无限时,并且您无法提前创建所有协程.在这种情况下,第二个(基于队列的)解决方案就是这个答案就是你想要的.但是在大多数常规情况下,例如通过 aiohttp 并行下载,您应该改用信号量.

    Before reading the rest of this answer, please note that the idiomatic way of limiting the number of parallel tasks this with asyncio is using asyncio.Semaphore, as shown in Mikhail's answer and elegantly abstracted in Andrei's answer. This answer contains working, but a bit more complicated ways of achieving the same. I am leaving the answer because in some cases this approach can have advantages over a semaphore, specifically when the work to be done is very large or unbounded, and you cannot create all the coroutines in advance. In that case the second (queue-based) solution is this answer is what you want. But in most regular situations, such as parallel download through aiohttp, you should use a semaphore instead.

    您基本上需要一个固定大小的池下载任务.asyncio 没有预制的任务池,但创建一个很容易:只需保留一组任务,不要让它增长超过限制.尽管问题表明您不愿意走这条路,但代码最终要优雅得多:

    You basically need a fixed-size pool of download tasks. asyncio doesn't come with a pre-made task pool, but it is easy to create one: simply keep a set of tasks and don't allow it to grow past the limit. Although the question states your reluctance to go down that route, the code ends up much more elegant:

    import asyncio import random async def download(code): wait_time = random.randint(1, 3) print('downloading {} will take {} second(s)'.format(code, wait_time)) await asyncio.sleep(wait_time) # I/O, context will switch to main function print('downloaded {}'.format(code)) async def main(loop): no_concurrent = 3 dltasks = set() i = 0 while i < 9: if len(dltasks) >= no_concurrent: # Wait for some download to finish before adding a new one _done, dltasks = await asyncio.wait( dltasks, return_when=asyncio.FIRST_COMPLETED) dltasks.add(loop.create_task(download(i))) i += 1 # Wait for the remaining downloads to finish await asyncio.wait(dltasks)

    另一种方法是创建固定数量的协程进行下载,就像固定大小的线程池一样,并使用 asyncio.Queue 为它们提供工作.这消除了手动限制下载次数的需要,这将被调用 download() 的协程数量自动限制:

    An alternative is to create a fixed number of coroutines doing the downloading, much like a fixed-size thread pool, and feed them work using an asyncio.Queue. This removes the need to manually limit the number of downloads, which will be automatically limited by the number of coroutines invoking download():

    # download() defined as above async def download_worker(q): while True: code = await q.get() await download(code) q.task_done() async def main(loop): q = asyncio.Queue() workers = [loop.create_task(download_worker(q)) for _ in range(3)] i = 0 while i < 9: await q.put(i) i += 1 await q.join() # wait for all tasks to be processed for worker in workers: worker.cancel() await asyncio.gather(*workers, return_exceptions=True)

    至于你的另一个问题,显而易见的选择是 aiohttp.

    As for your other question, the obvious choice would be aiohttp.

  • 更多推荐

    如何使用 Python asyncio 限制并发?

    本文发布于:2023-11-06 09:46:36,感谢您对本站的认可!
    本文链接:https://www.elefans.com/category/jswz/34/1563354.html
    版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
    本文标签:如何使用   Python   asyncio

    发布评论

    评论列表 (有 0 条评论)
    草根站长

    >www.elefans.com

    编程频道|电子爱好者 - 技术资讯及电子产品介绍!