在Apache Airflow中运行32个以上的并发任务

编程入门 行业动态 更新时间:2024-10-23 17:37:10
本文介绍了在Apache Airflow中运行32个以上的并发任务的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我正在运行Apache Airflow 1.8.1。我想在我的实例上运行32个以上的并发任务,但是无法使任何配置正常工作。

I'm running Apache Airflow 1.8.1. I would like to run more than 32 concurrent tasks on my instance, but cannot get any of the configurations to work.

我正在使用CeleryExecutor,它是UI显示 parallelism 和 dag_concurrency 的64位,我已经多次重启了Airflow Scheduler,Web服务器和工作程序(我我实际上是在Vagrant机器上进行本地测试,但也已经在EC2实例上进行了测试。)

I am using the CeleryExecutor, the Airflow config in the UI shows 64 for parallelism and dag_concurrency and I've restarted the Airflow scheduler, web server and workers numerous times (I'm actually testing this locally in a Vagrant machine, but have also tested in on an EC2 instance).

airflow.cfg

airflow.cfg

# The amount of parallelism as a setting to the executor. This defines # the max number of task instances that should run simultaneously # on this airflow installation parallelism = 64 # The number of task instances allowed to run concurrently by the scheduler dag_concurrency = 64

示例DAG。我已经尝试在DAG中直接使用 concurrency 参数。

Example DAG. I've tried both without and with the concurrency argument directly in the DAG.

from datetime import datetime from airflow import DAG from airflow.operators.bash_operator import BashOperator dag = DAG( 'concurrency_dev', default_args={ 'owner': 'airflow', 'depends_on_past': False, 'start_date': datetime(2018, 1, 1), }, schedule_interval=None, catchup=False ) for i in range(0, 40): BashOperator( task_id='concurrency_dev_{i}'.format(i=i), bash_command='sleep 60', dag=dag )

无论如何,只能同时执行32个任务。

Regardless, only 32 tasks are ever executed simultaneously.

推荐答案

如果您有2位工作人员且 celeryd_concurrency = 16 ,则您最多只能执行32个任务。如果 non_pooled_task_slot_count = 32 ,您也将受到限制。 当然,不仅要在Web服务器和调度程序上将 parallelism 和 dag_concurrency 设置为32以上。工人也是。

If you have 2 workers and celeryd_concurrency = 16 then you're limited to 32 tasks. If non_pooled_task_slot_count = 32 you'd also be limited. Of course parallelism and dag_concurrency need to be set above 32 on not only the webservers and schedulers, but the workers too.

更多推荐

在Apache Airflow中运行32个以上的并发任务

本文发布于:2023-11-24 04:22:50,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1623954.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:Apache   Airflow

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!