我一直在寻找一种解决方案,可以确定触发结构触发触发时,因为我不确定必须运行的运算符的数量。
I was looking for a solution where I can decide the dag structure when the dag is triggered as I'm not sure about the number of operators that I'll have to run.
请参考以下我打算创建的执行顺序。
Please refer below for the execution sequence that I'm planning to create.
|-- Task B.1 --| |-- Task C.1 --| |-- Task B.2 --| |-- Task C.2 --| Task A --|-- Task B.3 --|---> Task B ---> |-- Task C.3 --| | .... | | .... | |-- Task B.N --| |-- Task C.N --|我不确定N的值。
在气流中是否可行?如果是这样,我该如何实现。
Is this possible in airflow. If so, how do I achieve this.
预先感谢
推荐答案过去我不得不做类似的事情,我写了一个DAG,它从YAML文件中读取,该文件定义了要创建的任务。
I had to do something similar in the past, I wrote a DAG which read from a YAML file which defined what tasks to create.
我的情况是,我从中提取数据的表的数量可能每周更改,而不是每次需要添加DAG时将其重新部署到生产环境中新表我将DAG指向一个YAML文件,该文件描述了要提取的表。每次出现新表时,我只需使用新表详细信息编辑YAML文件。
My situation was that the number of tables that I was extracting data from could change every week, instead of re-deploying the DAG to production every time I needed to add a new table I pointed the DAG to a YAML file which described which tables to extract. Every time a new table came along I would simply edit the YAML file with the new table details.
我认为,如果需要运行上游任务,会有些棘手。首先,然后确定要运行多少下游任务,如下所示-但类似-问题:
I think it gets a bit trickier if an upstream task needs to be run first which then determines how many downstream tasks to run like in the following - but similar - question:
基于上游任务的输出在气流中生成动态任务
更多推荐
气流DAG动态结构
发布评论