借助Airflow,如果下游任务失败,是否可以重新启动上游任务?这似乎与术语DAG的非循环"部分背道而驰.我认为这是一个普遍的问题.
With Airflow, is it possible to restart an upstream task if a downstream task fails? This seems to be against the "Acyclic" part of the term DAG. I would think this is a common problem though.
背景
我正在考虑使用Airflow来管理已经手动管理的数据处理工作流程.
I'm looking into using Airflow to manage a data processing workflow that has been managed manually.
如果将参数x设置得太高,则有一项任务会失败,但是增加参数值会带来更好的质量结果.我们还没有找到一种计算安全但最大参数x的方法.手动执行的过程是,如果使用较低的参数失败,则重新启动作业,直到工作为止.
There is a task that will fail if a parameter x is set too high, but increasing the parameter value gives better quality results. We have not found a way to calculate a safe but maximally high parameter x. The process by hand has been to restart the job if failed with a lower parameter until it works.
工作流程如下:
任务A-收集原始数据
任务B-为作业生成配置文件
Task B - Generate config file for job
任务C-修改配置文件参数x
Task C - Modify config file parameter x
任务D-运行数据处理作业
Task D - Run the data manipulation Job
任务E-处理作业结果
任务F-生成报告
问题
如果任务D由于参数x太高而失败,我想重新运行任务C和任务D.似乎不支持此操作.我非常感谢您提供一些有关如何处理此问题的指导.
If task D fails because of parameter x being too high, I want to rerun task C and task D. This doesn't seem to be supported. I would really appreciate some guidance on how to handle this.
推荐答案首先:这是一个很好的问题,我想知道为什么到目前为止尚未对其进行广泛讨论
我可以想到两种可能的方法
I can think of two possible approaches
融合Operators : @Kris 指出> ,将Operators组合在一起似乎是最明显的解决方法
Fusing Operators: As pointed out by @Kris, Combining Operators together appears to be the most obvious workaround
分隔顶级 DAG s :请阅读以下内容
Separate Top-Level DAGs: Read below
单独的顶级DAG方法
给予
- 假设您有任务A& B
- A在B的上游
- 如果B失败,您希望从A恢复执行(重试)
(可能的)想法:如果您喜欢冒险
- 放置任务A& DAG-A& A表示B在单独的顶级 DAG中DAG-B
- 在DAG-A的末尾,使用TriggerDagRunOperator触发DAG-B
- 很可能,您还必须在TriggerDagRunOperator之后使用ExternalTaskSensor
- Put tasks A & B in separate top-level DAGs, say DAG-A & DAG-B
- At the end of DAG-A, trigger DAG-B using TriggerDagRunOperator
- In all likelihood, you will also have to use an ExternalTaskSensor after TriggerDagRunOperator
有用的参考
- 将运算符融合在一起
- 将顶级DAG连接在一起
- Fusing Operators Together
- Wiring Top-Level DAGs together
EDIT-1
这是实现相似行为的简单得多的方法
Here's a much simpler way that can achieve similar behaviour
如果下游任务在气流中失败(使用Sub Dag),如何重新运行上游任务
更多推荐
参数更改后的Airflow DAG任务能否重试
发布评论