airflow.cfg中的设置catchup_by_default = False似乎不起作用。另外,向DAG中添加catchup = False也不起作用。
The setting catchup_by_default=False in airflow.cfg does not seem to work. Also adding catchup=False to the DAG doesn't work neither.
这里是重现问题的方法。我总是从运行 airflow resetdb 开始。取消暂停后,任务便开始回填。
Here's how to reproduce the issue. I always start from a clean slate by running airflow resetdb. As soon as I unpause the dag, the tasks start to backfill.
以下是该设置。我只是使用教学示例。
Here's the setup for the dag. I'm just using the tutorial example.
default_args = { "owner": "airflow", "depends_on_past": False, "start_date": datetime(2018, 9, 16), "email": ["airflow@airflow"], "email_on_failure": False, "email_on_retry": False, "retries": 1, "retry_delay": timedelta(minutes=5), } dag = DAG("tutorial", default_args=default_args, schedule_interval=timedelta(1), catchup=False)推荐答案
就像@dlamblin一样,并且在 docs 也是。Airflow会为最近的有效间隔创建一个DagRun。 catchup = False 将指示调度程序仅为DAG间隔系列的最新实例创建DAG运行。
Like @dlamblin mentioned and as mentioned in the docs too Airflow would create a single DagRun for the most recent valid interval. catchup=False will instruct the scheduler to only create a DAG Run for the most current instance of the DAG interval series.
虽然在使用时有一个 BUG timedelta 表示 schedule_interval ,而不是CRON表达式或CRON预设。这已在Airflow Master中通过 github/apache/airflow/pull/修复。 8776 。我们将通过此修复程序发布Airflow 1.10.11。
Although there was a BUG when using a timedelta for schedule_interval instead of a CRON expression or CRON preset. This has been fixed in Airflow Master with github/apache/airflow/pull/8776. We will release Airflow 1.10.11 with this fix.
更多推荐
如何阻止DAG回填? catchup
发布评论