在Airflow中使用Cron时间表时如何考虑夏令时

编程入门 行业动态 更新时间:2024-10-13 06:21:01
本文介绍了在Airflow中使用Cron时间表时如何考虑夏令时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

在Airflow中,我希望每天在特定时间在非UTC时区运行作业。我该如何安排时间?

问题在于,一旦触发了夏时制,我的工作就会运行一个小时或太晚。

In Airflow, I'd like a job to run at specific time each day in a non-UTC timezone. How can I go about scheduling this?

The problem is that once daylight savings time is triggered, my job will either be running an hour too soon or an hour too late. In the Airflow docs, it seems like this is a known issue:

In case you set a cron schedule, Airflow assumes you will always want to run at the exact same time. It will then ignore day light savings time. Thus, if you have a schedule that says run at end of interval every day at 08:00 GMT+1 it will always run end of interval 08:00 GMT+1, regardless if day light savings time is in place.

Has anyone else run into this issue? Is there a work around? Surely the best practice cannot be to alter all the scheduled times after Daylight Savings Time occurs?

Thanks.

解决方案

Starting with Airflow 1.10, time-zone aware DAGs can be defined using time-zone aware datetime objects to specify start_date. For Airflow to schedule DAG runs always at the same time (regardless of a possible daylight-saving-time switch), use cron expressions to specify schedule_interval. To make Airflow schedule DAG runs with fixed intervals (regardless of a possible daylight-saving-time switch), use datetime.timedelta() to specify schedule_interval.

For example, consider the following code that, first, uses a cron expression to schedule two consecutive DAG runs, and then uses a fixed interval to do the same.

import pendulum from airflow import DAG from datetime import datetime, timedelta START_DATE = datetime( year=2019, month=10, day=25, hour=8, minute=0, tzinfo=pendulum.timezone('Europe/Kiev'), ) def gen_execution_dates(start_date, schedule_interval): dag = DAG( dag_id='id', start_date=start_date, schedule_interval=schedule_interval ) execution_date = dag.start_date for i in range(1, 3): execution_date = dag.following_schedule(execution_date) print( f'[Run {i}: Execution Date for "{schedule_interval}"]:', dag.timezone.convert(execution_date), ) gen_execution_dates(START_DATE, '0 8 * * *') gen_execution_dates(START_DATE, timedelta(days=1))

Running the code produces the following output:

[Run 1: Execution Date for "0 8 * * *"]: 2019-10-26 08:00:00+03:00 [Run 2: Execution Date for "0 8 * * *"]: 2019-10-27 08:00:00+02:00 [Run 1: Execution Date for "1 day, 0:00:00"]: 2019-10-26 08:00:00+03:00 [Run 2: Execution Date for "1 day, 0:00:00"]: 2019-10-27 07:00:00+02:00

For the zone [Europe/Kiev], the daylight saving time of 2019 ends on 2019-10-27 at 03:00:00+03:00. That is, between Run 1 and Run 2 in our example.

The first two output lines show that for the DAG runs scheduled with a cron expression the first run and second run are both scheduled for 08:00 (although, in different timezones: Eastern European Summer Time (EEST) and Eastern European Time (EET) respectively).

The last two output lines show that for the DAG runs scheduled with a fixed interval the first run is scheduled for 08:00 (EEST), and the second run is scheduled exactly 1 day (24 hours) later, which is at 07:00 (EET) due to the daylight-saving-time switch.

The following figure illustrates the example:

更多推荐

在Airflow中使用Cron时间表时如何考虑夏令时

本文发布于:2023-11-24 02:31:50,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1623644.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:夏令时   时间表   Airflow   Cron

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!