Categories
Mastering Development

Understanding Airflow’s execution_date and schedule

New to airflow coming from cron, trying to understand how the execution_date macro gets applied to the scheduling system and when manually triggered. I’ve read the faq, and setup a schedule to what I expected would execute with the correct execution_date macro filled in.

I would like to run my dag weekly, on Thursday at 10am UTC. Occasionally I would run it manually. My understanding was the the dag’s start date should be one period behind the actual date I want the dag to start. So, in order to execute the dag today, on 4/9/2020, with a 4/9/20020 execution_date I setup the following defaults:

default_args = {
    'owner': 'airflow',
    'start_date': dt.datetime(2020, 4, 2),
    'concurrency': 4,
    'retries': 0
}

And the dag is defined as:

with DAG('my_dag',
        catchup=False,
        default_args=default_args,
        schedule_interval='0 10 * * 4',
        max_active_runs=1,
        concurrency=4,
         ) as dag:

opr_exc = BashOperator(task_id='execute_dag',bash_command='/path/to/script.sh --dt ')

While the dag executed on time today 4/9, it executed with the ds_nodash of 20200402 instead of 20200409. I guess I’m still confused since catchup was turned off, start date was one week prior thus I was expecting 20200409.

Now, I found another answer here, that basically explains that execution_date is at the start of the period, and always one period behind. So going forward should I be using next_ds_nodash? Wouldn’t this create a problem for manually triggered dags, since execution_date works as expected when run on-demand. Or does next_ds_nodash translate to ds_nodash when manually triggered?

Question: Is there a happy medium that allows me to correctly get the execution_date macro passed over to my weekly run dag when running scheduled AND when manually triggered? What’s best practice here?

Leave a Reply

Your email address will not be published. Required fields are marked *