Categories
Mastering Development

AirflowTaskTimeout after setting execution_timeout

My Airflow DAG keeps failing on the only task that I have.
I declared the execution_timeout as 300 seconds, but it keeps crashing after around 37 seconds.
The task consists in scraping a website, without Chromedriver.
I’m on Linux, Raspberry PI.

Here is the code:

from datetime import timedelta
import importlib
import sys

from airflow.operators.bash_operator import BashOperator
from airflow.operators.python_operator import PythonOperator
from airflow.utils.dates import days_ago

from airflow import DAG

from lib.jobs import jobs, linkedin_jobs, glassdoor_jobs
from lib import jobs_and_companies

default_args = {
    'owner': 'airflow',
    'depends_on_past': False,
    'email': ['firstname.lastname@live.fr'],
    'email_on_failure': True,
    'retries': 0,
    'execution_timeout': timedelta(hours=24)
}

dag = DAG(
    dag_id='jobs',
    default_args=default_args,
    description='Collecting jobs from boards.',
    concurrency=10,
    schedule_interval=timedelta(hours=24),
    start_date=days_ago(2),
    dagrun_timeout=timedelta(seconds=300),
)

linkedin_jobs_task = PythonOperator(
    task_id='linkedin_jobs',
    python_callable=linkedin_jobs.scrap_jobs(),
    dag=dag,
    start_date=days_ago(2),
    execution_timeout=timedelta(seconds=300),
)

Can you help me?

Thanks

Leave a Reply

Your email address will not be published. Required fields are marked *