Categories
Mastering Development

Airflow:How to select bigquery table data in to dataframe

I am new in airflow. I created my first dag below, selecting data from google big query table & saving it to a pd dataframe.
Need suggestion in below

  1. Where should i provide connection id of my big query
  2. As pd.read_gbq requires authintication, how to handle same in airflow dags

import os

import pandas as pd

from airflow.contrib.operators import bigquery_operatorfrom  #this will 
from datetime import datetime
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from airflow.hooks.base_hook import BaseHook



def test_gbq():
    sql = "SELECT * FROM `bi-projects-300910.raw_data.production_time_records_cdc`"
    df = pd.read_gbq(sql, dialect='standard')
    print(len(df))


default_args = {'owner': 'Martin james',
                'depends_on_past': False,
                'start_date': datetime(2018, 2, 26, 11, 30),
                'email_on_retry': true,
                'retries': 2
                }

dag = DAG('test_gbq',
          default_args=default_args,
          schedule_interval="0 12 * * *")



testing = PythonOperator(
    task_id="test",
    python_callable=test_python,
    dag=dag)

testing

Leave a Reply

Your email address will not be published. Required fields are marked *