Skip to content

Instantly share code, notes, and snippets.

@mtustin-handy
Last active June 13, 2018 03:51
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mtustin-handy/5eeb04a9830856d86caa to your computer and use it in GitHub Desktop.
Save mtustin-handy/5eeb04a9830856d86caa to your computer and use it in GitHub Desktop.
How not to structure subdags (unless you want them to run on their own schedule). Module bad_dags
from airflow.models import DAG
from airflow.operators import PythonOperator, SubDagOperator
from bad_dags.subdag import hive_dag
from datetime import timedelta, datetime
main_dag = DAG(
dag_id='main_dag',
schedule_interval=timedelta(hours=1),
start_date=datetime(2015, 9, 18, 21)
)
# Obviously, this doesn't make sense without some other tasks
transform_hive = SubDagOperator(
subdag=hive_dag,
task_id='hive_transform',
dag=main_dag,
trigger_rule=TriggerRule.ALL_DONE
)
from airflow.models import DAG
from airflow.operators import HiveOperator
from datetime import timedelta, datetime
# This will be run on its own schedule as well as via the subdag operator
hive_dag = DAG('main_dag.hive_transform',
# note the repetition here
schedule_interval=timedelta(hours=1),
start_date=datetime(2015, 9, 18, 21))
hive_transform = HiveOperator(task_id='flatten_tables',
hql=send_charge_hql,
dag=dag)
@debuggingfuture
Copy link

do you refers to hive_dag in L13 of subdag.py ? or main_dag?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment