Skip to content

Instantly share code, notes, and snippets.

@DaddyMoe
Last active June 20, 2023 09:52
Show Gist options
  • Save DaddyMoe/d9372daf7c07ff7c50df686428208f56 to your computer and use it in GitHub Desktop.
Save DaddyMoe/d9372daf7c07ff7c50df686428208f56 to your computer and use it in GitHub Desktop.
airflow cheatsheet commands

Airflow cheatsheet commands

Helpppppppppp

airflow -h

Setup

~/airflow is the default home directory

You can cahnge this by

export AIRFLOW_HOME=~/airflow

Installation

Airflow

pip install apache-airflow

Install postgres plugin

pip install apache-airflow[postgres]

Initialize the airflow database

airflow initdb

Start the web server

airflow webserver -p 8080

Start the scheduler

airflow scheduler

Good to know

  • You can inspect the airflow.cfg, or through the UI in the Admin->Configuration menu.
  • The PID file for the webserver will be stored in $AIRFLOW_HOME/airflow-webserver.pid
    • or in /run/airflow/webserver.pid if started by systemd.

To validate script & Metadata

print the list of active DAGs

airflow list_dags

prints the list of tasks the "my_tutorial" dag_id

airflow list_tasks my_tutorial

prints the hierarchy of tasks in the my_tutorial DAG

airflow list_tasks my_tutorial --tree

Testing

Testing a dag's task

command format layout

command subcommand dag_id task_id date
airflow test tutorialx print_date 2019-12-18

Note that the airflow test command:

  • runs task instances locally,
  • outputs their log to stdout (on screen),
  • doesn’t bother with dependencie (e.g tasks run order graph)
  • doesn’t communicate state (running, success, failed, …) to the database.
  • It is for testing a single task instance.

Backfill run on a date range

airflow backfill tutorial -s 2019-12-18 -e 2019-12-20

Sources:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment