Skip to content

Instantly share code, notes, and snippets.

View drbinna's full-sized avatar
:octocat:

Obi drbinna

:octocat:
View GitHub Profile

Final Pipeline: Parameterized Redshift ETL with Airflow

Project Overview

This project implements a parameterized ETL pipeline to load song and log data from Amazon S3 into a Redshift data warehouse, using Apache Airflow for orchestration. The project emphasizes:

  • Dynamic DAG execution using parameters (dag_run.conf)
  • Modular code structure with reusable operators and helpers
  • Full ETL flow: staging → dimensions → facts → data quality
  • Logging and auditing with unmatched songplays

Final Pipeline: Parameterized Redshift ETL with Airflow

Project Overview

This project implements a parameterized ETL pipeline to load song and log data from Amazon S3 into a Redshift data warehouse, using Apache Airflow for orchestration. The project emphasizes:

  • Dynamic DAG execution using parameters (dag_run.conf)
  • Modular code structure with reusable operators and helpers
  • Full ETL flow: staging → dimensions → facts → data quality
  • Logging and auditing with unmatched songplays