This project implements a parameterized ETL pipeline to load song and log data from Amazon S3 into a Redshift data warehouse, using Apache Airflow for orchestration. The project emphasizes:
- Dynamic DAG execution using parameters (
dag_run.conf
) - Modular code structure with reusable operators and helpers
- Full ETL flow: staging → dimensions → facts → data quality
- Logging and auditing with unmatched songplays