Skip to content

Instantly share code, notes, and snippets.

@schapala-hm
schapala-hm / pg_wal_cdc_pipeline.md
Last active April 14, 2026 20:49
Happy Robot: Postgres WAL CDC Pipeline Documentation (PostgreSQL → Snowflake)

Postgres WAL CDC Pipeline Documentation (Happy Robot → Snowflake)

Executive Summary

A production-grade Change Data Capture (CDC) system that streams PostgreSQL WAL (Write-Ahead Log) changes from Happy Robot's AWS RDS instance to Snowflake. Built with Python, Snowflake Stored Procedures, DBT, and Dagster, this solution replicates 215 tables across 6 schemas (~8 GB) using the same proven architecture as our MySQL Binlog CDC pipeline.

Key Metrics:

  • Tables: 215 across 6 schemas (cash_reconciliation, fin_approval_log, member, origination, program_access, transactions)
  • Initial Load: 193 tables, 28.7M rows, ~25 minutes
  • Latency: Hourly CDC cycles (configurable)