- Divides large tables into smaller, hierarchical partitions
- Example: Two-level partitioning by year and manufacturer for auto sales data
- Benefits: Improved query performance, efficient data management, parallel query execution, simplified maintenance, and better data organization
- Components: PostgreSQL, S3 Storage,
pg_analytics
Foreign Data Wrapper (FDW), DuckDB - Key steps:
- Data generation and organization (local and S3)
- Database setup (partitioned table structure and FDW connections)
- Test cases (total sales, average price, monthly sales assertions)
- Root partitioned table in PostgreSQL
- Year-level partitions
- Manufacturer-level foreign tables linked to S3 Parquet files
- Leverages PostgreSQL's partition pruning and
pg_analytics
for efficient S3 data querying
- Transparent integration with PostgreSQL
- Efficient query optimization for S3-stored data
- Scalable and cost-effective storage solution
- Improved performance for large, partitioned datasets
RUST_LOG=info cargo test --test test_mlp_auto_sales -- --nocapture
For more information, contact the ParadeDB team team or join ParadeDB Slack Community.