Skip to content

Instantly share code, notes, and snippets.

View sungchun12's full-sized avatar
🧃
Juicin' up these open source contributor muscles in full force!

Sung Won Chung sungchun12

🧃
Juicin' up these open source contributor muscles in full force!
View GitHub Profile
@sungchun12
sungchun12 / xdb_diff_demo.py
Created January 8, 2024 22:55
Demo script to run a Datafold Cloud xdb data diff between Databricks and Snowflake with simple API calls
"""
Demo script to run a xdb data diff between Databricks and Snowflake with simple API calls
"""
import os
import time
from pydantic import BaseModel
from typing import Any, List, Optional
import requests
from tabulate import tabulate
@sungchun12
sungchun12 / Dockerfile
Created July 16, 2023 20:30
quick setup for airflow using astro cli
FROM quay.io/astronomer/astro-runtime:7.6.0
# Install python packages
RUN pip install -r requirements.txt
# Switch to root user for installing git
USER root
# Install git
RUN apt-get update && apt-get install -y git
@sungchun12
sungchun12 / Dockerfile
Created May 10, 2023 08:05
Simple quickstart to dbt Core DAG with bashoperator using BigQuery
FROM quay.io/astronomer/astro-runtime:7.4.2
# Install apache-airflow-providers-google package
RUN pip install apache-airflow-providers-google
# Switch to root user for installing git
USER root
# Install git
RUN apt-get update && apt-get install -y git
@sungchun12
sungchun12 / constraints_example_snowflake_schema.yml
Last active November 4, 2022 21:22
Add column types, constraints, and default values when creating tables in dbt: https://www.loom.com/share/14020499f5f646b6bc80c909716850fd
version: 2
models:
- name: constraints_example
config:
constraints_enabled: true
columns:
- name: id
column_type: integer
description: I want to describe this one, but I don't want to list all the columns
@sungchun12
sungchun12 / fake_data_example.py
Created September 29, 2022 18:56
Use this to generate fake data in your dbt pipelines as an alternative to dbt seeds with csv files: https://www.loom.com/share/90084f27396746619d4f53f44143faab
from faker import Faker
import pandas as pd
fake = Faker()
def create_rows_faker(num=1):
output = [{"name":fake.name(),
"address":fake.address(),
"name":fake.name(),
"email":fake.email(),
@sungchun12
sungchun12 / zero_copy_clone_pre_prod.sql
Last active October 4, 2022 22:04
dbt staging to production gatekeeper macro to prevent problems happening in production after it's too late. Demo Video: https://www.loom.com/share/bcfd2cf3b4b5471683bfc5b24587db3d
{% macro clone_from_to(from, to) %}
{% set sql -%}
create schema if not exists {{ target.database }}.{{ to }} clone {{ from }};
{%- endset %}
{{ dbt_utils.log_info("Cloning tables/views from schema [" ~ from ~ "] into target schema [" ~ to ~ "]") }}
{% do run_query(sql) %}
/*
Welcome to your first dbt model!
Did you know that you can also configure models directly within SQL files?
This will override configurations stated in dbt_project.yml
Try changing "table" to "view" below
*/
-- force a dependency
-- depends_on: analytics.dbt_demo_account_sung.fct_orders
select *
from -- Need this here, since the actual ref is nested within loops/conditions:
-- depends on: analytics.dbt_demo_account_sung.dbt_metrics_default_calendar
(with source_query as (
select
@sungchun12
sungchun12 / table_constraints_demo.sql
Created May 10, 2022 21:03
BigQuery: Table with Constraints Custom Materialization
{{
config(
materialized = "table_with_constraints"
)
}}
select
1 as id,
'blue' as color,
cast('2019-01-01' as date) as date_day
{% macro clone_prod_to_target(from) %}
{% set sql -%}
create schema if not exists {{ target.database }}.{{ target.schema }} clone {{ from }};
{%- endset %}
{{ dbt_utils.log_info("Cloning schema " ~ from ~ " into target schema.") }}
{% do run_query(sql) %}