Skip to content

Instantly share code, notes, and snippets.

@jeremyyeo
Last active January 27, 2023 01:36
Show Gist options
  • Save jeremyyeo/4c5fb6db98a0cc3c876e718f80bb3420 to your computer and use it in GitHub Desktop.
Save jeremyyeo/4c5fb6db98a0cc3c876e718f80bb3420 to your computer and use it in GitHub Desktop.
Extracting Snowflake variant keys and mapping them to dbt vars so we can use them as column names #dbt

Extracting Snowflake variant keys and mapping them to dbt vars so we can use them as column names

  1. Create a table with variant/json data:
create or replace table development.dbt_jyeo.tbl_with_json as (
  select object_construct('my_key',object_construct('1', 'foo', '2', 'bar', '3', 'baz')) as variant_col
);

image

We will use this table to create a dbt model.

  1. Add vars to dbt_project.yml:
# dbt_project.yml
vars:
  1: foo
  2: bar

Note that it's probably not the best idea to have integers as variable names here.

  1. Add our dbt model with all the necessary components:
-- models/unpack.sql
{%- set query_keys -%}
select k.key 
  from development.dbt_jyeo.tbl_with_json t, 
       lateral flatten(input => t.variant_col:my_key) k;
{%- endset -%}

{%- set results = run_query(query_keys) -%}

{%- if execute -%}
    {%- set res = results.columns[0].values() -%}
{%- else -%}
    {%- set res = [] -%}
{%- endif -%}

select
{%- for key in res %}
    {%- set column_name = var(key | int, key) %}
    variant_col['my_key']['{{ key }}'] as col_{{ column_name }} {%- if not loop.last -%},{%- endif -%}
{%- endfor %}
 from development.dbt_jyeo.tbl_with_json
  1. Run the dbt model and inspect the logs (or compile the model):
01:25:54  On model.my_dbt_project.unpack: /* {"app": "dbt", "dbt_version": "1.3.2", "profile_name": "user", "target_name": "dev", "node_id": "model.my_dbt_project.unpack"} */
create or replace transient table development.dbt_jyeo.unpack  as
        (select
    variant_col['my_key']['1'] as col_foo,
    variant_col['my_key']['2'] as col_bar,
    variant_col['my_key']['3'] as col_3
 from development.dbt_jyeo.tbl_with_json
        );
01:25:55  SQL status: SUCCESS 1 in 1.32 seconds

The magic is in:

{%- set column_name = var(key | int, key) %}

Where we make key into an integer - because if not, it is a string and if you look at our vars, there is no variable named "1" (the string) but rather there is a variable named 1 (the integer). As you can tell - it's not a super good idea to have integers as variable names. Basically by doing:

vars:
  1: foo
  2: bar

We've constructed a vars dictionary:

vars = {1: "foo", 2: "bar"}  # NOT THE SAME AS vars = {"1": "foo", "2": "bar"}
vars[1]   # 'foo'
vars["1"] # KeyError: '1'

Knowing this, we could stringify our vars declaration:

vars:
  "1": foo
  "2": bar

And then do the following instead:

{%- set column_name = var(key, key) %}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment