Skip to content

Instantly share code, notes, and snippets.

@jeremyyeo
Last active November 10, 2023 21:06
Show Gist options
  • Save jeremyyeo/7866afc4e36c1d6afb5663ca34330a82 to your computer and use it in GitHub Desktop.
Save jeremyyeo/7866afc4e36c1d6afb5663ca34330a82 to your computer and use it in GitHub Desktop.
How to correctly use a macro that returns a value to a hook #dbt

How to correctly use a macro that returns a value to a hook

Demonstrates https://docs.getdbt.com/docs/building-a-dbt-project/dont-nest-your-curlies#an-exception

Assuming we have a model with a hook like so:

-- models/foo.sql
{{
    config(
        post_hook = 'delete from {{ this }} where id = 1'
    )

}}

select 1 as id
 union
select 0 as id

We want to delete rows where id = 1 for whatever reason (it doesn't have to make sense here :P). When we run our model:

$ dbt --debug run
...
20:59:21  On model.my_dbt_project.foo: /* {"app": "dbt", "dbt_version": "1.6.7", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.foo"} */
create or replace transient table development.dbt_jyeo.foo
         as
        (

select 1 as id
 union all
select 0 as id
        );
20:59:21  Opening a new connection, currently in state closed
20:59:23  SQL status: SUCCESS 1 in 2.0 seconds
20:59:23  Using snowflake connection "model.my_dbt_project.foo"
20:59:23  On model.my_dbt_project.foo: /* {"app": "dbt", "dbt_version": "1.6.7", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.foo"} */
delete from development.dbt_jyeo.foo where id = 1
20:59:24  SQL status: SUCCESS 1 in 1.0 seconds
20:59:24  Timing info for model.my_dbt_project.foo (execute): 09:59:21.625271 => 09:59:24.065767
20:59:24  On model.my_dbt_project.foo: Close
...

We can see that things are working as expected and the hook deletes rows where id = 1.

Now, let's assume we want to have that condition come from a SQL query instead:

-- macros/get_from_db.sql
{% macro get_from_db() %}
    {% set res = run_query('select 1 as v') %}
    {% if execute %}
        {% set v = res[0][0] %}
    {% else %}
        {% set v = 0 %}
    {% endif %}
    {{ return(v) }}
{% endmacro %}

Note that my query does execute a SQL query and returns 1 but it can also return some value from a proper table - but that's not necessary for this exercise.

Now let's modify our model and use that macro in the hook:

-- models/foo.sql
{{
    config(
        post_hook = 'delete from {{ this }} where id = {{ get_from_db() }}'
    )

}}

select 1 as id
 union
select 0 as id

And then running it again:

$ dbt --debug run
...
21:01:18  On model.my_dbt_project.foo: /* {"app": "dbt", "dbt_version": "1.6.7", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.foo"} */
create or replace transient table development.dbt_jyeo.foo
         as
        (

select 1 as id
 union all
select 0 as id
        );
21:01:18  Opening a new connection, currently in state closed
21:01:19  SQL status: SUCCESS 1 in 2.0 seconds
21:01:19  Using snowflake connection "model.my_dbt_project.foo"
21:01:19  On model.my_dbt_project.foo: /* {"app": "dbt", "dbt_version": "1.6.7", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.foo"} */
select 1 as v
21:01:20  SQL status: SUCCESS 1 in 0.0 seconds
21:01:20  Using snowflake connection "model.my_dbt_project.foo"
21:01:20  On model.my_dbt_project.foo: /* {"app": "dbt", "dbt_version": "1.6.7", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.foo"} */
delete from development.dbt_jyeo.foo where id = 1
21:01:22  SQL status: SUCCESS 1 in 2.0 seconds
21:01:22  Timing info for model.my_dbt_project.foo (execute): 10:01:18.355574 => 10:01:22.129235
21:01:22  On model.my_dbt_project.foo: Close
...

We can see that it pretty much works exactly as before - just that the result from get_from_db() is injected into the post hook string.

Now, some users may make the mistake of not NESTING the curlies:

-- models/foo.sql
{{
    config(
        post_hook = 'delete from {{ this }} where id = ' ~ get_from_db()
    )

}}

select 1 as id
 union
select 0 as id

Let's see what happens:

$ dbt --debug run
...
21:03:25  On model.my_dbt_project.foo: /* {"app": "dbt", "dbt_version": "1.6.7", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.foo"} */
create or replace transient table development.dbt_jyeo.foo
         as
        (

select 1 as id
 union all
select 0 as id
        );
21:03:27  SQL status: SUCCESS 1 in 1.0 seconds
21:03:27  Using snowflake connection "model.my_dbt_project.foo"
21:03:27  On model.my_dbt_project.foo: /* {"app": "dbt", "dbt_version": "1.6.7", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.foo"} */
delete from development.dbt_jyeo.foo where id = 0
21:03:27  SQL status: SUCCESS 1 in 0.0 seconds
21:03:27  Timing info for model.my_dbt_project.foo (execute): 10:03:25.563286 => 10:03:27.542944
21:03:27  On model.my_dbt_project.foo: Close
...

Instead here - since we're calling get_from_db() directly (instead of the pattern showed above) - at that time, dbt does not yet execute any SQL queries - so it would return 0 instead. And this 0 value being fixed ahead of time is what is concatenated to the post hook and results in a SQL statement that perhaps you did not want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment