jeremyyeo/README.md

## README.md

      
    Raw
  

              README.md
            
          
    Executing stored procedures from dbt

Note that dbt typically does not recommend using stored procedures - have a read of this blog post for more information.

This is written for Snowflake but should apply to your usual suspects too (Redshift, BigQuery, etc).

Create your stored procedure. Here is a rather simple one that inserts the current timestamp whenever the procedure is called.
create or replace procedure development.dbt_jyeo.foo_proc()
returns text
language sql
as
$$
begin
  create table if not exists development.dbt_jyeo.foo_logging(updated_at timestamp);
  insert into development.dbt_jyeo.foo_logging values (current_timestamp());
  return current_timestamp()::text;
end;
$$
;
Worth nothing that the above procedure can be executed by running this statement:
call development.dbt_jyeo.foo_proc();

Now that we have the procedure created, there's a couple of options on how to execute this via dbt.
Using hooks

Hooks are SQL commands that are executed at different times. Here I'm going to have my hook be executed AFTER my model foo is built.
-- models/foo.sql
{{
  config(
    post_hook = "call development.dbt_jyeo.foo_proc();"
  )
}}
select 1 as id
$ dbt run --select foo

22:30:21  Running with dbt=1.2.1
22:30:21  Unable to do partial parsing because profile has changed
22:30:23  Found 1 model, 0 tests, 0 snapshots, 0 analyses, 309 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics
22:30:23  
22:30:28  Concurrency: 1 threads (target='dev')
22:30:28  
22:30:28  1 of 1 START table model dbt_jyeo.foo .......................................... [RUN]
22:30:33  1 of 1 OK created table model dbt_jyeo.foo ..................................... [SUCCESS 1 in 4.26s]
22:30:33  
22:30:33  Finished running 1 table model in 0 hours 0 minutes and 10.11 seconds (10.11s).
22:30:33  
22:30:33  Completed successfully
22:30:33  
22:30:33  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1

Now if we inspect the detailed logs for the above dbt run command:
[22:30:31.266732 [debug] [Thread-1  ]: On model.my_dbt_project.foo: /* {"app": "dbt", "dbt_version": "1.2.1", "profile_name": "snowflake", "target_name": "dev", "node_id": "model.my_dbt_project.foo"} */
call development.dbt_jyeo.foo_proc();
[22:30:32.433910 [debug] [Thread-1  ]: SQL status: SUCCESS 1 in 1.17 seconds

You'll notice that we did call our stored procedure (after model foo was built). Now, if we select from our logging table:

Indeed we see a new row with a new timestamp has been inserted as expected.
Using operations

Operations are macros that you can run using the run-operation command command. As such, operations aren't actually a separate resource in your dbt project — they are just a convenient way to invoke a macro without needing to run a model. Read more.
Basically if all we want to do is to invoke our procedure WITHOUT building any models, then we can use operations here.
Let's create a new macro that uses the run_query function:
-- macros/run_proc.sql
{% macro run_proc() %}
  {% do run_query("call development.dbt_jyeo.foo_proc();") %}
{% endmacro %}

And now we can execute our macro (which in turn, executes our stored procedure):
$ dbt run-operation run_proc

22:38:30  Running with dbt=1.2.1

^ There isn't much to be printed out in the console but the detailed logs show that we did call our procedure:
[22:38:30.662955 [debug] [MainThread]: Using snowflake connection "macro_run_proc"
[22:38:30.663256 [debug] [MainThread]: On macro_run_proc: /* {"app": "dbt", "dbt_version": "1.2.1", "profile_name": "snowflake", "target_name": "dev", "connection_name": "macro_run_proc"} */
call development.dbt_jyeo.foo_proc();
[22:38:30.663544 [debug] [MainThread]: Opening a new connection, currently in state init

And as before, selecting from our logging table:

Indeed shows that we have inserted yet another row as expected.