-! 🚨 WARNING 🚨 !-
This is an advanced dbt feature and is not recommended for users who are new to dbt.
The default dbt materialization that does inserts into a target table by selecting from another source table is the incremental
materialization. By default, the incremental materialization requires that all columns in the target table are accounted for in the model code - this means that it may be challenging if you want exclude certain columns from being inserted by dbt (perhaps your target table has a self incrementing integer primary key column and you want the database to increment this automatically instead of having dbt do it on your behalf).
We can try to solve this by creating our own custom materialization that essentially inserts rows for us without having to know of each column in the target table. These next steps assume that one is running on Snowflake.
(1) Add the m13n.sql
macro attached to this gist to your macros/
folder.
(2) Create the source table in Snowflake that we are going to select from later on:
create or replace table development.dbt_jyeo.my_source (
user_name varchar,
user_status varchar
);
insert into development.dbt_jyeo.my_source (user_name, user_status)
values ('alice', 'active'),
('bob', 'inactive')
;
(3) Create the target table in Snowflake with the self incrementing user_id
primary key:
create or replace table development.dbt_jyeo.my_model (
user_id int not null identity(1, 1),
user_name varchar,
user_status varchar
);
You can try selecting from this table to confirm that there are 0 rows in it.
Note: the way the custom m13n
materialization is written, it expects your target table to already exist (and will error if not) which is different to the built in dbt materializations that will create your target table for you if they do not exist.
(4) Add a dbt yaml file specifying our source:
# models/sources.yml
version: 2
sources:
- name: dbt_jyeo
tables:
- name: my_source
(5) Add a dbt model that selects from the source using our custom m13n
materialization:
-- models/my_model.sql
{{ config(materialized = 'm13n') }}
select user_name, user_status from {{ source('dbt_jyeo', 'my_source') }}
(6) Build the my_model
model via dbt run
- the logs should read something like the following:
01:17:49 Running with dbt=1.0.1
01:17:49 Found 1 model, 0 tests, 0 snapshots, 0 analyses, 180 macros, 0 operations, 0 seed files, 1 source, 0 exposures, 0 metrics
01:17:49
01:17:55 Concurrency: 4 threads (target='dev')
01:17:55
01:17:55 1 of 1 START m13n model dbt_jyeo.my_model....................................... [RUN]
01:17:59 m13n model development.dbt_jyeo.my_model.....inserting rows
01:18:01 m13n model development.dbt_jyeo.my_model.....inserted rows
01:18:02 1 of 1 OK created m13n model dbt_jyeo.my_model.................................. [SUCCESS 1 in 6.75s]
01:18:02
01:18:02 Finished running 1 m13n model in 12.23s.
01:18:02
01:18:02 Completed successfully
01:18:02
01:18:02 Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
(7) Selecting from the table in Snowflake should reveal that the user_id
columns has self populated:
+---------+-----------+-------------+
| user_id | user_name | user_status |
+=========+===========+=============+
| 1 | alice | active |
+---------+-----------+-------------+
| 2 | bob | inactive |
+---------+-----------+-------------+
As shown, being able to write your own custom materialization opens up entire new ways to update tables with dbt - though it is worth mentioning again that custom materializations are meant for advanced dbt users and not so much for those who are just getting started.
For more help on using dbt, be sure to join the community on Slack.