Skip to content

Instantly share code, notes, and snippets.

@jeremyyeo
Last active March 3, 2022 23:35
Show Gist options
  • Save jeremyyeo/197fc3e56c75a1530e3e69675ee9e1c8 to your computer and use it in GitHub Desktop.
Save jeremyyeo/197fc3e56c75a1530e3e69675ee9e1c8 to your computer and use it in GitHub Desktop.
Making dbt use a BigQuery project that is different to our production jobs when developing in the cloud IDE #dbt

Making dbt use a BigQuery project that is different to our production jobs when developing in the cloud IDE

If you use dbt Cloud with BigQuery, you may notice that when you go to edit your development credentials for the project, the "Project ID" field (A BigQuery "project" is equivalent to a "database" in Snowflake) is not something that can be edited:

Screen Shot 2022-03-04 at 11 38 07 AM

Note: This is different to working with Snowflake where the database IS editable. In fact, with Snowflake, you can use a whole different credential set (username / password) for the user profile. This is not the case with a BigQuery connection where there is an account level credential with limited configuration permitted on the user profile level.

This means that your production jobs and development dbt run's in the Cloud IDE will essentially be reading / writing to the same project. Let's see how we can change this behaviour.

The example below is set up so that:

  • Production jobs are written to the jeremyyeo-production project.
  • Development runs in the cloud IDE are written to the jeremyyeo-sandbox project.

  1. Ensure that the Google Cloud credentials that you're using in dbt Cloud actually has access to the two different projects that you want to use.

  2. Set the target name in your development credentials page to something different to what is set in your production job. In the screenshot above, I have set mine to be dev. In my production job, I have set this to be default (or rather, I had left this default value alone):

Screen Shot 2022-03-04 at 12 01 38 PM

  1. Add a generate_database_name() macro to override the standard behaviour of only using the jeremyyeo-production project:
-- macros/get_custom_database.sql

{% macro generate_database_name(custom_database_name=none, node=none) -%}

    {%- set default_database = target.database -%}
    {%- set development_database = 'jeremyyeo-sandbox' -%}

    {%- if custom_database_name is none -%}

        {%- if target.name == 'dev' -%}

            {{ development_database }}

        {%- else -%}

            {{ default_database }}
 
        {%- endif -%}

    {%- else -%}

        {{ custom_database_name | trim }}

    {%- endif -%}

{%- endmacro %}

Essentially, we're telling dbt to use the +database config of models if one exists, and if not to use the target.name (recall we have 2 values here - dev and default) to determine which of the two projects that we have access to to use when invoked.

  1. Let's try a production job run:

image

Notice that the job runs and builds our table in jeremyyeo-production as expected.

  1. Now let's try a run in the cloud IDE:

image

Note that the runs now take place in our jeremyyeo-sandbox project instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment