Skip to content

Instantly share code, notes, and snippets.

@troyharvey
Last active May 12, 2024 08:25
Show Gist options
  • Save troyharvey/d61ebe704395c925bf9448183e99af3e to your computer and use it in GitHub Desktop.
Save troyharvey/d61ebe704395c925bf9448183e99af3e to your computer and use it in GitHub Desktop.
GitHub Action for running the getdbt.com dbt CLI with BigQuery

Using GitHub Actions to run dbt

This example shows you how to use GitHub Actions to run dbt against BigQuery.

  1. Follow the instructions on getdbt.com for installing and initializing a dbt project.

  2. Copy this action (dbt.yml) into the workflows directory.

     mkdir .github
     mkdir .github/workflows
     cp ~/Downloads/dbt.yml .github/workflows/
    
  3. Follow the instructions on getdbt.com for creating a BigQuery service account, download the json key file, and copy it into a GitHub Secret named DBT_GOOGLE_BIGQUERY_KEYFILE.

  4. In the GitHub Action dbt.yml file, replace the Google Project and BigQuery dataset environment variables with your project's variables.

  5. Push to GitHub and watch it run.

name: dbt
on:
push:
branches:
- master
env:
DBT_PROFILES_DIR: ./
DBT_GOOGLE_PROJECT: your-gcp-project-id
DBT_GOOGLE_BIGQUERY_DATASET: bigquery-dataset-id
DBT_GOOGLE_BIGQUERY_KEYFILE: ./.gcloud/dbt-service-account.json
jobs:
dbt:
name: dbt
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
- uses: actions/setup-python@v1
with:
python-version: "3.7.x"
- run: pip3 install dbt
- run: dbt --version
- run: 'echo "$KEYFILE" > ./.gcloud/dbt-service-account.json'
shell: bash
env:
KEYFILE: ${{secrets.DBT_GOOGLE_BIGQUERY_KEYFILE}}
- run: dbt run
- run: dbt test
# Add this file to the root of your dbt project and use environment variables
# to configure dev vs production. I recommend using direnv locally with .envrc files.
default:
target: bigquery
outputs:
bigquery:
type: bigquery
method: service-account
keyfile: "{{ env_var('DBT_GOOGLE_BIGQUERY_KEYFILE') }}"
project: "{{ env_var('DBT_GOOGLE_PROJECT') }}"
dataset: "{{ env_var('DBT_GOOGLE_BIGQUERY_DATASET') }}"
threads: 10
timeout_seconds: 300
location: US
priority: interactive
@SoumayaMauthoorMOJ
Copy link

I recommend using dbt build instead of separate dbt run and dbt test. This means you have the choice of not running models downstream from models that have failed tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment