Skip to content

Instantly share code, notes, and snippets.

@darkdreamingdan
Created August 25, 2022 09:11
Show Gist options
  • Save darkdreamingdan/c5ded709a90fc3c5b420cee5f644f499 to your computer and use it in GitHub Desktop.
Save darkdreamingdan/c5ded709a90fc3c5b420cee5f644f499 to your computer and use it in GitHub Desktop.
Python Extensions for DBT Jinja
-- Modified from the jaffle_shop dbt example. See the first 3 lines and final line for usage of custom js module
{{ print("HI") }}
{{ print("Running whitelisted function: " ~ modules.datetime.datetime(2002, 10, 27, 6, 0, 0)) }}
{{ print("Running custom function: " ~ modules.js.some_module.get_os_name()) }}
with customers as (
select * from {{ ref('stg_customers') }}
),
orders as (
select * from {{ ref('stg_orders') }}
),
payments as (
select * from {{ ref('stg_payments') }}
),
customer_orders as (
select
customer_id,
min(order_date) as first_order,
max(order_date) as most_recent_order,
count(order_id) as number_of_orders
from orders
group by customer_id
),
customer_payments as (
select
orders.customer_id,
sum(amount) as total_amount
from payments
left join orders on
payments.order_id = orders.order_id
group by orders.customer_id
),
final as (
select
customers.customer_id,
customers.first_name,
customers.last_name,
customer_orders.first_order,
customer_orders.most_recent_order,
customer_orders.number_of_orders,
customer_payments.total_amount as customer_lifetime_value
from customers
left join customer_orders
on customers.customer_id = customer_orders.customer_id
left join customer_payments
on customers.customer_id = customer_payments.customer_id
)
-- Use custom python to do `select * from final`
{{ modules.js.some_module.select_final() }}
#!/usr/bin/env python3
"""
Custom `jsdbt` command that injects a `js` library into DBT.
- This library modifies the DBT library at runtime to whitelist 'js' (and all submodules) as an allowed library within Jinja.
- We use shims to modify, so that a fork of DBT does not have to be mantained.
- It then runs dbt exactly as the `dbt` command would do
Notes:
- This works by overriding dbt.context.base.get_context_modules. If future DBT versions move or change this
whitelist location, this shim will need updating correspondingly.
- New packages should use namespace packages feature to add custom functions into the 'js' namespace
https://packaging.python.org/en/latest/guides/packaging-namespace-packages/
- This command could be shipped as part of a `js` master pip package.
- We could decide to overwrite the `dbt` command instead of having a seperate `jsdbt`, but this is probably easier to mantain.
"""
from dbt.main import main
import dbt.context.base
from dbt.context.base import get_context_modules as _get_context_modules
import importlib
import pkgutil
import js
def import_submodules(package, recursive=True):
""" Helper function to Import all submodules of a module, recursively, including subpackages
We use this to ensure anything under the 'js' namespace is loaded in advance
https://stackoverflow.com/a/25562415
"""
if isinstance(package, str):
package = importlib.import_module(package)
results = {}
for loader, name, is_pkg in pkgutil.walk_packages(package.__path__):
full_name = package.__name__ + '.' + name
results[full_name] = importlib.import_module(full_name)
if recursive and is_pkg:
results.update(import_submodules(full_name))
return results
import_submodules(js)
def new_get_context_modules():
"""
Append the js module into the whitelisted DBT modules within Jinja
"""
modules = _get_context_modules()
modules["js"] = js
return modules
# Shim the original DBT whitelist to include our new whitelist
dbt.context.base.get_context_modules = new_get_context_modules
main()
import platform
def get_os_name():
return platform.platform()
def select_final():
return "select * from final"
> jsdbt run
00:51:05 Running with dbt=1.2.0
HI
Running whitelisted function: 2002-10-27 06:00:00
Running custom function: macOS-10.16-x86_64-i386-64bit
00:51:06 Found 5 models, 20 tests, 0 snapshots, 0 analyses, 256 macros, 0 operations, 3 seed files, 0 sources, 0 exposures, 0 metrics
00:51:06
00:51:06 Concurrency: 2 threads (target='dev')
00:51:06
00:51:06 1 of 5 START view model dev.stg_customers ...................................... [RUN]
00:51:06 2 of 5 START view model dev.stg_orders ......................................... [RUN]
00:51:06 2 of 5 OK created view model dev.stg_orders .................................... [CREATE VIEW in 0.08s]
00:51:06 1 of 5 OK created view model dev.stg_customers ................................. [CREATE VIEW in 0.08s]
00:51:06 3 of 5 START view model dev.stg_payments ....................................... [RUN]
00:51:06 3 of 5 OK created view model dev.stg_payments .................................. [CREATE VIEW in 0.02s]
00:51:06 4 of 5 START table model dev.customers ......................................... [RUN]
00:51:06 5 of 5 START table model dev.orders ............................................ [RUN]
HI
Running whitelisted function: 2002-10-27 06:00:00
Running custom function: macOS-10.16-x86_64-i386-64bit
00:51:06 5 of 5 OK created table model dev.orders ....................................... [SELECT 99 in 0.06s]
00:51:06 4 of 5 OK created table model dev.customers .................................... [SELECT 100 in 0.06s]
00:51:06
00:51:06 Finished running 3 view models, 2 table models in 0 hours 0 minutes and 0.27 seconds (0.27s).
00:51:06
00:51:06 Completed successfully
00:51:06
00:51:06 Done. PASS=5 WARN=0 ERROR=0 SKIP=0 TOTAL=5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment