clrcrl/readme.md

## readme.md

      
    Raw
  

              readme.md
            
          
    This is the process I used to put redshift user defined functions into dbt.

Created a subdirectory: macros/udfs/
Created a file for each udf, e.g. macros/udfs/f_future_date.sql.

{% macro f_future_date() %}
CREATE OR REPLACE FUNCTION {{target.schema}}.f_future_date()
RETURNS TIMESTAMP
IMMUTABLE AS $$
SELECT '2100-01-01'::TIMESTAMP;
$$ LANGUAGE sql
{% endmacro %}


Created a macro called macros/create_udfs.sql which calls each UDF macro. Note separation with ;s.

{% macro create_udfs() %}

{{f_list_custom_keys()}};

{{f_count_custom_keys()}};

{{f_future_date()}}

{% endmacro %}


Added a on-run-start hook to my project

on-run-start:
    - '{{create_udfs()}}'


Updated the references to UDFs in my models to use the schema-versions, e.g.

SELECT
{{target.schema}}.F_FUTURE_DATE()

The “gotcha” parts were:

I had to grant permission to users to create python udfs:

-- as your superuser
GRANT USAGE ON LANGUAGE PLPYTHONU TO claire;


I had to use schemas because users cannot edit each others’ UDFs (i.e. if claire creates the UDF, and sinter runs the CREATE OR REPLACE FUNCTION statement, you'll get a "must be owner of function" error). Using schemas means you are creating distinct UDFs* so won’t hit this issue. I would recommend using schemas anyway to maintain separate dev/production UDFs

* Assuming each redshift user profile used by dbt has a distinct target schema.