Skip to content

Instantly share code, notes, and snippets.

@garystafford
Last active March 26, 2022 04:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save garystafford/17ab9aa0b28ae535fe9bb1a30d857858 to your computer and use it in GitHub Desktop.
Save garystafford/17ab9aa0b28ae535fe9bb1a30d857858 to your computer and use it in GitHub Desktop.
# Purpose: DataHub example recipe for PostgreSQL datasource
# Author: Gary A. Stafford
# Date: March 2022
# see https://datahubproject.io/docs/metadata-ingestion/source_docs/postgres
source:
type: postgres
config:
# Coordinates
host_port: ${DB_HOST_PORT}
database: tickit
# Credentials
username: ${DB_USERNAME}
password: ${DB_PASSWORD}
# Options
profiling:
enabled: true
# Environment
env: DEV
# see https://datahubproject.io/docs/metadata-ingestion/transformers/#adding-a-set-of-tags
transformers:
- type: "simple_add_dataset_tags"
config:
tag_urns:
- "urn:li:tag:AWS"
- "urn:li:tag:${ACCOUNT_ID}"
- "urn:li:tag:us-east-1"
- type: "pattern_add_dataset_terms"
config:
term_pattern:
rules:
".*users.*": ["urn:li:glossaryTerm:Classification.Sensitive"]
- type: "simple_add_dataset_ownership"
config:
owner_urns:
- "urn:li:corpuser:Database Administrators"
ownership_type: "DATAOWNER"
# see https://datahubproject.io/docs/metadata-ingestion/sink_docs/datahub for complete documentation
sink:
type: "datahub-rest"
config:
server: ${DATAHUB_REST_ENDPOINT}
# see https://datahubproject.io/docs/metadata-ingestion/source_docs/reporting_telemetry/
pipeline_name: "postgres-pipeline-tickit"
reporting:
- type: "datahub"
config:
datahub_api:
server: ${DATAHUB_REST_ENDPOINT}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment