Skip to content

Instantly share code, notes, and snippets.

@yanhaeffner
yanhaeffner / utils.py
Created March 31, 2022 11:55
Python Utils Scripts
def get_current_machine_resources():
import psutil
from os import sched_getaffinity
hdd_partitions = {}
for p in psutil.disk_partitions():
usage = psutil.disk_usage(p.mountpoint)
hdd_partitions[p.mountpoint] = {
"total": usage.total,
"available": usage.total-usage.used
@yanhaeffner
yanhaeffner / generate_dbt_dag.py
Last active November 25, 2023 20:27
Python script to automate Airflow DAGs for DBT projects using a manifest.json file
from __future__ import division, print_function, unicode_literals
import json
from datetime import datetime
from os import environ as env
import argparse
def load_manifest():
local_filepath = f"./dbt/target/manifest.json"
with open(local_filepath) as f:
data = json.load(f)
@yanhaeffner
yanhaeffner / profiles.yml
Created September 20, 2021 16:38
DBT Big Query profiles file with environment variables
jaffle_shop: # this needs to match the profile: in your dbt_project.yml file
target: dev
outputs:
dev:
type: bigquery
method: service-account
keyfile: profile/bigquery-keyfile.json # bigquery-keyfile.json should be located inside your dbt profile folder
project: your-project-id # Replace this with your project id
dataset: you-dataset-name # Replace this with dbt_your_name, e.g. dbt_bob
threads: 1
@yanhaeffner
yanhaeffner / dockerized_dbt_folder_structure
Last active September 20, 2021 16:46
Necessary folder structure in order to containerize a DBT project
Dockerfile
requirements.txt
dbt/
├── analysis/
├── data/
├── macros/
├── models/ # Should contain your model structure and the project definition ".yml" file
└── schema.yml
└── customers.sql
└── stg_customers.sql
@yanhaeffner
yanhaeffner / requirements.txt
Created September 20, 2021 14:11
DBT containerization requirements file
agate==1.6.1
asn1crypto==1.3.0
attrs==19.3.0
azure-common==1.1.25
azure-storage-blob==2.1.0
azure-storage-common==2.1.0
Babel==2.8.0
boto3==1.11.17
botocore==1.14.17
cachetools==4.1.0
@yanhaeffner
yanhaeffner / Dockerfile
Created September 20, 2021 14:08
Dockerfile to containerize a DBT project
FROM ubuntu:18.04
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y --no-install-recommends git software-properties-common make build-essential ca-certificates libpq-dev && apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && apt-get install -y python3.8
RUN apt-get update
RUN apt-get install -y python3-pip
RUN pip3 install --upgrade pip setuptools
COPY requirements.txt ./
RUN pip3 install --requirement ./requirements.txt
RUN pip3 install --upgrade cffi
@yanhaeffner
yanhaeffner / diario_de_criptos_DAG.py
Created April 19, 2021 17:26
Airflow DAG para obtenção da criptomoeda com maior variação diária em BRL, comparação entre bitcoin, ethereum e dogecoin através da API CoinGecko.
from airflow.utils.dates import days_ago
from airflow.models import DAG
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.python_operator import PythonOperator
from airflow.sensors.http_sensor import HttpSensor
import requests
default_args = {
"owner": "yhaeffner",
@yanhaeffner
yanhaeffner / gist:2f4e998a48c1e4559be22a8be751d2e4
Last active March 2, 2021 13:24
airflow-1-10-12-constraints
# Editable install with no version control (apache-airflow==1.10.12)
Babel==2.8.0
Flask-Admin==1.5.4
Flask-AppBuilder==2.3.4
Flask-Babel==1.0.0
Flask-Bcrypt==0.7.1
Flask-Caching==1.3.3
Flask-JWT-Extended==3.24.1
Flask-Login==0.4.1
Flask-OpenID==1.2.5
@yanhaeffner
yanhaeffner / MyDAG.py
Created February 17, 2021 14:47
Basic DAG file to test parallel execution on Airflow
# First, let's import all the basic Airflow modules that we will be using on this DAG file
from airflow.utils.dates import days_ago
from airflow.models import DAG
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.python_operator import PythonOperator
import time
# Then, define your "default_args" dict to store a few of our DAG arguments, notice that the use of this dict is a good practice for DAG writing since we won't be needing to set things manually later on
default_args = {