Skip to content

Instantly share code, notes, and snippets.

View davehowell's full-sized avatar
🏠
Working from home - just like you

David Howell davehowell

🏠
Working from home - just like you
View GitHub Profile
@davehowell
davehowell / docker.md
Last active November 15, 2022 07:52
docker
@davehowell
davehowell / pyenv.md
Last active May 6, 2021 11:59
python
@davehowell
davehowell / aws-sts-functions.sh
Created February 9, 2021 15:05 — forked from strebitz/aws-sts-functions.sh
Obtain AWS STS session-tokens with BASH for multiple awscli profiles or an IAM role ARN
#!/bin/bash
# Add these functions to your bash_profile or bash_rc
# Dependencies:
# - awscli - AWS command line client
# - jq - Command-line JSON processor
function cfn-validate-template() {
for template in $@
do
@davehowell
davehowell / pyspark.md
Last active December 19, 2023 02:28
spark

PySpark snippets

catalog

spark.catalog.currentDatabase()
spark.catalog.listTables()

fs

# Which table has a column named
SELECT
c.relname
FROM pg_class AS c
INNER JOIN pg_attribute AS a on a.attrelid = c.oid
WHERE true
AND a.attname = '<COLUMN NAME>'
AND c.relkind = 'r'

Checkout a new branch and set upstream

git checkout -b branch origin/branch

Checkout an existing remote branch

git fetch --all git checkout -t / # e.g. origin/featuretastic

or git pull then just check it out?

Clean up branches

Which pid holding a bind port

lsof -ti:54320

mac version of ps -faux

ps -jef

filter the process list but still show the header

ps -jef | head -n1 && ps -jef | grep filter_value

@davehowell
davehowell / max_existing_multiro_macros.sql
Created November 27, 2019 04:34
Some dbt macros for fetching the max value of some column from a target table, useful in incremental models. The canonical subquery example is not performant, and sometimes you need a few different max values.
{% macro max_existing(max_field, target_ref=this) -%}
{#
Gets the max existing value for some field in the target, or some other ref or source.
Use in incremental models, inside `if is_incremental()` or `if execute` blocks,
otherwise the dbt model will not compile.
Useful where you have a primary key or other watermark field and want to construct SQL with
that value determined at compile time.
Why?
When using the max value multiple times in a query it will be better than inlining multiple
subqueries to fetch the same value, and in many cases, hardcoding a value in a where clause
@davehowell
davehowell / jsonlines_gzip_bytesio_bytestream_serialize_serialise.py
Last active November 28, 2018 11:30 — forked from veselosky/s3gzip.py
How to store and retrieve gzip-compressed objects in AWS S3
# Totally ~plagiarised~ inspired from the forked file by Vince Veselosky in this gist, extended to work with jsonlines format
# and skipping the S3 write so the serialisation can be tested without mocking S3
import re
import io
import json
import gzip
import dateutil.parser
from json import JSONEncoder
from pprint import pprint
@davehowell
davehowell / postgres-cheatsheet.md
Created November 15, 2018 01:17 — forked from Kartones/postgres-cheatsheet.md
PostgreSQL command line cheatsheet

PSQL

Magic words:

psql -U postgres

Some interesting flags (to see all, use -h or --help depending on your psql version):

  • -E: will describe the underlaying queries of the \ commands (cool for learning!)
  • -l: psql will list all databases and then exit (useful if the user you connect with doesn't has a default database, like at AWS RDS)