Skip to content

Instantly share code, notes, and snippets.

@matsonj
matsonj / pr_to_master.yml
Last active January 30, 2024 17:35
Running dbt-core with github actions
name: pr_to_master
on:
pull_request:
branches:
- master
env:
DBT_PROFILES_DIR: ./
MSSQL_USER: ${{ secrets.MSSQL_USER }}
@mehd-io
mehd-io / csv_to_parquet.sh
Created March 21, 2023 16:23
Convert CSV to Parquet using DuckDB CLI
#!/bin/bash
# You can put this in your .bashrc or .zshrc
function csv_to_parquet() {
file_path="$1"
duckdb -c "COPY (SELECT * FROM read_csv_auto('$file_path')) TO '${file_path%.*}.parquet' (FORMAT PARQUET);"
}
@troyharvey
troyharvey / README.md
Last active May 12, 2024 08:25
GitHub Action for running the getdbt.com dbt CLI with BigQuery

Using GitHub Actions to run dbt

This example shows you how to use GitHub Actions to run dbt against BigQuery.

  1. Follow the instructions on getdbt.com for installing and initializing a dbt project.

  2. Copy this action (dbt.yml) into the workflows directory.

     mkdir .github
     mkdir .github/workflows
    

cp ~/Downloads/dbt.yml .github/workflows/

@bryanhuntesl
bryanhuntesl / terraform provision sftp server.md
Created May 22, 2019 14:56
terraform provision sftp server

Terraform AWS Transfer server (managed SFTP storing to EC2)

Create two S3 buckets - the first will be used for logfiles - the second will hold client uploads - any requests to the client uploads bucket will result in logs being generated to the log storage bucket.

resource "aws_s3_bucket" "pbdhosts-logging" {
    bucket = "pbdhosts-logging"
@dev-head
dev-head / Dockerfile
Last active August 11, 2021 17:44
Looker : Docker : Docker Compose : Staging
FROM ubuntu:16.04
RUN echo "[INFO]::[installing]::[base packages]" \
&& apt-get update \
&& apt-get install -y --no-install-recommends --no-install-suggests \
software-properties-common libssl-dev libmcrypt-dev openssl ca-certificates \
git ntp curl tzdata bzip2 libfontconfig1 phantomjs mysql-client sudo jq \
&& apt-get autoclean && apt-get clean && apt-get autoremove -y && apt-get clean && rm -rf /var/lib/apt/lists/*
RUN echo "[INFO]::[installing]::[java packages]" \
PostgreSQL Data Types AWS DMS Data Types Redshift Data Types
INTEGER INT4 INT4
SMALLINT INT2 INT2
BIGINT INT8 INT8
NUMERIC (p,s) If precision is 39 or greater, then use STRING. If the scale is => 0 and =< 37 then: NUMERIC (p,s) If the scale is => 38 and =< 127 then: VARCHAR (Length)
DECIMAL(P,S) If precision is 39 or greater, then use STRING. If the scale is => 0 and =< 37 then: NUMERIC (p,s) If the scale is => 38 and =< 127 then: VARCHAR (Length)
REAL REAL4 FLOAT4
DOUBLE REAL8 FLOAT8
SMALLSERIAL INT2 INT2
SERIAL INT4 INT4
@iconara
iconara / queries.sql
Last active November 13, 2023 22:26
Low level Redshift cheat sheet
-- Table information like sortkeys, unsorted percentage
-- see http://docs.aws.amazon.com/redshift/latest/dg/r_SVV_TABLE_INFO.html
SELECT * FROM svv_table_info;
-- Table sizes in GB
SELECT t.name, COUNT(tbl) / 1000.0 AS gb
FROM (
SELECT DISTINCT datname, id, name
FROM stv_tbl_perm
JOIN pg_database ON pg_database.oid = db_id
@leeper
leeper / update_github.R
Last active February 2, 2021 22:56
Update packages if a newer version is available from GitHub
library('devtools')
library('utils')
library('httr')
update_github <-
function(ask = TRUE, ...){
installed <- installed.packages()
oldVersion <- installed[,'Version']
urls <- sapply(names(oldVersion), function(x){
d <- packageDescription(x)
@chitchcock
chitchcock / 20111011_SteveYeggeGooglePlatformRant.md
Created October 12, 2011 15:53
Stevey's Google Platforms Rant

Stevey's Google Platforms Rant

I was at Amazon for about six and a half years, and now I've been at Google for that long. One thing that struck me immediately about the two companies -- an impression that has been reinforced almost daily -- is that Amazon does everything wrong, and Google does everything right. Sure, it's a sweeping generalization, but a surprisingly accurate one. It's pretty crazy. There are probably a hundred or even two hundred different ways you can compare the two companies, and Google is superior in all but three of them, if I recall correctly. I actually did a spreadsheet at one point but Legal wouldn't let me show it to anyone, even though recruiting loved it.

I mean, just to give you a very brief taste: Amazon's recruiting process is fundamentally flawed by having teams hire for themselves, so their hiring bar is incredibly inconsistent across teams, despite various efforts they've made to level it out. And their operations are a mess; they don't real