Kyle G. Lundstedt kylelundstedt

## pr_to_master.yml
name: pr_to_master

on:
  pull_request:
    branches:
      - master

env:
  DBT_PROFILES_DIR: ./
  MSSQL_USER: ${{ secrets.MSSQL_USER }}

## csv_to_parquet.sh
#!/bin/bash
# You can put this in your .bashrc or .zshrc
function csv_to_parquet() {
    file_path="$1"
    duckdb -c "COPY (SELECT * FROM read_csv_auto('$file_path')) TO '${file_path%.*}.parquet' (FORMAT PARQUET);"
}

## README.md

      
              3 files
            
          
              1 fork
            
          
              3 comments
            
          
              21 stars
            
          
                troyharvey
                / README.md
            
            
              Last active
              May 12, 2024 08:25
            
              
                GitHub Action for running the getdbt.com dbt CLI with BigQuery
              
          
    Using GitHub Actions to run dbt

This example shows you how to use GitHub Actions to run dbt against BigQuery.


Follow the instructions on getdbt.com for installing and initializing a dbt project.


Copy this action (dbt.yml) into the workflows directory.
 mkdir .github
 mkdir .github/workflows


cp ~/Downloads/dbt.yml .github/workflows/

  
## terraform provision sftp server.md

      
              1 file
            
          
              2 forks
            
          
              0 comments
            
          
              3 stars
            
          
                bryanhuntesl
                / terraform provision sftp server.md
            
            
              Created
              May 22, 2019 14:56
            
              
                terraform provision sftp server 
              
          
    Terraform AWS Transfer server (managed SFTP storing to EC2)

Create two S3 buckets - the first will be used for logfiles - the second will hold
client uploads - any requests to the client uploads bucket will result in logs being
generated to the log storage bucket.
resource "aws_s3_bucket" "pbdhosts-logging" {
    bucket = "pbdhosts-logging"

  
## Dockerfile
FROM ubuntu:16.04

RUN echo "[INFO]::[installing]::[base packages]" \
    && apt-get update \
    && apt-get install -y --no-install-recommends --no-install-suggests \
        software-properties-common libssl-dev libmcrypt-dev openssl ca-certificates \
        git ntp curl tzdata bzip2 libfontconfig1 phantomjs mysql-client sudo jq \
    && apt-get autoclean && apt-get clean && apt-get autoremove -y && apt-get clean && rm -rf /var/lib/apt/lists/*

RUN echo "[INFO]::[installing]::[java packages]" \

## postgres_to_redshift.csv

          
            PostgreSQL Data Types
            AWS DMS Data Types
            Redshift Data Types

            
              INTEGER
              INT4
              INT4

            
              SMALLINT
              INT2
              INT2

            
              BIGINT
              INT8
              INT8

            
              NUMERIC (p,s)
              If precision is 39 or greater, then use STRING.
              If the scale is => 0 and =< 37 then: NUMERIC (p,s) If the scale is => 38 and =< 127 then: VARCHAR (Length)

            
              DECIMAL(P,S)
              If precision is 39 or greater, then use STRING.
              If the scale is => 0 and =< 37 then: NUMERIC (p,s) If the scale is => 38 and =< 127 then: VARCHAR (Length)

            
              REAL
              REAL4
              FLOAT4

            
              DOUBLE
              REAL8
              FLOAT8

            
              SMALLSERIAL
              INT2
              INT2

            
              SERIAL
              INT4
              INT4

## queries.sql
-- Table information like sortkeys, unsorted percentage
-- see http://docs.aws.amazon.com/redshift/latest/dg/r_SVV_TABLE_INFO.html
SELECT * FROM svv_table_info;

-- Table sizes in GB
SELECT t.name, COUNT(tbl) / 1000.0 AS gb
FROM (
  SELECT DISTINCT datname, id, name
  FROM stv_tbl_perm
  JOIN pg_database ON pg_database.oid = db_id

## update_github.R
library('devtools')
library('utils')
library('httr')

update_github <-
function(ask = TRUE, ...){
  installed <- installed.packages()
  oldVersion <- installed[,'Version']
  urls <- sapply(names(oldVersion), function(x){
    d <- packageDescription(x)

## 20111011_SteveYeggeGooglePlatformRant.md

      
              1 file
            
          
              363 forks
            
          
              61 comments
            
          
              2559 stars
            
          
                chitchcock
                / 20111011_SteveYeggeGooglePlatformRant.md
            
            
              Created
              October 12, 2011 15:53
            
              
                Stevey's Google Platforms Rant
              
          
    Stevey's Google Platforms Rant

I was at Amazon for about six and a half years, and now I've been at Google for that long. One thing that struck me immediately about the two companies -- an impression that has been reinforced almost daily -- is that Amazon does everything wrong, and Google does everything right. Sure, it's a sweeping generalization, but a surprisingly accurate one. It's pretty crazy. There are probably a hundred or even two hundred different ways you can compare the two companies, and Google is superior in all but three of them, if I recall correctly. I actually did a spreadsheet at one point but Legal wouldn't let me show it to anyone, even though recruiting loved it.
I mean, just to give you a very brief taste: Amazon's recruiting process is fundamentally flawed by having teams hire for themselves, so their hiring bar is incredibly inconsistent across teams, despite various efforts they've made to level it out. And their operations are a mess; they don't real
	name: pr_to_master

	on:
	pull_request:
	branches:
	- master

	env:
	DBT_PROFILES_DIR: ./
	MSSQL_USER: ${{ secrets.MSSQL_USER }}
	#!/bin/bash
	# You can put this in your .bashrc or .zshrc
	function csv_to_parquet() {
	file_path="$1"
	duckdb -c "COPY (SELECT * FROM read_csv_auto('$file_path')) TO '${file_path%.*}.parquet' (FORMAT PARQUET);"
	}
	FROM ubuntu:16.04

	RUN echo "[INFO]::[installing]::[base packages]" \
	&& apt-get update \
	&& apt-get install -y --no-install-recommends --no-install-suggests \
	software-properties-common libssl-dev libmcrypt-dev openssl ca-certificates \
	git ntp curl tzdata bzip2 libfontconfig1 phantomjs mysql-client sudo jq \
	&& apt-get autoclean && apt-get clean && apt-get autoremove -y && apt-get clean && rm -rf /var/lib/apt/lists/*

	RUN echo "[INFO]::[installing]::[java packages]" \
PostgreSQL Data Types	AWS DMS Data Types	Redshift Data Types
INTEGER	INT4	INT4
SMALLINT	INT2	INT2
BIGINT	INT8	INT8
NUMERIC (p,s)	If precision is 39 or greater, then use STRING.	If the scale is => 0 and =< 37 then: NUMERIC (p,s) If the scale is => 38 and =< 127 then: VARCHAR (Length)
DECIMAL(P,S)	If precision is 39 or greater, then use STRING.	If the scale is => 0 and =< 37 then: NUMERIC (p,s) If the scale is => 38 and =< 127 then: VARCHAR (Length)
REAL	REAL4	FLOAT4
DOUBLE	REAL8	FLOAT8
SMALLSERIAL	INT2	INT2
SERIAL	INT4	INT4
	-- Table information like sortkeys, unsorted percentage
	-- see http://docs.aws.amazon.com/redshift/latest/dg/r_SVV_TABLE_INFO.html
	SELECT * FROM svv_table_info;

	-- Table sizes in GB
	SELECT t.name, COUNT(tbl) / 1000.0 AS gb
	FROM (
	SELECT DISTINCT datname, id, name
	FROM stv_tbl_perm
	JOIN pg_database ON pg_database.oid = db_id
	library('devtools')
	library('utils')
	library('httr')

	update_github <-
	function(ask = TRUE, ...){
	installed <- installed.packages()
	oldVersion <- installed[,'Version']
	urls <- sapply(names(oldVersion), function(x){
	d <- packageDescription(x)