Skip to content

Instantly share code, notes, and snippets.

@p5k6
p5k6 / cursor_query.sql
Created February 28, 2020 20:54
gives information behind active cursors in redshift
select stv_active_cursors.userid, left("name", 23), sequence, "text" from stv_active_cursors join stl_utilitytext on stl_utilitytext.pid=stv_active_cursors.pid order by stv_active_cursors.name, stv_active_cursors.pid, stl_utilitytext.starttime, sequence;
@p5k6
p5k6 / sticky_assignor.md
Last active September 18, 2019 20:57
sticky assignor comments

The sticky assignor serves two purposes. First, it guarantees an assignment that is as balanced as possible, meaning either:

  • the numbers of topic partitions assigned to consumers differ by at most one; or
  • each consumer that has 2+ fewer topic partitions than some other consumer cannot get any of those topic partitions transferred to it.
@p5k6
p5k6 / postgres_sample_dag.py
Created May 2, 2019 18:24
postgres version of sample_dag for amundsen databuilder
import logging
import textwrap
from datetime import datetime, timedelta
import uuid
from elasticsearch import Elasticsearch
from airflow import DAG # noqa
from airflow import macros # noqa
from airflow.operators.python_operator import PythonOperator # noqa
from pyhocon import ConfigFactory
@p5k6
p5k6 / postgres_table_metadata_extractor.py
Created May 2, 2019 18:22
Amundsen Postgres version of a metadata extractor. Only tested on 9.6, on local install. No tests provided at this point
import logging
from collections import namedtuple
from pyhocon import ConfigFactory, ConfigTree # noqa: F401
from typing import Iterator, Union, Dict, Any # noqa: F401
from databuilder import Scoped
from databuilder.extractor.base_extractor import Extractor
from databuilder.extractor.sql_alchemy_extractor import SQLAlchemyExtractor
from databuilder.models.table_metadata import TableMetadata, ColumnMetadata
@p5k6
p5k6 / reinvent_2017_notes.md
Last active October 7, 2019 00:50
Notes from 2017 reinvent

re:invent

  • API gateway - no longer (exclusively) at the edge, can set up links within vpcs - see here

Monday


ABD202 - Best Practices for Building Serverless Big Data Applications

@p5k6
p5k6 / hockey1.md
Created September 27, 2017 21:07
basic hockey practice
  • zone entries - F1, F2, F3
  • breakout by D w/forward, regroup, then 2 on 1
    • involve patty eventually
    • Don't pass at the blue line. Try to pass by red line, or take it in yourself
    • after entering - make judgement. Can I get by D? don't be afraid to slow down, create space
  • dig in with skates - pushing net?
  • if you're not doing anything, just waiting - practice puckhandling
  • 1 on 1 on the boards - don't just throw it up the boards - esp for strong side breakouts
  • Retreiving the puck in deep behind net
  • D playing against F in corner
name: qa-test-transform
frequency: daily
load_time: 08:00
description: |
testing out new transform step with precondition
steps:
- step_type: extract-s3
name: ops-input
@p5k6
p5k6 / transform_with_precondition.py
Created May 19, 2016 15:54
custom step to allow a precondition on a transform step - see lines 75-82 and 139
"""
ETL step wrapper for shell command activity can be executed on Ec2 / EMR, with precondition
"""
from dataduct.pipeline import S3Node
from dataduct.pipeline import ShellCommandActivity
from dataduct.pipeline import Precondition
from dataduct.s3 import S3Directory
from dataduct.s3 import S3File
from dataduct.s3 import S3Path
from dataduct.utils import constants as const
"""
ETL step wrapper to extract data from mysql to S3, compressing it along the way
"""
import dataduct
from dataduct.config import Config
from dataduct.steps.etl_step import ETLStep
from dataduct.pipeline import CopyActivity
from dataduct.pipeline import MysqlNode
from dataduct.pipeline import PipelineObject
from dataduct.pipeline import Precondition

Keybase proof

I hereby claim:

  • I am p5k6 on github.
  • I am p5k6 (https://keybase.io/p5k6) on keybase.
  • I have a public key whose fingerprint is D844 1A65 94A4 665F CC35 BEA0 7DA7 7D09 8F4A 4E03

To claim this, I am signing this object: