Skip to content

Instantly share code, notes, and snippets.

View evan-burke's full-sized avatar

Evan Burke evan-burke

  • Oakland, CA
View GitHub Profile
@evan-burke
evan-burke / bfg
Last active April 7, 2018 00:19
cleaning sensitive data from a repo
- Start somewhere with Java runtime and a Git command line. Git should be configured with a valid SSH key for the repo.
- Create and push a new commit reverting any changes you want to wipe. This is done to ensure the commit is not 'protected'.
- Create & cd into temporary dir if you want.
- Download .jar file from here - search for 'downloadable jar'.
https://rtyley.github.io/bfg-repo-cleaner/
- Clone repo (with github, use ssh URL, not https) with --mirror:
@evan-burke
evan-burke / retry.php
Last active September 14, 2018 04:51 — forked from orottier/RetryTest.php
Retry function for PHP with linear backoff
<?php
/*
* Retry function for e.g. external API calls
*
* Will try the risky API call, and retries with an ever increasing delay if it fails
* Throws the latest error if $maxRetries is reached,
* otherwise it will return the value returned from the closure.
*
*/
@evan-burke
evan-burke / fabric_ssh.py
Created January 9, 2020 23:29
ssh with Fabric using an SSH key
# Fabric can be used to run commands on a remote system over SSH.
# Sadly, its docuentation is a bit short for connecting using a private key file.
# Other options exist too, like this one using an SSH config file - https://gist.github.com/aubricus/5157931
import fabric
# Openssh formatted private key:
keyfile = "/path/to/your/privkey"
host = "fqdn"
@evan-burke
evan-burke / docker_wsl_mounts.md
Last active February 1, 2020 01:23
docker WSL volume mounts

Docker under WSL has some issues with volume mounts. They will tend to fail silently (as far as I can tell).

Steps to follow to get these working:

  1. Change default mount point for the Win host FS under WSL. Edit /etc/wsl.conf and add or change the following. You may need to fully reboot Windows for this to take effect.
[automount]
@evan-burke
evan-burke / postgres bulk percentiles calculation
Last active March 30, 2020 13:16
Postgres bulk percentile calculation with generate_series() and percentile_cont()
This will return the percentiles in order.
-- #1
select unnest(
percentile_cont(
(select array_agg(s) from generate_series(0, 1, 0.2) as s)
) WITHIN GROUP (ORDER BY SIZE))
from mytable
@evan-burke
evan-burke / insert_paginator.py
Last active April 15, 2020 22:02
psycopg2 execute_values wrapper for accurate row counts
# One of the fastest ways to insert bulk data into Postgres (at least, aside from COPY) is using the psycopg2 extras function execute_values.
# However, this doesn't return an accurate row count value - instead, it just returns the row count for the last page inserted.
# This wraps the execute_values function with its own pagination to return an accurate count of rows inserted.
# Performance is approximately equivalent to underlying execute_values function - within 5-10% or so in my brief tests.
import psycopg2
import psycopg2.extras
import math
db_connection_string = "dbname=EDITME host=EDITME"
@evan-burke
evan-burke / psycopg2_update_execute_values.py
Last active June 3, 2020 23:45
update query using psycopg2 execute_values
# Updates are a little tricky using psycopg2.extras.execute_values(), and documentation is a little sparse.
# http://initd.org/psycopg/docs/extras.html
db_connection_string = "dbname=EDITME host=EDITME"
# uuids, as strings
my_data = ['6ef0f42a-63da-4edb-9a11-5e146cb337ac','e7b1e961-0a68-4c4f-a716-e0959593f27d','1f82c9a5-00c3-4bd8-8c50-0ede441b4e91']
query = """update my_table t
set bool_field = true
@evan-burke
evan-burke / seq_idx.py
Last active June 15, 2020 20:43
Detect repeated values in a series, and assigning an index to each sequence
import pandas as pd
# Use case: triggering an alert only if, say, monitoring is outside of a desired value for 4 hours in a row
def detect_sequential_failures(series, how_many):
# Takes as input a pd.Series of True or False values.
# Calculate like, e.g.,: df['my_condition_evaluation'] = df['testcol'] < threshold)
# then: detect_sequential_failures('my_condition_evaluation', 3)
#
# Returns a series with None for False items or True items in a sequence < how_many in a row,
@evan-burke
evan-burke / configparser
Last active August 12, 2020 17:06
python configfile parser example
# because I have to look up the syntax every time:
import configparser
CONFIG_FILENAME = "/path/to/config.file"
# or
# CONFIG_FILENAME = "../relative/path/to/config.file"
# for current dir, just: "config.file"
@evan-burke
evan-burke / sadb.py
Last active September 5, 2020 23:39
simple sqlalchemy wrapper
import configparser
import sqlalchemy as sa
__sadb_version__ = "0.0.2"
# 2020-09-03:
# Jeremy Howard's 'fastsql' is quite a bit better designed than this, but it's also higher-level - uses more ORM stuff
# https://github.com/fastai/fastsql
# honestly this could probably be a full module...