Skip to content

Instantly share code, notes, and snippets.

View evan-burke's full-sized avatar

Evan Burke evan-burke

  • Oakland, CA
View GitHub Profile
@evan-burke
evan-burke / sqlitelib.py
Last active December 23, 2021 16:39
simple python3 sqlite3 wrapper
""" This is pretty basic but makes it a little eaiser to use sqlite3 for common tasks.
Example usage:
db = sqlitelib(dbfile)
result = db.execute("select * from mytable")
or:
with sqlitelib(dbfile) as db:
result = db.execute("select * from mytable")
@evan-burke
evan-burke / retry.php
Last active September 14, 2018 04:51 — forked from orottier/RetryTest.php
Retry function for PHP with linear backoff
<?php
/*
* Retry function for e.g. external API calls
*
* Will try the risky API call, and retries with an ever increasing delay if it fails
* Throws the latest error if $maxRetries is reached,
* otherwise it will return the value returned from the closure.
*
*/
@evan-burke
evan-burke / postgres bulk percentiles calculation
Last active March 30, 2020 13:16
Postgres bulk percentile calculation with generate_series() and percentile_cont()
This will return the percentiles in order.
-- #1
select unnest(
percentile_cont(
(select array_agg(s) from generate_series(0, 1, 0.2) as s)
) WITHIN GROUP (ORDER BY SIZE))
from mytable
@evan-burke
evan-burke / bfg
Last active April 7, 2018 00:19
cleaning sensitive data from a repo
- Start somewhere with Java runtime and a Git command line. Git should be configured with a valid SSH key for the repo.
- Create and push a new commit reverting any changes you want to wipe. This is done to ensure the commit is not 'protected'.
- Create & cd into temporary dir if you want.
- Download .jar file from here - search for 'downloadable jar'.
https://rtyley.github.io/bfg-repo-cleaner/
- Clone repo (with github, use ssh URL, not https) with --mirror:
@evan-burke
evan-burke / dremio-ubuntu
Last active July 10, 2021 01:01 — forked from jcaristy/info.sh
[DREMIO: Install dremio on Ubuntu] #dremio
Installing Dremio 1.4 on Ubuntu 16
### NOTE: this is significantly out of date since I last edited it in Jan 2018.
# See the comments on the gist for suggested changes for more recent versions.
##Install links / references
https://www.dremio.com/tutorials/recommender-scikit-learn-dremio-postgres-mongodb/
https://www.dremio.com/tutorials/dremio-oracle-aws/
https://docs.dremio.com/deployment/standalone-tarball.html
@evan-burke
evan-burke / psycopg2_update_execute_values.py
Last active June 3, 2020 23:45
update query using psycopg2 execute_values
# Updates are a little tricky using psycopg2.extras.execute_values(), and documentation is a little sparse.
# http://initd.org/psycopg/docs/extras.html
db_connection_string = "dbname=EDITME host=EDITME"
# uuids, as strings
my_data = ['6ef0f42a-63da-4edb-9a11-5e146cb337ac','e7b1e961-0a68-4c4f-a716-e0959593f27d','1f82c9a5-00c3-4bd8-8c50-0ede441b4e91']
query = """update my_table t
set bool_field = true
@evan-burke
evan-burke / insert_paginator.py
Last active April 15, 2020 22:02
psycopg2 execute_values wrapper for accurate row counts
# One of the fastest ways to insert bulk data into Postgres (at least, aside from COPY) is using the psycopg2 extras function execute_values.
# However, this doesn't return an accurate row count value - instead, it just returns the row count for the last page inserted.
# This wraps the execute_values function with its own pagination to return an accurate count of rows inserted.
# Performance is approximately equivalent to underlying execute_values function - within 5-10% or so in my brief tests.
import psycopg2
import psycopg2.extras
import math
db_connection_string = "dbname=EDITME host=EDITME"