Skip to content

Instantly share code, notes, and snippets.

@mkoehrsen
mkoehrsen / gist:e51ec6f5d167158ef9a4
Created March 20, 2015 19:06
Postgres -- generate view definition mirroring every table in the public schema
select 'CREATE OR REPLACE VIEW dp_' || table_name ||
E' AS\nSELECT ' || array_to_string(array_agg(format('%s',column_name)),E',\n ') ||
E'\nFROM ' || table_name || E';\n'
from information_schema.columns
where table_schema = 'public'
group by table_name;
@mkoehrsen
mkoehrsen / postgres-schema-json
Last active April 6, 2018 15:32
Postgres -- dump out a representation of the public schema in json format
with my_columns as
(select table_name,array_agg(json_build_object('column_name', column_name, 'is_nullable', is_nullable, 'data_type', data_type, 'ordinal_position', ordinal_position)) cols
from information_schema.columns
where table_schema='public'
group by table_name),
my_fks as
(select distinct r.constraint_name,
k.table_name from_table,
c.table_name to_table
from information_schema.key_column_usage k
@mkoehrsen
mkoehrsen / gist:ce931b6462f4bb7835c7
Last active August 30, 2015 12:53
Generates the probability distribution describing the maximum number of leading zeroes seen in N random bit-strings, for successive values of N. This is related to the HyperLogLog cardinality estimator, as discussed at http://mkoehrsen.github.io/probability/data-analysis/2015/08/28/hyperloglog-estimator.html.
import numpy
def generate_max_zero_probabilities(maxZ):
maxZ += 1
row1 = numpy.array(list(map(lambda x: pow(2,-(x+1)),range(maxZ))))
row1_cum = row1.cumsum() - row1 # non-inclusive cumsum
yield (1,row1)
prev_row = row1
prev_row_cum = row1_cum
@mkoehrsen
mkoehrsen / segment.py
Created November 25, 2015 18:48
Superpixel segmentation in python with SLIC and watershed
# Superpixel segmentation approach that seems to give pretty good contiguous segments.
# (SLIC and quickshift don't seem to guarantee contiguity). The approach is to get initial
# segments from SLIC, use the centroid of each as a marker for watershed, then clean up.
import os, argparse
from skimage import segmentation
from skimage.future import graph
import cv2, numpy
import tempfile
import random