Skip to content

Instantly share code, notes, and snippets.

Dan Brown dbro

Block or report user

Report or block dbro

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
dbro / partition
Created Apr 24, 2014
partition lines of incoming data into separate files
View partition
# write the incoming data on stdin to separate files depending on their contents
# for example, take a file that has different dates in it:
# 2014-02-15+12334567 hello there this is the first line
# 2014-02-16+23345678 hello there this is the second line
# this file can be used to send the first line to a file called /tmp/session.log-20140215-randomnumber
# and the second line to another file called /tmp/session.log-20140214-randomnumber
# it takes the first N characters from the line for use in the output filename
USAGE="usage: $0 -p \"/tmp/session-logs-ready-to-merge-\" [-s \"-ready-for-merge\" -r -c 10 -d'-'] [input_filename] [another_input_filename]
\tp\tprefix path
View correlation
#!/usr/bin/awk -f
# input should be parallel sets of numbers, one set on each line, tab-separated.
# the input does not need to be sorted
# non-numeric input anywhere on the line will cause the entire line to be ignored
# this uses a naive algorithm that may lose precision in some situations.
# see for an alternate algorithm
n=0; # count of rows (which contain a full set of data)
dbro / summary-stats
Created Apr 24, 2014
summary statistics
View summary-stats
#!/usr/bin/awk -f
# input should be a set of numbers, one on each line. can be unsorted.
# non-numeric input will be ignored
# 2-pass algorithm, stores a copy of each number in an array in memory
# this could be changed to assume the input is sorted, but would still
# need to know in advance how many numbers to expect in the full set
# in order to calculate percentiles and the trimmed mean.
dbro / urldecode.awk
Created Apr 24, 2014
command line url decoder
View urldecode.awk
#!/usr/bin/awk -f
hextab["0"] = 0; hextab["8"] = 8;
hextab["1"] = 1; hextab["9"] = 9;
hextab["2"] = 2; hextab["A"] = 10; hextab["a"] = 10;
hextab["3"] = 3; hextab["B"] = 11; hextab["b"] = 11;
hextab["4"] = 4; hextab["C"] = 12; hextab["c"] = 12;
hextab["5"] = 5; hextab["D"] = 13; hextab["d"] = 13;
hextab["6"] = 6; hextab["E"] = 14; hextab["e"] = 14;
hextab["7"] = 7; hextab["F"] = 15; hextab["f"] = 15;
dbro /
Created Apr 1, 2014
Implementation of Hyper Log-Log probabilistic counting methods in lua inside redis, via python
# Lua routines for use inside the Redis datastore
# Hyperloglog cardinality estimation
# ported from
# Dan Brown, 2012.
# note that lua needs to have the bitlib and murmur3 modules built in, and loaded by redis
# suitable for counting unique items from 0 to billions
# choose a k value to balance storage and precision objectives
dbro / weeklyupdate.js
Created Apr 26, 2013
This is a Google Apps Script ( that replicates the "Snippets" messaging process as used by Google internally. See the notes at the bottom of this page for more info.
View weeklyupdate.js
/* **************************************
Weekly Update Scripts
by Dan Brown, March 2013
For automatic collection of weekly
update messages from employees.
* sends reminder messages
* posts to public sites pages
dbro / csvcut
Last active Aug 1, 2019 — forked from JoeGermuska/csvcut
Command line 'cut' utility that can handle csv quoting. This allows proper handling of fields that contain delimiters, both field and record delimiters like commas and newlines. Thanks to for the initial version of the code.
View csvcut
#!/usr/bin/env python
Like cut, but for CSVs. To be used from a shell command line.
Note that fields are 1-based, similar to the UNIX 'cut' command.
Should use something better than getopt, but this works...
You can’t perform that action at this time.