Skip to content

Instantly share code, notes, and snippets.

@lyoshenka
Last active March 2, 2016 23:46
Show Gist options
  • Save lyoshenka/5736797cba12429a177f to your computer and use it in GitHub Desktop.
Save lyoshenka/5736797cba12429a177f to your computer and use it in GitHub Desktop.
Read a bunch of numbers (one per line), print some basic stats
#!/bin/bash
# sample usage:
# cat *.csv | cut -d',' -f3 | ./stat.sh
set -euo pipefail
[ -n "${1:-}" -a -f "${1:-}" ] && INPUT="$1" || INPUT="-"
cat $INPUT | sort -n | awk '
BEGIN {
count = 0;
sum = 0;
oldM = newM = oldS = newS = 0;
}
$1 ~ /^[0-9]+(\.[0-9]*)?$/ {
a[count++] = $1;
sum += $1;
if (count == 1) {
oldM = newM = $1;
} else {
newM = oldM + (($1 - oldM) / count);
newS = oldS + (($1 - oldM) * ($1 - newM));
oldM = newM;
oldS = newS;
}
}
END {
mean = count > 0 ? sum / count : 0;
# this is the sample std dev, not the population std dev
# http://www.johndcook.com/blog/standard_deviation/
stdev = count > 1 ? sqrt(newS/(count-1)) : 0;
if( (count % 2) == 1 ) {
median = a[ int(count/2) ];
} else {
median = (a[count/2] + a[count/2-1]) / 2;
}
print "sum\t" sum;
print "count\t" count;
print "min\t" a[0];
print "median\t" median;
print "mean\t" mean;
print "stdev\t" stdev;
print "90%\t" a[int(count*0.90-0.5)];
print "95%\t" a[int(count*0.95-0.5)];
print "99%\t" a[int(count*0.99-0.5)];
print "max\t" a[count-1];
}
'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment