Skip to content

Instantly share code, notes, and snippets.

@infotroph
infotroph / tocsv.sh
Last active December 16, 2015 08:58
Sed example for concatenating multiple space-delimited files, with varying kinds of messy header line, into single CSV files.
#!/bin/bash
trtary=(a b c)
echo -e '1\tfoo.out\n0\tunwanted.out\n1\tbar.out\n1\tbaz.out' > tocsv.files
for t in ${trtary[*]}; do
# generate sample data. Each treatment overwrites previous .out files.
echo -e 'col1 col2 col3\n1 2 3\n4 5 6\n7 8 9' > foo.out
echo -e 'extrajunk\ncol1 col2 col3 col4\n1 2 3 4\n5 6 7 8\n9 10 11 12' > bar.out
@infotroph
infotroph / git-MSOffice-diff
Last active December 17, 2015 02:28
A quick and dirty approach to make git-diff on MS Office documents use the decompressed XML instead of treating them as binary. Known bugs: Complains on empty components (notably including the always-present [Content_Types].xml), and actually shows all changes to the XML (so good luck finding meaningful changes to an Excel file under all the dis…
# /usr/local/bin/xmlzip.sh:
#!/bin/bash
for i in `unzip -Z -1 "$1"`; do
echo "$i"
unzip -a -p "$1" "$i" | xmllint --format -
done
# ~/.gitconfig:
[diff "xmlzip"]
#Exim filter
logfile $home/eximfilter.log
logwrite "orig_local: $original_local_part"
if error_message then
logwrite "Error message; not filtering"
finish
endif
@infotroph
infotroph / git-filtering
Last active October 24, 2018 13:52
How to split a subdirectory of a Git repository off into its own repository, *without* losing history of files that have been moved from parent into subdirectory?
# The overarching problem: I'm an indecisive mofo.
# The solvable problem:
# I started a repo, later decided to move some things to a subdirectory,
# and later still decided to move that subdirectory to its own repo.
# I want the new repo to contain the history of only the files that
# currently live in the subdirectory... *including* their history
# from before I moved them into the subdirectory.
# Note that I'm more interested in preserving all history in subdir
# than I am in removing evidence of the original parent repo...
# the parent isn't secret, just large.
@infotroph
infotroph / gitxlscmp.sh
Last active January 2, 2016 23:29
Say you have an XLS file you've stupidly allowed into your repository and even more stupidly tried to update. Here's a shortcut to open the current and last-committed versions for a side-by-side eyeball diff.
#!/bin/bash
FN=`basename "$1"`
TMPFILE=`mktemp -t gitxlscmp."${FN}"` || exit 1
git show HEAD:"$1" > "$TMPFILE"
open -a "Microsoft Excel" "$TMPFILE" "$1"
# Want to delete $TMPFILE once open in Excel,
# but $(open) returns immediately, so check for ourselves
until lsof -a -c "Microsoft Excel" "$TMPFILE" > /dev/null; do
@infotroph
infotroph / gist:9479509
Created March 11, 2014 04:24
Keybase.md
### Keybase proof
I hereby claim:
* I am infotroph on github.
* I am infotroph (https://keybase.io/infotroph) on keybase.
* I have a public key whose fingerprint is E056 C0A5 5FB2 EA2C 897C 591A 19B3 5E7D 101C 0BEF
To claim this, I am signing this object:
@infotroph
infotroph / gist:9751447
Created March 24, 2014 23:19
No speed win from ddply
> system.time({a=by(raw, raw$Img, strip.tracing.dups); b=do.call(rbind,a)})
user system elapsed
35.127 5.569 40.398
> system.time({a=by(raw, raw$Img, strip.tracing.dups); b=do.call(rbind,a)})
user system elapsed
35.559 5.482 40.728
> system.time({a=by(raw, raw$Img, strip.tracing.dups); b=do.call(rbind,a)})
user system elapsed
35.666 4.975 40.366
> system.time({a=ddply(raw, .(Img), strip.tracing.dups)})
@infotroph
infotroph / gist:58608607659a5ce7a989
Last active August 29, 2015 14:00
short-circuiting or (not?) in any()
> f <- function() { print('FALSE'); FALSE }
# infix logical operators short-circuit as expected
> TRUE || f()
[1] TRUE
> FALSE || f()
[1] "FALSE"
[1] FALSE
> TRUE && f()
[1] "FALSE"
@infotroph
infotroph / findstable.R
Last active August 29, 2015 14:04
Find changepoints from irregular timestamps
# Goal: Identify the rows of a time series where I should expect concentration to be steady
# (i.e. setpoint has not changed recently).
# N.B. Not yet testing whether concentration IS steady -- that's the next step downstream.
# Wrinkles: setpoints are logged at a lower frequency than concentrations, and logging intervals
# for both are just irregular enough to be troublesome.
# Generate sample data:
# running log of gas concentrations, recorded approximately every second
concdata = data.frame(
time = as.POSIXct((1:50) + rnorm(25, mean= 0, sd=0.1), origin="2014-07-07"),
@infotroph
infotroph / gist:448abd19357a0418e7ad
Last active August 29, 2015 14:07
pointer-like variable updating between classes?
I have a parser inherited from someone else, and would rather not modify it if I don't have to.
The TL;DR on what's below: "Do I have to modify it?"
To pick the character encoding of its input, the parser is capable of either taking a
character encoding argument or of sniffing the encoding itself. Its designated initializer
method takes a pointer to an encoding:
-(instanceType)initWithStream:(NSStream)stream usedEncoding:(NSStringEncoding *)encoding;
and I understand the idea is that I check the pointee to see whether the parser updated its value when