Skip to content

Instantly share code, notes, and snippets.

@infotroph
infotroph / gitxlscmp.sh
Last active January 2, 2016 23:29
Say you have an XLS file you've stupidly allowed into your repository and even more stupidly tried to update. Here's a shortcut to open the current and last-committed versions for a side-by-side eyeball diff.
#!/bin/bash
FN=`basename "$1"`
TMPFILE=`mktemp -t gitxlscmp."${FN}"` || exit 1
git show HEAD:"$1" > "$TMPFILE"
open -a "Microsoft Excel" "$TMPFILE" "$1"
# Want to delete $TMPFILE once open in Excel,
# but $(open) returns immediately, so check for ourselves
until lsof -a -c "Microsoft Excel" "$TMPFILE" > /dev/null; do
@infotroph
infotroph / gist:ec031ff026b064879dc0
Created January 20, 2016 21:46
Unwanted ggplot2 update from running install_github(...)
R version 3.2.2 (2015-08-14) -- "Fire Safety"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)
> packageVersion("ggplot2")
[1] ‘1.0.1’
> library(ggplotTicks)
Error in library(ggplotTicks) : there is no package called ‘ggplotTicks’
> library(devtools)
@infotroph
infotroph / gist:7675ccacdfff28cd64ac
Created January 26, 2016 21:43
Controlling new column names from mutate_each?
# Sample data: 2x2 manipulation, replicated 4x, multiple responses measured repeatedly over time.
# The response variables aren't related: y_also doesn't come "after" y_1 in any meaningful sense.
df = expand.grid(
day=1:3,
replicate=1:4,
t1=c("ctrl", "more"),
t2=c("ambient", "manipulated"))
df$y_1 = rnorm(48)
df$y_other = runif(48)
df$y_also = rlnorm(48)
strs = c("nomatch", "Odum 1969", "2001FACE_3b", "1800", "sdfghj1928qwer")
strs_yrs = gsub(".*((19|20)[0-9]{2}).*", "\\1", strs)
strs_yrs
# [1] "nomatch" "1969" "2001" "1800" "1928"
# Oops, strings with no year (or year from wrong century) are returned unchanged! Let's remove them separately:
strs_haveyr = grepl("(19|20)[0-9]{2}", strs)
strs_yrs[!strs_haveyr] = ""
strs_yrs
# [1] "" "1969" "2001" "" "1928"
R version 3.2.3 (2015-12-10)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.3 (El Capitan)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets base
@infotroph
infotroph / do_not_attempt.R
Last active March 12, 2016 21:39
Just getting some bad aesthetic decisions out of my system while working out how to do a different thing. Might be "useful" for teaching ggplot themes?
anycolors = function(n)rgb(runif(n), runif(n), runif(n))
color_grid = function(x, y, nx=10, ny=nx){
g = expand.grid(
x=seq(min(x), max(x), by=diff(range(x))/nx),
y=seq(min(y), max(y), by=diff(range(y))/ny))
g$color = anycolors(nrow(g))
g
}
@infotroph
infotroph / gist:c288c39e260548c9d34c
Created March 24, 2016 20:07
A possibly terrible approach to plotting across year boundaries
library(ggplot2)
library(dplyr)
library(lubridate)
dates = c(
seq(from=as.Date("2014-10-01"), to=as.Date("2015-05-30"), by="days"),
seq(from=as.Date("2015-10-01"), to=as.Date("2016-05-30"), by="days"))
Tmin = rnorm(length(dates), mean=10*sin((yday(dates) - 90)/365*2*pi))
Tmax = Tmin + rnorm(length(dates), mean=5)
@infotroph
infotroph / gist:7e28b208e43792f7e4ce
Last active March 27, 2016 00:45
Mapping arbitrary strings to to arbitrary numbers
# I have a vector of strings that map to known numeric values.
# What's the cleanest/most reader-friendly R idiom for this conversion?
# sample data
df = expand.grid(
x_str = c("string1", "secondstring", "blah", "garbagestring"),
replicate=1:3,
stringsAsFactors=FALSE)
# Approach 1: Encode the look-up table as its own dataframe
# Demo of intermittent plotting failures.
# Apparent requirements:
# * Interactive session (I'm in R.app using Quartz graphics)
# * Arranging multiple figures using gridExtra::arrangeGrob
# * Very different figure complexities
# (i.e one plot big enough to have a rendering lag, one that renders much faster.)
# * No graphics device open at beginning of arrangeGrob call
# Gentle reader, can you reproduce this on your machine?
library(dplyr)
gather = tidyr::gather
df = data.frame(
one=c(1,NA,NA,3, NA),
two=c(NA,3,NA,4,NA),
three=c(NA,NA,2,NA,5),
x=rnorm(5))
result = (df