Skip to content

Instantly share code, notes, and snippets.

View erinshellman's full-sized avatar

Erin Shellman erinshellman

View GitHub Profile
@erinshellman
erinshellman / gist:75dd00ca211988b39387
Created October 26, 2015 20:53 — forked from jimbojsb/gist:1630790
Code highlighting for Keynote presentations

Step 0:

Get Homebrew installed on your mac if you don't already have it

Step 1:

Install highlight. "brew install highlight". (This brings down Lua and Boost as well)

Step 2:

@erinshellman
erinshellman / ec2_setup.sh
Created September 8, 2015 17:41
Stuff to run when firing up a fresh EC2 instance
sudo yum upgrade
sudo yum install tmux
tmux new -s session_name

Keybase proof

I hereby claim:

  • I am erinshellman on github.
  • I am erinshellman (https://keybase.io/erinshellman) on keybase.
  • I have a public key whose fingerprint is 833D 0373 26F9 8772 7590 D108 7095 2082 5D2D AD29

To claim this, I am signing this object:

@erinshellman
erinshellman / shellman_theme.R
Last active November 20, 2015 18:11
This is my own personal theme for ggplots. Mostly theme_minimal() with bigger labels.
theme_presentation = function (base_size = 12, base_family = "") {
theme_minimal(base_size = base_size, base_family = base_family) %+replace%
theme(
axis.text.x = element_text(size = 15),
axis.text.y = element_text(size = 15),
title = element_text(size = 20),
axis.title = element_text(size = 18),
axis.ticks = element_blank(),
legend.position = 'bottom')
}
// spark-shell --jars /home/otto/algebird-core_2.10-0.9.0.jar,/home/mforns/refinery-core-0.0.9.jar
import java.util.Date
import java.text.SimpleDateFormat
import org.wikimedia.analytics.refinery.core.PageviewDefinition
import org.wikimedia.analytics.refinery.core.Webrequest
import scala.math.pow
import org.apache.spark.rdd.RDD
import com.twitter.algebird.QTree
@erinshellman
erinshellman / multiread_delim.R
Last active August 29, 2015 14:16
Read in multiple, identically ordered files.
multiread.delim = function(path, header = TRUE, sep = '\t') {
# multiread.delim is a function to read in all the files in a given directory
# and rbind them into one data frame. Input files need to be of the same
# structure. Headers will be labeled according to the headers in the first file.
file_names = paste(path, list.files(path), sep = '/')
for (file in file_names) {
if (!exists('combined_df')) {
@erinshellman
erinshellman / clean-headers.R
Last active November 27, 2017 23:32
A collection of little helper functions for quick data cleaning in R
clean_headers = function(headers) {
# Make lowercase
headers = tolower(headers)
# Replace symbols
headers = gsub(' ', '', headers, fixed = TRUE)
headers = gsub('.', '_', headers, fixed = TRUE)
headers = gsub('[^[:alnum:]_]', '', headers) # remove all symbols except '_'
headers = gsub('__', '_', headers, fixed = TRUE)
@erinshellman
erinshellman / r_requirements.R
Last active August 29, 2015 14:01
A helper script in the spirit of Python's "requirements" file. Run when beginning a project to load up all requirements, or install them if missing.
libs = c('arm',
'biglm',
'car',
'doParallel',
'dplyr',
'gclus',
'ggplot2',
'gplots',
'ggthemes',
'lpSolveAPI',
@erinshellman
erinshellman / dict_to_df.R
Last active September 14, 2021 09:15
Convert Python dictionary to R data.frame
py_dict = readLines('python_dictionary.txt')
# e.g.
#{"cat_name": "Ella", "dwell_status": "tree_dweller", "coat_color": "gray, white, orange", "is_from_hell": "Y"}
#{"cat_name": "Billie", "dwell_status": "bush_dweller", "coat_color": "gray, white", "is_from_hell": "N"}
dict_to_df = function(dict) {
require(plyr)
df = data.frame()
df_temp = list()