Skip to content

Instantly share code, notes, and snippets.

View mrecos's full-sized avatar

Matt Harris mrecos

View GitHub Profile
@mrecos
mrecos / Zubrow Pop Change ggplot2.r
Created March 16, 2016 10:09
Used Zubrow (1974) data on population change over time in Pueblos of New Mexico to illustrate new features in ggplot2: subtitles and caption. More details on these new features can be found here: https://gist.github.com/hrbrmstr/283850725519e502e70c
library("ggplot2") # Must be dev version, use: devtools::install_github("hadley/ggplot2")
library("gridExtra")
library("extrafont") # Need to follow steps here: http://zevross.com/blog/2014/07/30/tired-of-using-helvetica-in-your-r-graphics-heres-how-to-use-the-fonts-you-like-2/
# create data frame
year <- c(1760, 1790, 1797, 1850, 1860, 1889, 1900, 1910, 1950)
sites <- c("Isleta", "Acoma", "Laguna", "Zuni", "Sandia", "San Felipe",
"Santa Ana", "Zia", "Santo Domingo", "Jemez", "Cochiti",
"Tesuque", "Nambe", "San Ildefonso", "Pojoaque", "Santa Clara",
"San Juan", "Picuris", "Toas")
library("ggplot2") # Must use Dev version as of 03/18/16
library("gridExtra")
library("extrafont") # for font selection
library("dplyr") # for data preperation
library("cowplot") # for combining plots
# Prepare data for plotting
# data from Zubrow, E.B.W. (1974), Population, Contact,and Climate in the New Mexican Pueblos
# prepared as a long format to facilitate plotting
year <- c(1760, 1790, 1797, 1850, 1860, 1889, 1900, 1910, 1950)
@mrecos
mrecos / Boxplot_compare.R
Last active May 6, 2023 19:28
Code for blog post: http://matthewdharris.com/2016/03/29/boxplot-or-not-to-boxplot-woe-ful-example/ A post to compare a bunch of visualizations against the boxplot.
library("data.table")
library("rowr")
library("dplyr")
library("ggplot2")
library("Information")
library("knitr")
library("ggrepel")
library("ggthemes")
library("ggalt")
library("xtable")
@mrecos
mrecos / Feature_space_prediction.r
Created April 19, 2016 20:26
code for blog post about visualizing predictions within feature space over a number of different models. https://matthewdharris.com/2016/04/19/predicting-in-feature-space/
### FUNCTIONS
# a quick function that plots the lowess response curve for binary data
plot.logreg <- function(dat){
dat$obs <- as.numeric(as.character(dat$obs)) # convert y factor to {0,1} digits
clr <- ifelse(dat$obs == 1, "orange", "blue")
pp <- ggplot(dat, aes(x = pred, y = obs)) +
geom_point(color = "gray30", alpha = 0.3) +
geom_jitter(width = 0.1, height = 0.11, color = clr) +
geom_rug(color = clr) +
geom_smooth(formula = y ~ x, se = FALSE, color = "gray25") +
@mrecos
mrecos / Gaussian_Process_blog_post.r
Created May 27, 2016 17:46
Code for my blog post on estimating priors, posteriors, and simulating hyperparameters in R using stan
############### FUNCTIONS ####################
## Simulate GP based on fixed sigma, rho, and eta
## calls gp-predict.stan
## this sims over same data (Y,X), but uses infered hyperparameters
sim_GP_y <- function(y1, x1, sigma_sq, rho_sq, eta_sq, iter = iter, chains = chains){
sim_fit <- stan(file="gp-predict_SE.stan", data=list(x1=x1, y1=y1, N1=length(x1),
x2=x, N2=length(x), eta_sq=eta_sq,
rho_sq=rho_sq, sigma_sq=sigma_sq),
iter=iter, chains=chains)
@mrecos
mrecos / gp-sim_SE.stan
Last active May 27, 2016 17:51
stan code for estimating Gaussian Process priors with squared exponential kernel. See: https://github.com/stan-dev/example-models/tree/master/misc/gaussian-process
// Sample from Gaussian process
// All data parameters must be passed as a list to the Stan call
// Based on original file from https://code.google.com/p/stan/source/browse/src/models/misc/gaussian-process/
data {
int<lower=1> N;
real x[N];
real eta_sq;
real rho_sq;
real sigma_sq;
@mrecos
mrecos / gp-predict_SE.stan
Created May 27, 2016 17:50
stan code for sampling response via Gaussian process and squared exponential kernel. See: https://github.com/stan-dev/example-models/tree/master/misc/gaussian-process
// Predict from Gaussian Process
// All data parameters must be passed as a list to the Stan call
// Based on original file from https://code.google.com/p/stan/source/browse/src/models/misc/gaussian-process/
data {
int<lower=1> N1;
vector[N1] x1;
vector[N1] y1;
int<lower=1> N2;
vector[N2] x2;
@mrecos
mrecos / GP_estimate_eta_rho_SE.stan
Created May 27, 2016 17:52
stan code for estimating eta and rho hyperparameters of the squared exponential kernel within a Gaussian Process.
// Predict from Gaussian Process
// estimate sigma_sq and rho_sq
// All data parameters must be passed as a list to the Stan call
// Based on original file from https://code.google.com/p/stan/source/browse/src/models/misc/gaussian-process/
data {
int<lower=1> N1;
vector[N1] x1;
vector[N1] y1;
int<lower=1> N2;
@mrecos
mrecos / intrees.r
Created July 30, 2016 02:59
loop to run simulations of intrees methods across RF, GBM, and rpart algorithms. Code supporting blog post: http://matthewdharris.com/2016/07/30/one-tree-to-rule-them-all-intrees-and-rule-based-learing
library("data.table")
library("rowr")
library("inTrees")
library("dplyr")
library("randomForest")
library("xtable")
library("caret")
library("gbm")
library("rpart")
library("reshape2")
@mrecos
mrecos / dogs_vs_cat_books.R
Last active August 9, 2016 00:56
Code for ggplot2 implimentation of dog vs cat plot posted by @hpster. using Dev (github) version of ggplot2 and ggalt
library("ggplot2")
library("ggalt")
library("dplyr")
color_function <- colorRampPalette(c("cadetblue3", "darkolivegreen3"))
num_books = 35
books <- paste0("book",1:num_books)
dogcat <- runif(length(books), -2.5, 2.5)
dat <- data.frame(books = books, dogcat = dogcat)