Skip to content

Instantly share code, notes, and snippets.

View first_orchard_simulation.R
roll_die <- function(status, strategy) {
roll = sample(c(names(status), "basket"), 1)
if (roll == "basket") {
trees = status[names(status) != "raven" & status > 0]
if (strategy == "optimal") {
biggest_trees = trees[trees == max(trees)]
roll = sample(names(biggest_trees), 1)
} else if (strategy == "random") {
roll = sample(names(trees), 1)
} else if (strategy == "worst") {
@andland
andland / JSM Supervised Dimensionality Reduction Talk.md
Last active Jul 31, 2016
Info related to my 2016 JSM talk on Supervised Dimensionality Reduction for Exponential Family Data.
View JSM Supervised Dimensionality Reduction Talk.md
View privacy policy.md

I will not share anyone's information. I will only use aggregated data with no personal information.

@andland
andland / JSM Generalized PCA Talk.md
Last active Jul 4, 2016
Info related to my 2015 JSM talk on Generalized PCA.
View JSM Generalized PCA Talk.md
@andland
andland / AmazonBookURLs.csv
Last active Aug 29, 2015
Scrape Amazon's Trade-In Value
View AmazonBookURLs.csv
Title URL
Data Clustering C++ http://www.amazon.com/Data-Clustering-Object-Oriented-Knowledge-Discovery/dp/1439862230
Transportation Statistics and Microsimulation http://www.amazon.com/Transportation-Statistics-Microsimulation-Clifford-Spiegelman/dp/1439800235
Fundamentals of Transportation and Traffic Operations http://www.amazon.com/Fundamentals-Transportation-Traffic-Operations-Daganzo/dp/0080427855
A First Course in Stochastic Processes http://www.amazon.com/First-Course-Stochastic-Processes-Second/dp/0123985528
A Probability Path http://www.amazon.com/A-Probability-Path-Sidney-Resnick/dp/081764055X
A Primer on Linear Models http://www.amazon.com/Primer-Linear-Chapman-Statistical-Science/dp/1420062018
Statistical Approach to Genetic Epidemiology http://www.amazon.com/Statistical-Approach-Genetic-Epidemiology-Applications/dp/3527323899
Intro Trans Engineering http://www.amazon.com/Introduction-Transportation-Engineering-Banks-James/dp/0072431881
@andland
andland / 00Time_Slicing_README.md
Last active Jan 1, 2017
Time Stack and Time Slicing
View 00Time_Slicing_README.md

See this link for an introduction on time stacking and time slicing.

time_slice.R requires the number of pixels wide or tall the image is to be a multiple of the number of images in your timelapse.

time_slice_v2.R attempts to get around this. Some images will contribute more pixels per slice than others. This is done by making the first x% of the images cover the first x% of the pixels (with appropriate rounding). It does not deal with number of images being greater than the height or width of the images in pixels. Version 2 will probably work better for you.

For example, if the images are 150 pixels wide and your timelapse has 100 images, time_slice.R will make the first image have a slice which is 51 pixels wide. The remaining 99 images will get slices which are 1 pixel wide. time_slice_v2.R will alternate between 1 pixel per i

@andland
andland / global.R
Created Dec 28, 2013
CD102.5 Top Songs by Artist in 2013
View global.R
library(XML)
library(lubridate)
library(sqldf)
library(reshape2)
library(ggplot2)
library(mgcv)
cat("loading old data...\n")
playlist=read.csv("CD101Playlist.csv",stringsAsFactors=FALSE)
colnames(playlist)[3]="Last Played"
@andland
andland / global.R
Created Nov 6, 2013
Shiny App plotting the average age of everyone in America with a given name in a particular year. To run, install shiny in R and type: library(shiny); runGist(7329230)
View global.R
library(shiny)
library(ggplot2)
exp.age.df=read.csv("https://dl.dropboxusercontent.com/u/17648661/ExpAgeByNameYear.csv")
age.range=range(exp.age.df$Age)
unique.names=sort(unique(exp.age.df$Name))
unique.names=c("<NONE>",as.character(unique.names))
start.names=c("Andrew","Dylan","Fred","Grace","Lillian","John")
@andland
andland / kaggle_amazon_nearest_neighbor.R
Last active Dec 18, 2015
A simple nearest neighbor algorithm for a dataset with categorical variables. This code written for the Amazon Employee Access challenge on Kaggle.com.
View kaggle_amazon_nearest_neighbor.R
# rm(list=ls())
setwd("Kaggle/Amazon Employee")
train = read.csv("train.csv")
test = read.csv("test.csv")
train$ROLE_TITLE <- NULL # Because the same as ROLE_CODE
test$ROLE_TITLE <- NULL # Because the same as ROLE_CODE
jaccard <- function(vec, matrix) {
rowSums(as.matrix(sweep(matrix, 2, as.numeric(vec), "==")))
@andland
andland / logithistplot.R
Last active Oct 20, 2017
Plot the relationship between a continuous and a binary variable, with the distribution of the continuous variable conditional on the binary variable. It includes a logistic and spline fit. You can add more layers to the result using standard ggplot2 syntax.
View logithistplot.R
# inspired by http://schamberlain.github.io/2012/01/logistic-regression-barplot-fig/
logithistplot <- function(data,breaks="Sturges",se=TRUE) {
require(ggplot2);
col_names=names(data)
# get min and max axis values
min_x <- min(data[,1])
max_x <- max(data[,1])
# get bin numbers