Skip to content

Instantly share code, notes, and snippets.

View mrecos's full-sized avatar

Matt Harris mrecos

View GitHub Profile
@mrecos
mrecos / stratifiedCV.r
Last active August 14, 2020 17:50
Stratified K-folds Cross-Validation with Caret
require(caret)
#load some data
data(USArrests)
### Prepare Data (postive observations)
# add a column to be the strata. In this case it is states, it can be sites, or other locations
# the original data has 50 rows, so this adds a state label to 10 consecutive observations
USArrests$state <- c(rep(c("PA","MD","DE","NY","NJ"), each = 5))
# this replaces the existing rownames (states) with a simple numerical index
@mrecos
mrecos / assignment loop.r
Created April 24, 2020 13:56
basic assignment loop over last name and case load
##### Your code leading up to this point ######
# initialize active cases
SWDetail$active_cases <- 0
SWDetail[1,"active_cases"] <- 1
# initialize most recent cases
SWDetail$last_case_given <- 0
SWDetail[1,"last_case_given"] <- 1
# resorting this by date
cps_test_assignment <- cps_test_assignment[order(cps_test_assignment$assignment_date),]
@mrecos
mrecos / Game of Zombies.R
Last active April 12, 2020 02:23
John Ramey's (@ramhiser) implementation for Conway's Game of Life, but updated for gganimate package and adding the undead cells
library('foreach')
library('ggplot2')
library('gganimate')
# library('animation') # require for original code
library('reshape2')
library('doParallel') # for multicore support
## Guts and working concept from post below.
## http://johnramey.net/blog/2011/06/05/conways-game-of-life-in-r-with-ggplot2-and-animation/#comments
## I updated ggplot, melt, varnames, gganimate, and faded cells
@mrecos
mrecos / intrees.r
Created July 30, 2016 02:59
loop to run simulations of intrees methods across RF, GBM, and rpart algorithms. Code supporting blog post: http://matthewdharris.com/2016/07/30/one-tree-to-rule-them-all-intrees-and-rule-based-learing
library("data.table")
library("rowr")
library("inTrees")
library("dplyr")
library("randomForest")
library("xtable")
library("caret")
library("gbm")
library("rpart")
library("reshape2")
@mrecos
mrecos / tidymodels.R
Last active March 11, 2020 02:13
reproducible Tidymodels workflow example
#Package installs -------------------------------------------------------------
load.fun <- function(x) {
x <- as.character(x)
if(isTRUE(x %in% .packages(all.available=TRUE))) {
eval(parse(text=paste("require(", x, ")", sep="")))
print(paste(c(x, " : already installed; requiring"), collapse=''))
} else {
#update.packages()
print(paste(c(x, " : not installed; installing"), collapse=''))
eval(parse(text=paste("install.packages('", x, "')", sep="")))
@mrecos
mrecos / sf point_in_poly.r
Last active December 4, 2019 15:35
A repro example to get data, and aggregate points into polygons over a list with purrr::map and then animate with ggplot
library(corrplot)
library(viridis)
library(stargazer)
library(tidyverse)
library(dplyr)
library(sf)
library(tigris)
library(ggplot2)
library(rgdal)
library(maptools)
@mrecos
mrecos / purrr_example_iris.r
Created December 3, 2019 23:52
Quick example of purrr::nest analysis
library(tidyverse)
g <- glimpse
g(iris)
dat <- iris %>%
nest(data = c(-Species))
dat$data[[1]]
@mrecos
mrecos / multiclass_confusion_matrix.R
Last active March 6, 2019 05:03
Reproducible example for the ggplot design and approach to making a mulitclass ggplot confusion matrix
### Example of ggplot code for multiclass confusion matrix with caret::confusionMatrix and ggplot
### `Example_plot1` is the result of applying `caret::confusionMatrix()` to the outcome ...
### of a model that included a reference class and a predicted class; both as factors
### calling `as.data.frame(Example_plot1$table)` casts the predicted class frequency table from the ...
### `caret::confusionMatrix()` object into a nice long format table of columns `Reference`, `Prediction`, and `Freq`.
### Do this for a bunch of models, and then use `cowplot::plot_grid()` to arrange them.
library(tidyverse)
library(cowplot)
library(caret)
@mrecos
mrecos / Beach not Beach NN Loop.r
Last active January 11, 2018 18:09
R stats code for building NN and looping over hidden layer node density for animated gif output. NN code attributed to David Selby; http://selbydavid.com/2018/01/09/neural-network/
########################################################################
### Bespoke Neural Network R code attributed to: David Selby
### From blog post: http://selbydavid.com/2018/01/09/neural-network/
### Adapted here for making animated GIF of node density
### output gifs compiled at gifmaker.me for final output
### output tweeted here:
### https://twitter.com/Md_Harris/status/951257342418608128
########################################################################
two_spirals <- function(N = 200,
@mrecos
mrecos / Purrr Grid Search Parallel.R
Last active December 24, 2017 20:46
A bit of code for conducting parallelized random grid-search of randomForest hyperparameters using purrr::map() and futures (for multicore/multisession). This is a bit of a proof-of-concept as there are plenty of ways to iterate over a grid and do CV. Also, especially with randomForest, this is very memory inefficient. However, the approach may …
### ------- Load Packages ---------- ###
library("purrr")
library("future")
library("dplyr")
library("randomForest")
library("rsample")
library("ggplot2")
library("viridis")
### ------- Helper Functions for map() ---------- ###
# breaks CV splits into train (analysis) and test (assessmnet) sets