Skip to content

Instantly share code, notes, and snippets.

View clayford's full-sized avatar

Clay Ford clayford

View GitHub Profile
@clayford
clayford / plot_ci.r
Last active December 30, 2015 14:29
Plotting confidence intervals to see which contain true mean.
# create a blank plot
plot(x=c(10,20),y=c(0,40), type="n")
abline(v=15,lty=2)
# plot line segments to represent confidence intervals
# add red lines for CI's that do not capture true mean of 15
# add dots for means
for(i in 1:40){
x <- rnorm(30,15,2)
@clayford
clayford / simulate_resp_1.Rmd
Last active January 2, 2016 22:49
first code block
Simulating responses from a linear model
=========================
Say you fit a model using R's ` lm()` function. What you get in return are coefficients that represent an estimate of the linear function that gave rise to the data. The assumption is the response of the model is normally distributed with a mean equal to the linear function and a standard deviation equal to the standard deviation of the residuals. Using notation we express this as follows for a simple linear model with intercept and slope coefficients:
$latex Y_{i} \sim N(\beta_{0} + \beta_{1}x_{i},\sigma)$
The implication of this in the world of statistical computing is that we can simulate responses given values of *x*. R makes this wonderfully easy with its ` simulate()` function. Simply pass it the model object and the number of simulations you want and it returns a matrix of simulated responses. A quick example:
```{r}
@clayford
clayford / sumExp.R
Last active August 29, 2015 14:10
function to simulate proportional sum of exponential variables
# function to simulate proportional sum of exponential variables
# p = p1,p2,p3,...,pn sum to one
# r = rates for exponential random variables
sumExp <- function(p,r){
if(sum(p) != 1) stop("p does not sum to 1")
if(length(p) != length(r)) stop("p and r not equal lengths")
x <- rexp(length(r),rate = r)
crossprod(p,x)
}
@clayford
clayford / scrape_craftcans.R
Created May 25, 2017 15:49
scrape craftcans.com database
# craftcans.com - web site devoted to canned craft beer
# scrape craftcans.com database
library(rvest)
library(magrittr) # for extract()
library(stringr)
URL <- "http://www.craftcans.com/db.php?search=all&sort=beerid&ord=desc&view=text"
page <- read_html(URL)
# table 11 contains the beer
@clayford
clayford / webscrape_offgrounds_housing.R
Last active September 12, 2018 20:11
R script to scrape housing information from University of Virginia off-Grounds Housing Service web site
# web scrape off-grounds housing
# for Jeff Boichuk
# 2018-09-12
# https://offgroundshousing.student.virginia.edu/
library(tidyverse)
library(rvest)
library(stringr)
library(pbapply)
@clayford
clayford / shapiro_levene_test_simulations.Rmd
Created September 26, 2018 18:31
Shapiro Test and Levene Test Simulations
---
title: "Shapiro Test and Levene Test Simulations"
author: "Clay Ford"
date: "September 25, 2018"
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
@clayford
clayford / rlm_with_bootstrap.R
Created September 27, 2018 15:52
Robust linear regression with bootstrapped standard errors
library(car)
library(MASS)
# generate data with slight non-constant variance
x1 <- gl(n = 3, k = 400, labels = c("A","B","C"))
x2 <- gl(n = 2, k = 600, labels = c("1","2"))
set.seed(1)
y <- 1 + 1.2*(x1 == "B") + 1.3*(x1 == "C") -0.5*(x2 == "2") +
@clayford
clayford / Moving_R_programs_to_Rivanna.md
Last active October 18, 2019 15:40
Notes from Moving R programs to Rivanna workshop, 10/17/19

Moving R programs to Rivanna notes

Workshop date: 10/17/2019

Acessing Rivanna

Home directory: 50 Mb storage
Scratch: 10 TB (90 day limit)
Can purchase storage; requires a PTAO

@clayford
clayford / pivot_longer_example.R
Last active March 6, 2020 19:55
pivot_longer versus gather
library(tidyverse)
d1 <- tibble(name = c("Clay", "Laura"),
score_1 = c(88, 99),
score_2 = c(77, 88),
score_3 = c(55, 66),
survey_1 = c(4, 5),
survey_2 = c(3, 3),
survey_3 = c(2, 5))
d1
# A tibble: 2 x 7
@clayford
clayford / simulate_proportional_odds_regression.R
Created June 25, 2020 14:54
Simulate data from a POLR model with proportional odds assumption satisfied and a POLR model without the assumption satisfied to demonstrate how the comparison to a multinomial logit model can provide some evidence of the proportional odds assumption or lack thereof.
# Clay Ford
# 2020-06-22
# Simulate data from a proportional odds model with proportional odds assumption
# satisfied.
# 300 observations and a grouping variable (example: democratic/republican)
n <- 300
set.seed(1)
grp <- sample(0:1, size = n, replace = TRUE)