Clay Ford clayford

## plot_ci.r
# create a blank plot
plot(x=c(10,20),y=c(0,40), type="n")
abline(v=15,lty=2)


# plot line segments to represent confidence intervals
# add red lines for CI's that do not capture true mean of 15
# add dots for means
for(i in 1:40){
  x <- rnorm(30,15,2)

## simulate_resp_1.Rmd
Simulating responses from a linear model
=========================
Say you fit a model using R's ` lm()` function. What you get in return are coefficients that represent an estimate of the linear function that gave rise to the data. The assumption is the response of the model is normally distributed with a mean equal to the linear function and a standard deviation equal to the standard deviation of the residuals. Using notation we express this as follows for a simple linear model with intercept and slope coefficients:

$latex Y_{i} \sim N(\beta_{0} + \beta_{1}x_{i},\sigma)$

The implication of this in the world of statistical computing is that we can simulate responses given values of *x*. R makes this wonderfully easy with its ` simulate()` function. Simply pass it the model object and the number of simulations you want and it returns a matrix of simulated responses. A quick example:


```{r}

## sumExp.R
# function to simulate proportional sum of exponential variables
# p = p1,p2,p3,...,pn sum to one
# r = rates for exponential random variables
sumExp <- function(p,r){
  if(sum(p) != 1) stop("p does not sum to 1")
  if(length(p) != length(r)) stop("p and r not equal lengths")
  x <- rexp(length(r),rate = r)
  crossprod(p,x)
}

## scrape_craftcans.R
# craftcans.com - web site devoted to canned craft beer
# scrape craftcans.com database

library(rvest)
library(magrittr) # for extract()
library(stringr)
URL <- "http://www.craftcans.com/db.php?search=all&sort=beerid&ord=desc&view=text"
page <- read_html(URL)

# table 11 contains the beer

## webscrape_offgrounds_housing.R
# web scrape off-grounds housing
# for Jeff Boichuk
# 2018-09-12

# https://offgroundshousing.student.virginia.edu/

library(tidyverse)
library(rvest)
library(stringr)
library(pbapply)

## shapiro_levene_test_simulations.Rmd
---
title: "Shapiro Test and Levene Test Simulations"
author: "Clay Ford"
date: "September 25, 2018"
output: pdf_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## rlm_with_bootstrap.R


library(car)
library(MASS)

# generate data with slight non-constant variance
x1 <- gl(n = 3, k = 400, labels = c("A","B","C"))
x2 <- gl(n = 2, k = 600, labels = c("1","2"))
set.seed(1)
y <- 1 + 1.2*(x1 == "B") + 1.3*(x1 == "C") -0.5*(x2 == "2") +

## Moving_R_programs_to_Rivanna.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                clayford
                / Moving_R_programs_to_Rivanna.md
            
            
              Last active
              October 18, 2019 15:40
            
              
                Notes from Moving R programs to Rivanna workshop, 10/17/19
              
          
    Moving R programs to Rivanna notes

Workshop date: 10/17/2019
Acessing Rivanna

Home directory: 50 Mb storage

Scratch: 10 TB (90 day limit)

Can purchase storage; requires a PTAO

  
## pivot_longer_example.R
library(tidyverse)
d1 <- tibble(name = c("Clay", "Laura"),
             score_1 = c(88, 99),
             score_2 = c(77, 88),
             score_3 = c(55, 66),
             survey_1 = c(4, 5),
             survey_2 = c(3, 3),
             survey_3 = c(2, 5))
d1
# A tibble: 2 x 7

## simulate_proportional_odds_regression.R
# Clay Ford
# 2020-06-22

# Simulate data from a proportional odds model with proportional odds assumption
# satisfied.

# 300 observations and a grouping variable (example: democratic/republican)
n <- 300
set.seed(1)
grp <- sample(0:1, size = n, replace = TRUE)
	# create a blank plot
	plot(x=c(10,20),y=c(0,40), type="n")
	abline(v=15,lty=2)


	# plot line segments to represent confidence intervals
	# add red lines for CI's that do not capture true mean of 15
	# add dots for means
	for(i in 1:40){
	x <- rnorm(30,15,2)
	Simulating responses from a linear model
	=========================
	Say you fit a model using R's ` lm()` function. What you get in return are coefficients that represent an estimate of the linear function that gave rise to the data. The assumption is the response of the model is normally distributed with a mean equal to the linear function and a standard deviation equal to the standard deviation of the residuals. Using notation we express this as follows for a simple linear model with intercept and slope coefficients:

	$latex Y_{i} \sim N(\beta_{0} + \beta_{1}x_{i},\sigma)$

	The implication of this in the world of statistical computing is that we can simulate responses given values of x. R makes this wonderfully easy with its ` simulate()` function. Simply pass it the model object and the number of simulations you want and it returns a matrix of simulated responses. A quick example:


	```{r}
	# function to simulate proportional sum of exponential variables
	# p = p1,p2,p3,...,pn sum to one
	# r = rates for exponential random variables
	sumExp <- function(p,r){
	if(sum(p) != 1) stop("p does not sum to 1")
	if(length(p) != length(r)) stop("p and r not equal lengths")
	x <- rexp(length(r),rate = r)
	crossprod(p,x)
	}
	# craftcans.com - web site devoted to canned craft beer
	# scrape craftcans.com database

	library(rvest)
	library(magrittr) # for extract()
	library(stringr)
	URL <- "http://www.craftcans.com/db.php?search=all&sort=beerid&ord=desc&view=text"
	page <- read_html(URL)

	# table 11 contains the beer
	# web scrape off-grounds housing
	# for Jeff Boichuk
	# 2018-09-12

	# https://offgroundshousing.student.virginia.edu/

	library(tidyverse)
	library(rvest)
	library(stringr)
	library(pbapply)
	---
	title: "Shapiro Test and Levene Test Simulations"
	author: "Clay Ford"
	date: "September 25, 2018"
	output: pdf_document
	---

	```{r setup, include=FALSE}
	knitr::opts_chunk$set(echo = TRUE)
	```


	library(car)
	library(MASS)

	# generate data with slight non-constant variance
	x1 <- gl(n = 3, k = 400, labels = c("A","B","C"))
	x2 <- gl(n = 2, k = 600, labels = c("1","2"))
	set.seed(1)
	y <- 1 + 1.2(x1 == "B") + 1.3(x1 == "C") -0.5*(x2 == "2") +
	library(tidyverse)
	d1 <- tibble(name = c("Clay", "Laura"),
	score_1 = c(88, 99),
	score_2 = c(77, 88),
	score_3 = c(55, 66),
	survey_1 = c(4, 5),
	survey_2 = c(3, 3),
	survey_3 = c(2, 5))
	d1
	# A tibble: 2 x 7
	# Clay Ford
	# 2020-06-22

	# Simulate data from a proportional odds model with proportional odds assumption
	# satisfied.

	# 300 observations and a grouping variable (example: democratic/republican)
	n <- 300
	set.seed(1)
	grp <- sample(0:1, size = n, replace = TRUE)