Workshop date: 10/17/2019
Home directory: 50 Mb storage
Scratch: 10 TB (90 day limit)
Can purchase storage; requires a PTAO
# create a blank plot | |
plot(x=c(10,20),y=c(0,40), type="n") | |
abline(v=15,lty=2) | |
# plot line segments to represent confidence intervals | |
# add red lines for CI's that do not capture true mean of 15 | |
# add dots for means | |
for(i in 1:40){ | |
x <- rnorm(30,15,2) |
Simulating responses from a linear model | |
========================= | |
Say you fit a model using R's ` lm()` function. What you get in return are coefficients that represent an estimate of the linear function that gave rise to the data. The assumption is the response of the model is normally distributed with a mean equal to the linear function and a standard deviation equal to the standard deviation of the residuals. Using notation we express this as follows for a simple linear model with intercept and slope coefficients: | |
$latex Y_{i} \sim N(\beta_{0} + \beta_{1}x_{i},\sigma)$ | |
The implication of this in the world of statistical computing is that we can simulate responses given values of *x*. R makes this wonderfully easy with its ` simulate()` function. Simply pass it the model object and the number of simulations you want and it returns a matrix of simulated responses. A quick example: | |
```{r} |
# function to simulate proportional sum of exponential variables | |
# p = p1,p2,p3,...,pn sum to one | |
# r = rates for exponential random variables | |
sumExp <- function(p,r){ | |
if(sum(p) != 1) stop("p does not sum to 1") | |
if(length(p) != length(r)) stop("p and r not equal lengths") | |
x <- rexp(length(r),rate = r) | |
crossprod(p,x) | |
} |
# craftcans.com - web site devoted to canned craft beer | |
# scrape craftcans.com database | |
library(rvest) | |
library(magrittr) # for extract() | |
library(stringr) | |
URL <- "http://www.craftcans.com/db.php?search=all&sort=beerid&ord=desc&view=text" | |
page <- read_html(URL) | |
# table 11 contains the beer |
# web scrape off-grounds housing | |
# for Jeff Boichuk | |
# 2018-09-12 | |
# https://offgroundshousing.student.virginia.edu/ | |
library(tidyverse) | |
library(rvest) | |
library(stringr) | |
library(pbapply) |
--- | |
title: "Shapiro Test and Levene Test Simulations" | |
author: "Clay Ford" | |
date: "September 25, 2018" | |
output: pdf_document | |
--- | |
```{r setup, include=FALSE} | |
knitr::opts_chunk$set(echo = TRUE) | |
``` |
library(car) | |
library(MASS) | |
# generate data with slight non-constant variance | |
x1 <- gl(n = 3, k = 400, labels = c("A","B","C")) | |
x2 <- gl(n = 2, k = 600, labels = c("1","2")) | |
set.seed(1) | |
y <- 1 + 1.2*(x1 == "B") + 1.3*(x1 == "C") -0.5*(x2 == "2") + |
library(tidyverse) | |
d1 <- tibble(name = c("Clay", "Laura"), | |
score_1 = c(88, 99), | |
score_2 = c(77, 88), | |
score_3 = c(55, 66), | |
survey_1 = c(4, 5), | |
survey_2 = c(3, 3), | |
survey_3 = c(2, 5)) | |
d1 | |
# A tibble: 2 x 7 |
# Clay Ford | |
# 2020-06-22 | |
# Simulate data from a proportional odds model with proportional odds assumption | |
# satisfied. | |
# 300 observations and a grouping variable (example: democratic/republican) | |
n <- 300 | |
set.seed(1) | |
grp <- sample(0:1, size = n, replace = TRUE) |