Skip to content

Instantly share code, notes, and snippets.

@cbrown5
Created November 6, 2020 11:22
Show Gist options
  • Save cbrown5/f48498870a5da4edc153299d23e9eaf0 to your computer and use it in GitHub Desktop.
Save cbrown5/f48498870a5da4edc153299d23e9eaf0 to your computer and use it in GitHub Desktop.
# Why you should plot data before doing statistical tests
# CJ Brown 2020-11-06
#More at www.conservationhackers.org
library(ggplot2)
#
# Make some data
#
n <- 50 #Sample size per group
x <- 1:n
sd <- 1 #SD for errors
y1 <- x - 0.02*x^2-8 + rnorm(n, sd = sd)
y2 <- -1*x + 0.02*x^2 +8 + rnorm(n, sd = sd)
dat <- data.frame(x = c(x, x), y = c(y1, y2),
grp = rep(c("y1", "y2"), each = n))
#Do a t-test first - the wrong way
t.test(y1, y2)
#shows no 'significant' difference of mean y1 to mean y2
head(dat)
#Now plot the data
ggplot(dat) +
aes(x = x, y = y, color = grp) +
geom_point() +
stat_smooth()
#Clearly y1 and y2 are different
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment