Skip to content

Instantly share code, notes, and snippets.

@briandk
Created September 3, 2015 14:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save briandk/069cfe5e40c80271aaa1 to your computer and use it in GitHub Desktop.
Save briandk/069cfe5e40c80271aaa1 to your computer and use it in GitHub Desktop.
When a bunch of data generates a nicely-fit line with a nearly zero slope, R^2 approaches zero. Why? The line clearly fits well. We think the answer is because R^2 assesses the suitability of a slope-dependent term in the linear model. And a nearly-horizontal line doesn't need a slope-dependent term in its model.
library(ggplot2)
library(magrittr)
set.seed(1001)
time_spent_lecturing <- c(
0.09
, 0.14
, 0.21
, 0.76
, 0.82
)
enrollment <- c(
183
, 111
, 149
, 146
, 146
)
jt_data <- data.frame(x, y)
p <- ggplot(
aes(
x = time_spent_lecturing,
y = enrollment)
, data = jt_data
)
p <- p + geom_point()
p <- p + geom_smooth(method = "lm")
print(p)
fake_y <- rnorm(n = 1000, mean = 1000, sd = 0.1)
fake_x <- 1:1000
p <- ggplot(aes(x = fake_x, y = fake_y), data = data.frame(fake_x, fake_y)) + geom_point() + geom_smooth(method = "lm")
print(p)
lm(fake_y ~ fake_x) %>% summary()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment