Skip to content

Instantly share code, notes, and snippets.

@florianhartig
Last active September 5, 2020 09:37
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save florianhartig/e31f6e02b6d253b0e8036569902b810a to your computer and use it in GitHub Desktop.
Save florianhartig/e31f6e02b6d253b0e8036569902b810a to your computer and use it in GitHub Desktop.
This example shows how AIC selection, followed by a conventional regression analysis of the selected model, massively inflates false positives
# This example shows how AIC selection, followed by a conventional regression analysis of the selected model, massively inflates false positives. CC BY-NC-SA 4.0 Florian Hartig
set.seed(1)
library(MASS)
dat = data.frame(matrix(runif(20000), ncol = 100))
dat$y = rnorm(200)
fullModel = lm(y ~ . , data = dat)
summary(fullModel)
# 2 predictors out of 100 significant (on average, we expect 5 of 100 to be significant)
selection = stepAIC(fullModel)
summary(selection)
# voila, 15 out of 28 (before 100) predictors significant - looks like we could have good fun to discuss / publish these results!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment