Skip to content

Instantly share code, notes, and snippets.

@drsimonj
Last active June 6, 2017 01:46
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save drsimonj/5b2cfc428fce350676db5dc77c059052 to your computer and use it in GitHub Desktop.
Save drsimonj/5b2cfc428fce350676db5dc77c059052 to your computer and use it in GitHub Desktop.
Snippets of code that demonstrate issues with model and predict functions, serving as a motivator for the twidlr package
## predict for Analysic of Variance (aov) searches for object in global environment
d <- datasets::mtcars
fit <- aov(hp ~ am * cyl, d)
predict(fit)
d <- NULL
predict(d)
## predict for Principal Components (prcomp) can't recreate new variables defined in formula
fit <- prcomp(~.*., mtcars[1:25, ])
predict(fit, mtcars[26:32,])
## predict for Linear Mixed Effects Model (lmer from lme4 package) doesn't match when data omitted or included as `newdata`
d <- datasets::airquality
fit <- lme4::lmer(Ozone ~ Wind + (Wind | Month), d)
nrow(d)
length(lme4:::predict.merMod(fit))
length(lme4:::predict.merMod(fit, newdata = d))
## predict for Linear Discriminant Analysis (lda from MASS package) searches for object in global environment
d <- iris[c(1:5, 51:55, 101:105),]
fit <- MASS::lda(Species ~ ., d)
predict(fit)$class
d <- d[1:3, ]
predict(fit)$class
rm(d)
predict(fit)$class
# Generalized Additive Models for Location Scale and Shape (gamlss from gamlss package)
# searches for original data in global environment when predicting new data
d <- datasets::mtcars[1:20,]
fit <- gamlss::gamlss(vs ~ hp + wt, data = d, family = gamlss.dist::BI())
gamlss:::predict.gamlss(fit, newdata = d[1:5,])
d <- d[1:10,]
gamlss:::predict.gamlss(fit, newdata = d[1:5,])
rm(d)
gamlss:::predict.gamlss(fit, newdata = d[1:5,])
# predict for Generalized Linear Models omits missing values unless data is explicitly added
d <- datasets::iris
d[1,1] <- NA
a <- predict(stats::glm(Species=="setosa" ~ ., data = d))
b <- predict(stats::glm(Species=="setosa" ~ ., data = d), newdata = d)
length(a)
length(b)
sum(is.na(a))
sum(is.na(b))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment