Skip to content

Instantly share code, notes, and snippets.

@LeeMendelowitz
Last active August 29, 2015 14:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save LeeMendelowitz/f6da6480eb87cb55c1ad to your computer and use it in GitHub Desktop.
Save LeeMendelowitz/f6da6480eb87cb55c1ad to your computer and use it in GitHub Desktop.
Predict.lm not working for matrix data
# Make a dataset where Y is cubic in X, plus noise:
n <- 1000
x <- runif(n)
x <- sort(x)
eps <- rnorm(n)
beta <- c(1, 2, 3)
X <- poly(x, 3, raw = TRUE)
y <- X %*% beta + eps
# Fit the data
data <- data.frame(X = X, y = y)
# This formula will not work for predicting, predictions will be wrong length.
lm.fit <- lm(y ~ X, data = data)
# This explicit formula works for predictions.
lm.fit2 <- lm(y ~ X.1 + X.2 + X.3, data = data)
# This will also work for predictions, but can handle variable number of columns in the
# predictor matrix:
predictor.names <- grep('X', names(data), value = TRUE)
lm.fit.formula <- sprintf("y ~ %s", paste(predictor.names, collapse = " + "))
lm.fit3 <- lm(lm.fit.formula, data = data)
# Plot the data
#quartz()
plot(x, y)
lines(x, predict(lm.fit), col = "red", lwd = 3)
log.out <- function(msg) {
cat(msg, "\n")
}
make.predict <- function(lm.model, new.x) {
new.X <- poly(new.x, 3, raw = TRUE)
new.data <- data.frame(X = new.X)
new.prediction <- predict(lm.model, newdata = new.data)
if ( length(new.x) != length(new.prediction) ) {
log.out("Warning: length(new.x) != length(new.prediction)\n")
} else {
log.out("Prediction length is okay!\n")
}
}
# This will not work, prediction is wrong length
log.out("Predicting with lm.fit: ")
make.predict(lm.fit, new.x)
# This works, prediction is correct length
log.out("Predicting with lm.fit2: ")
make.predict(lm.fit2, new.x)
log.out("Prediction with lm.fit3: ")
make.predict(lm.fit3, new.x)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment