Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save JeremyMcCormick/611909cdc414bfff5a7b7d459dbfa331 to your computer and use it in GitHub Desktop.
Save JeremyMcCormick/611909cdc414bfff5a7b7d459dbfa331 to your computer and use it in GitHub Desktop.
lm_cv5.R
data = read.csv("mydata.csv")
n = nrow(data)
nfolds = 5 # 5-fold CV
groups = rep(1:nfolds, length=n)
set.seed(2)
cvgroups = sample(groups, n)
allpredicted = rep(NA, n)
for (ii in 1:nfolds) {
group_ii = (cvgroups == ii)
train_data = data[!group_ii,]
test_data = data[group_ii,]
model_fit = lm(model, data = train_data)
predicted = predict(model_fit, newdata = test_data)
allpredicted[group_ii] = predicted
}
# Compute CV value
y = data$y
cv_value = mean((y - allpredicted)^2)
# Print CV value
print(cv_value)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment