Skip to content

Instantly share code, notes, and snippets.

@rinze
Last active August 29, 2015 14:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rinze/dbf8f0135c08230dfb34 to your computer and use it in GitHub Desktop.
Save rinze/dbf8f0135c08230dfb34 to your computer and use it in GitHub Desktop.
Just a little reminder: be careful not to use leave-one-out with a perfectly balanced problem
library(C50)
# Test data
group <- c(rep('groupA', 10), rep('groupB', 10))
data <- data.frame(group = group, var = c(rep(0, 10), rep(0, 10)))
# Leave-one-out
probs <- lapply(1:nrow(data), function(i) {
train <- data[-i, ]
test <- data[i, ]
model <- C5.0(group ~ ., train)
return(predict(model, test, type = "prob"))
})
probs <- do.call(rbind, probs)
# OOOOOOOOPS
cat("Wrong!\n")
print(probs)
# 5-fold CV
kfolds <- createFolds(data$group, 5)
probs <- lapply(kfolds, function(i) {
train <- data[-i, ]
test <- data[i, ]
model <- C5.0(group ~ ., train)
return(predict(model, test, type = "prob"))
})
probs <- do.call(rbind, probs)
probs <- probs[order(as.numeric(rownames(probs))), ]
# Good
cat("Yes!\n")
print(probs)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment