Skip to content

Instantly share code, notes, and snippets.

@tobigithub
Created September 24, 2015 17:37
Show Gist options
  • Save tobigithub/ac71d43ae98a2aad2e21 to your computer and use it in GitHub Desktop.
Save tobigithub/ac71d43ae98a2aad2e21 to your computer and use it in GitHub Desktop.
# reproducible caret models
# http://stackoverflow.com/questions/13403427/fully-reproducible-parallel-models-using-caret
library(doParallel); library(caret)
#create a list of seed, here change the seed for each resampling
set.seed(123)
seeds <- vector(mode = "list", length = 11)#length is = (n_repeats*nresampling)+1
for(i in 1:10) seeds[[i]]<- sample.int(n=1000, 3) #(3 is the number of tuning parameter, mtry for rf, here equal to ncol(iris)-2)
seeds[[11]]<-sample.int(1000, 1)#for the last model
#control list
myControl <- trainControl(method='cv', seeds=seeds, index=createFolds(iris$Species))
#run model in parallel
cl <- makeCluster(detectCores())
registerDoParallel(cl)
model1 <- train(Species~., iris, method='rf', trControl=myControl)
model2 <- train(Species~., iris, method='rf', trControl=myControl)
stopCluster(cl)
#compare
all.equal(predict(model1, type='prob'), predict(model2, type='prob'))
[1] TRUE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment