Skip to content

Instantly share code, notes, and snippets.

@tobigithub
Last active September 23, 2015 03:14
Show Gist options
  • Save tobigithub/7cd430b9b77a6e863082 to your computer and use it in GitHub Desktop.
Save tobigithub/7cd430b9b77a6e863082 to your computer and use it in GitHub Desktop.
# things that affect speeds of random forests
# http://stackoverflow.com/questions/14106010/parallel-execution-of-random-forest-in-r/15771458#15771458
# also
# http://stackoverflow.com/questions/23075506/how-to-improve-randomforest-performance?lq=1
# also
#
Setting .multicombine to TRUE can make a significant difference:
````R
rf <- foreach(ntree=rep(25000, 6), .combine=combine, .multicombine=TRUE,
.packages='randomForest') %dopar% {
randomForest(x, y, ntree=ntree)
}
````
This causes combine to be called once rather than five times. On my desktop machine, this runs in 8 seconds rather than 19 seconds.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment