Skip to content

Instantly share code, notes, and snippets.

@jie-nissel
Forked from ramhiser/random-forest.r
Created August 31, 2016 20:08
Show Gist options
  • Save jie-nissel/df6a5263d84280be39bbfe02a0451199 to your computer and use it in GitHub Desktop.
Save jie-nissel/df6a5263d84280be39bbfe02a0451199 to your computer and use it in GitHub Desktop.
Plots Variable Importance from Random Forest in R
library(randomForest)
library(dplyr)
library(ggplot2)
set.seed(42)
rf_out <- randomForest(Species ~ ., data=iris)
# Extracts variable importance (Mean Decrease in Gini Index)
# Sorts by variable importance and relevels factors to match ordering
var_importance <- data_frame(variable=setdiff(colnames(iris), "Species"),
importance=as.vector(importance(rf_out)))
var_importance <- arrange(var_importance, desc(importance))
var_importance$variable <- factor(var_importance$variable, levels=var_importance$variable)
p <- ggplot(var_importance, aes(x=variable, weight=importance, fill=variable))
p <- p + geom_bar() + ggtitle("Variable Importance from Random Forest Fit")
p <- p + xlab("Demographic Attribute") + ylab("Variable Importance (Mean Decrease in Gini Index)")
p <- p + scale_fill_discrete(name="Variable Name")
p + theme(axis.text.x=element_blank(),
axis.text.y=element_text(size=12),
axis.title=element_text(size=16),
plot.title=element_text(size=18),
legend.title=element_text(size=16),
legend.text=element_text(size=12))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment