Skip to content

Instantly share code, notes, and snippets.

@dsparks
Created December 18, 2012 22:31
Show Gist options
  • Star 21 You must be signed in to star a gist
  • Fork 9 You must be signed in to fork a gist
  • Save dsparks/4332698 to your computer and use it in GitHub Desktop.
Save dsparks/4332698 to your computer and use it in GitHub Desktop.
A coefficient plot for multiple models
doInstall <- TRUE
toInstall <- c("ggplot2")
if(doInstall){install.packages(toInstall, repos = "http://cran.us.r-project.org")}
lapply(toInstall, library, character.only = TRUE)
ANES <- read.csv("http://www.oberlin.edu/faculty/cdesante/assets/downloads/ANES.csv")
ANES <- ANES[ANES$year == 2008, -c(1, 11, 17)] # Limit to just 2008 respondents,
head(ANES) # remove some non-helpful variables
# Fit several models with the same DV:
model1 <- lm(pid7 ~ ideo7 + female + age + south, data = ANES)
model2 <- lm(pid7 ~ ideo7 + female + age + female:age, data = ANES)
model3 <- lm(pid7 ~ ideo7, data = ANES) # These are just arbitrary examples
# Put model estimates into temporary data.frames:
model1Frame <- data.frame(Variable = rownames(summary(model1)$coef),
Coefficient = summary(model1)$coef[, 1],
SE = summary(model1)$coef[, 2],
modelName = "South Indicator")
model2Frame <- data.frame(Variable = rownames(summary(model2)$coef),
Coefficient = summary(model2)$coef[, 1],
SE = summary(model2)$coef[, 2],
modelName = "Age Interaction")
model3Frame <- data.frame(Variable = rownames(summary(model3)$coef),
Coefficient = summary(model3)$coef[, 1],
SE = summary(model3)$coef[, 2],
modelName = "Univariate")
# Combine these data.frames
allModelFrame <- data.frame(rbind(model1Frame, model2Frame, model3Frame)) # etc.
# Specify the width of your confidence intervals
interval1 <- -qnorm((1-0.9)/2) # 90% multiplier
interval2 <- -qnorm((1-0.95)/2) # 95% multiplier
# Plot
zp1 <- ggplot(allModelFrame, aes(colour = modelName))
zp1 <- zp1 + geom_hline(yintercept = 0, colour = gray(1/2), lty = 2)
zp1 <- zp1 + geom_linerange(aes(x = Variable, ymin = Coefficient - SE*interval1,
ymax = Coefficient + SE*interval1),
lwd = 1, position = position_dodge(width = 1/2))
zp1 <- zp1 + geom_pointrange(aes(x = Variable, y = Coefficient, ymin = Coefficient - SE*interval2,
ymax = Coefficient + SE*interval2),
lwd = 1/2, position = position_dodge(width = 1/2),
shape = 21, fill = "WHITE")
zp1 <- zp1 + coord_flip() + theme_bw()
zp1 <- zp1 + ggtitle("Comparing several models")
print(zp1) # The trick to these is position_dodge().
@gavago
Copy link

gavago commented Jun 13, 2017

A wonderful example. I added position_dodge() to my existing code to make the comparisons I wanted.

I have a question though regarding the order of the colour factor levels...

My data frame coefs:

     Estimate         se     lowerCI     upperCI  dyads            vars
3   0.106478735 0.06169716 -0.01444769  0.22740516    All         Kinship
2  -2.591911491 0.20060472 -2.98509674 -2.19872624    All  Age difference
4  -0.169305922 0.07252870 -0.31146217 -0.02714967    All Rank difference
31  0.007409558 0.10206631 -0.19264041  0.20745952 Female         Kinship
21 -1.944700523 0.25320891 -2.44099000 -1.44841105 Female  Age difference
41 -0.220988389 0.10169709 -0.42031468 -0.02166210 Female Rank difference
32  0.137868642 0.06934290  0.00195656  0.27378072   Male         Kinship
22 -2.880896180 0.24594286 -3.36294418 -2.39884818   Male  Age difference
42 -0.136386916 0.08218800 -0.29747540  0.02470157   Male Rank difference

My code:

g<-ggplot(coefs, aes(vars, Estimate, colour=dyads)) + 
  theme(axis.title.x =element_text(margin=margin(0,20,0,0)))+
  theme(axis.title.x = element_blank())+
  
  geom_hline(yintercept=0, lty=2, lwd=1, colour="grey50") +
  
  geom_errorbar(aes(ymin=lowerCI, ymax=upperCI), 
                lwd=1, width=0, position = position_dodge(width = 0.5)) + 
  labs(x ="")+
  geom_point(size=4, pch=21,position = position_dodge(width = 0.5)) +
  
  theme(panel.background = element_rect(fill = "white"))
g

When I add coord_flip() at the end of this code, the lines are no longer ordered by the levels of the factor coefs$dyads. As you can see in your example also, @dsparks, South Indicator is the first level of the factor modelName, but its lines are below the lines of Age Interaction and Univariate. Do you know how to custom rearrange these lines such that they appear in the same order as the levels of the colour factor?

I realize I can simply omit coord_flip(), save the figure, and rotate it by hand, but rotating the coordinates beforehand is, of course, much more desirable.

Thanks for any insight.

@superdayv
Copy link

superdayv commented Oct 28, 2017

Is it possible to do this with models that were fit using "brms"?

@wraymond
Copy link

This is awesome! Thanks.

@stapial
Copy link

stapial commented Mar 7, 2018

This is awesome, thanks!

Quick question, is there anyway to do the same but have each model in a different column and not stacked? I have 6 binary regression models (they don't have the same dependent variable). It looks messy when I lump them all together. I would prefer to have a column for each model with its respective coefficients in the same plot. Any ideas on how to do this???

Thanks!

@ksorby
Copy link

ksorby commented Mar 12, 2018

@stapial
You can use ggplot2's facet functions to split the models into columns:

zp1<- zp1 + facet_grid(.~modelName)

@stapial
Copy link

stapial commented Mar 16, 2018

perfect! Thank you so much @ksorby!

@pgiro14
Copy link

pgiro14 commented Jul 10, 2021

This is exactly what I needed. Thank you!

@Tanj-ko
Copy link

Tanj-ko commented Jul 8, 2022

Thank you so much for this walk-through! Now, just one question: How can I select only certain variables to depict in the figure? Or at least drop the intercept?

@Tanj-ko
Copy link

Tanj-ko commented Aug 29, 2022

Thank you so much for this walk-through! Now, just one question: How can I select only certain variables to depict in the figure? Or at least drop the intercept?

I found a solution:
allModelFrame <- subset(allModelFrame, allModelFrame$Variable == "variable1" |
allModelFrame$Variable == "variable2")

Also, if you want heteroscedastic standard errors, define them for example as so: coff_model1 <- coeftest(model1, vcov = vcovHC)
and plug it into the code:
model1.Frame <- data.frame(Variable = rownames(summary(model1)$coef),
Coefficient = summary(model1)$coef[, 1],
SE = coff_model1[, 2],
modelName = "model1")

Hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment