Skip to content

Instantly share code, notes, and snippets.

@noamross
Created May 31, 2023 00:18
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save noamross/e58e63389e54e24919b20184d574e0a6 to your computer and use it in GitHub Desktop.
Save noamross/e58e63389e54e24919b20184d574e0a6 to your computer and use it in GitHub Desktop.
Example code for an ordered categorical model
library(mgcv)
library(ggplot2)
library(dplyr)
## Simulate data with two categories and outcomes in 1:10
## Outcomes must be positive nonzero integers, transform if necessary
set.seed(3);
dat$Group = as.factor(paste("Group", rep(1:2, each = 200)))
dat$y = as.integer(round(pmax(pmin(rnorm(400, ifelse(dat$Group == "Group 1", 4, 6), 3), 10), 1)))
# Plot the data
ggplot(dat, aes(x=as.numeric(y), fill=Group)) +
geom_histogram(bins = 10, col="grey10") +
facet_wrap(~Group, nrow=2) +
theme(legend.position = "none") +
scale_x_continuous(breaks = 1:10)
## fit ordered categorical (ocat) model to data by group, give number of categories
model <- gam(y ~ Group,family=ocat(R=10),data=dat)
plot(model,pages=1, all.terms = TRUE) # Show intercepts of two groups, in latent terms
# Look at model summary. See that the effect of group is significant
summary(model)
model$family$getTheta(trans = TRUE) ## the estimated cut points, transforming latent predictions to integer outcomes
## predict probabilities of being in each category
preds <- predict(model, newdata = data.frame(Group = as.factor(c("Group 1", "Group 2"))), type="response")
preds
pred_dat <- tibble(
Group = as.factor(paste("Group", rep(1:2, each = 10))),
x = rep(1:10, 2),
y = c(preds[1,], preds[2,])*200
)
# Plot data and model predictions
ggplot(dat, aes(x=as.numeric(y), fill=Group)) +
geom_histogram(bins = 10, col="grey10") +
geom_bar(data = pred_dat, mapping = aes(x=x, y = y), col = "yellow", fill = NA, linetype = 2, linewidth = 1, stat = "identity") +
facet_wrap(~Group, nrow=2) +
labs(title = "Ordered Categorical Latent Model Fit with mgcv", y = "Count (filled bars) or Prediction (dotted)", x = "Numeric Category") +
scale_x_continuous(breaks = 1:10) +
theme(legend.position = "none")
@noamross
Copy link
Author

Screenshot 2023-05-30 at 8 11 12 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment