Skip to content

Instantly share code, notes, and snippets.

@friendly
Last active February 12, 2023 16:36
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save friendly/89b9adc07a21477dbccb1445f4b99b33 to your computer and use it in GitHub Desktop.
Save friendly/89b9adc07a21477dbccb1445f4b99b33 to your computer and use it in GitHub Desktop.
Extract information about two-way terms in a loglinear model
#' Extract information about two-way terms in a loglinear model
#'
#' This function is designed for use with an association graph showing the pairwise
#' dependencies among categorical variables in a loglinear model for frequencies
#' fit using \code{\link[MASS]{loglm}} or \code{\link[stats]{glm}} in a poisson family.
#' The goal is to show
#' an association graph diagram with edges indicating the strength of associations
#' between each pair of variables.
#'
#' The weight for two-way terms can be found using the deviance or LRT or AIC statistic
#' for the effect of dropping a two-way term from the model or for adding a two-way
#' term to the model of mutual independence.
#'
#'
#' @param model A model object fit by \code{\link[MASS]{loglm}} or equivalent fit by
#' \code{\link[stats]{glm}}
#'
#' @param direction Either "backward" or "forward". Only "backward" is currently implemented.
#'
#' @return A data frame identifying the variables (V1, V2) in each two way term,
#' and statistics obtained from \code{\link[MASS]{dropterm}} for the effect on
#' model fit of dropping that term from the given model.
#'
#' @importFrom MASS addterm dropterm
#' @importFrom dplyr filter select
#' @importFrom tidyr separate
#'
#' @examples
#' data(DaytonSurvey, package = "vcdExtra")
#' Dayton_tab <- xtabs(Freq ~ ., data=DaytonSurvey)
#' structable(cigarette + alcohol + marijuana ~ sex + race, data=Dayton_tab)
#'
#' # A potential model: [A M] [A C] [M C] [A R] [A G] [R G]
#' dayton.loglm <- loglm(~ alcohol * (marijuana + cigarette + sex + race) +
#' marijuana * cigarette +
#' sex * race,
#' data=Dayton_tab)
#'
#' term_info(dayton.loglm)
#' term_info(dayton.loglm, direction="forward")
#'
#' # Fit the same model with glm()
#'
#' dayton.glm <- glm(Freq ~ alcohol * (marijuana + cigarette + sex + race) +
#' marijuana * cigarette +
#' sex * race,
#' family = poisson,
#' data=DaytonSurvey)
#' term_info(dayton.glm)
#'
#' # All two-way model
#' dayton.2way <- loglm(Freq ~ (cigarette + alcohol + marijuana + sex + race)^2,
#' data = DaytonSurvey)
#' term_info(dayton.2way)
term_info <- function(model,
direction = c("backward", "forward")) {
class <- class(model)
if(!inherits(model, c("loglm", "glm")))
stop(paste("term_info requires a model of class 'loglm' or 'glm', not one of", class))
dir <- match.arg(direction)
if (dir == "forward") stop("the `forward` method is not yet implemented")
info <- terms(model)
info <- data.frame(
order = attributes(info)$order,
label = attributes(info)$term.label)
terms_2way <- info |>
dplyr::filter(order==2) |>
tidyr::separate(label, c("v1", "v2")) |>
dplyr::select(-order)
if(dir == "backward")
dt <- MASS::dropterm(model, test = "Chisq")
else
# TODO: figure out proper scope to avoid 3+ way terms
# need to start with the independence model, add terms to that
dt <- MASS::addterm(model, test = "Chisq", scope = ~ . + .^2)
dt <- dt |>
as.data.frame() |>
tibble::rownames_to_column(var = "term") |>
dplyr::filter(term != "<none>") |>
# remove heading attribute
dplyr::mutate(across(everything(), as.vector))
# TODO: sort out the returned statistics
# loglm() delivers: AIC, LRT and Pr(Chi) as measures
# glm() also delivers Deviance; the AIC statistics are computed differently
res <- cbind(terms_2way, dt)
# perhaps add a `class` attribute to facilitate methods?
res
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment