Skip to content

Instantly share code, notes, and snippets.

@alexpkeil1
Last active June 29, 2024 00:37
Show Gist options
  • Save alexpkeil1/1887e2f87d92e682c517dd8fe74415b4 to your computer and use it in GitHub Desktop.
Save alexpkeil1/1887e2f87d92e682c517dd8fe74415b4 to your computer and use it in GitHub Desktop.
Using a GEE-like approach to address clustering or longitudinal data with qgcomp
# qgcomp with a gee like approach using bootstrapping or an estimating equation based approach - useful for clustered or longitudinal data when the interest is in effect of x -> y, pooled over multiple time points
library(qgcomp)
library(qgcompint)
set.seed(50)
####### simulate some clustered data -----
# linear model, binary modifier
# simulate cluster specific exposures and outcome means by just treating these like independent observations
dat <- qgcompint::simdata_quantized_emm(outcometype = "continuous",
n = 100,
corr=c(.8, .5, 0.1),
mainterms = c(0.2, 0.1, -0.3, 0.0),
prodterms = c(0.2, 0.1, -0.3, 0.0),
ztype = "continuous"
)
dat$ID = 1:nrow(dat) # cluster/individual ID
clustdat <- rbind(dat, dat) # 2 observations per cluster
clustdat$y <- clustdat$y + rnorm(n=nrow(dat)*2, sd=0.5) # each cluster has two outcomes that are normally distributed with a cluster specific mean
####### analyze clustered data using qgcomp -----
# here we just ignore the simulated effect measure modification
# not cluster appropriate in terms of standard errors, but weights are appropriate because point estimates don't change (for linear model, at least)
(qfit_wrong <- qgcomp.noboot(f=y ~ z + x1 + x2 + x3 + x4,
expnms = paste0("x",1:4), data=clustdat, q=4, family=gaussian()))
# cluster appropriate standard errors using bootstrap
(qfit_long <- qgcomp.boot(f=y ~ z + x1 + x2 + x3 + x4, id="ID",
expnms = paste0("x",1:4), data=clustdat, q=4, family=gaussian()))
# cluster appropriate standard errors using estimating equations
(qfit_long2 <- qgcomp.glm.ee(f=y ~ z + x1 + x2 + x3 + x4, id="ID",
expnms = paste0("x",1:4), data=clustdat, q=4, family=gaussian()))
####### analyze clustered data using qgcompint -----
# not cluster appropriate in terms of standard errors, but weights are appropriate because point estimates don't change (for linear model, at least)
(qfit_emm_wrong <- qgcomp.emm.noboot(f=y ~ z + x1 + x2 + x3 + x4, emmvar="z",
expnms = paste0("x",1:4), data=clustdat, q=4, family=gaussian()))
# cluster appropriate standard errors
(qfit_emm_long <- qgcomp.emm.boot(f=y ~ z + x1 + x2 + x3 + x4, emmvar="z", id="ID",
expnms = paste0("x",1:4), data=clustdat, q=4, family=gaussian()))
# cluster appropriate standard errors using estimating equations
# not yet implemented
@wpRunningSnail
Copy link

Is this method applicable to high-dimensional repeated measures Data?

@alexpkeil1
Copy link
Author

It depends on what you mean by "high dimensional," but it is intended for repeated measures data in the same way a GEE works, but focused on the joint effect of a number of variables. I doubt it will work for p>n problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment