Skip to content

Instantly share code, notes, and snippets.

@vincentarelbundock
Last active February 7, 2022 21:38
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vincentarelbundock/eb5284a4afd9e543acbf72bd8a34e8ca to your computer and use it in GitHub Desktop.
Save vincentarelbundock/eb5284a4afd9e543acbf72bd8a34e8ca to your computer and use it in GitHub Desktop.
Bayesian bootstrap in `R` with `update()`
library(fixest)
library(data.table)
library(insight)
bboot_vab <- function(model, reps = 1e3, conf.level = .95, cluster = NULL) {
data <- insight::get_data(model)
setDT(data)
if (!is.null(cluster)) {
if (anyNA(data[[cluster]])) {
stop("The cluster variable cannot include missing values.")
}
cluster_idx <- unique(data[[cluster]])
wts <- rexp(reps * length(cluster_idx))
wts <- matrix(wts, ncol = reps)
wts <- data.table(cluster_idx, wts)
colnames(wts)[1] <- cluster
wts <- merge(data, wts, all.x = TRUE, sort = FALSE)
wts <- as.matrix(wts[, (ncol(data)+1):ncol(wts)]) } else {
wts <- rexp(reps * nrow(data))
wts <- matrix(wts, ncol = reps)
}
fit <- function(i) {
coef(update(model, weights = wts[, i]))
}
boot_mat <- t(sapply(seq_len(reps), fit))
alpha <- 1 - conf.level
ci <- c(alpha / 2, 1 - alpha / 2)
out <- t(apply(boot_mat, 2, \(x) c(mean(x), quantile(x, ci))))
return(out)
}
model <- fepois(Sepal.Length ~ Sepal.Width + Petal.Length | Species,
data = iris)
bboot_vab(model, cluster = "Species")
@julianhinz
Copy link

Line 18 can be a problem when more than one set of fixed effects is included. More flexible:

wts <- as.matrix(wts[, (ncol(data)+1):ncol(wts)])

@vincentarelbundock
Copy link
Author

Won't ncol(data)+1 be out of range?

@julianhinz
Copy link

julianhinz commented Feb 7, 2022

I don't think so, as in line 17 you merge wts to data. So in effect with (ncol(data)+1):ncol(wts) you only retain the wts part.

@vincentarelbundock
Copy link
Author

Ah, got it! Read too quickly.

To be fair, this is something I wrote in 10 minutes as a way to understand this thing. I didn't expect anyone to look at it it ;)

Made the change.

@julianhinz
Copy link

:) It does the job!

Follow up question that I have just now been looking into: Any intuition on how to allow for multiway clustering?

@vincentarelbundock
Copy link
Author

Maybe line 12 you would create a unique ID for each combination of your various cluster variables? A row-wise apply(data[[cluster_variables]], 1, paste) would work, maybe?

@julianhinz
Copy link

Not sure whether that does the trick. I'll have a look sometime, if I find something, I'll post it here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment