Skip to content

Instantly share code, notes, and snippets.

@daob
Last active June 17, 2021 09:22
Show Gist options
  • Save daob/c3fa6af06dfd8639ae9d832ff15e5216 to your computer and use it in GitHub Desktop.
Save daob/c3fa6af06dfd8639ae9d832ff15e5216 to your computer and use it in GitHub Desktop.
library(tidyverse)
library(polycor) # for hetcor
set.seed(7567)
N <- 1e4 # Sample size
J <- 100 # Number of groups
ρ <- 0.7 # Within-group correlation
τ <- c(-Inf, -2, 0, 1, Inf) # Thresholds for categories
K <- length(τ) - 1 # Number of categories
z <- sample(1:J, size = N, replace = TRUE) # Group
θ <- rnorm(J, sd = 2) # Group means
# LRVs
x_star <- rnorm(N) + θ[z]
y_star <- ρ * x_star + sqrt(1 - ρ^2) * rnorm(N)
x <- as.numeric(cut(x_star, breaks = τ))
df <- tibble(x, x_star, y_star, z)
# Calculate group-wise statistics (mean, sd, cov, cor, and polychoric cor)
res <- df %>%
group_by(z) %>%
summarize(
mean_x = mean(x),
group_sd_x = sd(x),
group_cor = cor(x, y_star),
group_cov = cov(x, y_star),
group_polycor = polycor::polyserial(y_star, as.factor(x)),
) %>%
arrange(mean_x)
res # Check results
# Plot dependency of aggregate statistics on mean:
res %>%
pivot_longer(group_sd_x:group_polycor) %>%
ggplot(aes(mean_x, value)) + geom_point() +
facet_wrap(~ name) +
ggtitle("Dependency on mean only exists in linear models") +
theme_light()
@daob
Copy link
Author

daob commented Jun 17, 2021

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment