Skip to content

Instantly share code, notes, and snippets.

@sdaza
Created June 18, 2020 09:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sdaza/188c728361513e9e63a287d010ff3458 to your computer and use it in GitHub Desktop.
Save sdaza/188c728361513e9e63a287d010ff3458 to your computer and use it in GitHub Desktop.
# stratified sampling example and weighted means
set.seed(1)
library(data.table)
dat = data.table(
ID = 1:100,
A = sample(c("AA", "BB", "CC", "DD", "EE"), 100, replace = TRUE),
B = rnorm(100), C = abs(round(rnorm(100), digits=1)),
D = sample(c("CA", "NY", "TX"), 100, replace = TRUE),
E = sample(c("M", "F"), 100, replace = TRUE),
W = runif(100, 3, 10))
table(dat$A)
table(dat$D)
# samples with replacement
samples = list()
for (i in 1:10) {
samples[[i]] = dat[,.SD[ sample(.N, replace = TRUE)], .(A, D)]
}
anyDuplicated(dat[, ID])
anyDuplicated(samples[[1]][, ID])
anyDuplicated(samples[[2]][, ID])
table(samples[[3]]$A)
table((samples[[3]]$D))
table(dat$A)
table(dat$D)
# weighted mean
dat[, weighted.mean(C, W)]
dat[, mean(C)]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment