Skip to content

Instantly share code, notes, and snippets.

@eddjberry
Last active May 26, 2018 10:36
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save eddjberry/17648ed8bc0b29880db13c07d80f40d9 to your computer and use it in GitHub Desktop.
Save eddjberry/17648ed8bc0b29880db13c07d80f40d9 to your computer and use it in GitHub Desktop.
Simulate a binomial target and some features
sim_binom <- function(n_samples = 1000, n_features = 2,
true_target_prob = 0.5, beta = NULL, seed = NULL) {
if(!is.null(seed)) {
set.seed(seed)
}
x = matrix(rnorm(n_samples * n_features),
nrow = n_samples, ncol = n_features)
b0 = log(true_target_prob / (1 - true_target_prob))
if(is.null(beta)) {
beta = rnorm(n_features, mean = 0, sd = 0.5)
} else {
stopifnot(length(beta) == n_features, is.vector(beta), is.numeric(beta))
}
z = b0 + beta %*% t(x)
prob = as.vector(1 / (1 + exp(-z)))
y = rbinom(n_samples, 1, prob = prob)
df = as.data.frame(x)
df$y = y
df$prob = prob
list(df = df,
beta_0 = b0,
beta = beta)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment