Skip to content

Instantly share code, notes, and snippets.

@tvladeck
Created April 28, 2021 19:38
Show Gist options
  • Save tvladeck/d3b9ec6bba70e1becaaff3052ade945a to your computer and use it in GitHub Desktop.
Save tvladeck/d3b9ec6bba70e1becaaff3052ade945a to your computer and use it in GitHub Desktop.
simulation of algorithmic fairness
library(tidyverse)
set.seed(1)
sims <- map(1:200, function(x) {
intercept = -runif(1)/2
beta_male = runif(1)/4
beta_smoker = runif(1)/4
dat =
data.frame(male = sample(c(0, 1), 1000, replace = T)) %>% # female: 0, male: 1; makes coding easier
rowwise() %>%
mutate(smoker = as.integer(runif(1) < 0.1 + male * 0.1)) %>%
mutate(mortality_logit = intercept + male * beta_male + smoker * beta_smoker) %>%
mutate(mortality_prob = boot::inv.logit(mortality_logit)) %>%
mutate(observation = as.integer(runif(1) < mortality_prob))
# include gender
mod_gender = glm(observation ~ 1 + male + smoker, dat, family = "binomial")
# dont include gender
mod_fair = glm(observation ~ 1 + smoker, dat, family = "binomial")
dat_with_pred = dat %>%
ungroup %>%
mutate(risk_gender = predict(mod_gender, ., type = "response")) %>%
mutate(risk_fair = predict(mod_fair, ., type = "response"))
dat_with_pred %>%
group_by(male, smoker) %>%
summarize_at(vars(mortality_prob, risk_gender, risk_fair), mean) %>%
mutate(error_gender = risk_gender - mortality_prob) %>%
mutate(error_fair = risk_fair - mortality_prob) %>%
ungroup
})
sims %>%
reduce(bind_rows) %>%
group_by(male, smoker) %>%
summarize_all(mean)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment