Skip to content

Instantly share code, notes, and snippets.

@vladiibine
Last active March 7, 2020 12:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vladiibine/984168e0a0f54bc5d0b52ad26f2b10ec to your computer and use it in GitHub Desktop.
Save vladiibine/984168e0a0f54bc5d0b52ad26f2b10ec to your computer and use it in GitHub Desktop.
def algorithmic_chance_men_in_group(pop_size, sample_size, num_men):
"""Used to estimate the odds of selecting `num_men` men out of
a population of size `pop_size`, when taing a sample of size
`sample_size`
Usage:
>>> chance_as_percent = algorithmic_chance_men_in_group(30000, 1000, 600)[0]
"""
# assuming that 50% of the pop is men
numerator = 1
denominator = 1
chance = decimal.Decimal('1.0')
total_men = pop_size // 2
total_women = total_men
# This is the chance that we get num_men men
for _ in range(num_men):
numerator *= total_men
denominator *= total_men + total_women
total_men -= 1
# and then this is the chance that we get (sample_size - num_men) women
# ...otherwise we'd have just the chance to have AT LEAST num_men men,
# ...not EXACTLY num_men men
for _ in range(sample_size - num_men):
numerator *= total_women
denominator *= total_men + total_women
total_women -= 1
# This factor is interesting!
# We need to multiply by this, because otherwise, we just calculate the odds that
# the first `num_men` people are men, and all the rest are women.
# This factor accounts for all the other situations, which are equaly likely to happen, and
# for our question, they should all be added up. To us, it doesn't matter if we got
# all the men first, then all the women, or men and women came in any order
multiplier = math.factorial(sample_size) //math.factorial(num_men) //math.factorial(sample_size-num_men)
# (chance_as_percent, numerator, denominator, correction_factor_because_of_permutation_effects)
return numerator / denominator * multiplier * 100, numerator, denominator, multiplier
def bernouli_chance_men_in_group(sample_size, num_men):
"""This function is used to test the correction of the first one.
While the algorithmic function accounts for the fact that if we subtract an
individual from the larger population, that somewhat decreases the chance that
we'll meed a individual of the same sex, a bernouli trial assumes the chance is always 0.5
The functions give nearly the same answer if the population is large enough
"""
return (
math.factorial(sample_size) // math.factorial(num_men) // math.factorian(sample_size - num_men) *
pow(0.5, sample_size
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment