Skip to content

Instantly share code, notes, and snippets.

@statwonk
Created July 12, 2019 13:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save statwonk/cc3b4c73141676f03a8a2b7a0ad556fa to your computer and use it in GitHub Desktop.
Save statwonk/cc3b4c73141676f03a8a2b7a0ad556fa to your computer and use it in GitHub Desktop.
Assuming outages are totally unrelated and occur at a rate of every other month, what's the chance of seeing more than four in a given week? About one in seven million.
# https://news.ycombinator.com/item?id=20356610
# chance of the cluster size being >5 when we can cluster events within a one-week period
# and the rate of events is one every 2 months (debatable, but feels right to me)?
# My re-write:
# probability of more than four outages in a given week given a rate of every other month
# _assuming_: totally unrelated infra (none share AWS, etc.)
ppois(4, lambda = 1/2/4.5, lower.tail = FALSE) # ~ 1/7M chance.
# Given that we've observed five in one-week,
# what could we guess about the weekly rate?
library(tidyverse)
tibble(possible_lambdas = seq(0.1, 2e1, 0.1)) %>%
mutate(given_five_outages_chance_of_lambda = dpois(5, lambda = possible_lambdas)) %>%
mutate(max_likelihood = possible_lambdas[given_five_outages_chance_of_lambda == max(given_five_outages_chance_of_lambda)]) %>%
ggplot(aes(x = possible_lambdas, y = given_five_outages_chance_of_lambda)) +
geom_line() +
geom_vline(aes(xintercept = max_likelihood), color = "red") +
ggtitle("Likelihood of unknown outage rate\ngiven one week with five outages.") +
xlab("Possible unknown outage rate candidates") +
ylab("Likelihood of candidate rate")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment