Created
November 26, 2011 09:16
-
-
Save arsatiki/1395348 to your computer and use it in GitHub Desktop.
A/B test result computation. Simulation and exact versions.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from __future__ import division | |
import math | |
import sys | |
def gamma(n): | |
return math.factorial(n - 1) | |
def h(a, b, c, d): | |
num = gamma(a + c) * gamma(b + d) * gamma(a + b) * gamma(c + d) | |
den = gamma(a) * gamma(b) * gamma(c) * gamma(d) * gamma(a + b + c + d) | |
return num / den | |
def g0(a, b, c): | |
return gamma(a + b) * gamma(a + c) / (gamma(a + b + c) * gamma(a)) | |
def hiter(a, b, c, d): | |
while d > 1: | |
d -= 1 | |
yield h(a, b, c, d) / d | |
def g(a, b, c, d): | |
return g0(a, b, c) + sum(hiter(a, b, c, d)) | |
def print_odds(p): | |
o = p / (1 - p) | |
b = 10 * math.log(o, 10) | |
if o > 1: | |
s = "%.4f to 1" % o | |
else: | |
s = "1 to %.4f" % (1 / o) | |
print s, "or %.4f dB" % b | |
def main(): | |
if len(sys.argv) != 5: | |
sys.exit("Usage: %s succ1 fail1 succ2 fail2" % sys.argv[0]) | |
s1, f1, s2, f2 = map(int, sys.argv[1:]) | |
print_odds(g(s1 + 1, f1 + 1, s2 + 1, f2 + 1)) | |
if __name__ == "__main__": | |
main() |
For those of you who interested in all three parts, I restored them from the webarchive and convert to markdown. Enjoy!
https://gist.github.com/lucidyan/89eee0db8ce353d91e6bfbc51b5dcf19
Also Evan Miller wrote another great article about it:
https://www.evanmiller.org/bayesian-ab-testing.html
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thank you very much for this code!
Adding to @lrasinen argumentation: working directly with the factorial can lead to rounding (and, finally, NaNs) issues, and so better to pass to natural logarithms and exponentials.
As example: try with
s1=s2=1
andf1=f2=10000
, the original code will return aNaN
, while with logarithm it gives the right answer.By the way, there is a typo:
should be:
More, in order to have a huge perfomance improvement (20x when the parameters are of the order of 10, 1300x when the parameters are of the order of 100) Numba comes to help.
Finally, this is the modified version: