Skip to content

Instantly share code, notes, and snippets.

@arsatiki
Created November 26, 2011 09:16
Show Gist options
  • Save arsatiki/1395348 to your computer and use it in GitHub Desktop.
Save arsatiki/1395348 to your computer and use it in GitHub Desktop.
A/B test result computation. Simulation and exact versions.
from __future__ import division
import math
import sys
def gamma(n):
return math.factorial(n - 1)
def h(a, b, c, d):
num = gamma(a + c) * gamma(b + d) * gamma(a + b) * gamma(c + d)
den = gamma(a) * gamma(b) * gamma(c) * gamma(d) * gamma(a + b + c + d)
return num / den
def g0(a, b, c):
return gamma(a + b) * gamma(a + c) / (gamma(a + b + c) * gamma(a))
def hiter(a, b, c, d):
while d > 1:
d -= 1
yield h(a, b, c, d) / d
def g(a, b, c, d):
return g0(a, b, c) + sum(hiter(a, b, c, d))
def print_odds(p):
o = p / (1 - p)
b = 10 * math.log(o, 10)
if o > 1:
s = "%.4f to 1" % o
else:
s = "1 to %.4f" % (1 / o)
print s, "or %.4f dB" % b
def main():
if len(sys.argv) != 5:
sys.exit("Usage: %s succ1 fail1 succ2 fail2" % sys.argv[0])
s1, f1, s2, f2 = map(int, sys.argv[1:])
print_odds(g(s1 + 1, f1 + 1, s2 + 1, f2 + 1))
if __name__ == "__main__":
main()
@chris-hailstorm
Copy link

Any pointers on how to adapt for >2 comparisons? I.e. A-B-C-D testing.

@vsmelov
Copy link

vsmelov commented Jun 25, 2018

Thnx for your real-code implementations with explanations )
Dont you think to move your great article https://web.archive.org/web/20130331072458/http://arsatiki.posterous.com/bayesian-ab-testing-with-theory-and-code to github too?

Also, I would like to ask:
How to calculate 2-type error? (false-negative)
How to calculate sample size for given confidence?

@vlavorini
Copy link

Thank you very much for this code!

Adding to @lrasinen argumentation: working directly with the factorial can lead to rounding (and, finally, NaNs) issues, and so better to pass to natural logarithms and exponentials.
As example: try with s1=s2=1 and f1=f2=10000, the original code will return a NaN, while with logarithm it gives the right answer.

By the way, there is a typo:

def g0(a, b, c): return exp(lgamma(a + b) + lgamma(a + c) / (lgamma(a + b + c) + lgamma(a)))

should be:

def g0(a, b, c):    
    return np.exp(lgamma(a + b) + lgamma(a + c) - (lgamma(a + b + c) + lgamma(a)))

More, in order to have a huge perfomance improvement (20x when the parameters are of the order of 10, 1300x when the parameters are of the order of 100) Numba comes to help.

Finally, this is the modified version:

from math import lgamma
from numba import jit



@jit
def h(a, b, c, d):
    num = lgamma(a + c) + lgamma(b + d) + lgamma(a + b) + lgamma(c + d)
    den = lgamma(a) + lgamma(b) + lgamma(c) + lgamma(d) + lgamma(a + b + c + d)

    return np.exp(num - den)
@jit
def g0(a, b, c):    
    return np.exp(lgamma(a + b) + lgamma(a + c) - (lgamma(a + b + c) + lgamma(a)))

@jit
def hiter(a, b, c, d):
    while d > 1:
        d -= 1
        yield h(a, b, c, d) / d

@lucidyan
Copy link

lucidyan commented Feb 4, 2020

For those of you who interested in all three parts, I restored them from the webarchive and convert to markdown. Enjoy!
https://gist.github.com/lucidyan/89eee0db8ce353d91e6bfbc51b5dcf19

Also Evan Miller wrote another great article about it:
https://www.evanmiller.org/bayesian-ab-testing.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment