-
-
Save djhsu/e105bfa67c01ce0848626f045cebb7e4 to your computer and use it in GitHub Desktop.
Computer error rate confidence interval based on CLT approximation to binomial distribution
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
def clt_bound(n:int, e:float): | |
"""Return the lower and upper bound on error rate when test set size is n and empirical error rate is e""" | |
assert e >= 0. and e <= 1 and n >= 0, f'Invalid input: n={n}, e={e}' | |
a = 4.+n | |
b = 2.+n*e | |
c = n*e**2 | |
d = 2.*np.sqrt(1.+n*e*(1.-e)) | |
return ((b-d)/a, (b+d)/a) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The function constructs an approximate 95% confidence interval for the error rate using the test error rate. It is based on the CLT approximation of the binomial distribution.