Skip to content

Instantly share code, notes, and snippets.

@djhsu
Created October 5, 2023 11:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save djhsu/e105bfa67c01ce0848626f045cebb7e4 to your computer and use it in GitHub Desktop.
Save djhsu/e105bfa67c01ce0848626f045cebb7e4 to your computer and use it in GitHub Desktop.
Computer error rate confidence interval based on CLT approximation to binomial distribution
import numpy as np
def clt_bound(n:int, e:float):
"""Return the lower and upper bound on error rate when test set size is n and empirical error rate is e"""
assert e >= 0. and e <= 1 and n >= 0, f'Invalid input: n={n}, e={e}'
a = 4.+n
b = 2.+n*e
c = n*e**2
d = 2.*np.sqrt(1.+n*e*(1.-e))
return ((b-d)/a, (b+d)/a)
@djhsu
Copy link
Author

djhsu commented Oct 5, 2023

The function constructs an approximate 95% confidence interval for the error rate using the test error rate. It is based on the CLT approximation of the binomial distribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment