Skip to content

Instantly share code, notes, and snippets.

@rsnemmen
Created April 1, 2015 19:15
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rsnemmen/f2c03beb391db809c90f to your computer and use it in GitHub Desktop.
Save rsnemmen/f2c03beb391db809c90f to your computer and use it in GitHub Desktop.
Calculates the confidence band of a linear regression model at the desired confidence level, using analytical methods
def confband(xd,yd,a,b,conf=0.95,x=None):
"""
Calculates the confidence band of the linear regression model at the desired confidence
level, using analytical methods. The 2sigma confidence interval is 95% sure to contain
the best-fit regression line. This is not the same as saying it will contain 95% of
the data points.
Arguments:
- conf: desired confidence level, by default 0.95 (2 sigma)
- xd,yd: data arrays
- a,b: linear fit parameters as in y=ax+b
- x: (optional) array with x values to calculate the confidence band. If none is provided, will
by default generate 100 points in the original x-range of the data.
Returns:
Sequence (lcb,ucb,x) with the arrays holding the lower and upper confidence bands
corresponding to the [input] x array.
Usage:
>>> lcb,ucb,x=nemmen.confband(all.kp,all.lg,a,b,conf=0.95)
calculates the confidence bands for the given input arrays
>>> pylab.fill_between(x, lcb, ucb, alpha=0.3, facecolor='gray')
plots a shaded area containing the confidence band
References:
1. http://en.wikipedia.org/wiki/Simple_linear_regression, see Section Confidence intervals
2. http://www.weibull.com/DOEWeb/confidence_intervals_in_simple_linear_regression.htm
Author: Rodrigo Nemmen
v1 Dec. 2011
v2 Jun. 2012: corrected bug in computing dy
"""
alpha=1.-conf # significance
n=xd.size # data sample size
if x==None: x=numpy.linspace(xd.min(),xd.max(),100)
# Predicted values (best-fit model)
y=a*x+b
# Auxiliary definitions
sd=scatterfit(xd,yd,a,b) # Scatter of data about the model
sxd=numpy.sum((xd-xd.mean())**2)
sx=(x-xd.mean())**2 # array
# Quantile of Student's t distribution for p=1-alpha/2
q=scipy.stats.t.ppf(1.-alpha/2.,n-2)
# Confidence band
dy=q*sd*numpy.sqrt( 1./n + sx/sxd )
ucb=y+dy # Upper confidence band
lcb=y-dy # Lower confidence band
return lcb,ucb,x
@Scimonster
Copy link

For those who are wondering, scatterfit would appear to be from this gist: https://gist.github.com/rsnemmen/0eb32832c657c19e4d39

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment