Skip to content

Instantly share code, notes, and snippets.

@timm
Last active March 26, 2024 14:05
Show Gist options
  • Save timm/2866762 to your computer and use it in GitHub Desktop.
Save timm/2866762 to your computer and use it in GitHub Desktop.
Gaussian discretization

saas

asd sad asdasas

"""
Divides a numeric range into n breaks based
on equal areas under a gaussian curve.
Raw numerics are converted to their 'z' score
(normalized to mean=0, sd=1) and then their
position is looked up inside an equal area
division of the region under a bell curve.
tim@menzies.us, (c) 2012
Creative Commons Attribution 3.0 (Unported) [1]
Share and Enjoy! [2]
;-)
{ 1: "http://goo.gl/o9gzT"; 2:"http://goo.gl/7l4YL"}
"""
def gbin(n,bins,mean,sd) :
"Main driver: convert 'n' into one of x 'bins'."
return bin((n - mean) / sd, breaks[bins])
breaks = {
2: [0],
3: [-0.43, 0.43],
4: [-0.67, 0, 0.67],
5: [-0.84,-0.25, 0, 0.25, 0.84],
6: [-0.97,-0.43, 0, 0.43, 0.97],
7: [-1.07,-0.57,-0.18, 0, 0.18, 0.57, 1.07],
8: [-1.15,-0.67,-0.32, 0, 0.32, 0.67, 1.15],
9:[-1.22,-0.76,-0.43,-0.14, 0.14, 0.43, 0.76, 1.22],
10:[-1.28,-0.84,-0.52,-0.25, 0, 0.25, 0.52, 0.84, 1.28]
}
def bin(val,breaks, ninf = -10*10) :
"Low-level discretization of 'val' using the supplied 'breaks'."
before = ninf
last = 0
for i,now in enumerate(breaks) :
if val > before and val <= now:
return i
before = now
last = i
return last+1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment