Skip to content

Instantly share code, notes, and snippets.

@newtover
Last active October 9, 2019 16:35
Show Gist options
  • Save newtover/beede0d9ba7bfb36201b5c22832e4a57 to your computer and use it in GitHub Desktop.
Save newtover/beede0d9ba7bfb36201b5c22832e4a57 to your computer and use it in GitHub Desktop.
Percentiles from frequency distribution
import numpy as np
import scipy.stats
# some mapping from a value to the frequency
freqs = np.array([
[1, 3],
[2, 10],
[3, 13],
[4, 12],
[5, 9],
[6, 4],
])
def distrib_from_freqs(arr: np.ndarray) -> scipy.stats.rv_discrete:
pmf = arr[:, 1] / arr[:, 1].sum()
distrib = scipy.stats.rv_discrete(values=(arr[:, 0], pmf))
return distrib
distrib = distrib_from_freqs(freqs)
print(distrib.pmf(freqs[:, 0]))
print(distrib.cdf(freqs[:, 0]))
print(distrib.ppf(distrib.cdf(freqs[:, 0]))) # percentiles
# [0.05882353 0.19607843 0.25490196 0.23529412 0.17647059 0.07843137]
# [0.05882353 0.25490196 0.50980392 0.74509804 0.92156863 1. ]
# [1. 2. 3. 4. 5. 6.]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment