Skip to content

Instantly share code, notes, and snippets.

@DanielTakeshi
Last active July 26, 2020 15:28
Show Gist options
  • Save DanielTakeshi/7f90c6a508678e04714933378f13c483 to your computer and use it in GitHub Desktop.
Save DanielTakeshi/7f90c6a508678e04714933378f13c483 to your computer and use it in GitHub Desktop.
How to sample from a log-uniform distribution.
"""
How we might sample from a log-uniform distribution
https://stats.stackexchange.com/questions/155552/what-does-log-uniformly-distribution-mean
Only run one of these three cases at a time, otherwise the plots update each
other. Run with these versions:
matplotlib 3.2.1
numpy 1.18.3
"""
import numpy as np
import matplotlib.pyplot as plt
np.set_printoptions(suppress=True, linewidth=200, edgeitems=10)
low = 0.01
high = 0.7
size = 10000
nb_bins = 50
if False:
data = np.random.uniform(low=low, high=high, size=size)
count, bins, ignored = plt.hist(data, bins=nb_bins, align='mid')
plt.title('Uniform({}, {})'.format(low, high))
plt.xlabel('Epsilon')
plt.savefig('distr_uniform.png')
if False:
data = np.random.uniform(low=np.log(low), high=np.log(high), size=size)
count, bins, ignored = plt.hist(data, bins=nb_bins, align='mid')
plt.title('Uniform(log({}), log({})'.format(low, high))
plt.xlabel('Epsilon')
plt.savefig('distr_uniform_log.png')
# Number of classes are the number of intervals.
nb_classes = 5 + 1
if True:
data = np.random.uniform(low=np.log(low), high=np.log(high), size=size)
discretized = np.linspace(np.log(low), np.log(high), num=nb_classes)
data = np.exp(data)
count, bins, ignored = plt.hist(data, bins=nb_bins, align='mid')
plt.title('exp( Uniform(log({}), log({}) )'.format(low, high))
plt.xlabel('Epsilon')
plt.savefig('distr_uniform_log_true.png')
# Now let's add dicretized ranges.
print('Discretized bounds (len {}) for epsilons:\nLog: {}\nNormal: {}'.format(
len(discretized), discretized, np.exp(discretized)))
for idx,item in enumerate(discretized):
plt.axvline(x=np.exp(item), color='black')
if idx < len(discretized) - 1:
start = np.exp(discretized[idx])
end = np.exp(discretized[idx+1])
count = np.sum( (start <= data) & (data < end) )
print('{:.3f} <= x < {:.3f} count: {}'.format(start, end, count))
plt.savefig('distr_uniform_log_true_bounds.png')
@DanielTakeshi
Copy link
Author

DanielTakeshi commented Apr 23, 2020

0.01 to 0.5 now (same three plots in order)

distr_uniform

distr_uniform_log

distr_uniform_log_true

@DanielTakeshi
Copy link
Author

DanielTakeshi commented Jul 26, 2020

(July 26) Now log(0.01) to log(0.7) with discretized bins.

Here is a plot which also has nb_classes=5+1 (because nb_classes is really the number of vertical ticks).

distr_uniform_log_true_bounds

For classes, I get:

0.010 <= x < 0.023 count: 1998
0.023 <= x < 0.055 count: 1968
0.055 <= x < 0.128 count: 2020
0.128 <= x < 0.299 count: 2021
0.299 <= x < 0.700 count: 1993

If it's nb_classes=10+1) then we get:

distr_uniform_log_true_bounds

0.010 <= x < 0.015 count: 991
0.015 <= x < 0.023 count: 1034
0.023 <= x < 0.036 count: 1043
0.036 <= x < 0.055 count: 968
0.055 <= x < 0.084 count: 983
0.084 <= x < 0.128 count: 1020
0.128 <= x < 0.196 count: 1051
0.196 <= x < 0.299 count: 1022
0.299 <= x < 0.458 count: 961
0.458 <= x < 0.700 count: 927

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment