Skip to content

Instantly share code, notes, and snippets.

@DanielTakeshi
Last active July 26, 2020 15:28
Show Gist options
  • Save DanielTakeshi/7f90c6a508678e04714933378f13c483 to your computer and use it in GitHub Desktop.
Save DanielTakeshi/7f90c6a508678e04714933378f13c483 to your computer and use it in GitHub Desktop.
How to sample from a log-uniform distribution.
"""
How we might sample from a log-uniform distribution
https://stats.stackexchange.com/questions/155552/what-does-log-uniformly-distribution-mean
Only run one of these three cases at a time, otherwise the plots update each
other. Run with these versions:
matplotlib 3.2.1
numpy 1.18.3
"""
import numpy as np
import matplotlib.pyplot as plt
np.set_printoptions(suppress=True, linewidth=200, edgeitems=10)
low = 0.01
high = 0.7
size = 10000
nb_bins = 50
if False:
data = np.random.uniform(low=low, high=high, size=size)
count, bins, ignored = plt.hist(data, bins=nb_bins, align='mid')
plt.title('Uniform({}, {})'.format(low, high))
plt.xlabel('Epsilon')
plt.savefig('distr_uniform.png')
if False:
data = np.random.uniform(low=np.log(low), high=np.log(high), size=size)
count, bins, ignored = plt.hist(data, bins=nb_bins, align='mid')
plt.title('Uniform(log({}), log({})'.format(low, high))
plt.xlabel('Epsilon')
plt.savefig('distr_uniform_log.png')
# Number of classes are the number of intervals.
nb_classes = 5 + 1
if True:
data = np.random.uniform(low=np.log(low), high=np.log(high), size=size)
discretized = np.linspace(np.log(low), np.log(high), num=nb_classes)
data = np.exp(data)
count, bins, ignored = plt.hist(data, bins=nb_bins, align='mid')
plt.title('exp( Uniform(log({}), log({}) )'.format(low, high))
plt.xlabel('Epsilon')
plt.savefig('distr_uniform_log_true.png')
# Now let's add dicretized ranges.
print('Discretized bounds (len {}) for epsilons:\nLog: {}\nNormal: {}'.format(
len(discretized), discretized, np.exp(discretized)))
for idx,item in enumerate(discretized):
plt.axvline(x=np.exp(item), color='black')
if idx < len(discretized) - 1:
start = np.exp(discretized[idx])
end = np.exp(discretized[idx+1])
count = np.sum( (start <= data) & (data < end) )
print('{:.3f} <= x < {:.3f} count: {}'.format(start, end, count))
plt.savefig('distr_uniform_log_true_bounds.png')
@DanielTakeshi
Copy link
Author

DanielTakeshi commented Apr 9, 2020

0.01 to 0.2

The standard uniform distribution.

distr_uniform

A visualization of what the log looks like:

distr_uniform_log

And the exponentiated version which is what we will actually be using.

distr_uniform_log_true

@DanielTakeshi
Copy link
Author

DanielTakeshi commented Apr 9, 2020

Note that this is similar but not quite the same as calling something like torch.logspace(low, high). What the torch logspace does is get a set of values within that range which are "spaced logarithmically." E.g.:

For 5 and 10 numbered spacings, between 0.1 and 0.01, we call torch.logspace( torch.log10(0.01), torch.log10(0.1) ):

tensor([0.1000, 0.0562, 0.0316, 0.0178, 0.0100])
tensor([0.1000, 0.0774, 0.0599, 0.0464, 0.0359, 0.0278, 0.0215, 0.0167, 0.0129, 0.0100])

@DanielTakeshi
Copy link
Author

DanielTakeshi commented Apr 23, 2020

0.01 to 0.5 now (same three plots in order)

distr_uniform

distr_uniform_log

distr_uniform_log_true

@DanielTakeshi
Copy link
Author

DanielTakeshi commented Jul 26, 2020

(July 26) Now log(0.01) to log(0.7) with discretized bins.

Here is a plot which also has nb_classes=5+1 (because nb_classes is really the number of vertical ticks).

distr_uniform_log_true_bounds

For classes, I get:

0.010 <= x < 0.023 count: 1998
0.023 <= x < 0.055 count: 1968
0.055 <= x < 0.128 count: 2020
0.128 <= x < 0.299 count: 2021
0.299 <= x < 0.700 count: 1993

If it's nb_classes=10+1) then we get:

distr_uniform_log_true_bounds

0.010 <= x < 0.015 count: 991
0.015 <= x < 0.023 count: 1034
0.023 <= x < 0.036 count: 1043
0.036 <= x < 0.055 count: 968
0.055 <= x < 0.084 count: 983
0.084 <= x < 0.128 count: 1020
0.128 <= x < 0.196 count: 1051
0.196 <= x < 0.299 count: 1022
0.299 <= x < 0.458 count: 961
0.458 <= x < 0.700 count: 927

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment