Last active
December 12, 2018 02:26
-
-
Save WarrenWeckesser/2e5905d116e710914af383ee47adc2bf to your computer and use it in GitHub Desktop.
An alternative to numpy.random.choice
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
def random_select(items, nsample=None, p=None, size=None): | |
""" | |
Select random samples from `items`. | |
The function randomly selects `nsample` items from `items` without | |
replacement. | |
Parameters | |
---------- | |
items : sequence | |
The collection of items from which the selection is made. | |
nsample : int, optional | |
Number of items to select without replacement in each draw. | |
It must be between 0 and len(items), inclusive. | |
p : array-like of floats, same length as `items, optional | |
Probabilities of the items. If this argument is not given, | |
the elements in `items` are assumed to have equal probability. | |
size : int, optional | |
Number of variates to draw. | |
Notes | |
----- | |
`size=None` means "generate a single selection". | |
If `size` is None, the result is equivalent to | |
numpy.random.choice(items, size=nsample, replace=False) | |
`nsample=None` means draw one (scalar) sample. | |
If `nsample` is None, the functon acts (almost) like nsample=1 (see | |
below for more information), and the result is equivalent to | |
numpy.random.choice(items, size=size) | |
In effect, it does choice with replacement. The case `nsample=None` | |
can be interpreted as each sample is a scalar, and `nsample=k` | |
means each sample is a sequence with length k. | |
If `nsample` is not None, it must be a nonnegative integer with | |
0 <= nsample <= len(items). | |
If `size` is not None, it must be an integer or a tuple of integers. | |
When `size` is an integer, it is treated as the tuple ``(size,)``. | |
When both `nsample` and `size` are not None, the result | |
has shape ``size + (nsample,)``. | |
Examples | |
-------- | |
Make 6 choices with replacement from [10, 20, 30, 40]. (This is | |
equivalent to "Make 1 choice without replacement from [10, 20, 30, 40]; | |
do it six times.") | |
>>> random_select([10, 20, 30, 40], size=6) | |
array([20, 20, 40, 10, 40, 30]) | |
Choose two items from [10, 20, 30, 40] without replacement. Do it six | |
times. | |
>>> random_select([10, 20, 30, 40], nsample=2, size=6) | |
array([[40, 10], | |
[20, 30], | |
[10, 40], | |
[30, 10], | |
[10, 30], | |
[10, 20]]) | |
When `nsample` is an integer, there is always an axis at the end of the | |
result with length `nsample`, even when `nsample=1`. For example, the | |
shape of the array returned in the following call is (2, 3, 1) | |
>>> random_select([10, 20, 30, 40], nsample=1, size=(2, 3)) | |
array([[[10], | |
[30], | |
[20]], | |
[[10], | |
[40], | |
[20]]]) | |
When `nsample` is None, it acts like `nsample=1`, but the trivial | |
dimension is not included. The shape of the array returned in the | |
following call is (2, 3). | |
>>> random_select([10, 20, 30, 40], size=(2, 3)) | |
array([[20, 40, 30], | |
[30, 20, 40]]) | |
""" | |
# This implementation is a proof of concept, and provides a demonstration | |
# of a possible API. Efficiency was not considered. The actual | |
# implementation would probably use Cython or C. | |
if nsample is None: | |
return np.random.choice(items, size=size, p=p) | |
if size is None: | |
size = () | |
elif np.isscalar(size): | |
size = (size,) | |
tmp = np.empty(size + (0,)) | |
func = lambda _: np.random.choice(items, size=nsample, p=p, replace=False) | |
result = np.apply_along_axis(func, -1, tmp) | |
return result |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment