Skip to content

Instantly share code, notes, and snippets.

@fmder
Last active September 18, 2015 15:37
Show Gist options
  • Save fmder/f362ac50e8a6f1587e50 to your computer and use it in GitHub Desktop.
Save fmder/f362ac50e8a6f1587e50 to your computer and use it in GitHub Desktop.
Shuffled iterator with lookahead
def shuffled_iterator(iterator, look_ahead=5000, skip=0, seed=None):
"""Return an iterator on the shuffled (part) of the provided *iterator*.
This iterator consumes the provided iterator to turn it into a shuffled
version. The cunsumption can be limited by the *lookahead* argument that
limits the view of the operator. The *skip* argument flushes the first n
items of the provided *iterator* and the *seed* allow for repetability.
"""
rng = numpy.random.RandomState(seed)
batch = list(itertools.islice(iterator, skip, skip+look_ahead))
rng.shuffle(batch)
next_item = next(iterator, None)
while next_item:
yield batch.pop(0)
idx = rng.randint(0, look_ahead)
batch.insert(idx, next_item)
next_item = next(iterator, None)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment