Skip to content

Instantly share code, notes, and snippets.

@jonathaneunice
Last active August 29, 2015 14:19
Show Gist options
  • Save jonathaneunice/faf990b020541a696722 to your computer and use it in GitHub Desktop.
Save jonathaneunice/faf990b020541a696722 to your computer and use it in GitHub Desktop.
from itertools import *
def groups_of_n(iterable, n):
"""
Collect data into fixed-length chunks or blocks. Generates (yields)
a sequence of iterators. Useful for segmenting very large sequences
into more manageable chunks. If the final chunk is of less than size n,
no worries. It will return what there is. NB This is *much* more
efficient than the `grouper` recipe in the Python documentation, which
oddly generates a temporary list of length `n` (very not good for large
values of `n`!)
"""
it = iter(iterable)
while True:
x = islice(it, n)
first = next(x)
yield chain((first,), x)
# The fancy dancing, taking the first element out and then chaining
# it back in, has a very specific purpose. It baits a `StopIteration`
# exception. Iterators like `islice` don't have normal methods to
# test whether they are done/exhausted. Instead they throw
# `StopIteration`. If that happens when getting the first element,
# the `islice` is already exhausted, the exception is uncaught and
# propagates back to the caller, and Bob's your uncle! Otherwise,
# there is at least one item for this tranche, so bundle the `first`
# item back with all of the rest of the `islice` results and send
# it to the caller. Do this until the underlying iterable is exhausted.
def lists_of_n(iterable, n):
"""
Like groups_of_n, but generates sublists rather than subiterators.
"""
for g in groups_of_n(iterable, n):
yield list(g)
if __name__ == '__main__':
seen = set()
ngroups = 0
for g in groups_of_n(range(13,27), 5):
ngroups += 1
for gg in g:
print gg
seen.add(gg)
print "---"
assert ngroups == 3
assert seen == set(range(13,27))
assert list(lists_of_n(range(13, 27), 5)) == [[13, 14, 15, 16, 17],
[18, 19, 20, 21, 22],
[23, 24, 25, 26]]
assert list(lists_of_n([], 5)) == []
assert list(lists_of_n([1, 2], 5)) == [[1,2]]
assert list(lists_of_n([1, 2, 3, 4, 5], 5)) == [[1,2,3,4,5]]
assert list(lists_of_n([1, 2, 3, 4, 5, 6], 5)) == [[1, 2, 3, 4, 5],
[6]]
print "passes all tests"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment