Skip to content

Instantly share code, notes, and snippets.

@mkocikowski
Created June 3, 2015 14:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mkocikowski/8fd2728a9c66e5bb9aa2 to your computer and use it in GitHub Desktop.
Save mkocikowski/8fd2728a9c66e5bb9aa2 to your computer and use it in GitHub Desktop.
Iterable chunker with no batch size limits
def chunker(iterable=None, chunklen=None):
"""Collects data into fixed-length chunks.
Returns: iterator of iterators. Does not pad the last chunk.
Raises: TypeError on bad iterable or chunklen
Example: chunker("abcde", 2) returns (('a', 'b'), ('c', 'd'), ('e'))
This is better than the functools 'grouper' recipe (using izip on
[iter(iterable)] * n) in that there is no performance penalty for
really large batches.
"""
# for chunklen=3 generates (0,0,0,1,1,1,0,0,0,1,1,1,...)
keys = (
key
for n in itertools.cycle((0, 1))
for key in itertools.repeat(n, chunklen)
)
# ((a,b,c),(d,e,f),...)
groups = itertools.groupby(
iter(iterable),
key=lambda v: next(keys),
)
return (v for _, v in groups)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment