Skip to content

Instantly share code, notes, and snippets.

@joyrexus
Last active October 23, 2019 23:32
Show Gist options
  • Save joyrexus/5571582 to your computer and use it in GitHub Desktop.
Save joyrexus/5571582 to your computer and use it in GitHub Desktop.
Partition a list into N chunks of nearly equal size.

UPDATE: Use the method in this gist instead.

List Partitioning

We want to partition a list into N chunks where we use every item in the original list and the resulting chunks differ in size by at most one element.

We devised a chunk method for this, admittedly kinda hackish ... but it works.

Usage

L = 'a b c d'.split(" ")
assert chunk(L, 2) == [['a', 'b'], ['c', 'd']]
assert chunk(L, 3) == [['a', 'b'], ['c'], ['d']]
assert chunk(L, 4) == [['a'], ['b'], ['c'], ['d']]
assert chunk(L, 5) == [['a'], ['b'], ['c'], ['d'], []]

Behavior

Check that list L gets partitioned into n chunks and that the total items in the resulting chunks is equal to the size of the original list L.

def check(L, n, verbose=True):
    chunks = chunk(L, n)
    size = len(L)
    assert len(chunks) == n
    assert size == sum(len(chunk) for chunk in chunks)
    if verbose:
        msg = "\nYou want a list with {} items split into {} chunks"
        print msg.format(size, n)
        print "{} chunks produced:".format(len(chunks))
        for i, x in enumerate(chunks):
            print "\tchunk {}, size {}".format(i+1, len(x))

For each p, check that L gets partitioned into p chunks:

L = range(1, 101)

for p in range(1, 10): check(L, p, verbose=True)

Output

You want a list with 100 items split into 1 chunks
1 chunks produced:
    chunk 1, size 100

You want a list with 100 items split into 2 chunks
2 chunks produced:
    chunk 1, size 50
    chunk 2, size 50

You want a list with 100 items split into 3 chunks
3 chunks produced:
    chunk 1, size 34
    chunk 2, size 33
    chunk 3, size 33

You want a list with 100 items split into 4 chunks
4 chunks produced:
    chunk 1, size 25
    chunk 2, size 25
    chunk 3, size 25
    chunk 4, size 25

You want a list with 100 items split into 5 chunks
5 chunks produced:
    chunk 1, size 20
    chunk 2, size 20
    chunk 3, size 20
    chunk 4, size 20
    chunk 5, size 20

You want a list with 100 items split into 6 chunks
6 chunks produced:
    chunk 1, size 17
    chunk 2, size 17
    chunk 3, size 17
    chunk 4, size 17
    chunk 5, size 16
    chunk 6, size 16

You want a list with 100 items split into 7 chunks
7 chunks produced:
    chunk 1, size 15
    chunk 2, size 15
    chunk 3, size 14
    chunk 4, size 14
    chunk 5, size 14
    chunk 6, size 14
    chunk 7, size 14

You want a list with 100 items split into 8 chunks
8 chunks produced:
    chunk 1, size 13
    chunk 2, size 13
    chunk 3, size 13
    chunk 4, size 13
    chunk 5, size 12
    chunk 6, size 12
    chunk 7, size 12
    chunk 8, size 12

You want a list with 100 items split into 9 chunks
9 chunks produced:
    chunk 1, size 12
    chunk 2, size 11
    chunk 3, size 11
    chunk 4, size 11
    chunk 5, size 11
    chunk 6, size 11
    chunk 7, size 11
    chunk 8, size 11
    chunk 9, size 11
def chunk(L, n=1, verbose=False):
'''
Partition list L into n chunks.
Returns a list of n lists/chunks, where each chunk is
of nearly equal size.
>>> L = 'a b c d'.split(" ")
['a', 'b', 'c', 'd']
>>> chunk(L, 2)
[['a', 'b'], ['c', 'd']]
>>> chunk(L, 3)
[['a', 'b'], ['c'], ['d']]
>>> chunk(L, 4)
[['a'], ['b'], ['c'], ['d']]
>>> chunk(L, 5)
[['a'], ['b'], ['c'], ['d'], []]
'''
total = len(L)
if n > 0:
size = total / n
rest = total % n
ranges = []
if verbose:
msg = "{} items to be split into {} chunks of size {} with {} extra"
print msg.format(total, n, size, rest)
if not size:
return [[x] for x in L] + [[] for i in range(n - total)]
if rest:
index = [x for x in range(0, total, size)]
extra = [index[i] + i for i in range(rest + 1)] + \
[x + rest for x in index[rest+1:][:n-rest]]
ranges = [(extra[i], extra[i+1]) for i in range(len(extra) - 1)]
else:
index = [x for x in range(0, total+1, size)]
ranges = [(index[i], index[i+1]) for i in range(len(index) - 1)]
return [L[i:j] for i, j in ranges]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment