Skip to content

Instantly share code, notes, and snippets.

@treyhunner
Created November 2, 2021 20:27
Show Gist options
  • Save treyhunner/f0c069496a1e41aadcf0d7050e475eb3 to your computer and use it in GitHub Desktop.
Save treyhunner/f0c069496a1e41aadcf0d7050e475eb3 to your computer and use it in GitHub Desktop.
Demo of why sum(list_of_lists, []) is a bad idea (from a performance perspective)
"""
Demonstration of how slow sum can be for flattening lists-of-lists
Output on my machine:
0.019118324998999014 itertools.chain.from_iterable
0.03325132900135941 comprehension
10.947899631002656 sum
Using sum on a list results in a loop inside a loop (a new list is made for
each new sub-list) so the more sub lists there are, the bigger the timing
difference will be.
"""
import itertools
import random
import timeit
list_of_lists = [
[random.randint(1, 1000) for _ in range(3)]
for _ in range(7_000)
]
def flatten_chain(iterables):
return list(itertools.chain.from_iterable(iterables))
def flatten_comp(iterables):
return [
item
for iterable in iterables
for item in iterable
]
def flatten_sum(iterables):
return sum(iterables, [])
def time(function):
return max(timeit.repeat(
f"{function}(list_of_lists)",
number=50,
repeat=3,
globals=globals(),
))
print(time("flatten_chain"), "itertools.chain.from_iterable")
print(time("flatten_comp"), "comprehension")
print(time("flatten_sum"), "sum")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment