Skip to content

Instantly share code, notes, and snippets.

@kevinburke
Last active July 28, 2022 02:33
Show Gist options
  • Save kevinburke/6144826 to your computer and use it in GitHub Desktop.
Save kevinburke/6144826 to your computer and use it in GitHub Desktop.
Performance of ''.join vs .format() in Python
$ python test_time.py
join_test: 21.3 seconds
format_test: 36.6 seconds
j_setup = """
def join_test(s1, s2, s3):
return ''.join([s1, "foo", s2, "foo", s3])
a = 'foo' * 1000
b = 'baz' * 1000
c = 'bang' * 1000
"""
j = """
join_test(a, b, c)
"""
f_setup = """
def format_test(s1, s2, s3):
return '{}foo{}foo{}'.format(s1, s2, s3)
a = 'foo' * 1000
b = 'baz' * 1000
c = 'bang' * 1000
"""
f = """
format_test(a, b, c)
"""
import timeit
jointimer = timeit.Timer(j, j_setup)
print jointimer.timeit(number=20000000)
formattimer = timeit.Timer(f, f_setup)
print formattimer.timeit(number=20000000)
@zbentley
Copy link

zbentley commented Apr 10, 2020

Join_test's performance can be improved even further if you use a tuple argument to join instead of a list. Tuples are cheaper to construct than lists in general, and are sometimes "interned" such that their constructor speed goes from "faster than lists" to "MUCH faster than lists".

@kevinburke
Copy link
Author

These must be half a decade old at this point, I would verify these performance results still hold with modern CPU's and a modern Python version.

@zbentley
Copy link

They seem to hold. With python 2.7.15, the differences are even more pronounced than your published results:

7.85430908203
17.1770138741

Python 3.7.7 (altered the print statements and nothing else):

7.840473283000009
19.88268370099999

This benchmark indicates improved performance using tuples instead of list constructions on 2.7.15, but no significant difference in performance between the first two tests on Python 3.7.7:

from timeit import Timer

join_setup_list = """
def test(s1, s2, s3):
    return ''.join([s1, "foo", s2, "foo", s3])
a = 'foo' * 1000
b = 'baz' * 1000
c = 'bang' * 1000
"""

join_setup_tuple = """
def test(s1, s2, s3):
    return ''.join((s1, "foo", s2, "foo", s3))
a = 'foo' * 1000
b = 'baz' * 1000
c = 'bang' * 1000
"""

format_setup = """
def test(s1, s2, s3):
    return '{}foo{}foo{}'.format(s1, s2, s3)
a = 'foo' * 1000
b = 'baz' * 1000
c = 'bang' * 1000
"""


jointimer = Timer("test(a, b, c)", join_setup_list)
print(jointimer.timeit(number=20000000))
jointimer = Timer("test(a, b, c)", join_setup_tuple)
print(jointimer.timeit(number=20000000))
formattimer = Timer("test(a, b, c)", format_setup)
print(formattimer.timeit(number=20000000))

2.7.15:

8.01555895805
6.60311198235
15.6411790848

3.7.7:

7.767214711000008
7.350209489000008
18.818318449000003

The non-mutated-list-literal optimizations in Python 3 can be "defeated" by using ''.join([s1, "foo"] + [s2, "foo", s3]) and ''.join((s1, "foo") + (s2, "foo", s3)) for their respective test functions, at which point the tuple version gains an advantage on both 2 and 3, though the advantage is significantly more pronounced on 2.7.15:

2.7.15:

12.4432151318
7.96366500854
16.306732893

3.7.7:

9.509968151999999
8.512370731000004
18.97672093999998

All tests were run on an idle/stock MacBook Pro 2019 i9 2.3GhZ, with an updated OSX 10.14.6 and latest XCode/SDK/headers as of April 12, 2020, using pyenv-installed Pythons 2.7.15 and 3.7.7 (compiled and installed on April 12, 2020) with no compiler flag or library customizations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment