Skip to content

Instantly share code, notes, and snippets.

@zed
Created June 15, 2017 23:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save zed/cc530d8d8c2f149aa6670b68c97c0f9c to your computer and use it in GitHub Desktop.
Save zed/cc530d8d8c2f149aa6670b68c97c0f9c to your computer and use it in GitHub Desktop.
namedtuple vs. dict micro-benchmark

namedtuple vs. dict micro-benchmark

$ python -m perf timeit --rigorous --duplicate 10 --hist --stats -s 'import collections; Point = collections.namedtuple("Point", "x y")' 'Point(10.5, 11.5)'
$ python -m perf timeit --rigorous --duplicate 10 --hist --stats '{"x": 10.5, "y": 11}'

Point(10.5, 11.5) is 6 times slower than {'x': 10.5, 'y':11.5}. The absolute times are 635 ns +- 26 ns vs. 105 ns +- 4 ns. Don't create classes at the function level unless you know why you need it. If your API requires dict than use dict -- it has nothing to do with performance.

python -m perf timeit --rigorous --duplicate 10 --hist --stats '{"x": 10.5, "y": 11}'
.........................................
94.5 ns: 1 ####
95.8 ns: 1 ####
97.1 ns: 4 #################
98.5 ns: 2 ########
99.8 ns: 8 #################################
101 ns: 12 ##################################################
102 ns: 15 ##############################################################
104 ns: 10 ##########################################
105 ns: 19 ###############################################################################
106 ns: 17 #######################################################################
108 ns: 11 ##############################################
109 ns: 10 ##########################################
110 ns: 5 #####################
112 ns: 3 ############
113 ns: 0 |
114 ns: 0 |
116 ns: 0 |
117 ns: 0 |
118 ns: 0 |
120 ns: 1 ####
121 ns: 1 ####
Total duration: 23.7 sec
Start date: 2017-06-16 02:25:15
End date: 2017-06-16 02:25:44
Raw value minimum: 125 ms
Raw value maximum: 160 ms
Number of calibration run: 1
Number of run with values: 40
Total number of run: 41
Number of warmup per run: 1
Number of value per run: 3
Loop iterations per value: 1310720 (2^17 outer-loops x 10 inner-loops)
Total number of values: 120
Minimum: 95.4 ns
Median +- MAD: 106 ns +- 3 ns
Mean +- std dev: 105 ns +- 4 ns
Maximum: 122 ns
0th percentile: 95.4 ns (-10% of the mean) -- minimum
5th percentile: 98.6 ns (-7% of the mean)
25th percentile: 103 ns (-3% of the mean) -- Q1
50th percentile: 106 ns (+0% of the mean) -- median
75th percentile: 108 ns (+2% of the mean) -- Q3
95th percentile: 111 ns (+5% of the mean)
100th percentile: 122 ns (+16% of the mean) -- maximum
Number of outlier (out of 95.2 ns..115.5 ns): 2
Mean +- std dev: 105 ns +- 4 ns
python -m perf timeit --rigorous --duplicate 10 --hist --stats -s 'import collections; Point = collections.namedtuple("Point", "x y")' 'Point(10.5, 11.5)'
.........................................
596 ns: 3 #############
603 ns: 15 ##################################################################
610 ns: 18 ###############################################################################
618 ns: 18 ###############################################################################
625 ns: 8 ###################################
632 ns: 11 ################################################
640 ns: 13 #########################################################
647 ns: 12 #####################################################
655 ns: 4 ##################
662 ns: 6 ##########################
669 ns: 6 ##########################
677 ns: 1 ####
684 ns: 1 ####
691 ns: 2 #########
699 ns: 0 |
706 ns: 0 |
713 ns: 0 |
721 ns: 0 |
728 ns: 0 |
735 ns: 1 ####
743 ns: 1 ####
Total duration: 35.4 sec
Start date: 2017-06-16 02:23:35
End date: 2017-06-16 02:24:15
Raw value minimum: 196 ms
Raw value maximum: 244 ms
Number of calibration run: 1
Number of run with values: 40
Total number of run: 41
Number of warmup per run: 1
Number of value per run: 3
Loop iterations per value: 327680 (2^15 outer-loops x 10 inner-loops)
Total number of values: 120
Minimum: 597 ns
Median +- MAD: 629 ns +- 15 ns
Mean +- std dev: 635 ns +- 26 ns
Maximum: 744 ns
0th percentile: 597 ns (-6% of the mean) -- minimum
5th percentile: 605 ns (-5% of the mean)
25th percentile: 617 ns (-3% of the mean) -- Q1
50th percentile: 629 ns (-1% of the mean) -- median
75th percentile: 650 ns (+2% of the mean) -- Q3
95th percentile: 675 ns (+6% of the mean)
100th percentile: 744 ns (+17% of the mean) -- maximum
Number of outlier (out of 567 ns..700 ns): 2
Mean +- std dev: 635 ns +- 26 ns
@minusf
Copy link

minusf commented May 8, 2018

This microbenchmark tells only half of the story at best. It only creates the structures but does not access the elements.
First of all, if possible, it needs to be determined if the program is

  • read heavy (accessing elements of the structure)
  • write heavy (creating structures/updating elements)

If the program is write heavy, then indeed the non-keyword version of dict is faster:

In [13]: %timeit {"x": 10.5, "y": 11.5}
69.5 ns ± 0.982 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [14]: %timeit dict(x=10.5, y=11.5)
232 ns ± 1.86 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [15]: import collections; Point = collections.namedtuple("Point", "x y")

In [16]: %timeit Point(10.5, 11.5)
403 ns ± 0.899 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

namedtuple is not designed for updating, and its _replace() returns a new instance, so no point in benchmarking that, it will be very slow and left as an exercise for the reader.

Howevery if the program is read heavy, namedtuple does not seem significantly slower even when using the dot notation lookup, the style that makes the code much much more readable:

In [18]: d={"x": 10.5, "y": 11.5}

In [19]: p=Point(10.5, 11.5)

In [24]: %timeit d['x']
35.8 ns ± 0.119 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [25]: %timeit p.x
60.4 ns ± 0.259 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [26]: %timeit p[0]
33.6 ns ± 0.154 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

For number crunching probably none of these are the fastest. For all the rest readability beats a miniscule speedup, especially if the initial values never change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment