Instantly share code, notes, and snippets.

@zigg /time_del.py
Last active Apr 20, 2018

Embed
What would you like to do?
Time various methods of removing a possibly-present item from a dict
#!/usr/bin/python
import time
def new_d():
return {
1: 2, 3: 4, 5: 6, 7: 8, 9: 10,
11: 12, 13: 14, 15: 16, 17: 18, 19: 20
}
def time_func(iterations, f, *args):
start = time.time()
for i in xrange(iterations):
f(*args)
return time.time() - start
def try_del(key):
d = new_d()
try:
del d[key]
except KeyError:
pass
def if_del(key):
d = new_d()
if key in d:
del d[key]
def pop_del(key):
d = new_d()
d.pop(key, None)
def succeed(f):
f(3)
def fail(f):
f(4)
iterations = 1000000
print "pop succeed:", time_func(iterations, succeed, pop_del)
print "pop fail:", time_func(iterations, fail, pop_del)
print "try succeed:", time_func(iterations, succeed, try_del)
print "try fail:", time_func(iterations, fail, try_del)
print "if succeed:", time_func(iterations, succeed, if_del)
print "if fail:", time_func(iterations, fail, if_del)
@pquentin

This comment has been minimized.

pquentin commented Aug 24, 2015

Interestingly, on my machine, Python 3 is always faster. Additionally, try fail is 33% slower than try success, while it's 78% for Python .2

@oblinger

This comment has been minimized.

oblinger commented Nov 5, 2015

For all viewers who just want a quick answer here are results from python 2.7.10 on OSX.
(why not show us the output of your program?!?!?)

iterations = 1000000
print "pop succeed:", time_func(iterations, succeed, pop_del)
pop succeed: 0.901857852936
print "pop fail:", time_func(iterations, fail, pop_del)
pop fail: 0.912168979645
print "try succeed:", time_func(iterations, succeed, try_del)
try succeed: 0.813394069672
print "try fail:", time_func(iterations, fail, try_del)
try fail: 1.79899096489
print "if succeed:", time_func(iterations, succeed, if_del)
if succeed: 0.824536800385
print "if fail:", time_func(iterations, fail, if_del)
if fail: 0.80895113945

@oblinger

This comment has been minimized.

oblinger commented Nov 5, 2015

most people care about the relative performance anyway, and that is like to remain qualitatively unchanged, right?

@papower1

This comment has been minimized.

papower1 commented Aug 2, 2016

from python 2.7.11 on OSX.

pop succeed: 1.10197281837
pop fail: 1.08385300636
try succeed: 0.888073921204
try fail: 2.11601901054
if succeed: 0.899685144424
if fail: 0.938803911209

from python 2.6.6 on CentOS 5.5 final (CPU full-loaded by another program)

pop succeed: 2.71315407753
pop fail: 2.43682193756
try succeed: 2.1031870842
try fail: 7.71003699303
if succeed: 2.03478884697
if fail: 1.65251803398

from python 2.7.10 on CentOS 5.5 final (CPU full loaded by another program)

pop succeed: 2.16329598427
pop fail: 2.06613516808
try succeed: 2.03216099739
try fail: 6.88375806808
if succeed: 2.00452184677
if fail: 2.11254906654

@brettinternet

This comment has been minimized.

brettinternet commented Jul 10, 2017

Python3

#!/usr/bin/python3

import time

def new_d():
    return {
        1: 2, 3: 4, 5: 6, 7: 8, 9: 10,
        11: 12, 13: 14, 15: 16, 17: 18, 19: 20
    }

def time_func(iterations, f, *args):
    start = time.time()
    for i in range(iterations):
        f(*args)
    return time.time() - start

def try_del(key):
    d = new_d()
    try:
        del d[key]
    except KeyError:
        pass

def if_del(key):
    d = new_d()
    if key in d:
        del d[key]

def pop_del(key):
    d = new_d()
    d.pop(key, None)

def succeed(f):
    f(3)

def fail(f):
    f(4)

iterations = 1000000
print("pop succeed:", time_func(iterations, succeed, pop_del))
print("pop fail:", time_func(iterations, fail, pop_del))
print("try succeed:", time_func(iterations, succeed, try_del))
print("try fail:", time_func(iterations, fail, try_del))
print("if succeed:", time_func(iterations, succeed, if_del))
print("if fail:", time_func(iterations, fail, if_del))

Here are some of my results on my PC workstation's WSL with python3:

pop succeed: 0.618718147277832
pop fail: 0.6250848770141602
try succeed: 0.5316371917724609
try fail: 0.7190759181976318
if succeed: 0.548640251159668
if fail: 0.5335237979888916
@davoscollective

This comment has been minimized.

davoscollective commented Apr 20, 2018

The general approach to testing is right,

  • avoid shallow copy of the dict with new_d()
  • avoid timing time_func itself by starting and ending the measurement within it
  • run the functions in a tight loop with a large number of iterations to get a more meaningful comparison number

I'm not so sure about the premise of the testing, and the actual methodology.

The premise
The assertion that catching exceptions is slow may be true, but a real world measurement of try/except versus the other options isn't just an average between try success and try failure. The ratio of success/fail will vary to the situation, and I would hope that as consenting adults we would be applying try on data where we expect it to succeed the majority of the time. That's the situation where try is great, where it is indeed easier to ask for forgiveness than permission. If that's not the case then try/except is a bit of a rough approach to dealing with the data structure. In that sense I don't like the reductive assumptions to comparing try-success & try-failure to the other patterns. More generally the discussion is about the debate between using a built-in or one of the EAFP or LBYL patterns**[*]**

The methodology

The code is calling new_d(), and passing the functions into succeed() or fail() 1000000 times as well as the test subject functions, so these measurements are also including calls to an additional two functions on every iteration. They aren't particularly busy functions, but function calls still take time. It's DRY, sure, and a nice demonstration of higher order function usage which would be great in real code, but an empirical test setup should try to avoid measuring the test harness as well as the subject**[^]**

Rolling your own time_func is not that easy, better to use timeit which will isolate the function, do setup & and measure it more independently.

Compare the results of the original function with these modifications.

The first is the same as the original but removes the calls to new_d(). You might argue that a real world example function would need to call something to return a dict. It is also likely that it would be explicitly passed a parameter or directly access a global, class, instance, or locally scoped dictionary in the function. Perhaps I'm wrong and a function would indeed usually generate a new dict by calling a function, but this test is supposed to zoom in and focus on the performance of the various approaches to pop, try, and if.

#!/usr/bin/python3

import time


def time_func(iterations, f, *args):
    start = time.time()
    for i in range(iterations):
        f(*args)
    return time.time() - start

def try_del(key):
    d = {
        1: 2, 3: 4, 5: 6, 7: 8, 9: 10,
        11: 12, 13: 14, 15: 16, 17: 18, 19: 20
    }
    try:
        del d[key]
    except KeyError:
        pass

def if_del(key):
    d = {
        1: 2, 3: 4, 5: 6, 7: 8, 9: 10,
        11: 12, 13: 14, 15: 16, 17: 18, 19: 20
    }
    if key in d:
        del d[key]

def pop_del(key):
    d = {
        1: 2, 3: 4, 5: 6, 7: 8, 9: 10,
        11: 12, 13: 14, 15: 16, 17: 18, 19: 20
    }
    d.pop(key, None)

def succeed(f):
    f(3)

def fail(f):
    f(4)

iterations = 1000000
print("pop succeed:", time_func(iterations, succeed, pop_del))
print("pop fail:", time_func(iterations, fail, pop_del))
print("try succeed:", time_func(iterations, succeed, try_del))
print("try fail:", time_func(iterations, fail, try_del))
print("if succeed:", time_func(iterations, succeed, if_del))
print("if fail:", time_func(iterations, fail, if_del))

The second keeps the calls to new_d(), but removes the calls to the success and fail functions in order to show the impact those have on the measurement, and modifies it to use timeit. Since timeit performs the iteration loop for you, the time_func had to be removed. In the original, the time_func was only called once, but internally ran 1000000 iterations. I could have left it as-is and run timeit once and time_func many times, but the point is to make use of the machinery of timeit. I could have also changed time_func iterations to 1, but that seemed a bit silly to run a loop of 1.

#!/usr/bin/python3

from timeit import timeit

def new_d():
    return {
        1: 2, 3: 4, 5: 6, 7: 8, 9: 10,
        11: 12, 13: 14, 15: 16, 17: 18, 19: 20
    }

def try_del(key):
    d = new_d()
    try:
        del d[key]
    except KeyError:
        pass

def if_del(key):
    d = new_d()
    if key in d:
        del d[key]

def pop_del(key):
    d = new_d()
    d.pop(key, None)

if __name__ == '__main__':
    print("pop succeed:", timeit('pop_del(3)', setup='from __main__ import new_d, pop_del', number=1000000))
    print("pop fail:", timeit('pop_del(4)', setup='from __main__ import new_d, pop_del', number=1000000))
    print("try succeed:", timeit('try_del(3)', setup='from __main__ import new_d, try_del', number=1000000))
    print("try fail:", timeit('try_del(4)', setup='from __main__ import new_d, try_del', number=1000000))
    print("if succeed:", timeit('if_del(3)', setup='from __main__ import new_d, if_del', number=1000000))
    print("if fail:", timeit('if_del(4)', setup='from __main__ import new_d, if_del', number=1000000))

The third is a combination of the previous two

#!/usr/bin/python3

from timeit import timeit


def try_del(key):
    d = {
        1: 2, 3: 4, 5: 6, 7: 8, 9: 10,
        11: 12, 13: 14, 15: 16, 17: 18, 19: 20
    }
    try:
        del d[key]
    except KeyError:
        pass

def if_del(key):
    d = {
        1: 2, 3: 4, 5: 6, 7: 8, 9: 10,
        11: 12, 13: 14, 15: 16, 17: 18, 19: 20
    }
    if key in d:
        del d[key]

def pop_del(key):
    d = {
        1: 2, 3: 4, 5: 6, 7: 8, 9: 10,
        11: 12, 13: 14, 15: 16, 17: 18, 19: 20
    }
    d.pop(key, None)

if __name__ == '__main__':
    print("pop succeed:", timeit('pop_del(3)', setup='from __main__ import pop_del', number=1000000))
    print("pop fail:", timeit('pop_del(4)', setup='from __main__ import pop_del', number=1000000))
    print("try succeed:", timeit('try_del(3)', setup='from __main__ import try_del', number=1000000))
    print("try fail:", timeit('try_del(4)', setup='from __main__ import try_del', number=1000000))
    print("if succeed:", timeit('if_del(3)', setup='from __main__ import if_del', number=1000000))
    print("if fail:", timeit('if_del(4)', setup='from __main__ import if_del', number=1000000))

These are my times of the 4 scripts, starting with the original and then my 3 modified versions.

> python .\tests\pop_try_del_og.py
pop succeed: 0.6799745559692383
pop fail: 0.7050254344940186
try succeed: 0.625943660736084
try fail: 0.8100082874298096
if succeed: 0.6490054130554199
if fail: 0.6170117855072021

> python .\tests\pop_try_del_og_dicts.py
pop succeed: 0.648021936416626
pop fail: 0.60898756980896
try succeed: 0.5719962120056152
try fail: 0.7960281372070312
if succeed: 0.5789785385131836
if fail: 0.5299675464630127

> python .\tests\pop_try_del.py
pop succeed: 0.6000474864572732
pop fail: 0.6013029936221185
try succeed: 0.5362181026908466
try fail: 0.7164418500281282
if succeed: 0.5597693311757679
if fail: 0.5227962649254438

> python .\tests\pop_try_del_dicts.py
pop succeed: 0.5466019133644747
pop fail: 0.5360004235065076
try succeed: 0.5201604109909419
try fail: 0.7285216620418071
if succeed: 0.509157538099783
if fail: 0.45228045612282974

There is some variability in these, so to take it next level and be scientific you should run each of these multiple times and alongside an average also report at least some measures of variance or standard deviation.

So I did that using 10 rounds of each:

image

Apart from matplotlib identifying some outliers (which will be related to sporadic things on my system) there is a fairly clear pattern here. They are all very quick, and apart from try-fail they are all pretty much the same performance, and given this is 1000000 iterations, each sits around 500ns, so it's a massively premature optimization to even think this through as far as has been done here. So stop it! Go and just use what works.

So I shamelessly borrowed matplotlib code for the boxplots from here; https://stackoverflow.com/a/20132614/1335793
I made no attempt at making it more DRY, so this code is for reference only, not an example of how you should do it:

#!/usr/bin/python3

from timeit import timeit
import numpy as np
import matplotlib.pyplot as plt

def try_del(key):
    d = {
        1: 2, 3: 4, 5: 6, 7: 8, 9: 10,
        11: 12, 13: 14, 15: 16, 17: 18, 19: 20
    }
    try:
        del d[key]
    except KeyError:
        pass

def if_del(key):
    d = {
        1: 2, 3: 4, 5: 6, 7: 8, 9: 10,
        11: 12, 13: 14, 15: 16, 17: 18, 19: 20
    }
    if key in d:
        del d[key]

def pop_del(key):
    d = {
        1: 2, 3: 4, 5: 6, 7: 8, 9: 10,
        11: 12, 13: 14, 15: 16, 17: 18, 19: 20
    }
    d.pop(key, None)

def set_box_color(bp, color):
    plt.setp(bp['boxes'], color=color)
    plt.setp(bp['whiskers'], color=color)
    plt.setp(bp['caps'], color=color)
    plt.setp(bp['medians'], color=color)

if __name__ == '__main__':
    pop_succeed , pop_fail, try_succeed, try_fail, if_succeed, if_fail = [], [], [], [], [], []
    for i in range(10):
        pop_succeed.append( timeit('pop_del(3)', setup='from __main__ import pop_del', number=1000000))
        pop_fail.append(timeit('pop_del(4)', setup='from __main__ import pop_del', number=1000000))
        try_succeed.append(timeit('try_del(3)', setup='from __main__ import try_del', number=1000000))
        try_fail.append(timeit('try_del(4)', setup='from __main__ import try_del', number=1000000))
        if_succeed.append(timeit('if_del(3)', setup='from __main__ import if_del', number=1000000))
        if_fail.append(timeit('if_del(4)', setup='from __main__ import if_del', number=1000000))

    plt.figure()

    bp_ps = plt.boxplot(pop_succeed, positions=[2], widths=0.6)
    bp_pf = plt.boxplot(pop_fail, positions=[4], widths=0.6)
    bp_ts = plt.boxplot(try_succeed, positions=[6], widths=0.6)
    bp_tf = plt.boxplot(try_fail, positions=[8], widths=0.6)
    bp_is = plt.boxplot(if_succeed, positions=[10], widths=0.6)
    bp_if = plt.boxplot(if_fail, positions=[12], widths=0.6)

    set_box_color(bp_ps, '#c994c7')
    set_box_color(bp_pf, '#dd1c77')
    set_box_color(bp_ts, '#99d8c9')
    set_box_color(bp_tf, '#2ca25f')
    set_box_color(bp_is, '#a6bddb')
    set_box_color(bp_if, '#2b8cbe')

    plt.plot([], c='#c994c7', label='pop succeed')
    plt.plot([], c='#dd1c77', label='pop fail')
    plt.plot([], c='#99d8c9', label='try_succeed')
    plt.plot([], c='#2ca25f', label='try fail')
    plt.plot([], c='#a6bddb', label='if succeed')
    plt.plot([], c='#2b8cbe', label='if fail')
    plt.legend()

    ticks = ['', 'pop succeed', 'pop fail', 'try_succeed', 'try fail', 'if succeed','if fail']
    plt.xticks(range(0, len(ticks)*2, 2), ticks)
    plt.xlim(0, len(ticks)*2)
    plt.ylim(0, 1)
    plt.tight_layout()
    plt.title('Comparing pop, try and if for deleting dict keys')
    plt.show()

[*] pop is the built-in which can handle the key not found problem, try is the EAFP approach and if is the LBYL approach, and many people (including the glossary) are of the opinion that LBYL is not pythonic. The example of another thread potentially deleting the key inbetween the first thread doing a successful key-test and then trying to access that key only to have it gone and throw an exception anyway is valid, but also says that locks fix that issue. It's only really an issue in Python because the GIL does even round-robin thread switching and can break code execution paths at strange places (like in the middle of an if/elif/else block). For real concurrency or parallelism you wouldn't be using threads (in most cases). The other argument relates to duck-typing where you aren't supposed to be doing type() checks (LBYL style) but then goes on to say you'd be doing hasattr() checks or use Abstract Base Classes and then do isinstance() or issubclass() tests, and how would you? Surely with LBYL style, so in fact there is no basis in the argument that EAFP is better than LBYL. There are situations to use either, and situations where one is preferred. I tend to think using the built-in is ideal because they are simpler, and you benefit from implementation improvements in the language without having to change your code. For example using range() in python 2 was fine, but now in python 3 (like so many other things) it's a generator so you get awesome pipeline behaviour and reduced memory usage. Sure there was xrange but my point is about benefiting from python enhancements without re-writing code.

[^] as much as possible, in reality you can't eliminate all instrument interference from measurement, and there is no such thing as objective truth, but I digress. Philosophy of science is a deep topic, and completely falsifiable, so maybe I'm wrong and there is objective truth and maybe there is a god (the atheists aren't very scientific to be honest, it's hardly a falsifiable perspective to hold).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment