Last active
November 2, 2019 21:36
-
-
Save andrey-legayev/1489ad719dbc520d0db06bdfaa56f6c7 to your computer and use it in GitHub Desktop.
Python benchmark: is set() really fast?
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import sys | |
import random | |
import time | |
# fetch python version string | |
version = str(sys.version).split(" ", 2)[0] | |
# define get_time() function | |
if sys.version_info.major == 3: | |
def get_time(): | |
return time.perf_counter() | |
else: | |
# python 2 version | |
def get_time(): | |
return time.time() | |
# benchmark | |
def test(size, iterations): | |
# prepare data | |
data_list = list() | |
i = 0 | |
while i <= size: | |
data_list.append(random.random()) | |
i += 1 | |
# convert to tuple and set | |
data_tuple = tuple(data_list) | |
data_set = set(data_list) | |
# run tests | |
t = get_time() | |
j = 0 | |
while j <= iterations: | |
x = j in data_list | |
j += 1 | |
t1 = get_time() - t | |
t = get_time() | |
j = 0 | |
while j <= iterations: | |
x = j in data_tuple | |
j += 1 | |
t2 = get_time() - t | |
t = get_time() | |
j = 0 | |
while j <= iterations: | |
x = j in data_set | |
j += 1 | |
t3 = get_time() - t | |
# print results as CSV | |
print(", ".join([version, str(size), str(iterations), str(t1), str(t2), str(t3)])) | |
print("# Python Version, Array Size, Iterations, in list, in tuple, in set") | |
i = 1 | |
while i <= 10000: | |
test(i, 10000) | |
i = i * 10 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
for ver in 2.7 3.6 3.7 3.8; do | |
echo "# running tests for version $ver" | |
image=python:$ver | |
docker run --rm -v "$PWD":/src $image \ | |
bash -c "python /src/perf-test.py" | |
done |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Results:
Lookup for a key in set() data structure is must faster than lookup in list() or tuple() in all all major Python versions.
Also it has constant execution time.
Numbers: 0.002 sec (set) vs. 10.7 sec (list, tuple) in Python 3.x
Script output: