Skip to content

Instantly share code, notes, and snippets.

@ctb
Last active March 9, 2021 17:15
Show Gist options
  • Save ctb/ecea9809b8e1b6503abda1316b0de5b1 to your computer and use it in GitHub Desktop.
Save ctb/ecea9809b8e1b6503abda1316b0de5b1 to your computer and use it in GitHub Desktop.
Cython maps

Here, the .get function on maps is about 1m times slower than the .getitem function. Why?

To run (requires Python 3, Cython, and a C/C++ build environment):

python setup.py build_ext -i
python test.py
from libcpp.map cimport map
cdef class int_int_map:
cdef public map[long, long] _values
def __getitem__(self, k):
return self._values[k]
def __setitem__(self, k, v):
self._values[k] = v
def get(self, k):
return self._values.get(k)
from setuptools import setup, find_packages
from Cython.Build import cythonize
setup(
name = 'yada',
packages = find_packages(),
ext_modules = cythonize('cydata.pyx', language='c++')
)
import cydata
x = cydata.int_int_map()
for k in range(200000):
x[k] = k
import time
a = time.time()
for k in range(50000, 50100):
_ = x[k]
b = time.time()
print('getitem took {} seconds'.format(b - a))
a = time.time()
for k in range(50000, 50100):
_ = x.get(k)
b = time.time()
print('get took {} seconds'.format(b - a))
@fawaz-dabbaghieh
Copy link

fawaz-dabbaghieh commented Mar 9, 2021

Maybe try to use lower_bound and dereferencing the iterator. This is faster for maps

from cython.operator cimport dereference

# to insert an item in the map faster
it = _values.end()
_values.insert(it, the_value)

# to get an item
it = _values.lower_bound(value_to_get)
if it != _value.end():
    the_value = dereference(it).second

@ctb
Copy link
Author

ctb commented Mar 9, 2021

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment