Skip to content

Instantly share code, notes, and snippets.

@GaelVaroquaux
Last active September 15, 2023 03:58
Show Gist options
  • Star 77 You must be signed in to star a gist
  • Fork 15 You must be signed in to fork a gist
  • Save GaelVaroquaux/1249305 to your computer and use it in GitHub Desktop.
Save GaelVaroquaux/1249305 to your computer and use it in GitHub Desktop.
Copy-less bindings of C-generated arrays with Cython

Cython example of exposing C-computed arrays in Python without data copies

The goal of this example is to show how an existing C codebase for numerical computing (here c_code.c) can be wrapped in Cython to be exposed in Python.

The meat of the example is that the data is allocated in C, but exposed in Python without a copy using the PyArray_SimpleNewFromData numpy function in the Cython file cython_wrapper.pyx.

The purpose of the ArrayWrapper object, is to be garbage-collected by Python when the ndarray Python object disappear. The memory is then freed. Note that there is no control of when Python will deallocate the memory. If the memory is still being used by the C code, please refer to the following blog post by Travis Oliphant:

http://blog.enthought.com/python/numpy-arrays-with-pre-allocated-memory

You will need Cython, numpy, and a C compiler.

To build the C extension in-place run:

$ python setup.py build_ext -i

To test the C-Python bindings, run the test.py file.

Files
c_code.c The C code to bind. Knows nothing about Python
cython_wrapper.c The Cython code implementing the binding
setup.py The configure/make/install script
test.py Python code using the C extension
Author

Gael Varoquaux

License

BSD 3 clause

/* Small C file creating an array to demo C -> Python data passing
*
* Author: Gael Varoquaux
* License: BSD 3 clause
*/
#include <stdlib.h>
float *compute(int size)
{
int* array;
array = malloc(sizeof(int)*size);
int i;
for (i=0; i<size; i++)
{
array[i] = i;
}
return array;
}
""" Small Cython file to demonstrate the use of PyArray_SimpleNewFromData
in Cython to create an array from already allocated memory.
Cython enables mixing C-level calls and Python-level calls in the same
file with a Python-like syntax and easy type cohersion. See
http://cython.org for more information
"""
# Author: Gael Varoquaux
# License: BSD 3 clause
# Declare the prototype of the C function we are interested in calling
cdef extern from "c_code.c":
float *compute(int size)
from libc.stdlib cimport free
from cpython cimport PyObject, Py_INCREF
# Import the Python-level symbols of numpy
import numpy as np
# Import the C-level symbols of numpy
cimport numpy as np
# Numpy must be initialized. When using numpy from C or Cython you must
# _always_ do that, or you will have segfaults
np.import_array()
# We need to build an array-wrapper class to deallocate our array when
# the Python object is deleted.
cdef class ArrayWrapper:
cdef void* data_ptr
cdef int size
cdef set_data(self, int size, void* data_ptr):
""" Set the data of the array
This cannot be done in the constructor as it must recieve C-level
arguments.
Parameters:
-----------
size: int
Length of the array.
data_ptr: void*
Pointer to the data
"""
self.data_ptr = data_ptr
self.size = size
def __array__(self):
""" Here we use the __array__ method, that is called when numpy
tries to get an array from the object."""
cdef np.npy_intp shape[1]
shape[0] = <np.npy_intp> self.size
# Create a 1D array, of length 'size'
ndarray = np.PyArray_SimpleNewFromData(1, shape,
np.NPY_INT, self.data_ptr)
return ndarray
def __dealloc__(self):
""" Frees the array. This is called by Python when all the
references to the object are gone. """
free(<void*>self.data_ptr)
def py_compute(int size):
""" Python binding of the 'compute' function in 'c_code.c' that does
not copy the data allocated in C.
"""
cdef float *array
cdef np.ndarray ndarray
# Call the C function
array = compute(size)
array_wrapper = ArrayWrapper()
array_wrapper.set_data(size, <void*> array)
ndarray = np.array(array_wrapper, copy=False)
# Assign our object to the 'base' of the ndarray object
ndarray.base = <PyObject*> array_wrapper
# Increment the reference count, as the above assignement was done in
# C, and Python does not know that there is this additional reference
Py_INCREF(array_wrapper)
return ndarray
""" Example of building a module with a Cython file. See the distutils
and numpy distutils documentations for more info:
http://docs.scipy.org/doc/numpy/reference/distutils.html
"""
# Author: Gael Varoquaux
# License: BSD 3 clause
import numpy
from Cython.Distutils import build_ext
def configuration(parent_package='', top_path=None):
""" Function used to build our configuration.
"""
from numpy.distutils.misc_util import Configuration
# The configuration object that hold information on all the files
# to be built.
config = Configuration('', parent_package, top_path)
config.add_extension('cython_wrapper',
sources=['cython_wrapper.pyx'],
# libraries=['m'],
depends=['c_code.c'],
include_dirs=[numpy.get_include()])
return config
if __name__ == '__main__':
# Retrieve the parameters of our local configuration
params = configuration(top_path='').todict()
# Override the C-extension building so that it knows about '.pyx'
# Cython files
params['cmdclass'] = dict(build_ext=build_ext)
# Call the actual building/packaging function (see distutils docs)
from numpy.distutils.core import setup
setup(**params)
""" Script to smoke-test our Cython wrappers
"""
# Author: Gael Varoquaux
# License: BSD 3 clause
import numpy as np
import cython_wrapper
a = cython_wrapper.py_compute(10)
print 'The array created is %s' % a
print 'It carries a reference to our deallocator: %s ' % a.base
np.testing.assert_allclose(a, np.arange(10))
@danieldanciu
Copy link

danieldanciu commented Oct 5, 2021

return np.array(array_wrapper)

Should probably be

return np.array(array_wrapper, copy=False)

otherwise a copy of the array will be made anyway, right?
(I am assuming that the reference count for array_wrapper will be correctly updated so that it's not garbage collected)

@SeanDS
Copy link

SeanDS commented Feb 25, 2022

The original blog post is no longer available, but I found it on the archive: http://web.archive.org/web/20160321001549/http://blog.enthought.com/python/numpy-arrays-with-pre-allocated-memory/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment