Skip to content

Instantly share code, notes, and snippets.

@bfroehle
Last active January 12, 2021 08:02
Show Gist options
  • Star 6 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bfroehle/4531732 to your computer and use it in GitHub Desktop.
Save bfroehle/4531732 to your computer and use it in GitHub Desktop.
Proof of concept of using the LLVM JIT compiler to load Python extension modules.

Python & LLVM Extension Modules

This is a quick proof of concept of using the LLVM JIT compiler as Python's extension module loader. This implelentation completely replaces the standard extension module framework, so regular extension modules cannot be loaded. It would not be too difficult to extend the patch to allow loading both LLVM bitcode (.bc) and regular extension (.so) modules.

Building Python

The patch is designed to be applied over the mercurial Python 3.3 branch. It probably works for other branches, but this is untested.

hg clone http://hg.python.org/cpython
hg update 3.3
cp ~/path/to/dynload_llvm.cpp Python/dynload_llvm.cpp
patch -p1 < ~/path/to/python-dynload-llvm.patch

Three changes to the Python building process are requried.

  • We use the LLVM C++ API, so we must use a C++ linker to build the python executable. This is accomplished with the --with-cxx-main flag.

  • The LLVM extension module code is in Python/dynload_llvm.cpp so we inform the build process using DYNLOADFILE=dynload_llvm.o.

  • We need to know how to find the LLVM include directory and libraries. This is accomplished by setting the LLVMCONFIG variable, during the make process, to the path to llvm-config.

    ./configure --with-cxx-main DYNLOADFILE=dynload_llvm.o LLVMCONFIG=llvm-config make

Compiling Modules

Compiling extension modules is relatively simple. The loader expects the modules to be in LLVM bitcode, which may be generated using the -c -emit-llvm flags. The bitcode file should have the extension .bc. For example, to compile the xx module in the Python source:

clang -g -O2 -emit-llvm -c -I. -IInclude -o xx.bc Modules/xxmodule.c

Multiple bitcode objects may be linked together into one bitcode file using llvm-link:

clang -O2 -g -emit-llvm -c -I. -IInclude Modules/_math.c \
    -o Modules/_math.bc
clang -O2 -g -emit-llvm -c -I. -IInclude Modules/mathmodule.c \
    -o Modules/mathmodule.bc
llvm-link -o math.bc Modules/_math.bc Modules/mathmodule.bc

After being compiled, the extension modules may be imported and used as expected:

$ ./python
Python 3.3.0+ (dynload_llvm qbase qtip tip:e606bfc02df2+, Jan 14 2013, 08:51:28)
[GCC 4.6.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import xx
>>> xx.__file__
'./xx.bc'

Issues

Extension modules which refer to symbols in dynamically loaded libraries may be used, but the library first needs to be loaded so that its symbols are available. This can be done, for example, using ctypes:

>>> import ctypes
>>> libexample = ctypes.CDLL('/path/to/libexample.so',
...                          ctypes.RTLD_GLOBAL)
>>> import example # Now has access to symbols exported in libexample.so.

It may be necessary to add a line to Modules/Setup to compile ctypes into the Python module:

_ctypes _ctypes/_ctypes.c _ctypes/callbacks.c _ctypes/callproc.c _ctypes/stgdict.c _ctypes/cfield.c -lffi
/* Support for dynamic loading of extension modules */
#include "Python.h"
#include "importdl.h"
#include <llvm/LLVMContext.h>
#include <llvm/Support/TargetSelect.h>
#include <llvm/ADT/OwningPtr.h>
#include <llvm/Support/MemoryBuffer.h>
#include <llvm/Support/system_error.h>
#include <llvm/Bitcode/ReaderWriter.h>
#include <llvm/Module.h>
#include <llvm/ExecutionEngine/ExecutionEngine.h>
#include <llvm/ExecutionEngine/JIT.h>
#if (defined(__OpenBSD__) || defined(__NetBSD__)) && !defined(__ELF__)
#define LEAD_UNDERSCORE "_"
#else
#define LEAD_UNDERSCORE ""
#endif
/* The .bc extension module ABI tag, supplied by the Makefile via
Makefile.pre.in and configure. This is used to discriminate between
incompatible .bc files so that extensions for different Python builds can
live in the same directory. E.g. foomodule.cpython-32.bc
*/
const char *_PyImport_DynLoadFiletab[] = {
"." SOABI ".bc",
".abi" PYTHON_ABI_STRING ".bc",
".bc",
NULL,
};
static void
set_import_error(const char *error, const char *shortname, const char *pathname)
{
PyObject *error_ob = PyUnicode_FromString(error);
PyObject *mod_name = PyUnicode_FromString(shortname);
PyObject *path = PyUnicode_FromString(pathname);
PyErr_SetImportError(error_ob, mod_name, path);
Py_XDECREF(error_ob);
Py_XDECREF(path);
Py_XDECREF(mod_name);
}
extern "C"
dl_funcptr _PyImport_GetDynLoadFunc(const char *shortname,
const char *pathname, FILE *fp)
{
using namespace llvm;
InitializeNativeTarget();
void *handle;
char funcname[258];
char pathbuf[260];
if (strchr(pathname, '/') == NULL) {
/* Prefix bare filename with "./" */
PyOS_snprintf(pathbuf, sizeof(pathbuf), "%-.255s", pathname);
pathname = pathbuf;
}
PyOS_snprintf(funcname, sizeof(funcname),
LEAD_UNDERSCORE "PyInit_%.200s", shortname);
if (fp == NULL)
fp = fopen(pathname, "rb"); /* XXX Who is responsible for closing? */
// Read contents of file.
std::string msg;
OwningPtr<MemoryBuffer> mb;
MemoryBuffer::getOpenFile(fileno(fp), pathname, mb);
Module *m = ParseBitcodeFile(mb.get(), getGlobalContext(), &msg);
if (!m) {
set_import_error(msg.c_str(), shortname, pathname);
return NULL;
}
Function *F = m->getFunction(funcname);
if (!F) {
set_import_error("Cannot find init function", shortname, pathname);
return NULL;
}
TargetOptions opts;
opts.JITEmitDebugInfo = true;
opts.JITExceptionHandling = true;
EngineBuilder builder(m);
builder.setEngineKind(EngineKind::JIT)
.setUseMCJIT(true)
.setErrorStr(&msg)
.setTargetOptions(opts)
.setOptLevel(llvm::CodeGenOpt::None); // XXX Default);
ExecutionEngine *JIT = builder.create();
if (!JIT) {
set_import_error(msg.c_str(), shortname, pathname);
return NULL;
}
JIT->runStaticConstructorsDestructors(false);
handle = JIT->getPointerToFunction(F);
if (handle == NULL) {
set_import_error("handle == NULL; unknown error", shortname, pathname);
return NULL;
}
return (dl_funcptr) handle;
}
--- a/Makefile.pre.in Fri Dec 28 19:08:49 2012 +0100
+++ b/Makefile.pre.in Mon Jan 14 09:28:06 2013 -0800
@@ -35,6 +35,7 @@
CXX= @CXX@
MAINCC= @MAINCC@
LINKCC= @LINKCC@
+LLVMCONFIG?= llvm-config
AR= @AR@
RANLIB= @RANLIB@
READELF= @READELF@
@@ -185,7 +186,8 @@
LIBS= @LIBS@
LIBM= @LIBM@
LIBC= @LIBC@
-SYSLIBS= $(LIBM) $(LIBC)
+LIBLLVM= $(shell $(LLVMCONFIG) --libs) $(shell $(LLVMCONFIG) --ldflags)
+SYSLIBS= $(LIBM) $(LIBC) $(LIBLLVM)
SHLIBS= @SHLIBS@
THREADOBJ= @THREADOBJ@
@@ -640,6 +642,11 @@
-DSOABI='"$(SOABI)"' \
-o $@ $(srcdir)/Python/dynload_shlib.c
+Python/dynload_llvm.o: $(srcdir)/Python/dynload_llvm.cpp Makefile
+ $(CXX) -c $(PY_CORE_CFLAGS) $(shell $(LLVMCONFIG) --cxxflags) \
+ -DSOABI='"$(SOABI)"' \
+ -o $@ $(srcdir)/Python/dynload_llvm.cpp
+
Python/sysmodule.o: $(srcdir)/Python/sysmodule.c Makefile
$(CC) -c $(PY_CORE_CFLAGS) \
-DABIFLAGS='"$(ABIFLAGS)"' \
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment