magnific0/pagmo_issue_44.md

## pagmo_issue_44.md

      
    Raw
  

              pagmo_issue_44.md
            
          
    In response to pagmo issue #44 
By default the GIL is not managed which means that multiple threads
can access the Python interpreter without acquiring a lock. This default
behaviour saves the hassle of the user to work with the GIL. This works
perfectly fine as long as only one thread is accessing the Python
interpreter, as in the case with most algorithms. However if an
algorithm that spawns multiple threads (e.g. PaDe) that try to access the
interpreter (because the problem is in Python) at the same time this will
eventually cause segfaults.
There are three possible ways of managing this problem:

do not spawn multiple threads in algorithms;
discourage users from defining PyGMO problem for these select few algorithms;
use cooperative multitasking.

Below an implementation of option 3 is discussed. In cooperative
multitasking each Python callback function is wrapped with the
following lines (source):
PyGILState_STATE gstate;
gstate = PyGILState_Ensure();

/* Perform Python actions here. */
result = CallSomeFunction();
/* evaluate result or handle exception */

/* Release the thread. No Python API allowed beyond this point. */
PyGILState_Release(gstate);

This requires a thread to acquire GIL (Global Interpreter Lock) before
making a Python call and releasing it afterwards. GIL is explained in
more detail in the PyCon presentation by Dave Beazley.
Using GIL requires the main process (the C++ function called by
Python) to first initialize and obtain the lock, using:
PyEval_InitThreads();

The main thread now holds the GIL. If it spawns any threads the
threads do not automatically obtain the GIL as it is currently
holding and waiting for the threads to finish, causing a hang also
known as a deadlock.
The verdict on how the main process should release the GIL for the
threads is subject to much debate (e.g. here
and here). Where common
problems are more deadlocks and segfaults.
During the last test with Python 2.7.6 the following approach worked
best, where inside the algorithm the arch.evolve method is wrapped as
follows:
PyEval_ReleaseLock();
arch.evolve(pop);
PyEval_AcquireLock();

Although in the future the following strategy
should be used as ReleaseLock() was deprecated.
PyThreadState *_save;
_save = PyEval_SaveThread();    
arch.evolve(pop);
PyEval_RestoreThread(_save);

Finally it would be possible to simplify wrapping the callback
functions making use of the Boost Python call policies, as described in this
tutorial. A new policy takes care of acquiring and releasing
the lock using defined precall() and postcall() function for
every call back.
The definition of the police would be:
namespace boost { namespace python {
  
struct release_gil_policy
{
  // Ownership of this argument tuple will ultimately be adopted by
  // the caller.
  template <class ArgumentPackage>
  static bool precall(ArgumentPackage const&)
  {
    // Release GIL and save PyThreadState for this thread here

    return true;
  }

  // Pass the result through
  template <class ArgumentPackage>
  static PyObject* postcall(ArgumentPackage const&, PyObject* result)
  {
    // Reacquire GIL using PyThreadState for this thread here

    return result;
  }

  typedef default_result_converter result_converter;
  typedef PyObject* argument_package;

  template <class Sig> 
  struct extract_return_type : mpl::front<Sig>
  {
  };

private:
  // Retain pointer to PyThreadState on a per-thread basis here

};
}
}

Where the policy is enforced simply on each binding by:
def("myFunction", make_function(&myFunction, release_gil_policy()));