Skip to content

Instantly share code, notes, and snippets.

@PierreSelim
Last active May 6, 2016 20:23
Show Gist options
  • Save PierreSelim/5ebd6c39dfb6ef4c4680d70c9822c9d7 to your computer and use it in GitHub Desktop.
Save PierreSelim/5ebd6c39dfb6ef4c4680d70c9822c9d7 to your computer and use it in GitHub Desktop.
Parallel Workers in Python

To read: https://gist.github.com/mangecoeur/9540178

Parallel computing can be done in many ways including:

  • using multiple processus, based on fork(). This will yield different result on Windows because Windows do not have fork() system call. I would recommand not using this if you want compatibility on Windows.
  • using multiple threads.

Most python libraries for parallel computing provides a way to use a pool of work in order to map a given list x with a function f. The computation of all the f(x[i]) will be automatically scheduled on the pool of worker (and done in parallel if possible).

Here is two simple ways to use this map methods.

Process

I suggest using multiprocess over the multiprocessing package from the standard library pip install multiprocess

import time
# import multiprocessing as mp
import multiprocess as mp

def f(x):
	time.sleep(1)
	return 2*x

pool = mp.Pool(16)
# Parallel work done by the workers
results = pool.map(f, range(20))
print results

Threading

If using python 2.7 you can install backport from futures package with pip install futures

import time
from concurrent.futures import ThreadPoolExecutor

def f(x):
	time.sleep(1)
	return 2*x

pool = ThreadPoolExecutor(16)
# Parallel work done by the workers
results = [r for r in pool.map(f, range(20))]
print results
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment