Skip to content

Instantly share code, notes, and snippets.

@jabbany
Last active January 31, 2017 08:01
Show Gist options
  • Save jabbany/39ece829fd45b3f2e78057a1e20f3eb8 to your computer and use it in GitHub Desktop.
Save jabbany/39ece829fd45b3f2e78057a1e20f3eb8 to your computer and use it in GitHub Desktop.
Quick Guide to Python Multiprocessing
from multiprocessing import Pool
# --- Filler code that you can ignore
# Read in your sentences and labels somehow
X_train, y_train = read_train()
X_eval, y_eval = read_eval()
# Train your model (high level filler code that looks like sklearn :) )
# myModel must be global here since we want each forked process to have a *copy* of it
myModel = MyHMMModel()
myModel.fit(X_train, y_train)
# --- The Multiprocessing magic happens below
# Define a function for each process to run
def evaluate(pair):
# 'pair' is a pair tuple of 1 sentence and its correct reference label (X, y)
X, y = pair
# Make a prediction with your model
# IMPORTANT: predict must not change your model since that would
# desynchronize the models across the processes
y_pred = m.predict(X)
# Make an evaluation on your result
result = some_way_to_evaluate(y_pred, y)
return result
# Create a default Pool with as many processes as your CPU core count
p = Pool()
# Create some variables to produce final result
finished, allResults = 0, None
# imap_unordered means to produce an iterator of results that are not ordered
# We let the workers work on sentences without caring about which one is done first
# zip produces the (X, y) pairs
for result in p.imap_unordered(evaluate, zip(X_eval, y_eval)):
finished += 1
# Combine your results from a single task, implement this yourself
allResults = combineResult(result, allResults)
# Show some progress :)
print "Finished {} ... \r".format(finished),
# Clean up
p.close()
# Output results, implement yourself
showResults(allResults)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment