Skip to content

Instantly share code, notes, and snippets.

@joezuntz
Created February 5, 2015 15:41
Show Gist options
  • Save joezuntz/7c590a8652a1da6dc4c9 to your computer and use it in GitHub Desktop.
Save joezuntz/7c590a8652a1da6dc4c9 to your computer and use it in GitHub Desktop.
Simplest possible mpi4py job splitting
#Suppose you have a collection of tasks, which in this example I'll assume is just running a function f.
#If these tasks are completely separate and independent the most then you can parallelize them easily.
#In this gist I'll show the simplest possible way to do this using mpi4py.
#There are better ways to do this, in particular if the tasks vary significantly in time taken to run.
import mpi4py.MPI
def f(i):
"A fake task - in this case let just open a file and write a number to it"
#open file with name based on task number
f=open("%d.txt"%i,"w")
#write some info to it
f.write("%d * 10 = %d\n"%(i, 10*i))
#close the file
f.close()
#A list of all the tasks to do. In your case you will probably build this task list in a more complex way.
#You don't even need to build it in advance for this approach to work
task_list = range(100)
#main program loop. This is the unparallelized verion, for comparison
for task in task_list:
f(task)
#And now moving on the parallel version
#mpi4py has the notion of a "communicator" - a collection of processors
#all operating together, usually on the same program. Each processor
#in the communicator is identified by a number, its rank, We'll use that
#number to split the tasks
#find out which number processor this particular instance is,
#and how many there are in total
rank = mpi4py.MPI.COMM_WORLD.Get_rank()
size = mpi4py.MPI.COMM_WORLD.Get_size()
#parallelized version
#the enumerate function gives us a number i in addition
#to the task. (In this specific case i is the same as task! But that's
#not true usually)
for i,task in enumerate(task_list):
#This is how we split up the jobs.
#The % sign is a modulus, and the "continue" means
#"skip the rest of this bit and go to the next time
#through the loop"
# If we had e.g. 4 processors, this would mean
# that proc zero did tasks 0, 4, 8, 12, 16, ...
# and proc one did tasks 1, 5, 9, 13, 17, ...
# and do on.
if i%size!=rank: continue
print "Task number %d (%d) being done by processor %d of %d" % (i, task, rank, size)
f(task)
@nicolasH1027
Copy link

Hi, if my function returns a array and the input is the list of array if I want to return following the same order as the input, like, suppose we have 20 task, 4 processor. 1, 2,3, 4, 5, tasks are done by first processor, 6, 7, 8, 9, 10 are done by second one, and so on. the final result is list of the returned array.
how can I do that? sry, maybe its dumb question, I'm new to the mpi.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment