Skip to content

Instantly share code, notes, and snippets.

@minrk
Forked from Midnighter/parallel_tweaking.ipynb
Created February 19, 2013 01:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save minrk/4982349 to your computer and use it in GitHub Desktop.
Save minrk/4982349 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"name": "parallel_tweaking2"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Parallel Inner Products"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from IPython.parallel import Client, require, interactive"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 1
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"rc = Client()\n",
"dv = rc.direct_view()\n",
"lv = rc.load_balanced_view()"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 5
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"with dv.sync_imports():\n",
" import numpy"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"importing numpy on engine(s)\n"
]
}
],
"prompt_number": 6
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"mat = numpy.random.random_sample((800, 800))\n",
"mat = numpy.asfortranarray(mat)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 7
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def simple_inner(i):\n",
" column = mat[:, i]\n",
" # have to use a list comprehension to prevent closure\n",
" return sum([numpy.inner(column, mat[:, j]) for j in xrange(i + 1, mat.shape[1])])"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 8
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Local, serial performance."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit sum(simple_inner(i) for i in xrange(mat.shape[1] - 1))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1 loops, best of 3: 892 ms per loop\n"
]
}
],
"prompt_number": 9
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"dv.push(dict(mat=mat), block=True);"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 10
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Parallel implementation using a `DirectView`."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit sum(dv.map(simple_inner, range(mat.shape[1] - 1), block=False))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1 loops, best of 3: 3.24 s per loop\n"
]
}
],
"prompt_number": 11
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Parallel implementation using a `LoadBalancedView` with a large `chunksize` and unordered results."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit sum(lv.map(simple_inner, range(mat.shape[1] - 1), ordered=False, chunksize=(mat.shape[1] - 1) // len(lv), block=False))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1 loops, best of 3: 2.79 s per loop\n"
]
}
],
"prompt_number": 12
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"But the transfer forced the array back to C-contiguous, which explains the slowness.\n",
"\n",
"If we re-apply the fortran-contiguous transformation *on the engines*,\n",
"we should geet our performance back."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%px mat = numpy.asfortranarray(mat)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 13
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And re-run the timings"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit sum(dv.map(simple_inner, range(mat.shape[1] - 1), block=False))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1 loops, best of 3: 439 ms per loop\n"
]
}
],
"prompt_number": 14
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit sum(lv.map(simple_inner, range(mat.shape[1] - 1), ordered=False, chunksize=(mat.shape[1] - 1) // len(lv), block=False))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1 loops, best of 3: 473 ms per loop\n"
]
}
],
"prompt_number": 15
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using two indices takes even more time due to additional communication."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def inner(i, j):\n",
" return numpy.inner(mat[:, i], mat[:, j])"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 16
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"first = [i for i in xrange(mat.shape[1] - 1) for j in xrange(i + 1, mat.shape[1])]"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 17
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"second = [j for i in xrange(mat.shape[1] - 1) for j in xrange(i + 1, mat.shape[1])]"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 18
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit sum(dv.map(inner, first, second, block=False))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1 loops, best of 3: 1.11 s per loop\n"
]
}
],
"prompt_number": 19
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit sum(lv.map(inner, first, second, unordered=True, chunksize=len(first) // len(lv), block=False))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1 loops, best of 3: 1.27 s per loop\n"
]
}
],
"prompt_number": 20
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit sum(map(inner, first, second))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1 loops, best of 3: 1.91 s per loop\n"
]
}
],
"prompt_number": 21
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So, in every case the double-index case is slower (locality, etc., etc.),\n",
"but it's still faster in parallel than serial."
]
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment