Skip to content

Instantly share code, notes, and snippets.

@arogozhnikov
Last active August 29, 2015 14:24
Show Gist options
  • Save arogozhnikov/d8b0827557e7295f2aa9 to your computer and use it in GitHub Desktop.
Save arogozhnikov/d8b0827557e7295f2aa9 to your computer and use it in GitHub Desktop.
Speed benchmark of number crunchers in python
Display the source blob
Display the rendered blob
Raw
{"nbformat_minor": 0, "cells": [{"source": "## Log likelihood benchmark\n\ntested: \n0. numpy + scipy's pdf\n1. numpy\n2. cython\n4. numexpr\n3. theano\n4. parakeet\n\ncomputing log-likelihood for normal distribution \n\n__Notes__\n1. Not optimizing computations here (but in theory theano and parakeet may remove unnecessary computations)\n2. This test includes exp, log, division and summation of array\n2. Everything is running on CPU in one thread, this limitation is for 'fairness' of tests\n", "cell_type": "markdown", "metadata": {}}, {"execution_count": 1, "cell_type": "code", "source": "import numpy\nimport scipy\nfrom scipy.stats import norm\nimport theano", "outputs": [], "metadata": {"collapsed": true, "trusted": true}}, {"source": "### Scipy implementation", "cell_type": "markdown", "metadata": {}}, {"execution_count": 2, "cell_type": "code", "source": "def llh_scipy(data, mean, sigma):\n lh = norm(mean, sigma).pdf(data)\n return numpy.log(lh).sum()", "outputs": [], "metadata": {"collapsed": true, "trusted": true}}, {"source": "### Numpy implementation", "cell_type": "markdown", "metadata": {}}, {"execution_count": 3, "cell_type": "code", "source": "def llh_numpy(data, mean, sigma):\n s = (data - mean) ** 2 / (2 * (sigma ** 2))\n pdfs = numpy.exp(- s)\n pdfs /= numpy.sqrt(2 * numpy.pi) * sigma\n return numpy.log(pdfs).sum()", "outputs": [], "metadata": {"collapsed": true, "trusted": true}}, {"source": "### Cython implementation", "cell_type": "markdown", "metadata": {}}, {"execution_count": 4, "cell_type": "code", "source": "%load_ext Cython", "outputs": [], "metadata": {"collapsed": false, "trusted": true}}, {"execution_count": 5, "cell_type": "code", "source": "%%cython\ncdef extern from \"math.h\":\n double sqrt(double m)\n double exp(double m)\n double log(double m)\n\nimport cython\nimport numpy as np\n\ncimport numpy as np\nfrom numpy cimport ndarray\npi = np.pi\n\n@cython.boundscheck(False)\n@cython.wraparound(False)\ndef llh_cython(ndarray[np.float64_t, ndim=1] data, double mean, double sigma):\n cdef int l = len(data)\n cdef double llh = 0\n cdef double s = 0\n for i in xrange(l):\n s = (data[i] - mean) ** 2 / (2 * (sigma ** 2))\n llh += log(exp(-s) / (sqrt(2 * pi) * sigma))\n return llh", "outputs": [], "metadata": {"collapsed": false, "trusted": true}}, {"source": "### Numexpr implementation", "cell_type": "markdown", "metadata": {}}, {"execution_count": 6, "cell_type": "code", "source": "import numexpr\nnumexpr.set_num_threads(1)\n\ndef llh_numexpr(data, mean, sigma):\n expression = 'sum(log(exp(- (data-mean) **2 / (2 * sigma ** 2)) / (sqrt(2 * pi) * sigma)))'\n return numexpr.evaluate(expression, local_dict=dict(data=data, mean=mean, sigma=sigma, pi=numpy.pi))", "outputs": [], "metadata": {"collapsed": false, "trusted": true}}, {"source": "### Parakeet implementation", "cell_type": "markdown", "metadata": {}}, {"execution_count": 7, "cell_type": "code", "source": "import parakeet\nparakeet.config.backend = 'c'\n\n@parakeet.jit \ndef llh_parakeet(data, mean, sigma):\n s = (data - mean) ** 2 / (2 * (sigma ** 2))\n pdfs = numpy.exp(- s)\n pdfs /= numpy.sqrt(2 * numpy.pi) * sigma\n return numpy.log(pdfs).sum()", "outputs": [], "metadata": {"collapsed": false, "trusted": true}}, {"source": "### Theano implementation", "cell_type": "markdown", "metadata": {}}, {"execution_count": 8, "cell_type": "code", "source": "import theano\nimport theano.tensor as T\n\nprint theano.config.device", "outputs": [{"output_type": "stream", "name": "stdout", "text": "cpu\n"}], "metadata": {"collapsed": false, "trusted": true}}, {"execution_count": 9, "cell_type": "code", "source": "theano.config.openmp", "outputs": [{"execution_count": 9, "output_type": "execute_result", "data": {"text/plain": "False"}, "metadata": {}}], "metadata": {"collapsed": false, "trusted": true}}, {"execution_count": 10, "cell_type": "code", "source": "def llh_theano(data, mean, sigma):\n s = (data - mean) ** 2 / (2 * (sigma ** 2))\n pdfs = T.exp(- s)\n pdfs /= T.sqrt(2 * numpy.pi) * sigma\n return T.log(pdfs).sum()\n\nmean, sigma = T.scalars('m', 's')\nd = T.vector('data')\n\nllh_theano = theano.function([d, mean, sigma], llh_theano(d, mean, sigma))", "outputs": [], "metadata": {"collapsed": false, "trusted": true}}, {"source": "## C++ implementation for comparison of speed\nwe are neither passing, nor returning anything in c++. Just doing same operations in C++ for some array to compare speed.\n\nMind the overhead for creating new process - it is essential for small sizes.", "cell_type": "markdown", "metadata": {}}, {"execution_count": 11, "cell_type": "code", "source": "%%writefile test_speed.cpp\n#include <iostream>\n#include <stdio.h>\n#include <stdlib.h>\n#include <math.h>\n\n// using namespace std;\n\n\nint main(int argc, char** argv) {\n if (argc < 4){\n std::cout << \"run with n_samples, mean, sigma!\";\n return 1;\n }\n int size = atoi(argv[1]);\n double mean = atof(argv[2]);\n double sigma = atof(argv[3]);\n \n double * data = new double[size];\n double factor = 1. / size;\n for (int i=0; i<size; ++i){\n data[i] = i * factor;\n }\n double result = 0.;\n double s = 0.;\n double x = 0.;\n \n for (int i=0; i<size; ++i){\n x = (data[i] - mean);\n s = x * x / (2 * (sigma * sigma));\n result += log(exp(-s) / (sqrt(2 * M_PI) * sigma));\n }\n \n std::cout << std::endl << result << std::endl;\n return 0;\n}", "outputs": [{"output_type": "stream", "name": "stdout", "text": "Overwriting test_speed.cpp\n"}], "metadata": {"collapsed": false, "trusted": true}}, {"execution_count": 12, "cell_type": "code", "source": "!g++ test_speed.cpp -o test_speed -O3", "outputs": [], "metadata": {"collapsed": false, "trusted": true}}, {"execution_count": 13, "cell_type": "code", "source": "def llh_cpp(data, mean, sigma):\n size = len(data)\n out = !./test_speed {len(data)} {mean} {sigma}", "outputs": [], "metadata": {"collapsed": false, "trusted": true}}, {"source": "### Data generation, checking that all functions output the same value", "cell_type": "markdown", "metadata": {}}, {"execution_count": 14, "cell_type": "code", "source": "from collections import OrderedDict\nfunctions = OrderedDict()\nfunctions['scipy'] = llh_scipy\nfunctions['numpy'] = llh_numpy\nfunctions['cython'] = llh_cython\nfunctions['numexpr'] = llh_numexpr\nfunctions['parakeet'] = llh_parakeet\nfunctions['theano'] = llh_theano\nfunctions['c++'] = llh_cpp", "outputs": [], "metadata": {"collapsed": false, "trusted": true}}, {"execution_count": 15, "cell_type": "code", "source": "data = numpy.random.normal(size=1000000).astype('float64')\nfor name, function in functions.items():\n print name, function(data, 0.1, 1.1)", "outputs": [{"output_type": "stream", "name": "stdout", "text": "scipy -1431620.79845\nnumpy -1431620.79845\ncython -1431620.79845\nnumexpr -1431620.79845\nparakeet -1431620.79845\ntheano -1431620.79845\nc++ None\n"}], "metadata": {"collapsed": false, "trusted": true}}, {"execution_count": 16, "cell_type": "code", "source": "import timeit \nsizes = [10 ** 5, 10 ** 6, 10 ** 7, 10 ** 8]\nimport pandas\nscores = pandas.DataFrame(data=0, columns=functions.keys(), index=sizes)\nfor size in sizes:\n for name, function in functions.items():\n data = numpy.random.normal(size=size).astype('float64')\n result = %timeit -o function(data, 0.1, 1.1)\n scores.loc[size, name] = result.best", "outputs": [{"output_type": "stream", "name": "stdout", "text": "100 loops, best of 3: 10.9 ms per loop\n100 loops, best of 3: 5.97 ms per loop\n100 loops, best of 3: 10.6 ms per loop\n100 loops, best of 3: 7.01 ms per loop\n100 loops, best of 3: 6.32 ms per loop\n100 loops, best of 3: 7.7 ms per loop\n100 loops, best of 3: 17 ms per loop\n10 loops, best of 3: 131 ms per loop\n10 loops, best of 3: 62.1 ms per loop\n10 loops, best of 3: 109 ms per loop\n10 loops, best of 3: 72.6 ms per loop\n10 loops, best of 3: 64.5 ms per loop\n10 loops, best of 3: 80 ms per loop\n10 loops, best of 3: 79.8 ms per loop\n1 loops, best of 3: 1.8 s per loop\n1 loops, best of 3: 840 ms per loop\n1 loops, best of 3: 1.07 s per loop\n1 loops, best of 3: 700 ms per loop\n1 loops, best of 3: 628 ms per loop\n1 loops, best of 3: 822 ms per loop\n1 loops, best of 3: 697 ms per loop\n1 loops, best of 3: 18.9 s per loop\n1 loops, best of 3: 8.54 s per loop\n1 loops, best of 3: 10.9 s per loop\n1 loops, best of 3: 7.16 s per loop\n1 loops, best of 3: 6.25 s per loop\n1 loops, best of 3: 8.23 s per loop\n1 loops, best of 3: 6.92 s per loop\n"}], "metadata": {"collapsed": false, "trusted": true}}, {"source": "## Results (time in seconds, less is better) ", "cell_type": "markdown", "metadata": {}}, {"execution_count": 17, "cell_type": "code", "source": "scores", "outputs": [{"execution_count": 17, "output_type": "execute_result", "data": {"text/plain": " scipy numpy cython numexpr parakeet theano \\\n100000 0.010891 0.005968 0.010637 0.007012 0.006323 0.007702 \n1000000 0.130639 0.062146 0.108749 0.072568 0.064520 0.079974 \n10000000 1.795048 0.840145 1.067230 0.700086 0.627737 0.821548 \n100000000 18.917289 8.539139 10.910751 7.157359 6.250242 8.229951 \n\n c++ \n100000 0.017023 \n1000000 0.079771 \n10000000 0.696519 \n100000000 6.924702 ", "text/html": "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>scipy</th>\n <th>numpy</th>\n <th>cython</th>\n <th>numexpr</th>\n <th>parakeet</th>\n <th>theano</th>\n <th>c++</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>100000 </th>\n <td> 0.010891</td>\n <td> 0.005968</td>\n <td> 0.010637</td>\n <td> 0.007012</td>\n <td> 0.006323</td>\n <td> 0.007702</td>\n <td> 0.017023</td>\n </tr>\n <tr>\n <th>1000000 </th>\n <td> 0.130639</td>\n <td> 0.062146</td>\n <td> 0.108749</td>\n <td> 0.072568</td>\n <td> 0.064520</td>\n <td> 0.079974</td>\n <td> 0.079771</td>\n </tr>\n <tr>\n <th>10000000 </th>\n <td> 1.795048</td>\n <td> 0.840145</td>\n <td> 1.067230</td>\n <td> 0.700086</td>\n <td> 0.627737</td>\n <td> 0.821548</td>\n <td> 0.696519</td>\n </tr>\n <tr>\n <th>100000000</th>\n <td> 18.917289</td>\n <td> 8.539139</td>\n <td> 10.910751</td>\n <td> 7.157359</td>\n <td> 6.250242</td>\n <td> 8.229951</td>\n <td> 6.924702</td>\n </tr>\n </tbody>\n</table>\n</div>"}, "metadata": {}}], "metadata": {"collapsed": false, "trusted": true}}, {"source": "## Comparison to numpy time (less is better)", "cell_type": "markdown", "metadata": {}}, {"execution_count": 18, "cell_type": "code", "source": "normalized_scores = scores.copy()\nfor column in normalized_scores.columns:\n normalized_scores[column] /= scores['numpy'] ", "outputs": [], "metadata": {"collapsed": true, "trusted": true}}, {"execution_count": 19, "cell_type": "code", "source": "normalized_scores", "outputs": [{"execution_count": 19, "output_type": "execute_result", "data": {"text/plain": " scipy numpy cython numexpr parakeet theano c++\n100000 1.824860 1 1.782218 1.174857 1.059506 1.290531 2.852221\n1000000 2.102137 1 1.749897 1.167704 1.038194 1.286867 1.283616\n10000000 2.136593 1 1.270292 0.833292 0.747177 0.977864 0.829046\n100000000 2.215363 1 1.277734 0.838183 0.731952 0.963792 0.810937", "text/html": "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>scipy</th>\n <th>numpy</th>\n <th>cython</th>\n <th>numexpr</th>\n <th>parakeet</th>\n <th>theano</th>\n <th>c++</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>100000 </th>\n <td> 1.824860</td>\n <td> 1</td>\n <td> 1.782218</td>\n <td> 1.174857</td>\n <td> 1.059506</td>\n <td> 1.290531</td>\n <td> 2.852221</td>\n </tr>\n <tr>\n <th>1000000 </th>\n <td> 2.102137</td>\n <td> 1</td>\n <td> 1.749897</td>\n <td> 1.167704</td>\n <td> 1.038194</td>\n <td> 1.286867</td>\n <td> 1.283616</td>\n </tr>\n <tr>\n <th>10000000 </th>\n <td> 2.136593</td>\n <td> 1</td>\n <td> 1.270292</td>\n <td> 0.833292</td>\n <td> 0.747177</td>\n <td> 0.977864</td>\n <td> 0.829046</td>\n </tr>\n <tr>\n <th>100000000</th>\n <td> 2.215363</td>\n <td> 1</td>\n <td> 1.277734</td>\n <td> 0.838183</td>\n <td> 0.731952</td>\n <td> 0.963792</td>\n <td> 0.810937</td>\n </tr>\n </tbody>\n</table>\n</div>"}, "metadata": {}}], "metadata": {"collapsed": false, "trusted": true}}, {"source": "## Conclusion\n\nMany libraries claim that they can speed up number crunching in python.\nResults of this test are floating (+- 0.1), but what we can see\n1. numpy turned out to be fastest at moderate sizes of arrays\n2. numpy implementation at least not more complex than others\n3. parakeet was the only to get sensible speed up", "cell_type": "markdown", "metadata": {}}, {"source": "# Technical info", "cell_type": "markdown", "metadata": {}}, {"execution_count": 20, "cell_type": "code", "source": "import multiprocessing\nprint multiprocessing.cpu_count(), 'xeon cores'", "outputs": [{"output_type": "stream", "name": "stdout", "text": "16 xeon cores\n"}], "metadata": {"collapsed": false, "trusted": true}}, {"execution_count": 21, "cell_type": "code", "source": "numpy.__version__", "outputs": [{"execution_count": 21, "output_type": "execute_result", "data": {"text/plain": "'1.9.2'"}, "metadata": {}}], "metadata": {"collapsed": false, "trusted": true}}, {"execution_count": 22, "cell_type": "code", "source": "scipy.__version__", "outputs": [{"execution_count": 22, "output_type": "execute_result", "data": {"text/plain": "'0.14.0'"}, "metadata": {}}], "metadata": {"collapsed": false, "trusted": true}}, {"execution_count": 23, "cell_type": "code", "source": "import cython\ncython.__version__", "outputs": [{"execution_count": 23, "output_type": "execute_result", "data": {"text/plain": "'0.21.1'"}, "metadata": {}}], "metadata": {"collapsed": false, "trusted": true}}, {"execution_count": 24, "cell_type": "code", "source": "numexpr.__version__", "outputs": [{"execution_count": 24, "output_type": "execute_result", "data": {"text/plain": "'2.4'"}, "metadata": {}}], "metadata": {"collapsed": false, "trusted": true}}, {"execution_count": 25, "cell_type": "code", "source": "parakeet.__version__", "outputs": [{"execution_count": 25, "output_type": "execute_result", "data": {"text/plain": "'0.23.2'"}, "metadata": {}}], "metadata": {"collapsed": false, "trusted": true}}, {"execution_count": 26, "cell_type": "code", "source": "theano.__version__", "outputs": [{"execution_count": 26, "output_type": "execute_result", "data": {"text/plain": "'0.7.0'"}, "metadata": {}}], "metadata": {"collapsed": false, "trusted": true}}, {"execution_count": 27, "cell_type": "code", "source": "!g++ -v", "outputs": [{"output_type": "stream", "name": "stdout", "text": "Using built-in specs.\r\nCOLLECT_GCC=g++\r\nCOLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6/lto-wrapper\r\nTarget: x86_64-linux-gnu\r\nConfigured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.6.3-1ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i686 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu\r\nThread model: posix\r\ngcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) \r\n"}], "metadata": {"collapsed": false, "trusted": true}}], "nbformat": 4, "metadata": {"kernelspec": {"display_name": "Python 2", "name": "python2", "language": "python"}, "language_info": {"mimetype": "text/x-python", "nbconvert_exporter": "python", "version": "2.7.3", "name": "python", "file_extension": ".py", "pygments_lexer": "ipython2", "codemirror_mode": {"version": 2, "name": "ipython"}}}}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment