moble/CliffordSpeed.ipynb

## CliffordSpeed.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Speed of matrix multiplications\n",
    "\n",
    "Alex, Hugo, and I discussed why [`clifford`](https://github.com/arsenovic/clifford) is slow, and they suggested that it's because of the matrix multiplications that `clifford` uses to perform geometric products, like [this piece of code](https://github.com/arsenovic/clifford/blob/master/clifford/__init__.py#L725).  Basically, any multivector in $n$ dimensions is represented as an array of the $2^n$ possible coefficients of the various basis blades, so any product is represented by an array of dimension $2^n$x$2^n$x$2^n$ where we contract the two multivectors on two of the indices, and the remaining index ranges over the $2^n$ coefficients of the output multivector.\n",
    "\n",
    "Numpy has various options for the backend libraries that do matrix multiplication.  There's a nice overview of this [here](http://markus-beuckelmann.de/blog/boosting-numpy-blas.html), which also shows some benchmarks.  The takeaway is that MKL is probably best if you have an Intel chip, and OpenBLAS is probably best otherwise.  You can see which libraries are linked with your numpy installation by running `numpy.show_config()`.  There are various ways to install these.  I think `conda` automatically installs the best one, but I'm not sure; I know that `conda` gives you a choice.\n",
    "\n",
    "Here, I'll try a few numpy functions that can do the matrix multiplication in question and show their timings.  The upshot is that each multiplication can be done a bit more efficiently using python's `@` operator, but in any case will take tens of microseconds for $n=5$.  Obviously, the sparse bitwise representation discussed by [Dorst et al.](http://www.geometricalgebra.net/) will be better; I would expect typical results (depending on sparseness) tens to hundreds of times faster."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2017-11-16T19:05:59.739813Z",
     "start_time": "2017-11-16T19:05:59.520914Z"
    }
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "np.random.seed(1234)\n",
    "\n",
    "def setup_arrays(n_dimensions):\n",
    "    size = 2**n_dimensions\n",
    "    return (\n",
    "        np.random.random((1, size)),  # Random multivector\n",
    "        np.random.choice([-1, 0, 1], (size, size, size)),  # Array with elements randomly chosen from [-1, 0, 1]\n",
    "        np.random.random((size, 1))  # Random multivector\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, I'll check to make sure that my methods of computing the result all give the same answer.  Since each term in the output is the result of $\\sim 4^5 \\approx 1000$ operations, I'll demand that they're the same to about 1000 times machine precision:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2017-11-16T19:08:54.241136Z",
     "start_time": "2017-11-16T19:08:54.233203Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a, mt, b = setup_arrays(5)\n",
    "\n",
    "dot = np.dot(a, np.dot(mt, b)).flatten()   # This is the current implementation in the code\n",
    "einsum = np.einsum('ij,ljk,ki', a, mt, b)  # This is the function I suggested on the phone call\n",
    "at = (a @ mt @ b).flatten()                # This also works\n",
    "\n",
    "np.allclose(dot, einsum, rtol=2e-13, atol=0) and np.allclose(dot, at, rtol=2e-13, atol=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Looks good; they all agree to within the tolerances.\n",
    "\n",
    "Now let's see how long each one takes:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2017-11-16T19:06:05.117215Z",
     "start_time": "2017-11-16T19:06:00.354170Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "58.5 µs ± 385 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit a, mt, b = setup_arrays(5)\n",
    "\n",
    "np.dot(a, np.dot(mt, b))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2017-11-16T19:06:09.897928Z",
     "start_time": "2017-11-16T19:06:05.118990Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "58.7 µs ± 2.69 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit a, mt, b = setup_arrays(5)\n",
    "\n",
    "np.einsum('ij,ljk,ki', a, mt, b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2017-11-16T19:06:13.278976Z",
     "start_time": "2017-11-16T19:06:09.899782Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "41.6 µs ± 1.04 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit a, mt, b = setup_arrays(5)\n",
    "\n",
    "(a @ mt @ b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Interestingly, the `@` operator is consistently faster than either of the other options.\n",
    "\n",
    "Presumably, this all depends a lot on the matrix libraries numpy is using.  Here's what I have:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2017-11-16T19:06:13.287358Z",
     "start_time": "2017-11-16T19:06:13.280942Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "blas_mkl_info:\n",
      "    libraries = ['mkl_rt', 'pthread']\n",
      "    library_dirs = ['/Users/boyle/.continuum/anaconda3/lib']\n",
      "    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]\n",
      "    include_dirs = ['/Users/boyle/.continuum/anaconda3/include']\n",
      "blas_opt_info:\n",
      "    libraries = ['mkl_rt', 'pthread']\n",
      "    library_dirs = ['/Users/boyle/.continuum/anaconda3/lib']\n",
      "    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]\n",
      "    include_dirs = ['/Users/boyle/.continuum/anaconda3/include']\n",
      "lapack_mkl_info:\n",
      "    libraries = ['mkl_rt', 'pthread']\n",
      "    library_dirs = ['/Users/boyle/.continuum/anaconda3/lib']\n",
      "    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]\n",
      "    include_dirs = ['/Users/boyle/.continuum/anaconda3/include']\n",
      "lapack_opt_info:\n",
      "    libraries = ['mkl_rt', 'pthread']\n",
      "    library_dirs = ['/Users/boyle/.continuum/anaconda3/lib']\n",
      "    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]\n",
      "    include_dirs = ['/Users/boyle/.continuum/anaconda3/include']\n"
     ]
    }
   ],
   "source": [
    "np.show_config()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So I'm using MKL."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [default]",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"# Speed of matrix multiplications\n",
	"\n",
	"Alex, Hugo, and I discussed why [`clifford`](https://github.com/arsenovic/clifford) is slow, and they suggested that it's because of the matrix multiplications that `clifford` uses to perform geometric products, like [this piece of code](https://github.com/arsenovic/clifford/blob/master/clifford/__init__.py#L725). Basically, any multivector in $n$ dimensions is represented as an array of the $2^n$ possible coefficients of the various basis blades, so any product is represented by an array of dimension $2^n$x$2^n$x$2^n$ where we contract the two multivectors on two of the indices, and the remaining index ranges over the $2^n$ coefficients of the output multivector.\n",
	"\n",
	"Numpy has various options for the backend libraries that do matrix multiplication. There's a nice overview of this [here](http://markus-beuckelmann.de/blog/boosting-numpy-blas.html), which also shows some benchmarks. The takeaway is that MKL is probably best if you have an Intel chip, and OpenBLAS is probably best otherwise. You can see which libraries are linked with your numpy installation by running `numpy.show_config()`. There are various ways to install these. I think `conda` automatically installs the best one, but I'm not sure; I know that `conda` gives you a choice.\n",
	"\n",
	"Here, I'll try a few numpy functions that can do the matrix multiplication in question and show their timings. The upshot is that each multiplication can be done a bit more efficiently using python's `@` operator, but in any case will take tens of microseconds for $n=5$. Obviously, the sparse bitwise representation discussed by [Dorst et al.](http://www.geometricalgebra.net/) will be better; I would expect typical results (depending on sparseness) tens to hundreds of times faster."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {
	"ExecuteTime": {
	"end_time": "2017-11-16T19:05:59.739813Z",
	"start_time": "2017-11-16T19:05:59.520914Z"
	}
	},
	"outputs": [],
	"source": [
	"import numpy as np\n",
	"np.random.seed(1234)\n",
	"\n",
	"def setup_arrays(n_dimensions):\n",
	" size = 2**n_dimensions\n",
	" return (\n",
	" np.random.random((1, size)), # Random multivector\n",
	" np.random.choice([-1, 0, 1], (size, size, size)), # Array with elements randomly chosen from [-1, 0, 1]\n",
	" np.random.random((size, 1)) # Random multivector\n",
	" )"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Now, I'll check to make sure that my methods of computing the result all give the same answer. Since each term in the output is the result of $\\sim 4^5 \\approx 1000$ operations, I'll demand that they're the same to about 1000 times machine precision:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 7,
	"metadata": {
	"ExecuteTime": {
	"end_time": "2017-11-16T19:08:54.241136Z",
	"start_time": "2017-11-16T19:08:54.233203Z"
	}
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"True"
	]
	},
	"execution_count": 7,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"a, mt, b = setup_arrays(5)\n",
	"\n",
	"dot = np.dot(a, np.dot(mt, b)).flatten() # This is the current implementation in the code\n",
	"einsum = np.einsum('ij,ljk,ki', a, mt, b) # This is the function I suggested on the phone call\n",
	"at = (a @ mt @ b).flatten() # This also works\n",
	"\n",
	"np.allclose(dot, einsum, rtol=2e-13, atol=0) and np.allclose(dot, at, rtol=2e-13, atol=0)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Looks good; they all agree to within the tolerances.\n",
	"\n",
	"Now let's see how long each one takes:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"metadata": {
	"ExecuteTime": {
	"end_time": "2017-11-16T19:06:05.117215Z",
	"start_time": "2017-11-16T19:06:00.354170Z"
	}
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"58.5 µs ± 385 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n"
	]
	}
	],
	"source": [
	"%%timeit a, mt, b = setup_arrays(5)\n",
	"\n",
	"np.dot(a, np.dot(mt, b))"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 4,
	"metadata": {
	"ExecuteTime": {
	"end_time": "2017-11-16T19:06:09.897928Z",
	"start_time": "2017-11-16T19:06:05.118990Z"
	}
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"58.7 µs ± 2.69 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n"
	]
	}
	],
	"source": [
	"%%timeit a, mt, b = setup_arrays(5)\n",
	"\n",
	"np.einsum('ij,ljk,ki', a, mt, b)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 5,
	"metadata": {
	"ExecuteTime": {
	"end_time": "2017-11-16T19:06:13.278976Z",
	"start_time": "2017-11-16T19:06:09.899782Z"
	}
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"41.6 µs ± 1.04 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n"
	]
	}
	],
	"source": [
	"%%timeit a, mt, b = setup_arrays(5)\n",
	"\n",
	"(a @ mt @ b)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Interestingly, the `@` operator is consistently faster than either of the other options.\n",
	"\n",
	"Presumably, this all depends a lot on the matrix libraries numpy is using. Here's what I have:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 6,
	"metadata": {
	"ExecuteTime": {
	"end_time": "2017-11-16T19:06:13.287358Z",
	"start_time": "2017-11-16T19:06:13.280942Z"
	}
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"blas_mkl_info:\n",
	" libraries = ['mkl_rt', 'pthread']\n",
	" library_dirs = ['/Users/boyle/.continuum/anaconda3/lib']\n",
	" define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]\n",
	" include_dirs = ['/Users/boyle/.continuum/anaconda3/include']\n",
	"blas_opt_info:\n",
	" libraries = ['mkl_rt', 'pthread']\n",
	" library_dirs = ['/Users/boyle/.continuum/anaconda3/lib']\n",
	" define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]\n",
	" include_dirs = ['/Users/boyle/.continuum/anaconda3/include']\n",
	"lapack_mkl_info:\n",
	" libraries = ['mkl_rt', 'pthread']\n",
	" library_dirs = ['/Users/boyle/.continuum/anaconda3/lib']\n",
	" define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]\n",
	" include_dirs = ['/Users/boyle/.continuum/anaconda3/include']\n",
	"lapack_opt_info:\n",
	" libraries = ['mkl_rt', 'pthread']\n",
	" library_dirs = ['/Users/boyle/.continuum/anaconda3/lib']\n",
	" define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]\n",
	" include_dirs = ['/Users/boyle/.continuum/anaconda3/include']\n"
	]
	}
	],
	"source": [
	"np.show_config()"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"So I'm using MKL."
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python [default]",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.6.3"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}