Skip to content

Instantly share code, notes, and snippets.

@BielStela
Created October 6, 2018 08:13
Show Gist options
  • Save BielStela/a856da5aa75836cee06e87a3b380d872 to your computer and use it in GitHub Desktop.
Save BielStela/a856da5aa75836cee06e87a3b380d872 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a one hot encoded vector to demostrate matrix building perfomance."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"shape = (1000,200)\n",
"seed = np.random.randint(0,shape[1], shape[0])"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"def make_mat_prealloc(seed, shape):\n",
" \"\"\" Creates matrix preallocating\"\"\"\n",
" \n",
" mat = np.zeros(shape)\n",
" for i, n in enumerate(seed):\n",
" mat[i, n] = 1\n",
" return mat\n",
"\n",
"def make_mat_rbr(seed, shape):\n",
" \"\"\"What we do. Creates matrix row by row\"\"\"\n",
" \n",
" for i, n in enumerate(seed):\n",
" # init matrix for first row\n",
" if i == 0:\n",
" mat = np.zeros(shape[1])\n",
" mat[n] = 1\n",
" \n",
" # create and stack other rows\n",
" else:\n",
" mat_i = np.zeros(shape[1])\n",
" mat_i[n] = 1\n",
" mat = np.vstack((mat, mat_i))\n",
" return mat"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"mat1 = make_mat_prealloc(seed, shape)\n",
"mat2 = make_mat_rbr(seed, shape)\n",
"\n",
"assert np.isclose(mat1,mat2).all()"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"769 µs ± 24.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
]
}
],
"source": [
"%timeit make_mat_prealloc(seed, shape)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"248 ms ± 17.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%timeit make_mat_rbr(seed, shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Not only is orders of magnitude faster, it's **cleaner** and the canonical way to create matrices when one knows the shape "
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python [default]",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment