Skip to content

Instantly share code, notes, and snippets.

@springcoil
Created November 16, 2015 14:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save springcoil/85795f0c149b4820ac88 to your computer and use it in GitHub Desktop.
Save springcoil/85795f0c149b4820ac88 to your computer and use it in GitHub Desktop.
Some NumPy fun with the new operators
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Playing with NumPy\n",
"* Although I've used NumPy for a while now I've not learned all of its functionality.\n",
"* Based on [Python for Data Analysis](http://shop.oreilly.com/product/0636920023784.do) by Wes McKinney"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Installed watermark.py. To use it, type:\n",
" %load_ext watermark\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/peadarcoyle/anaconda/envs/py35/lib/python3.5/site-packages/IPython/core/magics/extension.py:47: UserWarning: %install_ext` is deprecated, please distribute your extension(s)as a python packages.\n",
" \"as a python packages.\", UserWarning)\n"
]
}
],
"source": [
"%install_ext https://raw.githubusercontent.com/rasbt/watermark/master/watermark.py"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%load_ext watermark"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"11/16/2015 \n",
"\n",
"CPython 3.5.0\n",
"IPython 4.0.0\n",
"\n",
"numpy 1.10.1\n",
"pandas 0.17.0\n",
"scipy 0.16.0\n",
"matplotlib 1.4.3\n",
"\n",
"compiler : GCC 4.2.1 (Apple Inc. build 5577)\n",
"system : Darwin\n",
"release : 14.4.0\n",
"machine : x86_64\n",
"processor : i386\n",
"CPU cores : 8\n",
"interpreter: 64bit\n"
]
}
],
"source": [
"%watermark -d -v -m -p numpy,pandas,scipy,matplotlib"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"data2 = [[1,2,3,4],[5,6,7,8]]\n",
"arr2 = np.array(data2)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[1, 2, 3, 4],\n",
" [5, 6, 7, 8]])"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"arr2"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"2"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"arr2.ndim"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"(2, 4)"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"arr2.shape"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"dtype('int64')"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"arr2.dtype"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[[ 0.00000000e+000, -2.68156175e+154],\n",
" [ 2.96439388e-323, 2.49653716e+237],\n",
" [ 0.00000000e+000, 0.00000000e+000]],\n",
"\n",
" [[ 3.95363904e-207, 0.00000000e+000],\n",
" [ 0.00000000e+000, -1.85611574e+204],\n",
" [ 0.00000000e+000, 8.34402697e-309]]])"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.empty((2,3,2))"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"# Operations between Arrays and Scalars\n",
"Arrays are important because they enable you to express batch operations on data without writing any for loops. This is usually called **vectorization**. Any arithmetic operations between equal-sized arrays applies the operation elementwise:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"arr = np.array([[1., 2., 3.], [5., 7., 9.]])"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 1., 2., 3.],\n",
" [ 5., 7., 9.]])"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"arr"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In Python 3.5 and NumPy 1.10 there is a new matrix binary operator"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 14., 46.],\n",
" [ 46., 155.]])"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"arr @ arr.T"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 1. , 0.5 , 0.33333333],\n",
" [ 0.2 , 0.14285714, 0.11111111]])"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"1/arr"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 1. , 1.41421356, 1.73205081],\n",
" [ 2.23606798, 2.64575131, 3. ]])"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"arr ** 0.5"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Universal Functions: Fast Element-wise Array Functions\n",
"A universal function, or *ufunc*, is a function that performs elementwise operations on data in ndarrays. You can think of them as fast vectorized wrappers for simple functions that take one of more scalar values and produce one or more scalar results. Many examples are like sqrt or exp"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"arr_2 = np.arange(20)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([ 0. , 1. , 1.41421356, 1.73205081, 2. ,\n",
" 2.23606798, 2.44948974, 2.64575131, 2.82842712, 3. ,\n",
" 3.16227766, 3.31662479, 3.46410162, 3.60555128, 3.74165739,\n",
" 3.87298335, 4. , 4.12310563, 4.24264069, 4.35889894])"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.sqrt(arr_2)"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"x = np.random.randn(9)\n",
"y = np.random.randn(9)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([-1.09886578, 1.14519147, -0.47733132, -0.50168259, -0.81878711,\n",
" 2.51633349, 1.06050518, 0.33874671, 1.29634673])"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.maximum(x, y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Linear Algebra\n",
"Linear algebra, like matrix multiplication, decompositions, determinants, and other square matrix math, is an important part of any array library. Unlike some languages like MATLAB, multiplying two two-dimensinonal arrays with $*$ is elementwise instead of a matrix dot product. So we'll rewrite the Pandas examples from Wes McKinney with the new **@** operator."
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"x = np.array([[1., 2., 3.], [4., 5., 6.]])"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"y = np.array([[6., 23.], [-1, 7], [8, 9]])"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 1., 2., 3.],\n",
" [ 4., 5., 6.]])"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 6., 23.],\n",
" [ -1., 7.],\n",
" [ 8., 9.]])"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 28., 64.],\n",
" [ 67., 181.]])"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x @ y"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A nice thing about these matrix decompositions is that they are implemtned under the hood using the same industry-standard libraries used in other languages such as MATLAB and R, such as BLAS, LAPACK, or possibly (depending on your NumPy build) the Intel MKL:"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"from numpy.linalg import inv, qr, solve"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"X = np.random.randn(5, 5)"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"mat = X.T @ X"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 6.42234924, 2.08150619, 0.03211693, 2.66338807,\n",
" -0.33702587],\n",
" [ 2.08150619, 9.2522974 , 4.49475303, -4.7292663 ,\n",
" -2.22419419],\n",
" [ 0.03211693, 4.49475303, 7.35230855, -0.1390147 , 0.9943707 ],\n",
" [ 2.66338807, -4.7292663 , -0.1390147 , 10.35496047,\n",
" 2.15878809],\n",
" [ -0.33702587, -2.22419419, 0.9943707 , 2.15878809,\n",
" 1.75132664]])"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mat"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 0.361194 , -0.45142924, 0.33029988, -0.20256795, -0.44165023],\n",
" [-0.45142924, 1.25344022, -0.98194916, 0.33027445, 1.65541936],\n",
" [ 0.33029988, -0.98194916, 0.9247809 , -0.22180807, -1.4351774 ],\n",
" [-0.20256795, 0.33027445, -0.22180807, 0.25701021, 0.18960068],\n",
" [-0.44165023, 1.65541936, -1.4351774 , 0.18960068, 3.16954993]])"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"inv(mat)"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 0.67943618, -1.85125015, 1.54681555, -0.43132427,\n",
" -2.81910453],\n",
" [ -1.85125015, 5.58861936, -4.7370874 , 1.12198117,\n",
" 8.99316886],\n",
" [ 1.54681555, -4.7370874 , 4.03747484, -0.92546231,\n",
" -7.68956068],\n",
" [ -0.43132427, 1.12198117, -0.92546231, 0.30131647,\n",
" 1.60421898],\n",
" [ -2.81910453, 8.99316886, -7.68956068, 1.60421898,\n",
" 15.07719752]])"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"solve(mat, inv(mat))"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"q, r = qr(mat)"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ -7.2655001 , -2.88004555, -1.2515143 , -4.69457804,\n",
" 0.22060252],\n",
" [ 0. , -11.36480608, -6.11904856, 9.33857417,\n",
" 2.66440145],\n",
" [ 0. , 0. , -6.02157223, -4.9458143 ,\n",
" -2.54484136],\n",
" [ 0. , 0. , 0. , -2.7668364 ,\n",
" 0.29439234],\n",
" [ 0. , 0. , 0. , 0. ,\n",
" 0.25753703]])"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"r"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.0"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment