Skip to content

Instantly share code, notes, and snippets.

@deepanshululla
Last active November 18, 2018 17:12
Show Gist options
  • Save deepanshululla/b662bf4d69d648bb0978bd444111f33d to your computer and use it in GitHub Desktop.
Save deepanshululla/b662bf4d69d648bb0978bd444111f33d to your computer and use it in GitHub Desktop.
Numpy Tutorial for blog part 1
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Numpy -Data Science Library Part 1\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Creating Numpy arrays"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0 1 2 3]\n"
]
}
],
"source": [
"import numpy as np\n",
"a=np.array([0,1,2,3])\n",
"print(a)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Dimensions of array"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#print dimensions\n",
"\n",
"a.ndim"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(4,)"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#shape\n",
"\n",
"a.shape"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(a)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[0, 1, 2],\n",
" [3, 4, 5]])"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 2-D, 3-D....\n",
"\n",
"b = np.array([[0, 1, 2], [3, 4, 5]])\n",
"\n",
"b"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(b)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(2, 3)"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"b.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Functions for creating arrays"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###### Arange function is similar to range in python"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.arange(10)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Creates an array from 0 to 9"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 2, 4, 6, 8])"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"b=np.arange(0,10,2)\n",
"b"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Function signature is arange(start,end, step_size)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###### Linspace"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a=np.linspace(0,1,6)\n",
"a"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" Function signature is np.linspace(start,end,number_of_points_required)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Zeros"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0., 0., 0., 0., 0.])"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.zeros(5)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[0., 0., 0.],\n",
" [0., 0., 0.]])"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.zeros((2,3))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Ones"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[1., 1., 1.],\n",
" [1., 1., 1.],\n",
" [1., 1., 1.]])"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.ones((3,3))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Identity Matrix"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[1., 0., 0.],\n",
" [0., 1., 0.],\n",
" [0., 0., 1.]])"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.eye(3)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[1., 0.],\n",
" [0., 1.],\n",
" [0., 0.]])"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d = np.eye(3, 2) #3 is number of rows, 2 is number of columns, index of diagonal start with 0\n",
"\n",
"d"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Diagonal Matrix"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[1, 0, 0, 0],\n",
" [0, 2, 0, 0],\n",
" [0, 0, 3, 0],\n",
" [0, 0, 0, 4]])"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a=np.diag([1,2,3,4])\n",
"a"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([1, 2, 3, 4])"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.diag(a)#extract the diagonal matrix"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Random Arrays"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0.38095827, 0.05454193, 0.5649424 , 0.35600484])"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#create array using random\n",
"\n",
"#Create an array of the given shape and populate it with random samples from a uniform distribution over [0, 1).\n",
"a = np.random.rand(4) \n",
"\n",
"a"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 0.288684 , -0.32566062, 1.09628212, -0.30523579])"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a = np.random.randn(4)#Return a sample (or samples) from the “standard normal” distribution. ***Gausian***\n",
"\n",
"a"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Note: Numpy arrays are faster than normal python lists"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"277 µs ± 6.15 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
]
}
],
"source": [
"L = range(1000)\n",
"%timeit [i**2 for i in L]"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"916 ns ± 14.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)\n"
]
}
],
"source": [
"a = np.arange(1000)\n",
"%timeit a**2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Data Types"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0 1 2 3 4 5 6 7 8 9]\n"
]
},
{
"data": {
"text/plain": [
"dtype('int64')"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a=np.arange(10)\n",
"print(a)\n",
"a.dtype"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]\n"
]
},
{
"data": {
"text/plain": [
"dtype('float64')"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#You can explicitly specify which data-type you want:\n",
"\n",
"a = np.arange(10, dtype='float64')\n",
"print(a)\n",
"a.dtype"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[0. 0. 0.]\n",
" [0. 0. 0.]\n",
" [0. 0. 0.]]\n"
]
},
{
"data": {
"text/plain": [
"dtype('float64')"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#The default data type is float for zeros and ones function\n",
"\n",
"a = np.zeros((3, 3))\n",
"\n",
"print(a)\n",
"\n",
"a.dtype"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Each built-in data type has a character code that uniquely identifies it.**\n",
"\n",
"'b' − boolean\n",
"\n",
"'i' − (signed) integer\n",
"\n",
"'u' − unsigned integer\n",
"\n",
"'f' − floating-point\n",
"\n",
"'c' − complex-floating point\n",
"\n",
"'m' − timedelta\n",
"\n",
"'M' − datetime\n",
"\n",
"'O' − (Python) objects\n",
"\n",
"'S', 'a' − (byte-)string\n",
"\n",
"'U' − Unicode\n",
"\n",
"'V' − raw data (void)"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"complex128\n"
]
}
],
"source": [
"d = np.array([1+2j, 2+4j]) #Complex datatype\n",
"\n",
"print(d.dtype)"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"bool\n"
]
}
],
"source": [
"b = np.array([True, False, True, False]) #Boolean datatype\n",
"\n",
"print(b.dtype)"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"dtype('<U6')"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"s = np.array(['Ram', 'Robert', 'Rahim'])\n",
"\n",
"s.dtype"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Indexing and slicing an array"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0 1 2 3 4 5 6 7 8 9]\n"
]
},
{
"data": {
"text/plain": [
"5"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a=np.arange(10)\n",
"print(a)\n",
"a[5]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"indices begin at 0, like other python arrays"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"3\n"
]
}
],
"source": [
"# For multidimensional arrays, indexes are tuples of integers:\n",
"\n",
"a = np.diag([1, 2, 3])\n",
"\n",
"print(a[2, 2])"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[1, 0, 0],\n",
" [0, 2, 0],\n",
" [0, 5, 3]])"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a[2, 1] = 5 #assigning value\n",
"\n",
"a"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([1, 3, 5, 7])"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# slicing\n",
"a = np.arange(10)\n",
"\n",
"a[1:8:2]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is kind of like syaing get me all values from index 1 to 7 at intervals of 2..which gets the odd numbers"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 0, 1, 2, 3, 4, 10, 10, 10, 10, 10])"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#we can also combine assignment and slicing:\n",
"\n",
"a = np.arange(10)\n",
"a[5:] = 10\n",
"a"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Sharing memory"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])"
]
},
"execution_count": 49,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a=np.arange(10)\n",
"a"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 2, 4, 6, 8])"
]
},
"execution_count": 62,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"b=a[::2]\n",
"b\n"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 63,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"id(a)==id(b)\n"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4548358352"
]
},
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"id(a)"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4548359632"
]
},
"execution_count": 65,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"id(b)"
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 66,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.shares_memory(a,b)"
]
},
{
"cell_type": "code",
"execution_count": 67,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([10, 2, 4, 6, 8])"
]
},
"execution_count": 67,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"b[0]=10\n",
"b"
]
},
{
"cell_type": "code",
"execution_count": 68,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[10 1 2 3 4 5 6 7 8 9]\n"
]
}
],
"source": [
"print(a)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Change in b causes change in a even though we modified only b**"
]
},
{
"cell_type": "code",
"execution_count": 69,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 2, 4, 6, 8])"
]
},
"execution_count": 69,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a = np.arange(10)\n",
"\n",
"c = a[::2].copy() #force a copy\n",
"c"
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 70,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.shares_memory(a, c)"
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])"
]
},
"execution_count": 71,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"c[0] = 10\n",
"\n",
"a"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Indexing using masks"
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {},
"outputs": [],
"source": [
"a=np.random.randint(0,20,15)"
]
},
{
"cell_type": "code",
"execution_count": 73,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 2, 1, 4, 0, 15, 10, 16, 9, 7, 7, 7, 2, 3, 14, 4])"
]
},
"execution_count": 73,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a"
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ True, False, True, True, False, True, True, False, False,\n",
" False, False, True, False, True, True])"
]
},
"execution_count": 75,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mask=(a%2==0)\n",
"mask"
]
},
{
"cell_type": "code",
"execution_count": 77,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 2, 4, 0, 10, 16, 2, 14, 4])"
]
},
"execution_count": 77,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"extract_from_a=a[mask]\n",
"extract_from_a"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Indexing with an array of Integers"
]
},
{
"cell_type": "code",
"execution_count": 78,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])"
]
},
"execution_count": 78,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a = np.arange(0, 100, 10)\n",
"\n",
"a"
]
},
{
"cell_type": "code",
"execution_count": 79,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([20, 30, 20, 40, 20])"
]
},
"execution_count": 79,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Indexing can be done with an array of integers, where the same index is repeated several time:\n",
"\n",
"a[[2, 3, 2, 4, 2]]"
]
},
{
"cell_type": "code",
"execution_count": 80,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 0, 10, 20, 30, 40, 50, 60, -200, 80, -200])"
]
},
"execution_count": 80,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# New values can be assigned \n",
"\n",
"a[[9, 7]] = -200\n",
"\n",
"a"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.0"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment