Z30G0D/Coursera_Multiclass_Ex3

## Coursera_Multiclass_Ex3
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercise 3 - Multiclass classification\n",
    "## Hello all!\n",
    "\n",
    "This will be my featured solution for the Execise number 3 in the coursera machine learning course by Andrew NG.\n",
    "The PDF is located [Here](https://github.com/merwan/ml-class/blob/master/ex3.pdf).\n",
    "\n",
    "Let's first import the packages for this exercise."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from scipy.io import loadmat\n",
    "from scipy.optimize import minimize\n",
    "from PIL import Image\n",
    "from IPython.display import Image\n",
    "from IPython.core.display import HTML \n",
    "import sys\n",
    "\n",
    "# for debugging - seeing entire array\n",
    "#np.set_printoptions(threshold=np.inf)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll use the loadmat function to load the Matlab information since all the exercise are programmed in Matlab (why?!?!) and I had to convert them to Python. In some exercises I was assisted by different piece codes off the internet , credits follow these lines."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = loadmat(\"Exercise3/ex3data1.mat\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "X is the samples matrix, it has 5000 samples of hand written numbers. each row is a sample picture of 20x20 pixels, so each row in X contains 400 cells. Overall 5000X400. Y is the labels matrix, so it is 5000x1 . Let's Check."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "((5000L, 400L), (5000L, 1L), numpy.ndarray, numpy.ndarray)"
      ]
     },
     "execution_count": 58,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X=data['X']\n",
    "y=data['y']\n",
    "X.shape, y.shape, type(X), type(y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Great, please notice that loadmat function will not work with python 2.7 , so try 3> please, it took me a while to figure this out.\n",
    "\n",
    "Let's define the sigmoid function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "def sigmoid(z):\n",
    "    \"\"\"sigmoid function\"\"\"\n",
    "    return 1/(1+np.exp(-z))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And now for the cost function as stated in the exercise."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "def cost(theta, X, y, lamb):\n",
    "    \"\"\"\"computing the cost function according to logistic regression including regularization term\"\"\"\n",
    "    # Avoiding loops , vectorized approach\n",
    "    X = np.matrix(X)\n",
    "    m = len(X)\n",
    "    y = np.matrix(y)\n",
    "    theta = np.matrix(theta)\n",
    "    # first term including y=1 classes\n",
    "    first = np.multiply(-y, np.log(sigmoid(X * theta.T)))\n",
    "    # second term includes y=0 classes\n",
    "    second = np.multiply((1 - y), np.log(1 - sigmoid(X * theta.T)))\n",
    "    # reg term to avoid overfitting - excluding theta(0)\n",
    "    reg = (lamb / 2 * m) * np.sum(np.power(theta[:, 1:theta.shape[1]], 2))\n",
    "    # concluding cost\n",
    "    j = reg + np.sum(first - second) / m\n",
    "    return j\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The cost function is not very different than the one stated in other exercises. based on this formula\n",
    "<br>\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<img src=\"https://i.stack.imgur.com/XbU4S.png\"/>"
      ],
      "text/plain": [
       "<IPython.core.display.Image object>"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Image(url= \"https://i.stack.imgur.com/XbU4S.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let's write the gradients function, remember this is the derviative"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<img src=\"https://i.stack.imgur.com/pYVzl.png\"/>"
      ],
      "text/plain": [
       "<IPython.core.display.Image object>"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Image(url= \"https://i.stack.imgur.com/pYVzl.png\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [],
   "source": [
    "def gradients(theta ,X, y, lamb):\n",
    "    \"\"\"Calculating gradients (derviatives) for updating the parameters\"\"\"\n",
    "    X = np.matrix(X)\n",
    "    y = np.matrix(y)\n",
    "    theta = np.matrix(theta)\n",
    "    m = len(X)\n",
    "    grads = np.zeros(param)\n",
    "    z =X *theta.T \n",
    "    # error vector\n",
    "    error = sigmoid(z) - y\n",
    "    # calculating first term(intercept parameter\\ bias) with *no* regularization to avoid penalizing all parameters\n",
    "    first_term = np.multiply(error, X[:, 0])\n",
    "    grads[0] = np.sum(first_term) / m    \n",
    "    grads = ((X.T * error) / m).T + ((lamb / m) * theta)\n",
    "\n",
    "    return grads.T\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we can define a training session for every class.\n",
    "eventually we will get a theta vector (weights) for every class."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [],
   "source": [
    "def onevsall(X, y, num_labels, lamb):\n",
    "\n",
    "    # number of columns (features)\n",
    "    params = X.shape[1]\n",
    "    # number of rows (examples)\n",
    "    rows = X.shape[0]\n",
    "    # creating a theta matrix (rows as number of calsses , columns as number of parameters)\n",
    "    theta_vector = np.zeros((num_labels, params + 1))\n",
    "\n",
    "    # according to the pdf exercise , insert the bias term (intercept)\n",
    "    X = np.insert(X, 0, values=np.ones(rows), axis=1)\n",
    "\n",
    "    # For one vs all we go through each class and classify it as 1, and all other as 0. so y is a vector of\n",
    "    for i in range(1, num_labels + 1):\n",
    "        # initialize theta for minimizing function\n",
    "        theta = np.zeros(params + 1)\n",
    "        # creating a y specific for our i category\n",
    "        y_i = np.array([1 if label == i else 0 for label in y])\n",
    "        y_i = np.reshape(y_i, (rows, 1))\n",
    "        theta.shape,X.shape\n",
    "        # minimize the objective function -taken from scipy documentation\n",
    "        # need to debug this one\n",
    "        fmin = minimize(fun=cost, x0=theta, args=(X, y_i, lamb), method='TNC', jac=gradients)\n",
    "        theta_vector[i - 1, :] = fmin.x\n",
    "\n",
    "    return theta_vector"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\zeogo\\Miniconda2\\envs\\py35\\lib\\site-packages\\ipykernel_launcher.py:6: RuntimeWarning: divide by zero encountered in log\n",
      "  \n",
      "C:\\Users\\zeogo\\Miniconda2\\envs\\py35\\lib\\site-packages\\ipykernel_launcher.py:6: RuntimeWarning: invalid value encountered in multiply\n",
      "  \n",
      "C:\\Users\\zeogo\\Miniconda2\\envs\\py35\\lib\\site-packages\\ipykernel_launcher.py:7: RuntimeWarning: invalid value encountered in power\n",
      "  import sys\n",
      "C:\\Users\\zeogo\\Miniconda2\\envs\\py35\\lib\\site-packages\\ipykernel_launcher.py:3: RuntimeWarning: overflow encountered in exp\n",
      "  This is separate from the ipykernel package so we can avoid doing imports until\n",
      "C:\\Users\\zeogo\\Miniconda2\\envs\\py35\\lib\\site-packages\\ipykernel_launcher.py:5: RuntimeWarning: divide by zero encountered in log\n",
      "  \"\"\"\n",
      "C:\\Users\\zeogo\\Miniconda2\\envs\\py35\\lib\\site-packages\\ipykernel_launcher.py:5: RuntimeWarning: invalid value encountered in multiply\n",
      "  \"\"\"\n"
     ]
    }
   ],
   "source": [
    "# note to self: checking sizes of matrices called by fmin\n",
    "# training the algorithm, 10 classes, arbitrary regularization\n",
    "all_theta = one_vs_all(X, y, 10, 1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([[-8.05522912e+00,  0.00000000e+00,  0.00000000e+00, ...,\n",
       "          2.18619279e-02,  2.85921938e-07,  0.00000000e+00],\n",
       "        [-5.90990431e+00,  0.00000000e+00,  0.00000000e+00, ...,\n",
       "          6.72129871e-02, -6.85937921e-03,  0.00000000e+00],\n",
       "        [-8.71826341e+00,  0.00000000e+00,  0.00000000e+00, ...,\n",
       "         -2.56532495e-04, -1.14937641e-06,  0.00000000e+00],\n",
       "        ...,\n",
       "        [-1.33464325e+01,  0.00000000e+00,  0.00000000e+00, ...,\n",
       "         -6.15496460e+00,  7.10885457e-01,  0.00000000e+00],\n",
       "        [-8.55318810e+00,  0.00000000e+00,  0.00000000e+00, ...,\n",
       "         -1.89349871e-01,  8.57477934e-03,  0.00000000e+00],\n",
       "        [-1.29493922e+01,  0.00000000e+00,  0.00000000e+00, ...,\n",
       "          2.58438619e-04,  4.11482696e-05,  0.00000000e+00]]), (10L, 401L))"
      ]
     },
     "execution_count": 63,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "all_theta, all_theta.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So now we have 10 rows vector with 401 columns (1 bias term and 400 weights for each label)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {},
   "outputs": [],
   "source": [
    "def predict_all(X, all_theta):\n",
    "    rows = X.shape[0]\n",
    "    params = X.shape[1]\n",
    "    num_labels = all_theta.shape[0]\n",
    "    \n",
    "    # same as before, insert ones to match the shape\n",
    "    X = np.insert(X, 0, values=np.ones(rows), axis=1)\n",
    "    \n",
    "    # convert to matrices\n",
    "    X = np.matrix(X)\n",
    "    all_theta = np.matrix(all_theta)\n",
    "    \n",
    "    # calculating our hypotheses,\n",
    "    h = sigmoid(X * all_theta.T)\n",
    "    \n",
    "    # create array of the index with the maximum probability\n",
    "    maximum = np.argmax(h, axis=1)\n",
    "    \n",
    "    # because our array was zero-indexed we need to add one for the true label prediction\n",
    "    maximum = maximum + 1\n",
    "    \n",
    "    return maximum"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "('accuracy is:', 0.9748)\n"
     ]
    }
   ],
   "source": [
    "y_pred = predict_all(data['X'], all_theta)\n",
    "correct = [1 if a == b else 0 for (a, b) in zip(y_pred, data['y'])]\n",
    "accuracy = (sum(map(int, correct)) / float(len(correct)))\n",
    "print(\"accuracy is:\", accuracy)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "97.48%  is our final accuracy, nice.\n",
    "For remarks please email me at tomer@nahshon.net"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.14"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"# Exercise 3 - Multiclass classification\n",
	"## Hello all!\n",
	"\n",
	"This will be my featured solution for the Execise number 3 in the coursera machine learning course by Andrew NG.\n",
	"The PDF is located [Here](https://github.com/merwan/ml-class/blob/master/ex3.pdf).\n",
	"\n",
	"Let's first import the packages for this exercise."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 56,
	"metadata": {},
	"outputs": [],
	"source": [
	"import numpy as np\n",
	"from scipy.io import loadmat\n",
	"from scipy.optimize import minimize\n",
	"from PIL import Image\n",
	"from IPython.display import Image\n",
	"from IPython.core.display import HTML \n",
	"import sys\n",
	"\n",
	"# for debugging - seeing entire array\n",
	"#np.set_printoptions(threshold=np.inf)\n"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"We'll use the loadmat function to load the Matlab information since all the exercise are programmed in Matlab (why?!?!) and I had to convert them to Python. In some exercises I was assisted by different piece codes off the internet , credits follow these lines."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"metadata": {},
	"outputs": [],
	"source": [
	"data = loadmat(\"Exercise3/ex3data1.mat\")"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"X is the samples matrix, it has 5000 samples of hand written numbers. each row is a sample picture of 20x20 pixels, so each row in X contains 400 cells. Overall 5000X400. Y is the labels matrix, so it is 5000x1 . Let's Check."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 58,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"((5000L, 400L), (5000L, 1L), numpy.ndarray, numpy.ndarray)"
	]
	},
	"execution_count": 58,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"X=data['X']\n",
	"y=data['y']\n",
	"X.shape, y.shape, type(X), type(y)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Great, please notice that loadmat function will not work with python 2.7 , so try 3> please, it took me a while to figure this out.\n",
	"\n",
	"Let's define the sigmoid function."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 5,
	"metadata": {},
	"outputs": [],
	"source": [
	"def sigmoid(z):\n",
	" \"\"\"sigmoid function\"\"\"\n",
	" return 1/(1+np.exp(-z))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"And now for the cost function as stated in the exercise."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 6,
	"metadata": {},
	"outputs": [],
	"source": [
	"def cost(theta, X, y, lamb):\n",
	" \"\"\"\"computing the cost function according to logistic regression including regularization term\"\"\"\n",
	" # Avoiding loops , vectorized approach\n",
	" X = np.matrix(X)\n",
	" m = len(X)\n",
	" y = np.matrix(y)\n",
	" theta = np.matrix(theta)\n",
	" # first term including y=1 classes\n",
	" first = np.multiply(-y, np.log(sigmoid(X * theta.T)))\n",
	" # second term includes y=0 classes\n",
	" second = np.multiply((1 - y), np.log(1 - sigmoid(X * theta.T)))\n",
	" # reg term to avoid overfitting - excluding theta(0)\n",
	" reg = (lamb / 2 * m) * np.sum(np.power(theta[:, 1:theta.shape[1]], 2))\n",
	" # concluding cost\n",
	" j = reg + np.sum(first - second) / m\n",
	" return j\n"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"The cost function is not very different than the one stated in other exercises. based on this formula\n",
	"<br>\n"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 57,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/html": [
	"<img src=\"https://i.stack.imgur.com/XbU4S.png\"/>"
	],
	"text/plain": [
	"<IPython.core.display.Image object>"
	]
	},
	"execution_count": 57,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"Image(url= \"https://i.stack.imgur.com/XbU4S.png\")"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Now, let's write the gradients function, remember this is the derviative"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 59,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/html": [
	"<img src=\"https://i.stack.imgur.com/pYVzl.png\"/>"
	],
	"text/plain": [
	"<IPython.core.display.Image object>"
	]
	},
	"execution_count": 59,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"Image(url= \"https://i.stack.imgur.com/pYVzl.png\")"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 52,
	"metadata": {},
	"outputs": [],
	"source": [
	"def gradients(theta ,X, y, lamb):\n",
	" \"\"\"Calculating gradients (derviatives) for updating the parameters\"\"\"\n",
	" X = np.matrix(X)\n",
	" y = np.matrix(y)\n",
	" theta = np.matrix(theta)\n",
	" m = len(X)\n",
	" grads = np.zeros(param)\n",
	" z =X *theta.T \n",
	" # error vector\n",
	" error = sigmoid(z) - y\n",
	" # calculating first term(intercept parameter\\ bias) with no regularization to avoid penalizing all parameters\n",
	" first_term = np.multiply(error, X[:, 0])\n",
	" grads[0] = np.sum(first_term) / m \n",
	" grads = ((X.T * error) / m).T + ((lamb / m) * theta)\n",
	"\n",
	" return grads.T\n"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Now, we can define a training session for every class.\n",
	"eventually we will get a theta vector (weights) for every class."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 53,
	"metadata": {},
	"outputs": [],
	"source": [
	"def onevsall(X, y, num_labels, lamb):\n",
	"\n",
	" # number of columns (features)\n",
	" params = X.shape[1]\n",
	" # number of rows (examples)\n",
	" rows = X.shape[0]\n",
	" # creating a theta matrix (rows as number of calsses , columns as number of parameters)\n",
	" theta_vector = np.zeros((num_labels, params + 1))\n",
	"\n",
	" # according to the pdf exercise , insert the bias term (intercept)\n",
	" X = np.insert(X, 0, values=np.ones(rows), axis=1)\n",
	"\n",
	" # For one vs all we go through each class and classify it as 1, and all other as 0. so y is a vector of\n",
	" for i in range(1, num_labels + 1):\n",
	" # initialize theta for minimizing function\n",
	" theta = np.zeros(params + 1)\n",
	" # creating a y specific for our i category\n",
	" y_i = np.array([1 if label == i else 0 for label in y])\n",
	" y_i = np.reshape(y_i, (rows, 1))\n",
	" theta.shape,X.shape\n",
	" # minimize the objective function -taken from scipy documentation\n",
	" # need to debug this one\n",
	" fmin = minimize(fun=cost, x0=theta, args=(X, y_i, lamb), method='TNC', jac=gradients)\n",
	" theta_vector[i - 1, :] = fmin.x\n",
	"\n",
	" return theta_vector"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 49,
	"metadata": {},
	"outputs": [
	{
	"name": "stderr",
	"output_type": "stream",
	"text": [
	"C:\\Users\\zeogo\\Miniconda2\\envs\\py35\\lib\\site-packages\\ipykernel_launcher.py:6: RuntimeWarning: divide by zero encountered in log\n",
	" \n",
	"C:\\Users\\zeogo\\Miniconda2\\envs\\py35\\lib\\site-packages\\ipykernel_launcher.py:6: RuntimeWarning: invalid value encountered in multiply\n",
	" \n",
	"C:\\Users\\zeogo\\Miniconda2\\envs\\py35\\lib\\site-packages\\ipykernel_launcher.py:7: RuntimeWarning: invalid value encountered in power\n",
	" import sys\n",
	"C:\\Users\\zeogo\\Miniconda2\\envs\\py35\\lib\\site-packages\\ipykernel_launcher.py:3: RuntimeWarning: overflow encountered in exp\n",
	" This is separate from the ipykernel package so we can avoid doing imports until\n",
	"C:\\Users\\zeogo\\Miniconda2\\envs\\py35\\lib\\site-packages\\ipykernel_launcher.py:5: RuntimeWarning: divide by zero encountered in log\n",
	" \"\"\"\n",
	"C:\\Users\\zeogo\\Miniconda2\\envs\\py35\\lib\\site-packages\\ipykernel_launcher.py:5: RuntimeWarning: invalid value encountered in multiply\n",
	" \"\"\"\n"
	]
	}
	],
	"source": [
	"# note to self: checking sizes of matrices called by fmin\n",
	"# training the algorithm, 10 classes, arbitrary regularization\n",
	"all_theta = one_vs_all(X, y, 10, 1)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 63,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"(array([[-8.05522912e+00, 0.00000000e+00, 0.00000000e+00, ...,\n",
	" 2.18619279e-02, 2.85921938e-07, 0.00000000e+00],\n",
	" [-5.90990431e+00, 0.00000000e+00, 0.00000000e+00, ...,\n",
	" 6.72129871e-02, -6.85937921e-03, 0.00000000e+00],\n",
	" [-8.71826341e+00, 0.00000000e+00, 0.00000000e+00, ...,\n",
	" -2.56532495e-04, -1.14937641e-06, 0.00000000e+00],\n",
	" ...,\n",
	" [-1.33464325e+01, 0.00000000e+00, 0.00000000e+00, ...,\n",
	" -6.15496460e+00, 7.10885457e-01, 0.00000000e+00],\n",
	" [-8.55318810e+00, 0.00000000e+00, 0.00000000e+00, ...,\n",
	" -1.89349871e-01, 8.57477934e-03, 0.00000000e+00],\n",
	" [-1.29493922e+01, 0.00000000e+00, 0.00000000e+00, ...,\n",
	" 2.58438619e-04, 4.11482696e-05, 0.00000000e+00]]), (10L, 401L))"
	]
	},
	"execution_count": 63,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"all_theta, all_theta.shape"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"So now we have 10 rows vector with 401 columns (1 bias term and 400 weights for each label)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 60,
	"metadata": {},
	"outputs": [],
	"source": [
	"def predict_all(X, all_theta):\n",
	" rows = X.shape[0]\n",
	" params = X.shape[1]\n",
	" num_labels = all_theta.shape[0]\n",
	" \n",
	" # same as before, insert ones to match the shape\n",
	" X = np.insert(X, 0, values=np.ones(rows), axis=1)\n",
	" \n",
	" # convert to matrices\n",
	" X = np.matrix(X)\n",
	" all_theta = np.matrix(all_theta)\n",
	" \n",
	" # calculating our hypotheses,\n",
	" h = sigmoid(X * all_theta.T)\n",
	" \n",
	" # create array of the index with the maximum probability\n",
	" maximum = np.argmax(h, axis=1)\n",
	" \n",
	" # because our array was zero-indexed we need to add one for the true label prediction\n",
	" maximum = maximum + 1\n",
	" \n",
	" return maximum"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 72,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"('accuracy is:', 0.9748)\n"
	]
	}
	],
	"source": [
	"y_pred = predict_all(data['X'], all_theta)\n",
	"correct = [1 if a == b else 0 for (a, b) in zip(y_pred, data['y'])]\n",
	"accuracy = (sum(map(int, correct)) / float(len(correct)))\n",
	"print(\"accuracy is:\", accuracy)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"97.48% is our final accuracy, nice.\n",
	"For remarks please email me at tomer@nahshon.net"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": []
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 2
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython2",
	"version": "2.7.14"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}