CreatCodeBuild/how_to_use_numpy_correctly_for_neural_nets.ipynb

## how_to_use_numpy_correctly_for_neural_nets.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Motivation\n",
    "I create this notebook to point out common mistakes among students who have done the __\"Build Your First Neural Network\"__ project\n",
    "\n",
    "__Erros in this notebook are intentional. Do not fix them.__"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Linear ALgebra refresher\n",
    "Before we dive into deep neural networks, let's review some linear algebra and some numpy gotcha\n",
    "\n",
    "__This toturial focus on numpy programming, not math.__  \n",
    "__For a mathamatical refresher, see [this great tutorial](https://www.youtube.com/watch?v=sYlOjyPyX3g&t=419s)__"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from numpy import array, dot"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[14]\n",
      "[[1 4 9]]\n"
     ]
    }
   ],
   "source": [
    "# First\n",
    "# dot(A, B) != A * B\n",
    "A = array([[1,2,3]])\n",
    "B = array([1,2,3])\n",
    "dot_C = dot(A, B)\n",
    "element_wise_c = A * B\n",
    "print(dot_C)\n",
    "print(element_wise_c)\n",
    "\n",
    "# dot is the dot product\n",
    "#  *  is the element-wise product"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "ename": "ValueError",
     "evalue": "shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-8-b76d9186a652>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m      4\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m      5\u001b[0m \u001b[1;31m# this line will fail, because it make no sense to dot product two matrices with shape (1, 3) and (1, 3)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 6\u001b[1;33m \u001b[0mdot_C\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[1;31mValueError\u001b[0m: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)"
     ]
    }
   ],
   "source": [
    "# But, these is a gotcha\n",
    "A = array([[1,2,3]])\n",
    "B = array([[1,2,3]])\n",
    "\n",
    "# this line will fail, because it make no sense to dot product two matrices with shape (1, 3) and (1, 3)\n",
    "dot_C = dot(A, B)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[14]]\n"
     ]
    }
   ],
   "source": [
    "# However\n",
    "A = array([[1,2,3]])\n",
    "B = array([[1,2,3]])\n",
    "\n",
    "# This is the same as if B = [1,2,3]. But actually, this is the correct way. \n",
    "dot_C = dot(A, B.T)\n",
    "print(dot_C)\n",
    "\n",
    "# dot([[1,2,3]], [1,2,3]) works probably just because it's a numpy short hand.\n",
    "# Because mathamatically speaking, it make less sense."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[1 4]\n",
      " [2 5]\n",
      " [3 6]]\n",
      "[[ 7  0]\n",
      " [ 8  1]\n",
      " [ 9  2]\n",
      " [10  3]]\n"
     ]
    }
   ],
   "source": [
    "# Now, let's consider bigger matrix\n",
    "A = array([\n",
    "        [1,2,3],\n",
    "        [4,5,6]\n",
    "    ])\n",
    "B = array([\n",
    "        [7,8,9,10],\n",
    "        [0,1,2,3]\n",
    "    ])\n",
    "\n",
    "\n",
    "# First let's see that is a transpose matrix .T\n",
    "print(A.T)\n",
    "print(B.T)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "X:\n",
      "[[ 7 12 17 22]\n",
      " [14 21 28 35]\n",
      " [21 30 39 48]]\n",
      "\n",
      "Y:\n",
      "[[ 7 14 21]\n",
      " [12 21 30]\n",
      " [17 28 39]\n",
      " [22 35 48]]\n"
     ]
    }
   ],
   "source": [
    "X = dot(A.T, B)\n",
    "Y = dot(B.T, A)\n",
    "\n",
    "print('X:')\n",
    "print(X)\n",
    "print()\n",
    "print('Y:')\n",
    "print(Y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ True  True  True]\n",
      " [ True  True  True]\n",
      " [ True  True  True]\n",
      " [ True  True  True]]\n",
      "[[ True  True  True  True]\n",
      " [ True  True  True  True]\n",
      " [ True  True  True  True]]\n",
      "[[ True  True  True]\n",
      " [ True  True  True]\n",
      " [ True  True  True]\n",
      " [ True  True  True]]\n",
      "[[ True  True  True  True]\n",
      " [ True  True  True  True]\n",
      " [ True  True  True  True]]\n"
     ]
    }
   ],
   "source": [
    "# You see, X and Y are just transpose of each other\n",
    "print(X.T == Y)\n",
    "print(X == Y.T)\n",
    "print(Y == X.T)\n",
    "print(Y.T == X)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 7 12 17 22]\n",
      " [14 21 28 35]\n",
      " [21 30 39 48]]\n"
     ]
    }
   ],
   "source": [
    "# Also consider\n",
    "A = array([\n",
    "        [1, 4],\n",
    "        [2, 5],\n",
    "        [3, 6],\n",
    "    ])\n",
    "B = array([\n",
    "        [7,8,9,10],\n",
    "        [0,1,2,3]\n",
    "    ])\n",
    "\n",
    "X = dot(A, B)\n",
    "print(X)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "ename": "ValueError",
     "evalue": "shapes (2,4) and (3,2) not aligned: 4 (dim 1) != 3 (dim 0)",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-34-aba7d292d99b>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m      1\u001b[0m \u001b[1;31m# But this will fail\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 2\u001b[1;33m \u001b[0mY\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m      3\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mY\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;31mValueError\u001b[0m: shapes (2,4) and (3,2) not aligned: 4 (dim 1) != 3 (dim 0)"
     ]
    }
   ],
   "source": [
    "# But this will fail, because the shape doesn't match\n",
    "Y = dot(B, A)\n",
    "print(Y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[11 16 21]\n",
      " [19 26 33]\n",
      " [27 36 45]]\n",
      "[[ 50 122]\n",
      " [ 14  32]]\n"
     ]
    }
   ],
   "source": [
    "# Therefore, dot(A, B) != dot(B, A)\n",
    "# also consider\n",
    "A = array([\n",
    "        [1, 4],\n",
    "        [2, 5],\n",
    "        [3, 6],\n",
    "    ])\n",
    "B = array([\n",
    "        [7, 8, 9],\n",
    "        [1, 2, 3]\n",
    "    ])\n",
    "X = dot(A, B)\n",
    "Y = dot(B, A)\n",
    "print(X)\n",
    "print(Y)\n",
    "\n",
    "# As you can see, \n",
    "# just because 2 operations with the same arguments but different order get computed without error,\n",
    "# it doesn't mean the semantics are the same.\n",
    "\n",
    "# But, for *, A * B == B * A because it's just element-wise!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[70 80 90]\n",
      " [10 20 30]]\n",
      "[[70 80 90]\n",
      " [10 20 30]]\n",
      "[[70 80 90]\n",
      " [10 20 30]]\n",
      "[[70 80 90]\n",
      " [10 20 30]]\n"
     ]
    }
   ],
   "source": [
    "# Now, let's consider scalar or 1 unit vector/matrix\n",
    "\n",
    "# salar\n",
    "A = 10\n",
    "B = array([\n",
    "        [7, 8, 9],\n",
    "        [1, 2, 3]\n",
    "    ])\n",
    "print(A * B)\n",
    "print(B * A)\n",
    "print(dot(A, B))\n",
    "print(dot(B, A))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[70 80 90]\n",
      " [10 20 30]]\n",
      "[[70 80 90]\n",
      " [10 20 30]]\n"
     ]
    },
    {
     "ename": "ValueError",
     "evalue": "shapes (1,) and (2,3) not aligned: 1 (dim 0) != 2 (dim 0)",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-46-00b31614be64>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m      7\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m \u001b[1;33m*\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m      8\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m \u001b[1;33m*\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 9\u001b[1;33m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m     10\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;31mValueError\u001b[0m: shapes (1,) and (2,3) not aligned: 1 (dim 0) != 2 (dim 0)"
     ]
    }
   ],
   "source": [
    "# 1 unit vector\n",
    "A = array([10])\n",
    "B = array([\n",
    "        [7, 8, 9],\n",
    "        [1, 2, 3]\n",
    "    ])\n",
    "print(A * B)\n",
    "print(B * A)\n",
    "print(dot(A, B))\n",
    "print(dot(B, A))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[70 80 90]\n",
      " [10 20 30]]\n",
      "[[70 80 90]\n",
      " [10 20 30]]\n"
     ]
    },
    {
     "ename": "ValueError",
     "evalue": "shapes (1,1) and (2,3) not aligned: 1 (dim 1) != 2 (dim 0)",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-47-61d8df2d1573>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m      7\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m \u001b[1;33m*\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m      8\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m \u001b[1;33m*\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 9\u001b[1;33m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m     10\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;31mValueError\u001b[0m: shapes (1,1) and (2,3) not aligned: 1 (dim 1) != 2 (dim 0)"
     ]
    }
   ],
   "source": [
    "# 1 unit matrix\n",
    "A = array([[10]])\n",
    "B = array([\n",
    "        [7, 8, 9],\n",
    "        [1, 2, 3]\n",
    "    ])\n",
    "print(A * B)\n",
    "print(B * A)\n",
    "print(dot(A, B))\n",
    "print(dot(B, A))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[70 80 90]]\n",
      "[[70 80 90]]\n",
      "[[70 80 90]]\n",
      "[[70 80 90]] \n",
      "\n",
      "1 unit vector\n",
      "[[70 80 90]]\n",
      "[[70 80 90]]\n",
      "[70 80 90]\n"
     ]
    },
    {
     "ename": "ValueError",
     "evalue": "shapes (1,3) and (1,) not aligned: 3 (dim 1) != 1 (dim 0)",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-56-9e972ee5436b>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m     15\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m \u001b[1;33m*\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     16\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 17\u001b[1;33m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[1;31mValueError\u001b[0m: shapes (1,3) and (1,) not aligned: 3 (dim 1) != 1 (dim 0)"
     ]
    }
   ],
   "source": [
    "# But, what if B has only one layer?\n",
    "# saclar\n",
    "A = array(10)\n",
    "B = array([[7, 8, 9]])\n",
    "print(A * B)\n",
    "print(B * A)\n",
    "print(dot(A, B))\n",
    "print(dot(B, A), '\\n')\n",
    "\n",
    "# 1-unit vector\n",
    "print('1 unit vector')\n",
    "A = array([10])\n",
    "B = array([[7, 8, 9]])\n",
    "print(A * B)\n",
    "print(B * A)\n",
    "print(dot(A, B))\n",
    "print(dot(B, A))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 unit matrix\n",
      "[[70 80 90]]\n",
      "[[70 80 90]]\n",
      "[[70 80 90]]\n"
     ]
    },
    {
     "ename": "ValueError",
     "evalue": "shapes (1,3) and (1,1) not aligned: 3 (dim 1) != 1 (dim 0)",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-55-6183a2d96180>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m      6\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m \u001b[1;33m*\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m      7\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 8\u001b[1;33m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[1;31mValueError\u001b[0m: shapes (1,3) and (1,1) not aligned: 3 (dim 1) != 1 (dim 0)"
     ]
    }
   ],
   "source": [
    "# 1-unit matrix\n",
    "print('1 unit matrix')\n",
    "A = array([[10]])\n",
    "B = array([[7, 8, 9]])\n",
    "print(A * B)\n",
    "print(B * A)\n",
    "print(dot(A, B))\n",
    "print(dot(B, A))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Have you noticed that [10] dot [[7,8,9]] != [[10]] dot [[7,8,9]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[70 80 90]\n",
      "[70 80 90]\n",
      "[70 80 90]\n",
      "[70 80 90] \n",
      "\n",
      "[70 80 90]\n",
      "[70 80 90]\n"
     ]
    },
    {
     "ename": "ValueError",
     "evalue": "shapes (1,) and (3,) not aligned: 1 (dim 0) != 3 (dim 0)",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-59-714314867e51>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m     11\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m \u001b[1;33m*\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     12\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m \u001b[1;33m*\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 13\u001b[1;33m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m     14\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;31mValueError\u001b[0m: shapes (1,) and (3,) not aligned: 1 (dim 0) != 3 (dim 0)"
     ]
    }
   ],
   "source": [
    "# Now, what if B is also a vector?\n",
    "A = array(10)\n",
    "B = array([7, 8, 9])\n",
    "print(A * B)\n",
    "print(B * A)\n",
    "print(dot(A, B))\n",
    "print(dot(B, A), '\\n')\n",
    "\n",
    "A = array([10])\n",
    "B = array([7, 8, 9])\n",
    "print(A * B)\n",
    "print(B * A)\n",
    "print(dot(A, B))\n",
    "print(dot(B, A))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "And I will stop here. That's enough numpy matrix multiplication\n"
     ]
    }
   ],
   "source": [
    "# The above is another gotcha!!!\n",
    "# matrix dot product only make sense when it's a matrix,\n",
    "# that is, has shape (x, y), even if x or y could be 1!\n",
    "\n",
    "print(\"And I will stop here. That's enough numpy matrix multiplication\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Now, let's dive into deep neural network\n",
    "__Just simple fully connected neural nets. No convolution, no recurrence__  \n",
    "For example, if layer A and layer B are adjacent and B is received input from A.\n",
    "\n",
    "A has 10 nodes and B has 15 nodes.\n",
    "\n",
    "Then there are 150 connections between A and B. Each connection is an edge if you think of it as a computational graph.\n",
    "\n",
    "Therefore, B received 150 values(scalars) from A. Each node in B received 10 values from A. Each node in A ouput 15 values to B.\n",
    "\n",
    "Therefore, no matter it's forward pass or backward pass, 150 values get computed between A and B each pass.\n",
    "\n",
    "The forward pass computes the activation. The backward pass computes the gradient and updating values.\n",
    "\n",
    "These 150 edges contains 150 weights, 1 weight each."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15]\n",
      " [  2   4   6   8  10  12  14  16  18  20  22  24  26  28  30]\n",
      " [  3   6   9  12  15  18  21  24  27  30  33  36  39  42  45]\n",
      " [  4   8  12  16  20  24  28  32  36  40  44  48  52  56  60]\n",
      " [  5  10  15  20  25  30  35  40  45  50  55  60  65  70  75]\n",
      " [  6  12  18  24  30  36  42  48  54  60  66  72  78  84  90]\n",
      " [  7  14  21  28  35  42  49  56  63  70  77  84  91  98 105]\n",
      " [  8  16  24  32  40  48  56  64  72  80  88  96 104 112 120]\n",
      " [  9  18  27  36  45  54  63  72  81  90  99 108 117 126 135]\n",
      " [ 10  20  30  40  50  60  70  80  90 100 110 120 130 140 150]]\n",
      "[[  1   2   3   4   5   6   7   8   9  10]\n",
      " [  2   4   6   8  10  12  14  16  18  20]\n",
      " [  3   6   9  12  15  18  21  24  27  30]\n",
      " [  4   8  12  16  20  24  28  32  36  40]\n",
      " [  5  10  15  20  25  30  35  40  45  50]\n",
      " [  6  12  18  24  30  36  42  48  54  60]\n",
      " [  7  14  21  28  35  42  49  56  63  70]\n",
      " [  8  16  24  32  40  48  56  64  72  80]\n",
      " [  9  18  27  36  45  54  63  72  81  90]\n",
      " [ 10  20  30  40  50  60  70  80  90 100]\n",
      " [ 11  22  33  44  55  66  77  88  99 110]\n",
      " [ 12  24  36  48  60  72  84  96 108 120]\n",
      " [ 13  26  39  52  65  78  91 104 117 130]\n",
      " [ 14  28  42  56  70  84  98 112 126 140]\n",
      " [ 15  30  45  60  75  90 105 120 135 150]]\n"
     ]
    }
   ],
   "source": [
    "# Therefore, assuming\n",
    "A = array([[1,2,3,4,5,6,7,8,9,10]])\n",
    "B = array([[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]])\n",
    "# it only make semantic sense to write\n",
    "whatever_been_computed = dot(A.T, B)\n",
    "print(whatever_been_computed)\n",
    "# or\n",
    "whatever_been_computed = dot(B.T, A)\n",
    "print(whatever_been_computed)\n",
    "\n",
    "# no matter it's computing the weights, gradient, error, or what ever intermedia values between 2 layers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ 1  2  3  4  5  6  7  8  9 10]\n"
     ]
    },
    {
     "ename": "ValueError",
     "evalue": "shapes (10,) and (15,) not aligned: 10 (dim 0) != 15 (dim 0)",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-67-c890709b7c01>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m      6\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mT\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m      7\u001b[0m \u001b[1;31m# and this will fall and it hurts\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 8\u001b[1;33m \u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mT\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[1;31mValueError\u001b[0m: shapes (10,) and (15,) not aligned: 10 (dim 0) != 15 (dim 0)"
     ]
    }
   ],
   "source": [
    "# You can't store A and B in plain vectors\n",
    "A = array([1,2,3,4,5,6,7,8,9,10])\n",
    "B = array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15])\n",
    "\n",
    "# Because A.T make zero impact\n",
    "print(A.T)\n",
    "# and this will fall and it hurts\n",
    "dot(A.T, B)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]]\n",
      "[[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]]\n",
      "[[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]]\n",
      "[[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]]\n",
      "[[ 1]\n",
      " [ 2]\n",
      " [ 3]\n",
      " [ 4]\n",
      " [ 5]\n",
      " [ 6]\n",
      " [ 7]\n",
      " [ 8]\n",
      " [ 9]\n",
      " [10]\n",
      " [11]\n",
      " [12]\n",
      " [13]\n",
      " [14]\n",
      " [15]]\n",
      "[[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]]\n"
     ]
    }
   ],
   "source": [
    "# But, if one of you layer has only 1 node\n",
    "A = array([[1]])\n",
    "B = array([[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]])\n",
    "\n",
    "print(A * B)\n",
    "print(B * A)\n",
    "print(dot(A, B))\n",
    "print(dot(A.T, B))\n",
    "print(dot(B.T, A)) # oh, and this one's transpose is just\n",
    "print(dot(B.T, A).T) # which == all of the first 4 equations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "__And the above exmple is the reason some students start to do crazy combinations of `*`, `dot` and `.T`__\n",
    "\n",
    "Eventually, They will probably gets the shapes matched. But, does that mean they understand how it actually work?\n",
    "\n",
    "Also, even if the shapes match, the value could be wrong."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 94,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 7 16 13]\n",
      " [14 26 23]\n",
      " [21 36 33]]\n",
      "[[ 49  56  63]\n",
      " [112 148 154]\n",
      " [189 228 249]]\n"
     ]
    }
   ],
   "source": [
    "# For example\n",
    "A = array([\n",
    "        [1,2,3],\n",
    "        [4,5,6]\n",
    "    ])\n",
    "B = array([\n",
    "        [7,8,9],\n",
    "        [0,2,1]\n",
    "    ])\n",
    "\n",
    "print(dot(A.T, B))\n",
    "print(dot((A * B).T, B))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### End\n",
    "I hope this will help you to clear your mind and understand what exactly happend when you are implementing neural netowrks with numpy as your matrix manipulation tool"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}