raytroop/12-1_CNN.ipynb

## 12-1_CNN.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# <center>Nuts and Bolts of Convolution Neural Networks</center>\n",
    "<center>\n",
    "        \"Shan-Hung Wu & DataLab\"\n",
    "        <br>\n",
    "        \"Fall 2017\"\n",
    "</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
   "[nthu-datalab/ml/labs/12-1_CNN/12-1_CNN.ipynb](https://github.com/nthu-datalab/ml/blob/7f09e4c2f9704495d184edbd55b7aca71c418407/labs/12-1_CNN/12-1_CNN.ipynb)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this lab, we introduce two datasets, **mnist** and **cifar**, then we will talk about how to implement CNN models for these two datasets using tensorflow. The major difference between mnist and cifar is their size. Due to the limit of memory size and time issue, we offer a guide to illustrate typical **input pipeline** of tensorflow. Let's dive into tensorflow!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## MNIST Dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We start from a simple dataset. MNIST is a simple computer vision dataset. It consists of images of handwritten digits like:\n",
    "\n",
    "<center><img style='width: 30%' src='imgsrc/MNIST.png' /></center>\n",
    "\n",
    "It also includes labels for each image, telling us which digit it is. For example, the labels for the above images are 5, 0, 4, and 1. Each image is 28 pixels by 28 pixels. We can interpret this as a big array of numbers:\n",
    "\n",
    "<center><img style='width: 30%' src='imgsrc/MNIST2.png' /></center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The MNIST data is hosted on [Yann LeCun's website](http://yann.lecun.com/exdb/mnist/). We can directly import MNIST dataset from Tensorflow. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Extracting dataset/mnist/train-images-idx3-ubyte.gz\n",
      "Extracting dataset/mnist/train-labels-idx1-ubyte.gz\n",
      "Extracting dataset/mnist/t10k-images-idx3-ubyte.gz\n",
      "Extracting dataset/mnist/t10k-labels-idx1-ubyte.gz\n"
     ]
    }
   ],
   "source": [
    "from tensorflow.examples.tutorials.mnist import input_data\n",
    "import tensorflow as tf\n",
    "import os\n",
    "\n",
    "dest_directory = 'dataset/mnist'\n",
    "# check the directory\n",
    "if not os.path.exists(dest_directory):\n",
    "  os.makedirs(dest_directory)\n",
    "# import data\n",
    "mnist = input_data.read_data_sets(\"dataset/mnist/\", one_hot=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Softmax Regression on MNIST"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before jumping to *Convolutional Neural Network* model, we're going to start with a very simple model with a single layer and softmax regression.\n",
    "\n",
    "We know that every image in MNIST is a handwritten digit between zero and nine. So there are only ten possible digits that a given image can be. We want to give the probability of the input image for being each digit. That is, input an image, the model outputs a ten-dimension vector.\n",
    "\n",
    "This is a classic case where a softmax regression is a natural, simple model. If you want to assign probabilities to an object being one of several different things, softmax is the thing to do."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Create the model (Softmax Regression)\n",
    "\n",
    "x = tf.placeholder(tf.float32,[None, 784])  # flatten into vector of 28 x 28 = 784\n",
    "y_true = tf.placeholder(tf.float32, [None, 10])  # true answers\n",
    "W = tf.Variable(tf.zeros([784, 10]))  # Weights\n",
    "b = tf.Variable(tf.zeros([10]))  # bias\n",
    "y_pred = tf.matmul(x, W) + b  # y = Wx + b\n",
    "\n",
    "# Define loss and optimizer\n",
    "cross_entropy = tf.reduce_mean(\n",
    "    tf.nn.softmax_cross_entropy_with_logits(labels=y_true, \n",
    "                                            logits=y_pred))  # our loss\n",
    "train_step = tf.train.GradientDescentOptimizer(0.5).minimize(\n",
    "    cross_entropy)  # our optimizer"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After creating our model and defining the loss and optimizer, we can start training."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Accuracy: 91.8%\n"
     ]
    }
   ],
   "source": [
    "# Training and Testing\n",
    "\n",
    "with tf.Session() as sess:\n",
    "  # initialize the variables we created\n",
    "  sess.run(tf.global_variables_initializer())  \n",
    "  # run the training step 1000 times\n",
    "  for _ in range(1000):\n",
    "    batch_xs, batch_ys = mnist.train.next_batch(100)\n",
    "    # feed training data x and y_ for training\n",
    "    sess.run(train_step, feed_dict={\n",
    "            x: batch_xs,\n",
    "            y_true: batch_ys\n",
    "        })  \n",
    "\n",
    "  # Testing\n",
    "  correct_prediction = tf.equal(tf.argmax(y_pred, 1), tf.argmax(y_true, 1))\n",
    "  accuracy_op = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))\n",
    "  accuracy = sess.run(accuracy_op, feed_dict={\n",
    "          x: mnist.test.images,\n",
    "          y_true: mnist.test.labels\n",
    "      })\n",
    "  # feed our testing data for testing\n",
    "  print('Accuracy: %.1f%%' % (accuracy * 100))  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "From the above result, we got about 92% accuracy for *Softmax Regression* on MNIST. In fact, it's not so good. This is because we're using a very simple model."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Multilayer Convolutional Network on MNIST"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We're now jumping from a very simple model to something moderately sophisticated: a small *Convolutional Neural Network*. This will get us to around 99.2% accuracy, not state of the art, but respectable."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here is the diagram of the model we're going to build:\n",
    "\n",
    "<center><img style='width: 30%' src='imgsrc/mnist_deep.png' /></center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To create this model, we need to create a lot of weights and biases. Instead of doing this repeatedly, let's create two handy functions to do it for us."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def weight_variable(shape):\n",
    "  initial = tf.truncated_normal(shape, stddev=0.1)\n",
    "  return tf.Variable(initial)\n",
    "\n",
    "def bias_variable(shape):\n",
    "  initial = tf.constant(0.1, shape=shape)\n",
    "  return tf.Variable(initial)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "TensorFlow gives us a lot of flexibility in **convolution** and **pooling** operations. How do we handle the boundaries? What is our stride size? For now, we're going to choose the vanilla version. To keep our code cleaner, let's also abstract those operations into functions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Our convolutions uses a stride of one and are zero padded so that the output is the same size as the input.\n",
    "# Our pooling is plain old max pooling over 2x2 blocks.\n",
    "\n",
    "def conv2d(x, W):\n",
    "  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')\n",
    "\n",
    "def max_pool_2x2(x):\n",
    "  return tf.nn.max_pool(\n",
    "      x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can now implement our layers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# [batch_size, height, width, channel]\n",
    "x_image = tf.reshape(x, [-1, 28, 28, 1])\n",
    "\n",
    "# First Convolutional Layer\n",
    "W_conv1 = weight_variable([5, 5, 1, 32]) # (filter_height, filter_width, number of input channels, number of output channels)\n",
    "b_conv1 = bias_variable([32])\n",
    "\n",
    "# convolve x_image with the weight tensor, add the bias, then apply the ReLU function\n",
    "h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)\n",
    "# and finally max pool \n",
    "h_pool1 = max_pool_2x2(h_conv1) # It will reduce the image size to \"14x14\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Second Convolutional Layer\n",
    "\n",
    "W_conv2 = weight_variable([5, 5, 32, 64])\n",
    "b_conv2 = bias_variable([64])\n",
    "\n",
    "h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)\n",
    "h_pool2 = max_pool_2x2(h_conv2) # It will reduce the image size to \"7x7\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that the image size has been reduced to 7x7, we add a fully-connected layer with 1024 neurons to allow processing on the entire image."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Densely Connected Layer\n",
    "\n",
    "W_fc1 = weight_variable([7 * 7 * 64, 1024]) \n",
    "b_fc1 = bias_variable([1024])\n",
    "\n",
    "h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64]) # flatten\n",
    "h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To reduce overfitting, we will apply [*dropout*](https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf) before the readout layer. The idea behind dropout is to train an ensemble of model instead of a single model. During training, we drop out neurons with probability $p$, i.e., the probability to keep is $1-p$. When a neuron is dropped, its output is set to zero. These dropped neurons do not contribute to the training phase in forward pass and backward pass. For each training phase, we train the network slightly different from the previous one. It's just like we train different networks in each training phrase. However, during testing phase, we **don't** drop any neuron, and thus, implement dropout is kind of like doing ensemble. Also, randomly drop units in training phase can prevent units from co-adapting too much. Thus, dropout is a powerful regularization techique to deal with *overfitting*. \n",
    " \n",
    "We create a placeholder for the probability that a neuron's output is kept during dropout. This allows us to turn dropout on during training, and turn it off during testing."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Dropout\n",
    "\n",
    "keep_prob = tf.placeholder(tf.float32)\n",
    "h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we add a layer, just like for the one layer softmax regression above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Readout Layer\n",
    "\n",
    "W_fc2 = weight_variable([1024, 10])\n",
    "b_fc2 = bias_variable([10])\n",
    "\n",
    "y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After defining our model, we then define our loss and optimizer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Define loss and optimizer\n",
    "\n",
    "cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_true, logits=y_conv)) # our loss\n",
    "train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) # our optimizer\n",
    "correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_true, 1))\n",
    "accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's check how well does this model do! Note that we will include the additional parameter **keep_prob** in feed_dict to control the dropout rate."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Extracting dataset/mnist/train-images-idx3-ubyte.gz\n",
      "Extracting dataset/mnist/train-labels-idx1-ubyte.gz\n",
      "Extracting dataset/mnist/t10k-images-idx3-ubyte.gz\n",
      "Extracting dataset/mnist/t10k-labels-idx1-ubyte.gz\n",
      "step 0, training accuracy 14.0%\n",
      "step 1000, training accuracy 98.0%\n",
      "step 2000, training accuracy 96.0%\n",
      "step 3000, training accuracy 100.0%\n",
      "step 4000, training accuracy 98.0%\n",
      "step 5000, training accuracy 100.0%\n",
      "step 6000, training accuracy 100.0%\n",
      "step 7000, training accuracy 100.0%\n",
      "step 8000, training accuracy 100.0%\n",
      "step 9000, training accuracy 100.0%\n",
      "step 10000, training accuracy 100.0%\n",
      "step 11000, training accuracy 100.0%\n",
      "step 12000, training accuracy 100.0%\n",
      "step 13000, training accuracy 100.0%\n",
      "step 14000, training accuracy 100.0%\n",
      "step 15000, training accuracy 100.0%\n",
      "step 16000, training accuracy 100.0%\n",
      "step 17000, training accuracy 100.0%\n",
      "step 18000, training accuracy 98.0%\n",
      "step 19000, training accuracy 100.0%\n",
      "test accuracy 99.2%\n"
     ]
    }
   ],
   "source": [
    "# Training and Testing\n",
    "\n",
    "# Re-import data for initializing batch\n",
    "mnist = input_data.read_data_sets(\"dataset/mnist\", one_hot=True)\n",
    "\n",
    "with tf.Session() as sess:\n",
    "  sess.run(\n",
    "      tf.global_variables_initializer())  # initialize the variables we created\n",
    "  # run the training step 20000 times\n",
    "  for i in range(20000):\n",
    "    batch = mnist.train.next_batch(50)\n",
    "    if i % 1000 == 0:\n",
    "      train_accuracy = accuracy.eval(feed_dict={\n",
    "          x: batch[0],\n",
    "          y_true: batch[1],\n",
    "          keep_prob: 1.0\n",
    "      })\n",
    "      print('step %d, training accuracy %.1f%%' % (i, train_accuracy * 100))\n",
    "    train_step.run(feed_dict={\n",
    "        x: batch[0],\n",
    "        y_true: batch[1],\n",
    "        keep_prob: 0.5\n",
    "    })  # feed into x, y_ and keep_prob for training\n",
    "\n",
    "  print('test accuracy %.1f%%' % (100 * accuracy.eval(feed_dict={\n",
    "      x: mnist.test.images,\n",
    "      y_true: mnist.test.labels,\n",
    "      keep_prob: 1.0\n",
    "  })))  # feed for testing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The final testing accuracy should be approximately 99.2%"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Cifar-10"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Actually MNIST is a easy dataset for the beginner. To demonstrate the power of *Neural Networks*, we need a larger dataset *CIFAR-10*."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) consists of 60000 32x32 color images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. Here are the classes in the dataset, as well as 10 random images from each:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center><img style='width: 40%' src='imgsrc/CIFAR10.png' /></center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before jumping to a complicated neural network model, we're going to start with **KNN** and **SVM**. The motivation here is to compare neural network model with traditional classifiers, and highlight the performance of neural network model.  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### K Nearest Neighbors (KNN) on CIFAR-10"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Keras offers convenient facilities that automatically download some well-known datasets and store them in the ~/.keras/datasets directory. Let's load the CIFAR-10 in Keras:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "X_train shape: (50000, 32, 32, 3)\n",
      "Y_train shape: (50000, 10)\n",
      "X_test shape: (10000, 32, 32, 3)\n",
      "Y_test shape: (10000, 10)\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Using TensorFlow backend.\n"
     ]
    }
   ],
   "source": [
    "# Loading Data\n",
    "from keras.datasets import cifar10\n",
    "from keras.utils import np_utils\n",
    "import numpy as np\n",
    "import math\n",
    "\n",
    "(X_train, y_train), (X_test, y_test) = cifar10.load_data()\n",
    "# convert class vectors to binary vectors\n",
    "Y_train = np_utils.to_categorical(y_train)\n",
    "Y_test = np_utils.to_categorical(y_test)\n",
    "\n",
    "print('X_train shape:', X_train.shape)\n",
    "print('Y_train shape:', Y_train.shape)\n",
    "print('X_test shape:', X_test.shape)\n",
    "print('Y_test shape:', Y_test.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The datas are loaded as integers, so we need to cast it to floating point values in order to perform the division:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Data Preprocessing\n",
    "# normalize inputs from 0-255 to 0.0-1.0\n",
    "X_train = X_train.astype('float32')\n",
    "X_test = X_test.astype('float32')\n",
    "X_train = X_train / 255.0\n",
    "X_test = X_test / 255.0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For simplicity, we also convert the images into the grayscale. We use the [Luma coding](https://en.wikipedia.org/wiki/Grayscale#Luma_coding_in_video_systems) that is common in video systems:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWMAAAC5CAYAAAD9EF4cAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAAPYQAAD2EBqD+naQAAIABJREFUeJztnXlwXNd15r/TGxo7QIAACIIQCZAiSJGUZEeilliSRcWW\nXDOK4ngcO4412UoVOUk5SU3FTo090lhVSdkZjz1xQk8qM1GiONLYLmWRbEmUbMmWLGqxFos7KYK7\nSOwk9t7v/NEA+51zL7sbjW7ywTy/KlThNu677773Lu67/Z17ziFjDBRFUZRLS+BSd0BRFEXRyVhR\nFMUX6GSsKIriA3QyVhRF8QE6GSuKovgAnYwVRVF8gE7GiqIoPkAnY0VRFB+gk7GiKIoP0MlYURTF\nB1RsMiai3yeio0Q0S0SvEtF1lTqXopSCjlHFT1RkMiaiXwPwVQAPALgWwDsAdhBRayXOpygLRceo\n4jeoEoGCiOhVAK8ZYz47VyYAJwH8lTHmK2U/oaIsEB2jit8IlbtBIgoDeD+AP5//zBhjiOgHAG50\n1G8B8GEAxwDEyt0f5eeSKIDVAHYYY0YXerCOUeUisOAxWvbJGEArgCCAQfH5IID1jvofBvDPFeiH\n8vPPpwA8WsJxOkaVi0XRY7QSk/FCOQYAn//UR9Hdvhzf/LdncP89d6IlNcUqTUxMWAe+MsTrtC1v\nZuW+5Pj537/+43fwR7dejfbkNKsTT9syzdsp/tnOkUlWTsYy53/f3X8Km3u70NYYYXW6lzdY7a4k\nLtH3Gb7ICqdz5S/uPIaHblqNcYRZnZ+dy0DSFOT9bakNsvJrI+nzvz+19yg+ctUavAdep7kparVb\nHyBWruVdwdnJ2fO/P/NWP+58Xy+Oj82wOsvb2612p86OsHLa8GuKRKvO/75z1xHctKUHtZ6+jE7M\n4PuvHwTmxs5F4BgAfP7zn0d3dze++c1v4v7770cgwJ/n2bNnrQMPHTrEyp2dnaxcXV19/vdvfetb\n+I3f+A2r3WQyabU7MsLv4bvvvsvKsVhuLO3evRubN29GS0sLq9PV1WW1W1dXl7ecyeSe1T/90z/h\n05/+NFKpFKtz4sQJq91olI+vpqYmVu7v7z//+4svvohbbrkF09P8f7W11ZbzIxH+fyfP430mzz//\nPG6//XacOXOG1Vm5cqXV7ugoX8ym02mrzvyz27lzJ2666SbWl7GxMezYsQNYwBitxGQ8AiANQP4X\ntgMYcNSPAcD3X3kTtdVRnBoewSM7foRoJom7runDR67tAwCcHQtbBx5L8+6v7FjOyn2J3KCuqwqj\nr70ZXXE+Cc2m7Ml4THx2IMkfRDycG5DhUBBN9TVoE5PZqrZGq90e4ufebPjEFUnlrqchEsKW5XUY\nAx9sZ8keFK0h3t+Oen5fTnr+WaLhIDqb6hA3vM7y1hqr3SYxGTfwriAazrURjYSwYlk9Jvn/Jdpb\n+D8dAEQzs6ycNvyaqmpyE1QylcYb+0+wgRpPnj9JqZJBaWP0+99HbW0tTp06hUceeQREhNtvvx3b\ntm0DAAwPD1sHykVEd3c3K3snu5qaGqxZs8aajBOJhNVuVVUVK8vJY2YmN7bC4TCampqwfDn//3BN\nxg0NfBHR2MjHsXcynu+vfFm4Xh41NXx8yRfD5GRuwVNVVYW2tjb2GQB0dHRY7crJV57He5+qqqrQ\n0dHBXlQXalfimoxra2sBZK/3jTfeYM/N88yKHqNln4yNMUkiehPANgBPAOeNI9sA/NWFjrv/njux\nrqsTX/y/j+Kh3/l1dKTGL1RVuYyora7CXTdtZKv0wbNTeOQHb5fcZslj9P77sW7dOnzxi1/EQw89\nhGAweKGqymVEbW0t7rrrLjbxDw0N4bHHHltQO5WSKf4ngH+YG/CvA/hjADUA/uFCB5wan0CgJoqZ\nRAL9Y2OYqOZfXavSKeuYdIJ/LTyT4KvniWBuKZeiACaCEaTFt/zkLF+dAkBAvAUDU/zFUB3JrWaC\nBFSHDIJBfkwqxld/AJAIVrPyWJLXafNMOMYAmRShqY4f07OMr1YBoM7wexMhvjJpCHlW8pQtzw7y\nVdwg6q12azraWHlonN+H0cncSz+RSmN0chaxBD93wrFKCob4sKuO8tXMbDK3EjTGIJXJYHImd6+m\nZ+xnVgILHqODg4OIRqOIxWI4efKktWqUKy7XZ+fOnWNl70rOGMNWnvPMOK43Ho/nreP9yhwIBBCJ\nRKyXh6u/csU9Lp657G86nbakDNfXfnluWfaucAOBAKLRKI4fP87quHZ+rV69mpXlt5OxsbHzvyeT\nSYyNjVnX7boPITFG6+vt/4/54+bvg/deTU1NWfULUZHJ2Bjznbn9ml9C9qvfzwB82Bhjf49TlEuA\njlHFb1TMgGeM2Q5g+0KPu+4qlzF78Xyw74qKtNvd3ly4Ugn8Sm9lfA/e11WZdq/sbClcqQR6K9Rf\noPQxev3111egN8CNN1q76sqC1KnLRaX6u2HDhoq029vb6+t2fReb4rpNFZqMN6yuSLurOyo1GS8v\nXKkE3r+qQpPxygpNxqsqcx8Www033FCRdnUyzrJx48aKtFupyXjt2rVlaccPW9sAAKdGRpFEThMa\nb+TaVVfA1oyX13DdaXqWa7DjdbYlPxbk75+Qy0o6zfW3LdX8NqWabf0oXc37S8Y27kxnuOY1GeT6\nb1tYbFcAQGKnQSBua1FEvN2kOE97rb1tbd1KvpFg/6htME2HxLahBmFZjzm2+4htdZOOLYmNNbw/\n9XW1rBxxaHgUyrU7bcvQF4VTp04xq7rcETBvXfcit3DJnRGunQdyu5ZLK5XttIsthHJXBGDrwS5k\nu+FwOG+5mDYAW4OVmndzs72okZOc1JABW3uWOr5rS6zUf6WODxTeVQLYOr33Wbp06EL4bmWsKIpy\nOaKTsaIoig/QyVhRFMUH6GSsKIriA3xjwDs7PomAJ3ZDxHDBfNZhrO+s44aO8XPcyHcuYhuYTDMX\n5oPnxqw67cIAtnY1jydwlGwjxj6xLz8tHDwAYDbIjTVTwvhw1uHYQsIoOTtrGxtCVcLQJs7dGLTf\nucmz3NU0bWxnkmnhwBFLzeb9OwBMz3DDTNJx7oYaYaBKCKeVpG2wqmrIORZMJOznejEYGxtjLq/S\nbdkbZ2IeaeST8StczgHSkcFVR7oBX3cdj4vvchQZHJRxkQqTdUzM4TJMyXO5+ivdlKWR0nXvZDsu\nl2R5buk0MztrO19JN2uXUVIa+VxGSXkur7HWdd5C6MpYURTFB+hkrCiK4gN0MlYURfEBvtGMg2QQ\nIo8GI2LczjocDJpEaMtAimuYIy69qJFrU93L66w668Te+FCU62YT07Z+lJrmfZkg+9ZGq7kWOhvh\nOtopR3+jMX6uqCNSWMsyrq+nZ7lu2zTD48ICQGKah1ykkH0fUkmu5Q6MCb3ajmmDuqgI1jRjX5MM\n9FNfxdcE52ZtB5Tm6pyGl0pdGq8PImI6sXTGcOm00tlB6p4yEI+rjgx9Cdgaq3Q4kU4Vrv7JOMSu\ndqUuXkwwJJcG29bGg05JTdXVFxnP2BUlTzrNSF3cFXhJ6tcujVt+5nLokQ4lXqcalzNPIcq+Miai\nB4goI372lfs8ilIqOkYVP1KplfEeZGPDzi8p7deeolxadIwqvqJSk3FKQxEqPkfHqOIrKmXAW0dE\n7xFRPxF9i4hWVeg8ilIqOkYVX1GJlfGrAH4TwEEAKwA8COBFItpkjLEtSec7kkbY800xOc2NRXFj\nGwUmE9xIQUlejk7azgOZVrHZvMGOwBYNC4PdDDeitYbs27ZaGPl2T9nJKYfBjSNtnTycZcbxRTkg\nHFcaq+xcdeEM709UZB2JzfJN7gAg43dR2jY4nBKJJWPg7XYtt5ONBjO8zozDkBETz210lBtzMmRH\nF5uNeyJiJRatKJQ0RomIOUFII4801s0f46WYfHHSmOWKaCaNUNKRQf7d1Y7LCUQa7JYtW8bKLsNg\nochpgH0fpNOHK5mrPMbl9HH48OG8dVxZRySua5JGyYEBOzWi7J/XKOlqsxCVyIG3w1PcQ0SvAzgO\n4OMAHi73+RRloegYVfxIxbe2GWPGiegQgLwRmF/e9S4inmzDoQCwobsDG7tXVLqLio85/N4oDp8e\nRSiUW7HFF78yZhQ7Rnfu3MlWdMFgEOvXr0dfX19Z+6MsLY4cOYIjR46wb0Yu9+lCVHwyJqI6ZAf5\nI/nq3bxlHZY35SSD+rBm3lWAtStbsHZlC2rrctLF4NgUHn2u9OzQkmLH6E033cT2/LriKSiXHz09\nPejp6WEB6X2RHZqI/hLAk8h+7VsJ4L8DSALI27NAOoFgOqezhFNSH7I1sJkU39C9Isgvp6/eznBR\nLzfqJ2xdOSBuS6SavxiWpeygOjfUck07MWhv5t83xa9peJa3U+1wbAkGubZW68gGYkSQHyN0s6qI\nfe8iGa7TJmccTgIp7iSQCPJ7NV1jO4pUW7qpfa+iIrBRWGQziVbZWSoCnpVxOLQ4p49Sx6jM3ix1\nxWKCw0jngdZWOw2W1GBd7UptV2rELgcJuYJ3OUTI7MrSscGlccsMIsVkFJEOM65j5DVKJxDA1mZl\nuy79upj+yToue0C+4EfFZESRVGJl3AXgUQAtAIYB/ATADcaY0bxHKcrFQ8eo4jsqYcD7ZLnbVJRy\nomNU8SMaKEhRFMUH+CZQUGM0gmU1OZ2mk7jFPBy3tapakf25KcI1sJ5G28CSTPM6sYytrYVE8Pgp\nw8/tiMOO5eLc17TY2u7IED9wdJxbXBsdenBdlD+i2SpbBx8Vulk0IMr1drtNrfy6I2fsfZ7pAL/u\noNh/7dpLWRXi7dY69rtWRXl/2uv5c4pN2pboiCdwVLVxRCi6CNTV1TENUmqGrmA3UrOsq+M6e2cn\nT1wA2FquS6eVGqas49KDpV4tg9gDtj49NsaTL8ig9oB9TS59Vd4b2T9XNmuZ8Vr2xYXUmV16u9Rz\nZSB5oDhtX+rp3nPLPcjFoCtjRVEUH6CTsaIoig/QyVhRFMUH6GSsKIriA3xjwFtWX4u2ppyI357k\nQVgaQrZzRiLCjUUz0ikhbRuYagNcvI/DNm4Fq7mR4tS5EVaOOTafb2nlgn9H2Da6dIIbpn42yp0d\nMs12NoHO5TxQS6zKfmTnDDe+NQlHkYaYfe/GhMPCdNI2PkVr+H1IG14nFLI3tmeEgTTk2GAfCovA\nRsJIWZW2DXheZ51Y0L6ei0FTUxPzwCtmY7+sI50SXEbQYtqVxsMTIqiTDBwEAGvWrGFlaXgDbK9C\nGSBHZrsG7GA8LqcK2R9p4HIZHKWBzGWMk4Y2aSiUAYlc53LVKSZ7tXyWXsNlKd6ZujJWFEXxAToZ\nK4qi+ACdjBVFUXyAbzRjGAPK5DQYEs4NNfZec8yKbLczQnYaS9laUH2Ua0pVIVvbSdRzLS0+zvXr\ndxO2Ztwhgv7Uxu1MwcuF3ksiOM+ZGbsvNdP8wmvTth43PcsvPD3Dteigw2Fm9xTvy1TVMqtOVR3X\n4xqb+Mb3QMTWHJNxfm+mx09bdRDnISASAa5/Lqu2n1vUowVGApcuXZ1XJ5TaqMshQuqeUtN0ZZSW\nAd1dQX+kJin1S1ewdtmuK1i7S0f24nK8GB3lz9OVQVpqxjIwv8uxRerVrvsgtXN5jS79Woa3lP13\n1XE9W+nQ430GLseXQix4ZUxEHyCiJ+ZS1mSI6G5HnS8R0WkimiGi54gob5xYRSknOkaVpUgpMkUt\ngJ8B+AwAy6xNRJ8D8AcA7gNwPYBpADuIyF7uKEpl0DGqLDkWvJY2xjwD4BkAILcD9mcBPGSM+d5c\nnXsBDAK4B8B3Su+qohSHjlFlKVJWAx4RrQHQAeCH858ZYyYAvAbgxnKeS1FKQceo4lfKbcDrQPZr\noUw7Ozj3twsSm01hZjon4p+J8veECTs2+gf4Z9VBvlk+Y+x3TVJEYKsP25vNR2a40aVRiPHjDkeR\ngxku5teGbWPcAeHLMCKaOUd2VoKREZnJwDYeBgPcsDF0nGcZiTkcACbEZU/FbUeLUyP8XOuvuYKV\nr7/+w9YxTSL7x+jp/VadM+/8mJXHzg2xcptjidDoGQ91C0+i4KXkMTozM8MMT9J5wGUQk0iDkjS8\nAbaRz5XpeXycP2Np0HMZxKQRzWUQkwZF2T/XNfb397Oyy4FDnkse43JSkUY0l7HzvffeY+Ubb+Tv\n0zvuuMM6Rhrejh8/btV54403WNlluJTPxRv9rZhsIhLd2qYoiuIDyr0yHkA26Vk7+MqjHUDeDJJP\nvLkP1R430GCQcF1PJ67rXZnnKOXnndeODuP1o8OIBHLS7/TiskOXPEafffZZtuIJhULYsmULrrnm\nmsX0R1ni7Nq1C7t27WLb2YrJhygp62RsjDlKRAMAtgHYBQBE1ABgK4C/yXfs3e/fiK5lua8PdVFd\ntCvA1jXLsXXNcrR4xkP/yBT+y7+Ulh16MWP0Qx/6EFasWHG+7JIPlMuPLVu2YMuWLUymOHnyJL76\n1a8uqJ0FT8ZEVItsWvP5pUoPEV0NYMwYcxLA1wF8gYgOAzgG4CEApwD8e752DQIwlPuHOzbE9cpg\npyMaf4hP2G0kMuYGbE0sQUKjy9iBWoJJseFbaMbG4ezwSop/tuv4MavO0ARv5+RZfo3RsN2XxnYu\nkNY12hNAtdCnGldxbbc1bq8kx4VGfHLgjFVnbJhrua++xnU0qm62jvngtl9h5d5r77TqdK/exMoD\nO59i5cj0PuuYjlBuZTwetPVt1q8KjVGJDM7T09Nj1ZGb/2UQIJdzgNRpXRqs1IRlYBtX8JuhIf48\n9+7da9WR2qh0vHA5P7S1tbFyU1OTVUcG9LniCj5GXVq0zAbt0nbPnOHj9oc//CEruwL2fPSjH2Vl\nqTMDdibtl156yaojn5N3MpbXWwylrIx/AcALyBpBDID56f8fAfy2MeYrRFQD4G8BNAF4CcBdxpj8\n/0GKUj50jCpLjlL2Gf8YBQx/xpgHATxYWpcUZXHoGFWWIirMKoqi+ADfBAqKTU9gxrPnNxrimlfa\n8dowIlA8klxbCwRt5yupysbStp46m+K62LEAL49E7I2uBya5blvdc7VVp1sEHBqZPcjKPV0rIOla\nybe+ZhyZkc+dO8fKnV3drHziONc2AcCEuZ7evnKVVScQ4fr0yCgPPrN/n63tRsJ8D+cyoScCQJ0I\nBFRfz7MAj5w7bB2zKp3bYxqfuTSBgiYnJ5kGKfeSupz95P5aqf+6NGNZxxWAXu7BLcZ6PzjIt1a7\nNG5v8HzA3s985ZVXWsfIoPUu/VcG45HHHDp0yDpG3hupMwO2Bi918TfffNM6RureXqPsPDJgkstY\nOzw8zMrevdJS7y4GXRkriqL4AJ2MFUVRfIBOxoqiKD5AJ2NFURQf4BsDXsgkEPY4YHS3N7C/JxyZ\nnjMZbggKRfi7JRGwDQkBkb34XNIO1DIa4oL+T8E3jh+F7UzS3sINVVdt6LTqHD/EDV57T/EsGLVd\ntut3g2j3lMOZZHySZ1bIhPmG83VbfsE6ZtfPXmVlCtj3IVLHN+83ETeWyKwKAPDe4bdYefdbdoAV\nEobXBmHQ6w3aQWN6O3MGlWmHo86lYNUqbvR0OWcUE/RHIh06XAYxeS6Z2cMVVKezk4/Ja6+91qqz\nZ88eVj54kBuZu7q6rGOk08fhw7YBVhqZpXHu+uuvt455+eWXWdkV2Kihgc8T0ogq+wbYzi4vvvii\nVUeey5UBxevkAfBrcAVqKoSujBVFUXyATsaKoig+QCdjRVEUH+AbzXhFUx26W3IaTFs912zGM7am\nOTbJdZlAPQ9cU5WxQw0kA3yj/rGEvVH/UExsJG/ljheNHVynAoDWENdpm5rsOOVvzR5g5St6eMCc\n9pW9drvLRNCVoO1w0tHNN9BXV4sN67W23tW1mm/ez6RsjevKau7A0dzIz72miuuAAHDk1WdY+Z2p\nAavOIPH7F8jw/q1qtIOs1HsyVdfMXpoQEm1tbVi5MqfrNzfz8eZyzpBaqXQ4cDmKBAJ8jSQdL4DC\ngeK9/ZxH6p7SwcPV7saNG1l59erV1jHSacKl7UoHE9kXqb8CwPr161nZpcPKdqQdQwaSB2yN2JXN\nWp7LZQ+Q9897LtczK0TZs0MT0cNzn3t/nrpQe4pSbnSMKkuRsmeHnuNpZIN1d8z9fLKk3ilKaegY\nVZYclcgODQBxY8zwBf6mKBVFx6iyFKmUAe82IhokogNEtJ2I7A2pinJp0TGq+IpKGPCeBvA4gKMA\negH8BYCniOhG40qFO0d1VRh11TnjWm2KGwGqo7bR7GyIb4Y/mOQb6je02kaMs+d4NKUTEdtY9OT+\nd1k5PMUNV1ddY298bxaGg0TE3ty/pm8LK28SDgA1NXZfIkH+iNprbUNHUGygrxJGoljCNiz1ruLX\n0OzIziCNTXUJbpQYfezL1jH1Z3i23sw5R/QqEXRtAtxBoTpg34fA8tw1BlKLNuCVNkarq1kGB2lo\ncxmLZPQu6ZzhikQms21IxxEA+MlPfsLK0qHjhhtusI6RDhAuQ9vVV/Nog9JA5nJ+kFlFXM5AMrqa\nzMDhMqKtW7eOlVtbW6060olGGtq+/e1vW8dI45zrPkgD3MTEhFVH3nPvc3I9s0KUfTI2xnzHU9xL\nRLsB9AO4DdnsC4pySdExqviRim9tm0sAOYJsTrILDvS/f3k/aj1xgiOBAO64ciV+ab29ClUuH/YO\njWLv0BheOJqLyTyVWLiraT6KHaOPP/44W9GFQiHccMMNzlWocvlw5MgRHD16FG+9lQsFcMmzQ7sg\noi4ALQDsjJcefvvmDehdnvuatzxsJ1RULj+uamvBVW0t+KW1ua/Yh0bGcd8TL+c5amEUO0Z/9Vd/\nlcWjcO2NVS4/enp60NPTw2JTnDhxAl/+si3j5aOs2aHnfh5AVo8bmKv3ZQCHAOzI1+5sMokpj7a5\nDFwjnkrbmvGE4RP2kYxw6CCeQQIA2tq5VtXQYUf5/8AVfKN7hvht2rxps3WM1LxmZ+w343XXXMXK\nRjiypB1ZR1Ki3aNHTlp1GoRW2dvDv03EHX2R2tqBQ7usOuMp7gDQKDJs1A3b9t+9p3j5RNCesGaq\nuRadEnri8bSdgfgccp9NIv+qo1JjNBaLsRWP1D1dTgkyyI90AnEFDpKT/Nq1a6060uFAbhpxBd6R\nwYOmpqasOjfffDMry/67nB/kKnD//v1WHakjX3UV/19wZcaQ53rnnXesOvKapOQvs3EAdlYR1ypW\natwyqwtg3z/v879YmnG+zLufAbAFwL3IZt09jewA/2/GmPJ+t1SUC6NjVFlyVCI79J2ld0dRFo+O\nUWUpooGCFEVRfIBvAgUlE2kk4jl96kgd19LezdgGvX0iW3FLNw8ssn7zB61jzs3y/YKxkK2BLRPZ\nn+MTXDs9uNvOODs2xvsycc4OFBIR+4Fjs1wPnpy09zLGZ/i5Yw596woRvOWZf3mMlWenbD3ujjvu\nYOW9e3dbdcZj/BpCdVzLbGuxdfzXo3yP8HDS3j+aCHIdUsa1TwzZ+6KbozlNdGjC1jovBolEgtkG\nZGAgGWQHsAOty4A5W7dutY6Re1ylfgnYe1zlPtjXX3/dOkbqpyMjI1YdqY1KLVdq3q46Lv1XBv15\n+OGHWdm1j/djH/sYK//0pz+16sh7LvdBe/eFX+hcrnMXk/m7v7//gueW2bCLQVfGiqIoPkAnY0VR\nFB+gk7GiKIoP0MlYURTFB/jGgNe4agVaVrScL++b4Aa7Z/f2y0NwYBc3pHW38cy2fV12hubnfsoN\nG9MTDqE9zjeSZ8QG7kTC3tCdTPLgNSlHnVSC15Eb6tOOjeLpFO9LKGi/P/eITM/jE9zIQmQfMzhy\nnJVXtNvOL4PD/N7EqnhAlQMheyN8RmTOnh1yGDKr+XFBYaA6G7YDt+z0BByacTixXAy6u7uxZk0u\nq8rAAM9isnPnTuuYV1/lz6a7u5uVr7ySZ1wBgB07uO+Jy2iWEGOpUBmwDY4uJxVZRzovuJwZZDsy\n8zMAvPLKK6wsgyG5DGSnT/Ps6TIbt6uODN7kMn7Ka3A5hkhnHBkMCbADDHkNei6HmkLoylhRFMUH\n6GSsKIriA3QyVhRF8QG+0Yz3jM5gxJO5+eWdb7G/v3PolDwEa3q4hhSt4u+W7zz2v61jphPcyePs\niK0Zh4R8FQoLbdShb2WMcB6RZQAmLQMDySAsvAwAZLjTRNIVTCjD+/Op//w7rLx8hR0w6bF//DtW\n3nfQdgBIpvn9DIjLrg3xQDkAMJPimuOUCBwPAJEkvwaK8/uSCtiaMaK5zxKxS5Md+vjx40wffe65\n59jf33zTdgbq6+tjZRlc6Bvf+IZ1jAxcMzg4aNWRGmsxTgoyiI4rjr4ck1JfdQUKku24nD5ku3/y\nJ3/Cyl1ddqjcr33ta6z89ttvW3Vk/+R1ywQJQHHauXwGrmuS5/LqyqWE0FzQypiI/oyIXieiibmU\nNf9KRJYFgoi+RESniWiGiJ4jIjvslKJUAB2jylJloTLFBwB8A8BWAHcACAN4lojOv+6J6HMA/gDA\nfQCuBzANYAcRaYBi5WKgY1RZkixIpjDGfMRbJqLfBDAE4P0A5pNyfRbAQ8aY783VuRfAIIB7AHjT\n3ShK2dExqixVFmvAa0I2XuwYABDRGgAdAH44X8EYMwHgNQA3LvJcilIKOkaVJUHJBjzKqtdfB/AT\nY8y+uY87kB340uIwOPe3C/L487sR8mzQPjvEM+DULbMz797z6d9l5Q2re1n5f3zxj61jzp2VTgj2\n+ygNLsxTihsfSFqyABgjjHEOA15GGvBS0lhiG6YI3ICXSNhGvjXrN7Hyh+7+JCt3Opxf3nntJVZ+\n9tmnrTrVUTvilZfJlN2X6Tjvr3G87mdn+b0JEN+YHwnZGYhbq3MR4mZMAEN5e5al7GP08ceZE8GZ\nM3yMtrS0yEPwh3/4h6y8cSPPInPfffdZx7icECTSeCSNaKUa8Ao5ebiMXbIdaSADgC1beGb0e++9\nl5VdWbJ/9KMfsfJ3v/tdq44rU4oXV/Q0mR3Eda+k04Z0JgFso6k3O7gr43QhFrObYjuAjQBuLlRR\nUS4ROkaLjZjGAAAUAElEQVSVJUNJkzER/TWAjwD4gDHGuzwYQDbvWDv4yqMdgL0vxcPM1AR7Q82/\na+13lnI5kU4lkE4ncKT/QO4zx/Y+SSXG6MTEhHMVpVzepFIppFIp7Nu37/xncitfMZSSkPSvAfwy\ngFuNMSe8f5tLeT4AYBuAXXP1G5C1bP9NvnZr6hryyhTK5UkwFEEwFEFPb26/7szMFA4dtIPhz1Op\nMdrQ0JBXplAuT0KhEEKhEJOgpqamnAlU87azkMpEtB3AJwHcDWCa6Hz65XFjznsnfB3AF4joMIBj\nAB4CcArAv+drO1pTh0hVboP22DBXBZta7UA20fpWVq5bzjeON7bZzg7vHecBcsIhO5CI1HtSRaSp\nlHvhi8kUDKG1OXVmsRKbdawKG5qaWDkc5vrWjCODiPdeA0AiZeuHmVmp/fG+BBzaOQX5/ayptTW9\nJpHNurWVP6dola0Ze50lQhMXHraVHKO1tbXMiUBOxu3t9nhrFNfa0dGRtwwA7777LisXE6TGFRhI\nIh02XMfIMSr1YJfTh8TVrswOLQP4uLKkFJN92+WM4cWl3crPZDZuwO7vihX2/CP1am/Z9cwKsdCV\n8e8hqyD8SHz+WwAeAQBjzFeIqAbA3yJryX4JwF3GmEvjNqVcbugYVZYkC91nXNRWOGPMgwAeLKE/\nirIodIwqSxUNFKQoiuIDfBMoaGX3FahryGmfJs31oWiVvef1zBmuK8/M8G+Zw46sr2mhy0YcmlLa\n5A+WYqurQFWYZ0puabT3nFKAn9sYWbbbNUGu7WYcOm1afDQyzIOe87zVWU6eeo+VO6/oseo0N3Ld\nrP8wD/DvzZQ8D4X4+31d3yarjgwSnhL7ldNpW5f0apmB2KUZtr29vWhubj5fluPCtef1xAlmP7T2\nr7oyNEtd1qV7yjquoO8SGTTHtS+60P5ll2bsCiZfiEJB4QE7s/batXb4EHkNe/fuZWW5pxiw+3vN\nNddYdXp7uc+CSwfP9wzkHuRi0JWxoiiKD9DJWFEUxQfoZKwoiuIDdDJWFEXxAb4x4EVr6lBTn9sg\nf+11W9nfx0dtQ8fJAzwb9B6xAXx62s5MfPtH7mHlnh7bKPCWyNjwyqs/YuVg0HYUuX3bf2DlzZtt\no0AiyQ2K0zO8vzPTtrEhEeNGgqRjK+yZMW4M+fb/+xYrhx3v3CNHj7Fyz1WbrTpN9dyZZPQc7/+A\nwwMtWssNrYGwvfl9eIybFGVgGZfzS8BjWJqZsh0ELgZ1dXXMieO2225jfx8assMX7dq1i5UnhFFZ\nlgHgE5/4BCvL4EIA8OKLL7KyzDriMvrJdrdu3WrVkUZZaXB0OWfIY1zGROkgs337dlZ2GfAOHjzI\nytdee61VRzpnSIOoNKAC2efoxZVBWmZXcWXuyBecyfVcC6ErY0VRFB+gk7GiKIoP0MlYURTFB/hG\nM06bANIeT1YjMh6HonYm4gN7uR43KTTjVs8G/XnqWnmg9emMfQuaVvBA180ruUNEJGpv7k+E+We7\njxy36sRmxvi5hWYcm7WdKIzIZp0hOzRfwnA9a2KCO32EyX7nLm9fycrxhO1xMjLCtd22dn7vlrXa\ngXEoyM+VcPgipDPiQ9E/cuiH5MkYLYMRXSyMMUwnlEF1ZGAbwM4YLbXE1lYe7AoA2traWNkVIEdm\nU5aONK6+SB1592478p3UhKVm7ArM4+qfROrIZ8/yseXSjOU1uoLWDwzwsb5yJR/XrkBM8j4UE9TL\nFTpV9tnbbinB5cueHZqIHiaijPh5asE9U5QS0DGqLFXKnh16jqeRDdbdMffzSSjKxUHHqLIkqUR2\naACIG2MKJ/JSlDKjY1RZqpQ1O7SH2+a+Ih4gou1EtMxxrKJcDHSMKksCcmWILerArKL9JIB6Y8yt\nns8/DmAGwFEAvQD+AsAkgBuN42RE9D4Ab/Zt3YaahpzBLS0E/4wjVlpaRE0Ki0wUVY5oUqEQF9Zd\nEfllxCUSm8IzjlsmMz+TsS1XlOIGumLyZAUDvH+hoOPLjLAtRCK8/8Gg653L+5tM2v2V9zcY4PfO\nmQ5O3N+MY3xlxHWnMyKztiMymHfoTJwdxWvPfw8A3m+MecvRi7n+lXeM3nLLLWjyZFUpxnAln7E0\n+hSTxUNGW3N9Jttx3cNiIrvJ/hZzjIyC5oriJg1g8n+sGINXMZHTimlH9s91r+R9cP2vyuO85dHR\nUTz55JNAgTHK+lVMpQvgzLxrjPmOp7iXiHYD6AdwG4AXFnE+RVkoOkaVJUO5s0NbzCWAHAGwFnkG\n+qmD7yDoyUdnjEFTexea27sudIhyGXDmxBEMnDwK41nJp4pYkVZijO7Zs4e5zhpjsGLFCnR2dl7o\nEOUyoL+/H0eOHGGfFZOTUFLW7NAXqN8FoAVA3n+IrvVX55UplMuTFd09WNHdcyGZwkmlxuimTZsW\nLFMoP//09vait7f3QjJF0ZQ1OzQR1QJ4AMDjAAaQXWl8GcAhADvytV1dXYPa2lwAD6lzymwWABAS\n+lBQSD8hKaYCqK7h2ppLY0qJDMwBw+uEQ7bOZ0mNDs04k8nvrODS76VmHAjYjywjRGz7mhy6bUZm\nGXFkehYZN2Q2aONo14h77tKVpX4YFDo4UX7NOJBHF6zkGK2trUVDQy6jSyn6pHzGLmeHWhFsyaXB\nFtJyXVq0/cwddpgCdgzXMfIaiumvKzhPob4U01/Zl2JsYq5nIO+V65rynbviTh/IZt5tQDbz7mnP\nz8fn+wdgC7Ipzw8C+DsAPwVwizFGlxHKxUDHqLIkKWt2aGNMDMCdi+qRoiwCHaPKUkUDBSmKovgA\n3wQKqq+vR1NzzjhiZV51aj/8swhxnSbgOCYSEXuGM7ZGFg7xOiEj9lEGHBmlRX8DjmA2Muu0NAC5\ngpFIPTXgOHcolD+jLzkCBUm9y6VxSV3PtR9TYqSu7NCMUwUyELv64q1TSjbictDc3MwC+8j7UdS+\n8SK0RLmH2NWufDa2Dm+fR9ZxaaXyXHJXgGuMyufhOnch7dzVrtSZXc+9mP3VhXDdh0JZsl398Z67\nGE3c6seCj1AURVHKjk7GiqIoPkAnY0VRFB+gk7GiKIoP8I0Bj0iI5kLPl0FqANsJISCcFMhhPTIi\n87DLcGB/Jh1QHMcIo4UxttGFRPYSaUx0OV4EpPHNYceUgXZsg5jjnVuEAUUiDTMuo0bSyL7Y7UiD\niZ1VwXWMJ4uCYyz4gWIC5Ehc91AaoYoZo4XKrv65jF2FAvoU4/ThQj7jYpwqinHYkHXKFQRIXpPL\nyUbeK++5L4bTh6IoilIBdDJWFEXxAToZK4qi+ADfaMYZBJDxvBssvciZMZh/FoAIHOSQbaTOTIHC\nWqkRziSuvkC0E4RD4xbyle1w4uhLRmROdjhwQAQ2MjL6vew/AMgs045TSwlbOrY4A26T0KKL0XdJ\nBqS3dT5vUChX9uhLQTFBfwpph8UcU4wmW0y78jOXriyvqZgsyVKXdZ1baq7FaNyl1JHnKcYRp5j7\nW0wwIW+dkp7ZQioT0e8R0TtEND73s5OI7hR1vkREp4lohoieI6K1C+6VopSIjlFlqbLQ6fskgM8B\neB+yCR6fB/DvRLQBAIjocwD+AMB9AK4HMA1gBxHZ8fwUpTLoGFWWJAuajI0x3zfGPGOM6TfGHDbG\nfAHAFIAb5qp8FsBDxpjvGWP2ALgXQCeAe8raa0W5ADpGlaVKyeIbEQWI6BMAagDsJKI1ADoA/HC+\njjFmAsBrAG5cbEcVZaHoGFWWEqWkXdoE4BUAUWQz6v6KMeYgEd2IrEvCoDhkENl/gLwkKYy455ti\nqEpk1xAZjwEgLDIRhyEikTk8JEhk7XBl5KBA/iwYGVcoMkEGthEqmeHniieEUcPR35A0iMHVX2GY\nkUYzh9HPiDoZ2IYOI+5DRrYTtLMWG8sBxaoCIhHBKygziNh9SXqqpBzGUd5+ZcaoRDpEuLJryM+K\nMUpJXM4PhY4rJXoZYEcSnJ2dZeViDIMuZJ1SjinmmuR9cUVPK8aoV0wUvFKMdPkoZTfFAQBXA2gE\n8DEAjxDRLWXtlaIsDh2jypJjwZOxMSYFYD4V6ttEdD2yOtxXkN0g1Q6+8mgH8Hahdg++9gOEIlWI\nTU8iWlsPAqFz7VXoXLtpoV10cvTALqzp21KWtrwcO7gbq9dvLnu7J/sPYFVvX9nbPXZ4H1av3Vj2\ndk/170dX74ayt7v71ecRm55gnyUT8bzHVGqM/vjHP0ZVVRUmJydRX18PIkJfXx82bCjPde/atQtb\ntpR/jO7evRubN5d/jO7fv79s1+5l37592Lix/GP0wIED6Osr///UCy+8gMnJSbaajsfzj1EX5dhn\nHABQNZfufADANgC7AICIGgBsBfA3hRpZv/UONLR24O3nvotrf+k/WclGF8uxQ7srMxkfqtBkfORg\nRSbj44f3V2YyPnKgIpNxbHoC1227h+2DHh8ZxEtP/vNCminLGL311lvR1taGJ554AnfffXfZg9zv\n3r17SU3GBw4c0MkYwOTkJO6++242GQ8NDeHRRx9dUDsLzQ795wCeBnACQD2ATwG4FcCH5qp8HcAX\niOgwgGMAHgJwCtnkj3mpa2hEY3MLQuEIGptbLD3OpXsm4zFWDmTE28jShsjSMF16nBFOFBkR4Mcr\n/ZpMBsl40nJUCIUdm8SFZ4XM9Bz0OI4EiBAKhSHkVOlb4iQgnSi8XaOsxiwzLMuMIgCQIn4fIkIj\nzqS9m9yDiESqkUrxZ+AOLCO0Zrlx36OLEwUQjFQxfS7ksB/k6ldujDY1NaG1tRWRSAStra1WRg6X\njhuL8TEq74dLBy1HHa8umslkEI/HrTrFZKPIl8UjEAggHA4X5UwiyacHExGCwWBR/ZX3IV92mmAw\niGg0amUvcWZlF/8f+ZxdAoEAIpEIO8ZlPyjEQl/tbQD+EcAKAOPIri4+ZIx5HgCMMV8hohoAfwug\nCcBLAO4yxiQu0J6ilBsdo8qSZKHZoX+3iDoPAniwxP4oyqLQMaosVfwQmyIKABOjQwCy0sPZwVMI\nh8Uy35WwMMkXM6GMWNx4vqIk4zGMDp0GhOTg2tpmxPYxuZXNe0giHsfZ4TPIiG1pwZDdX7ntKy3i\nL3sliWy7A5YsUYxMQUKm8CooyUQcY8MDSMmYA66QFySTlorYtulcZxKJ7H1IpfnWKNfeNgrkf7Zp\nz9a2ZCKOcyMDCHjqTJ4dmf/V3ltXGaIAMDw8DCArPZw+fbrgtjWgsCHH+zV6vl35tdkppRWQKbzl\neDyOgYGBkmIIy1gPXnlhvt1yxM6Q9+HMmTN5z32h/uZLfBqLxTAwMGBt3ysmRnO+OB7xeByDg4Ps\nmNHR0flfix6jVEwA50pCRL8OYEHWGEWZ41PGmIVZSUpAx6iyCIoeo36YjFsAfBhZY0osf21FAZBd\nbawGsMMYM1qg7qLRMaqUwILH6CWfjBVFURQNLq8oiuILdDJWFEXxAToZK4qi+ACdjBVFUXyAbyZj\nIvp9IjpKRLNE9CoRXbfI9h4gooz42VdCOx8goieI6L25Nu521FlwGp9C7RLRw47+P1VEu39GRK8T\n0QQRDRLRvxLRlYvtczHtltLnpZQmSceojtFKjlFfTMZE9GsAvgrgAQDXAngH2VQ4rYtseg+yEbk6\n5n5+sYQ2agH8DMBnADvgMJWexidvu3M8Lfr/ySL6+wEA30A2+M0dAMIAniWi6kX2uWC7JfZ5SaRJ\n0jGqYxSVHqPGmEv+A+BVAP/LUyZkg7f86SLafADAW2XuZwbA3eKz0wD+2FNuADAL4OOLbPdhAP9S\nhj63zrX/i2Xus6vdcvV5FMBvlauvZXr2OkZ1jFZ0jF7ylTERhZF923hT4RgAP8DiU+Gsm/uK1U9E\n3yKiVYtsj0GVT+Nz29zXrQNEtJ2IlpXQRhOyq5qxMveZtVuOPpNP0yTpGM2LjtEy3d9LPhkj+/YK\nokypcDy8CuA3kfWc+j0AawC8SES1i2hT0oEypvERPI1ssszbAfwpsmEgnyIqIj7hHHN1vw7gJ8aY\neS1y0X2+QLsl95mINhHRJIA4gO2YS5NUjr6WCR2jbnSMlvH++iFQUEUwxuzwFPcQ0esAjgP4OLJf\nVXyNMeY7nuJeItoNoB/AbQBeKLKZ7QA2Ari5vL1zt7uIPl+WaZJ0jALQMXoeP6yMRwCkkRXUvbQD\nGCjXSYwx4wAOASinJX4AuTQ+XsradwAwxhxF9l4V1X8i+msAHwFwmzHmjOdPi+pznnZL7rMxJmWM\nOWKMedsY81+RNY59drF9LSM6RotAx+jC++rlkk/GxpgkgDeRTYUD4PxXjG0AdpbrPERUh+wNz/tw\nFsLcg5xP4zN/nvk0PmXr+1y7XQBaUET/5wbjLwP4oDHmRLn6nK/dxfZZcD5NUql9LSc6RotDx+jC\n+mqxWKtiOX6Q/Vo2g6yW04dsFoZRAMsX0eZfArgFwBUAbgLwHLI6TssC26lF9uvJNchaZv9orrxq\n7u9/OtfX/whgM4B/A/AugEip7c797StzD/SKuQf9BoD9AMIF2t0O4Cyy23zaPT9RT50F97lQu6X2\nGcCfz7V5BYBNAP4CQArA7Yu5vzpGdYwutTF60QZzEQPqM8iGKJwF8AqAX1hke48hu/VoFtl8aI8C\nWFNCO7fODcS0+Pl7T50Hkd3eMgNgB4C1i2kX2fB7zyD7xo0hm+n4myjiH/8CbaYB3CvqLajPhdot\ntc8A/s9c3dm5Y5+dH+SLub86RnWMLrUxqiE0FUVRfMAl14wVRVEUnYwVRVF8gU7GiqIoPkAnY0VR\nFB+gk7GiKIoP0MlYURTFB+hkrCiK4gN0MlYURfEBOhkriqL4AJ2MFUVRfIBOxoqiKD5AJ2NFURQf\n8P8BLusX/CHAP18AAAAASUVORK5CYII=\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x7f8bee84eba8>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "# transform an 3-channel image into one channel\n",
    "def grayscale(data, dtype='float32'):\n",
    "  # luma coding weighted average in video systems\n",
    "  r = np.asarray(.3, dtype=dtype)\n",
    "  g = np.asarray(.59, dtype=dtype)\n",
    "  b = np.asarray(.11, dtype=dtype)\n",
    "  rst = r * data[:, :, :, 0] + g * data[:, :, :, 1] + b * data[:, :, :, 2]\n",
    "  # add channel dimension\n",
    "  rst = np.expand_dims(rst, axis=3)\n",
    "  return rst\n",
    "\n",
    "X_train_gray = grayscale(X_train)\n",
    "X_test_gray = grayscale(X_test)\n",
    "\n",
    "# plot a randomly chosen image\n",
    "img = round(np.random.rand() * X_train.shape[0])\n",
    "plt.figure(figsize=(4, 2))\n",
    "plt.subplot(1, 2, 1)\n",
    "plt.imshow(X_train[img], interpolation='none')\n",
    "plt.subplot(1, 2, 2)\n",
    "plt.imshow(\n",
    "    X_train_gray[img, :, :, 0], cmap=plt.get_cmap('gray'), interpolation='none')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we can see, the objects in grayscale images can still be recognizable."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Feature Selection\n",
    "When coming to object detection, HOG (histogram of oriented gradients) is often extracted as a feature for classification. It first calculates the gradients of each image patch using sobel filter, then use the magnitudes and orientations of derived gradients to form a histogram per patch (a vector). After normalizing these histograms, it concatenates them into one HOG feature. For more details, read this [tutorial](https://www.learnopencv.com/histogram-of-oriented-gradients/). \n",
    ">Note.  one can directly feed the original images for classification; however, it will take lots of time to train and get worse performance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# The code is credit to: \"http://www.itdadao.com/articles/c15a1243072p0.html\"\n",
    "def getHOGfeat(image,\n",
    "               stride=8,\n",
    "               orientations=8,\n",
    "               pixels_per_cell=(8, 8),\n",
    "               cells_per_block=(2, 2)):\n",
    "  cx, cy = pixels_per_cell\n",
    "  bx, by = cells_per_block\n",
    "  sx, sy, sz = image.shape\n",
    "  n_cellsx = int(np.floor(sx // cx))  # number of cells in x\n",
    "  n_cellsy = int(np.floor(sy // cy))  # number of cells in y\n",
    "  n_blocksx = (n_cellsx - bx) + 1\n",
    "  n_blocksy = (n_cellsy - by) + 1\n",
    "  gx = np.zeros((sx, sy), dtype=np.double)\n",
    "  gy = np.zeros((sx, sy), dtype=np.double)\n",
    "  eps = 1e-5\n",
    "  grad = np.zeros((sx, sy, 2), dtype=np.double)\n",
    "  for i in range(1, sx - 1):\n",
    "    for j in range(1, sy - 1):\n",
    "      gx[i, j] = image[i, j - 1] - image[i, j + 1]\n",
    "      gy[i, j] = image[i + 1, j] - image[i - 1, j]\n",
    "      grad[i, j, 0] = np.arctan(gy[i, j] / (gx[i, j] + eps)) * 180 / math.pi\n",
    "      if gx[i, j] < 0:\n",
    "        grad[i, j, 0] += 180\n",
    "      grad[i, j, 0] = (grad[i, j, 0] + 360) % 360\n",
    "      grad[i, j, 1] = np.sqrt(gy[i, j]**2 + gx[i, j]**2)\n",
    "  normalised_blocks = np.zeros((n_blocksy, n_blocksx, by * bx * orientations))\n",
    "  for y in range(n_blocksy):\n",
    "    for x in range(n_blocksx):\n",
    "      block = grad[y * stride:y * stride + 16, x * stride:x * stride + 16]\n",
    "      hist_block = np.zeros(32, dtype=np.double)\n",
    "      eps = 1e-5\n",
    "      for k in range(by):\n",
    "        for m in range(bx):\n",
    "          cell = block[k * 8:(k + 1) * 8, m * 8:(m + 1) * 8]\n",
    "          hist_cell = np.zeros(8, dtype=np.double)\n",
    "          for i in range(cy):\n",
    "            for j in range(cx):\n",
    "              n = int(cell[i, j, 0] / 45)\n",
    "              hist_cell[n] += cell[i, j, 1]\n",
    "          hist_block[(k * bx + m) * orientations:(k * bx + m + 1) * orientations] = hist_cell[:]\n",
    "      normalised_blocks[y, x, :] = hist_block / np.sqrt(\n",
    "          hist_block.sum()**2 + eps)\n",
    "  return normalised_blocks.ravel()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once we have our *getHOGfeat* function, we then get the HOG features of all images."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "X_train_hog = []\n",
    "X_test_hog = []\n",
    "\n",
    "print('This will take some minutes.')\n",
    "\n",
    "for img in X_train_gray:\n",
    "  img_hog = getHOGfeat(img)\n",
    "  X_train_hog.append(img_hog)\n",
    "\n",
    "for img in X_test_gray:\n",
    "  img_hog = getHOGfeat(img)\n",
    "  X_test_hog.append(img_hog)\n",
    "\n",
    "X_train_hog_array = np.asarray(X_train_hog)\n",
    "X_test_hog_array = np.asarray(X_test_hog)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[scikit-learn](http://scikit-learn.org/stable/supervised_learning.html#supervised-learning) provides off-the-shelf libraries for classification. For KNN and SVM classifiers, we can just import from scikit-learn to use."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[KNN]\n",
      "Misclassified samples: 5334\n",
      "Accuracy: 0.47\n"
     ]
    }
   ],
   "source": [
    "# KNN\n",
    "from sklearn.neighbors import KNeighborsClassifier \n",
    "from sklearn.metrics import accuracy_score\n",
    "\n",
    "# p=2 and metric='minkowski' means the Euclidean Distance\n",
    "knn = KNeighborsClassifier(n_neighbors=11, p=2, metric='minkowski')\n",
    "\n",
    "knn.fit(X_train_hog_array, y_train.ravel())\n",
    "y_pred = knn.predict(X_test_hog_array)\n",
    "print('[KNN]')\n",
    "print('Misclassified samples: %d' % (y_test.ravel() != y_pred).sum())\n",
    "print('Accuracy: %.2f' % accuracy_score(y_test, y_pred))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can observe that the accuracy of KNN on CIFAR-10 is embarrassingly bad."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Support Vector Machine (SVM) on CIFAR-10"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Linear SVC]\n",
      "Misclassified samples: 4940\n",
      "Accuracy: 0.51\n"
     ]
    }
   ],
   "source": [
    "# SVM\n",
    "from sklearn.svm import SVC \n",
    "\n",
    "# C is the hyperparameter for the error penalty term\n",
    "# gamma is the hyperparameter for the rbf kernel\n",
    "svm_linear = SVC(kernel='linear', random_state=0, gamma=0.2, C=10.0)\n",
    "\n",
    "svm_linear.fit(X_train_hog_array, y_train.ravel())\n",
    "y_pred = svm_linear.predict(X_test_hog_array)\n",
    "print('[Linear SVC]')\n",
    "print('Misclassified samples: %d' % (y_test.ravel() != y_pred).sum())\n",
    "print('Accuracy: %.2f' % accuracy_score(y_test.ravel(), y_pred))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By above, SVM is slightly better than KNN, but still poor. Next, we'll design a CNN model using tensorflow. Because the cifar10 is not a small dataset, we can't just use feed_dict to feed all training data to the model due to the limit of memory size. Even if we can feed all data into the model, we still want the process of loading data is efficient. **Input pipeline** is the common way to solve these."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Input Pipeline\n",
    "\n",
    "### Queues\n",
    "Because tf.Session objects are designed to be **multithreaded** and thread-safe, so multiple threads can easily use the same session and run ops in parallel. [Queues](https://www.tensorflow.org/programmers_guide/threading_and_queues) are useful because of the ability to **compute tensor asynchronously** in a graph. Most of the time, we use queues to handle inputs. In this way, multiple threads prepare training example and enequeue these examples. In addition, only parts of inputs would be read into memory a time, instead of all of them. This can avoid **out of memory error** when data is large.\n",
    "> Tensorflow recommended queue-base input pipeline before version 1.2. Beginning with version 1.2, tensorflow recommend using the [tf.contrib.data module](https://www.tensorflow.org/programmers_guide/datasets) instead. Read [more](https://github.com/tensorflow/tensorflow/issues/7951).\n",
    "\n",
    "### Typical Input Pipeline\n",
    "1. The list of filenames\n",
    "2. Optional filename shuffling\n",
    "3. Optional epoch limit\n",
    "4. Filename queue\n",
    "5. A Reader for the file format\n",
    "6. A decoder for a record read by the reader\n",
    "7. Optional preprocessing\n",
    "8. Example queue  \n",
    "\n",
    "<center><img style='width: 70%' src='imgsrc/AnimatedFileQueues.gif' /></center>  \n",
    "\n",
    "We've specified the order of input pipeline in the followng codes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import os\n",
    "import sys\n",
    "from six.moves import urllib\n",
    "import tarfile\n",
    "import tensorflow as tf\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Loading Data Manually\n",
    "To know how it works under the hood, let's load CIFAR-10 by our own (not using keras). According the descripion, the dataset file is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly-selected images from each class. We define some constants based on the above:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# the url to download CIFAR-10 dataset (binary version)\n",
    "# see format and details here: http://www.cs.toronto.edu/~kriz/cifar.html\n",
    "DATA_URL = 'http://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz'\n",
    "DEST_DIRECTORY = 'dataset/cifar10'\n",
    "# the image size we want to keep\n",
    "IMAGE_HEIGHT = 32\n",
    "IMAGE_WIDTH = 32\n",
    "IMAGE_DEPTH = 3\n",
    "IMAGE_SIZE_CROPPED = 24\n",
    "BATCH_SIZE = 128\n",
    "# Global constants describing the CIFAR-10 data set.\n",
    "NUM_CLASSES = 10 \n",
    "NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN = 50000\n",
    "NUM_EXAMPLES_PER_EPOCH_FOR_EVAL = 10000"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ">> Downloading cifar-10-binary.tar.gz ...\n",
      ">> Total 170052171 bytes\n",
      ">> Done\n"
     ]
    }
   ],
   "source": [
    "def maybe_download_and_extract(dest_directory, url):\n",
    "  if not os.path.exists(dest_directory):\n",
    "    os.makedirs(dest_directory)\n",
    "  file_name = 'cifar-10-binary.tar.gz'\n",
    "  file_path = os.path.join(dest_directory, file_name)\n",
    "  # if have not downloaded yet\n",
    "  if not os.path.exists(file_path):\n",
    "    def _progress(count, block_size, total_size):\n",
    "      sys.stdout.write('\\r%.1f%%' % \n",
    "            (float(count * block_size) / float(total_size) * 100.0))\n",
    "      sys.stdout.flush()  # flush the buffer\n",
    "\n",
    "    print('>> Downloading %s ...' % file_name)\n",
    "    file_path, _ = urllib.request.urlretrieve(url, file_path, _progress)\n",
    "    file_size = os.stat(file_path).st_size\n",
    "    print('\\r>> Total %d bytes' % file_size)\n",
    "  extracted_dir_path = os.path.join(dest_directory, 'cifar-10-batches-bin')\n",
    "  if not os.path.exists(extracted_dir_path):\n",
    "    # Open for reading with gzip compression, then extract all\n",
    "    tarfile.open(file_path, 'r:gz').extractall(dest_directory)\n",
    "  print('>> Done')\n",
    "\n",
    "# download it\n",
    "maybe_download_and_extract(DEST_DIRECTORY, DATA_URL)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After downloading the dataset, we create functions\n",
    "* ```distort_input(training_file, batch_size)``` to get a training example queue.\n",
    "* ```eval_input(testing_file, batch_size)``` to get a testing example queue.\n",
    "* ```read_cifar10(filename_queue)``` to read a record from dataset with a filename queue. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# the folder store the dataset\n",
    "DATA_DIRECTORY = DEST_DIRECTORY + '/cifar-10-batches-bin'\n",
    "# (1) a list of training/testing filenames\n",
    "training_files = [os.path.join(DATA_DIRECTORY, 'data_batch_%d.bin' % i) for i in range(1,6)]\n",
    "testing_files = [os.path.join(DATA_DIRECTORY, 'test_batch.bin')]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# (5) + (6)\n",
    "def read_cifar10(filename_queue):\n",
    "  \"\"\" Reads and parses examples from CIFAR10 data files.\n",
    "    -----\n",
    "    Args:\n",
    "        filename_queue: \n",
    "            A queue of strings with the filenames to read from.\n",
    "    Returns:\n",
    "        An object representing a single example, with the following fields:\n",
    "        height: \n",
    "            number of rows in the result (32)\n",
    "        width: \n",
    "            number of columns in the result (32)\n",
    "        depth: \n",
    "            number of color channels in the result (3)\n",
    "        key: \n",
    "            a scalar string Tensor describing the filename & record number for this example.\n",
    "        label: \n",
    "            an int32 Tensor with the label in the range 0..9.\n",
    "        image: \n",
    "            a [height, width, depth] uint8 Tensor with the image data\n",
    "  \"\"\"\n",
    "\n",
    "  class CIFAR10Record(object):\n",
    "    pass\n",
    "\n",
    "  result = CIFAR10Record()\n",
    "  # CIFAR10 consists of 60000 32x32 'color' images in 10 classes\n",
    "  label_bytes = 1  # 10 class\n",
    "  result.height = IMAGE_HEIGHT\n",
    "  result.width = IMAGE_WIDTH\n",
    "  result.depth = IMAGE_DEPTH\n",
    "  image_bytes = result.height * result.width * result.depth\n",
    "  # bytes of a record: label(1 byte) followed by pixels(3072 bytes)\n",
    "  record_bytes = label_bytes + image_bytes\n",
    "  # (5) reader for cifar10 file format\n",
    "  reader = tf.FixedLengthRecordReader(record_bytes=record_bytes)\n",
    "  # read a record\n",
    "  result.key, record_string = reader.read(filename_queue)\n",
    "  # Convert from a string to a vector of uint8 that is record_bytes long.\n",
    "  # (6) decoder\n",
    "  record_uint8 = tf.decode_raw(record_string, tf.uint8)\n",
    "  # get the label and cast it to int32\n",
    "  result.label = tf.cast(\n",
    "      tf.strided_slice(record_uint8, [0], [label_bytes]), tf.int32)\n",
    "  # [depth, height, width], uint8\n",
    "  depth_major = tf.reshape(\n",
    "      tf.strided_slice(record_uint8, [label_bytes],\n",
    "                       [label_bytes + image_bytes]),\n",
    "      [result.depth, result.height, result.width])\n",
    "  # change to [height, width, depth], uint8\n",
    "  result.image = tf.transpose(depth_major, [1, 2, 0])\n",
    "  return result"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def distort_input(training_files, batch_size):\n",
    "  \"\"\" Construct distorted input for CIFAR training using the Reader ops.\n",
    "    -----\n",
    "    Args:\n",
    "        training_files: \n",
    "            an array of paths of the training files.\n",
    "        batch_size: \n",
    "            Number of images per batch.\n",
    "    Returns:\n",
    "        images: Images. \n",
    "            4D tensor of [batch_size, IMAGE_SIZE, IMAGE_SIZE, 3] size.\n",
    "        labels: Labels. \n",
    "            1D tensor of [batch_size] size.\n",
    "  \"\"\"\n",
    "  for f in training_files:\n",
    "    if not tf.gfile.Exists(f):\n",
    "      raise ValueError('Failed to find file: ' + f)\n",
    "  # create a queue that produces filenames to read\n",
    "  # (4) filename queue\n",
    "  file_queue = tf.train.string_input_producer(training_files)\n",
    "  # (5) + (6)\n",
    "  cifar10_record = read_cifar10(file_queue)\n",
    "  # (7) image preprocessing for training\n",
    "  height = IMAGE_SIZE_CROPPED\n",
    "  width = IMAGE_SIZE_CROPPED\n",
    "  float_image = tf.cast(cifar10_record.image, tf.float32)\n",
    "  distorted_image = tf.random_crop(float_image, [height, width, 3])\n",
    "  distorted_image = tf.image.random_flip_left_right(distorted_image)\n",
    "  distorted_image = tf.image.random_brightness(distorted_image, max_delta=63)\n",
    "  distorted_image = tf.image.random_contrast(\n",
    "      distorted_image, lower=0.2, upper=1.8)\n",
    "  # standardization: subtract off the mean and divide by the variance of the pixels\n",
    "  distorted_image = tf.image.per_image_standardization(distorted_image)\n",
    "  # Set the shapes of tensors.\n",
    "  distorted_image.set_shape([height, width, 3])\n",
    "  cifar10_record.label.set_shape([1])\n",
    "  # ensure a level of mixing of elements.\n",
    "  min_fraction_of_examples_in_queue = 0.4\n",
    "  min_queue_examples = int(\n",
    "      NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN * min_fraction_of_examples_in_queue)\n",
    "  # (8) example queue\n",
    "  # Filling queue with min_queue_examples CIFAR images before starting to train\n",
    "  image_batch, label_batch = tf.train.shuffle_batch(\n",
    "      [distorted_image, cifar10_record.label],\n",
    "      batch_size=batch_size,\n",
    "      num_threads=16,\n",
    "      capacity=min_queue_examples + 3 * batch_size,\n",
    "      min_after_dequeue=min_queue_examples)\n",
    "  return image_batch, tf.reshape(label_batch, [batch_size])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The following code is to generate the data for testing. Now, you are able to specify the order of input pipeline in the following code block."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def eval_input(testing_files, batch_size):\n",
    "  for f in testing_files:\n",
    "    if not tf.gfile.Exists(f):\n",
    "      raise ValueError('Failed to find file: ' + f)\n",
    "  # create a queue that produces filenames to read\n",
    "  file_queue = tf.train.string_input_producer(testing_files)\n",
    "  cifar10_record = read_cifar10(file_queue)\n",
    "  # image preprocessing for training\n",
    "  height = IMAGE_SIZE_CROPPED\n",
    "  width = IMAGE_SIZE_CROPPED\n",
    "  float_image = tf.cast(cifar10_record.image, tf.float32)\n",
    "  resized_image = tf.image.resize_image_with_crop_or_pad(\n",
    "      float_image, height, width)\n",
    "  image_eval = tf.image.per_image_standardization(resized_image)\n",
    "  image_eval.set_shape([height, width, 3])\n",
    "  cifar10_record.label.set_shape([1])\n",
    "  # Ensure that the random shuffling has good mixing properties.\n",
    "  min_fraction_of_examples_in_queue = 0.4\n",
    "  min_queue_examples = int(\n",
    "      NUM_EXAMPLES_PER_EPOCH_FOR_EVAL * min_fraction_of_examples_in_queue)\n",
    "  image_batch, label_batch = tf.train.batch(\n",
    "      [image_eval, cifar10_record.label],\n",
    "      batch_size=batch_size,\n",
    "      num_threads=16,\n",
    "      capacity=min_queue_examples + 3 * batch_size)\n",
    "  return image_batch, tf.reshape(label_batch, [batch_size])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After building the input pipeline, we can check the functionality of the example queues."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Shape of cropped image: (128, 24, 24, 3)\n",
      "Shape of label: (128,)\n"
     ]
    }
   ],
   "source": [
    "# test function distort_input\n",
    "with tf.Session() as sess:\n",
    "  coord = tf.train.Coordinator()\n",
    "  image, label = distort_input(training_files, BATCH_SIZE)\n",
    "  # --- Note ---\n",
    "  # If you forget to call start_queue_runners(), it will hang\n",
    "  # indefinitely and deadlock the user program.\n",
    "  # ------------\n",
    "  threads = tf.train.start_queue_runners(sess=sess, coord=coord)\n",
    "  image_batch, label_batch = sess.run([image, label])\n",
    "  coord.request_stop()\n",
    "  coord.join(threads)\n",
    "  image_batch_np = np.asarray(image_batch)\n",
    "  label_batch_np = np.asarray(label_batch)\n",
    "  print('Shape of cropped image:', image.shape)\n",
    "  print('Shape of label:', label.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So far, we have prepared input queues. Let's start designing our cnn model!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## CNN Model\n",
    "### Model Structure\n",
    "<center><img style='width: 80%' src='imgsrc/model.png' /></center> \n",
    "### Model Details\n",
    "* We put all variables on CPU because we want GPU to only focus on calculation. \n",
    "* The cost function we use is simply the *cross entropy* of labels and predictions.\n",
    "* *Weight decay* is a very common regularization technique. For NNs, we can penalize large weights in the cost function. The implementation of weight decay is simple: add a term in the cost function that penalizes the $L^{2}$-norm of the weight matrix at each layer.\n",
    "$$\\operatorname{arg}\\underset{\\Theta=\\{\\boldsymbol{W^{(1)}}{\\cdots}\\boldsymbol{W^{(L)}}\\}}{\\operatorname{min}}C(\\Theta)+\\alpha\\sum_{i=1}^{L} \\lVert \\boldsymbol{W^{(i)}} \\rVert_{2}^{2}$$ \n",
    "* *Local response normalization* is mentioned in original [*AlexNet*](http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf) article in NIPS 2012. Because the activation function we used in our CNN model is *ReLU*, whose output has no upper bound. Thus, we need a local response normalization to normalize that.  \n",
    "Denoting by $a_{x,y}^i$ the activity of a neuron computed by applying kernel $i$ at position $(x, y)$ and then applying the ReLU nonlinearity, the response-normalized activity $b^i_{x,y}$ is given by the expression \n",
    "$$ b^i_{x,y} = a^i_{x,y} / \\left( k + \\alpha \\sum_{j=max(0,i-n/2)}^{min(N-1, i+n/2)} (a^j_{x,y})^2 \\right)^\\beta$$ \n",
    "where the sum runs over $n$ **adjacent** kernel maps at the same spatial position, and $N$ is the total number of kernels in the layer. The ordering of the kernel maps is arbitrary and determined before training begins.  The constants $k$, $n$, $\\alpha$, and $\\beta$ are hyper-parameters. Check the following figure drawn by Hu Yixuan.\n",
    "<center><img style='width: 80%' src='imgsrc/localResponseNormalization.jpeg' /></center> \n",
    "* When using gradient descent to update the weights of a neural network, sometimes the weights might move in the wrong direction. Thus, we take a [moving average](https://www.tensorflow.org/versions/r0.12/api_docs/python/train/moving_averages) of the weights over a bunch of previous updates.  \n",
    "  \n",
    "  $$\\boldsymbol{w_{avg_i}} = decay\\times\\boldsymbol{w_{avg_{i-1}}} + (1-decay)\\times\\boldsymbol{w_{i}}$$\n",
    "where $w_{i}$ is the $i_{th}$ updated weight.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class CNN_Model(object):\n",
    "  def __init__(self, batch_size, num_classes, num_training_example,\n",
    "               num_epoch_per_decay, init_lr, moving_average_decay):\n",
    "    self.batch_size = batch_size\n",
    "    self.num_classes = num_classes\n",
    "    self.num_training_example = num_training_example\n",
    "    self.num_epoch_per_decay = num_epoch_per_decay\n",
    "    self.init_lr = init_lr  # initial learn rate\n",
    "    self.moving_average_decay = moving_average_decay\n",
    "\n",
    "  def _variable_on_cpu(self, name, shape, initializer):\n",
    "    with tf.device('/cpu:0'):\n",
    "      var = tf.get_variable(\n",
    "          name, shape, initializer=initializer, dtype=tf.float32)\n",
    "    return var\n",
    "\n",
    "  def _variable_with_weight_decay(self, name, shape, stddev, wd=0.0):\n",
    "    \"\"\" Helper to create an initialized Variable with weight decay.\n",
    "        Note that the Variable is initialized with a truncated normal \n",
    "        distribution. A weight decay is added only if one is specified.\n",
    "        -----\n",
    "        Args:\n",
    "            name: \n",
    "                name of the variable\n",
    "            shape: \n",
    "                a list of ints\n",
    "            stddev: \n",
    "                standard deviation of a truncated Gaussian\n",
    "            wd: \n",
    "                add L2Loss weight decay multiplied by this float. If None, weight\n",
    "                decay is not added for this Variable.\n",
    "        Returns:\n",
    "            Variable Tensor\n",
    "    \"\"\"\n",
    "    initializer = tf.truncated_normal_initializer(\n",
    "        stddev=stddev, dtype=tf.float32)\n",
    "    var = self._variable_on_cpu(name, shape, initializer)\n",
    "    # deal with weight decay\n",
    "    weight_decay = tf.multiply(tf.nn.l2_loss(var), wd, name='weight_loss')\n",
    "    tf.add_to_collection('losses', weight_decay)\n",
    "    return var\n",
    "\n",
    "  def inference(self, images):\n",
    "    \"\"\" build the model\n",
    "        -----\n",
    "        Args:\n",
    "            images with shape [batch_size,24,24,3]\n",
    "        Return:\n",
    "            logits with shape [batch_size,10]\n",
    "    \"\"\"\n",
    "    with tf.variable_scope('conv_1') as scope:\n",
    "      kernel = self._variable_with_weight_decay('weights', [5, 5, 3, 64], 5e-2)\n",
    "      conv = tf.nn.conv2d(images, kernel, strides=[1, 1, 1, 1], padding=\"SAME\")\n",
    "      biases = self._variable_on_cpu('bias', [64], tf.constant_initializer(0.0))\n",
    "      pre_activation = tf.nn.bias_add(conv, biases)\n",
    "      conv_1 = tf.nn.relu(pre_activation, name=scope.name)\n",
    "    # pool_1\n",
    "    pool_1 = tf.nn.max_pool(\n",
    "        conv_1,\n",
    "        ksize=[1, 3, 3, 1],\n",
    "        strides=[1, 2, 2, 1],\n",
    "        padding='SAME',\n",
    "        name='pool_1')\n",
    "    # norm_1 (local_response_normalization)\n",
    "    norm_1 = tf.nn.lrn(\n",
    "        pool_1, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75, name='norm_1')\n",
    "    # conv2\n",
    "    with tf.variable_scope('conv_2') as scope:\n",
    "      kernel = self._variable_with_weight_decay('weights', [5, 5, 64, 64], 5e-2)\n",
    "      conv = tf.nn.conv2d(norm_1, kernel, [1, 1, 1, 1], padding='SAME')\n",
    "      biases = self._variable_on_cpu('biases', [64],\n",
    "                                     tf.constant_initializer(0.1))\n",
    "      pre_activation = tf.nn.bias_add(conv, biases)\n",
    "      conv_2 = tf.nn.relu(pre_activation, name=scope.name)\n",
    "    # norm2\n",
    "    norm_2 = tf.nn.lrn(\n",
    "        conv_2, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75, name='norm_2')\n",
    "    # pool2\n",
    "    pool_2 = tf.nn.max_pool(\n",
    "        norm_2,\n",
    "        ksize=[1, 3, 3, 1],\n",
    "        strides=[1, 2, 2, 1],\n",
    "        padding='SAME',\n",
    "        name='pool_2')\n",
    "    # FC_1 (fully-connected layer)\n",
    "    with tf.variable_scope('FC_1') as scope:\n",
    "      flat_features = tf.reshape(pool_2, [self.batch_size, -1])\n",
    "      dim = flat_features.get_shape()[1].value\n",
    "      weights = self._variable_with_weight_decay('weights', [dim, 384], 0.04,\n",
    "                                                 0.004)\n",
    "      biases = self._variable_on_cpu('biases', [384],\n",
    "                                     tf.constant_initializer(0.1))\n",
    "      FC_1 = tf.nn.relu(\n",
    "          tf.matmul(flat_features, weights) + biases, name=scope.name)\n",
    "    # FC_2\n",
    "    with tf.variable_scope('FC_2') as scope:\n",
    "      weights = self._variable_with_weight_decay('weights', [384, 192], 0.04,\n",
    "                                                 0.004)\n",
    "      biases = self._variable_on_cpu('biases', [192],\n",
    "                                     tf.constant_initializer(0.1))\n",
    "      FC_2 = tf.nn.relu(tf.matmul(FC_1, weights) + biases, name=scope.name)\n",
    "    with tf.variable_scope('softmax_linear') as scope:\n",
    "      weights = self._variable_with_weight_decay(\n",
    "          'weights', [192, self.num_classes], 1 / 192.0)\n",
    "      biases = self._variable_on_cpu('biases', [self.num_classes],\n",
    "                                     tf.constant_initializer(0.0))\n",
    "      logits = tf.add(tf.matmul(FC_2, weights), biases, name=scope.name)\n",
    "    return logits\n",
    "\n",
    "  def loss(self, logits, labels):\n",
    "    '''calculate the loss'''\n",
    "    labels = tf.cast(labels, tf.int64)\n",
    "    cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(\n",
    "        labels=labels, logits=logits, name='cross_entropy_per_example')\n",
    "    cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy')\n",
    "    tf.add_to_collection('losses', cross_entropy_mean)\n",
    "    # The total loss is defined as the cross entropy loss plus all of the weight\n",
    "    # decay terms (L2 loss).\n",
    "    return tf.add_n(tf.get_collection('losses'), name='total_loss')\n",
    "\n",
    "  def train(self, total_loss, global_step):\n",
    "    '''train a step'''\n",
    "    num_batches_per_epoch = self.num_training_example / self.batch_size\n",
    "    decay_steps = int(num_batches_per_epoch * self.num_epoch_per_decay)\n",
    "    # Decay the learning rate exponentially based on the number of steps.\n",
    "    lr = tf.train.exponential_decay(\n",
    "        self.init_lr, global_step, decay_steps, decay_rate=0.1, staircase=True)\n",
    "    opt = tf.train.GradientDescentOptimizer(lr)\n",
    "    grads = opt.compute_gradients(total_loss)\n",
    "    apply_gradient_op = opt.apply_gradients(grads, global_step=global_step)\n",
    "    # Track the moving averages of all trainable variables.\n",
    "    # This step just records the moving average weights but not uses them\n",
    "    ema = tf.train.ExponentialMovingAverage(self.moving_average_decay,\n",
    "                                            global_step)\n",
    "    self.ema = ema\n",
    "    variables_averages_op = ema.apply(tf.trainable_variables())\n",
    "    with tf.control_dependencies([apply_gradient_op, variables_averages_op]):\n",
    "      train_op = tf.no_op(name='train')\n",
    "    return train_op"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we can train our model. First, we need to feed some hyperparameters into it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "tf.reset_default_graph()\n",
    "# CNN model\n",
    "model = CNN_Model(batch_size=BATCH_SIZE, \n",
    "                  num_classes=NUM_CLASSES, \n",
    "                  num_training_example=NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN, \n",
    "                  num_epoch_per_decay=350.0, \n",
    "                  init_lr=0.1,\n",
    "                  moving_average_decay=0.9999)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we use CPU to handle the input because we want GPU to only focus on training."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# op for training\n",
    "global_step = tf.contrib.framework.get_or_create_global_step()\n",
    "with tf.device('/cpu:0'):\n",
    "  images, labels = distort_input(training_files, BATCH_SIZE)\n",
    "with tf.variable_scope('model'):\n",
    "  logits = model.inference(images)\n",
    "total_loss = model.loss(logits, labels)\n",
    "train_op = model.train(total_loss, global_step)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we train our model 180 epochs and save it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Done\n"
     ]
    }
   ],
   "source": [
    "NUM_EPOCH = 180\n",
    "NUM_BATCH_PER_EPOCH = NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN // BATCH_SIZE\n",
    "ckpt_dir = './model/'\n",
    "\n",
    "# train\n",
    "saver = tf.train.Saver()\n",
    "with tf.Session() as sess:\n",
    "  ckpt = tf.train.get_checkpoint_state(ckpt_dir)\n",
    "  if (ckpt and ckpt.model_checkpoint_path):\n",
    "    saver.restore(sess, ckpt.model_checkpoint_path)\n",
    "    # assume the name of checkpoint is like '.../model.ckpt-1000'\n",
    "    gs = int(ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1])\n",
    "    sess.run(tf.assign(global_step, gs))\n",
    "  else:\n",
    "    # no checkpoint found\n",
    "    sess.run(tf.global_variables_initializer())\n",
    "  coord = tf.train.Coordinator()\n",
    "  threads = tf.train.start_queue_runners(sess=sess, coord=coord)\n",
    "  loss = []\n",
    "  for i in range(NUM_EPOCH):\n",
    "    _loss = []\n",
    "    for _ in range(NUM_BATCH_PER_EPOCH):\n",
    "      l, _ = sess.run([total_loss, train_op])\n",
    "      _loss.append(l)\n",
    "    loss_this_epoch = np.sum(_loss)\n",
    "    gs = global_step.eval()\n",
    "    # print('loss of epoch %d: %f' % (gs / NUM_BATCH_PER_EPOCH, loss_this_epoch))\n",
    "    loss.append(loss_this_epoch)\n",
    "    saver.save(sess, ckpt_dir + 'model.ckpt', global_step=gs)\n",
    "  coord.request_stop()\n",
    "  coord.join(threads)\n",
    "  \n",
    "print('Done')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We have done our training! Let's see whether our model is great or not."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "with tf.device('/cpu:0'):\n",
    "    # build testing example queue\n",
    "    images, labels = eval_input(testing_files, BATCH_SIZE)\n",
    "with tf.variable_scope('model', reuse=True):\n",
    "    logits = model.inference(images)\n",
    "# use to calculate top-1 error\n",
    "top_k_op = tf.nn.in_top_k(logits, labels, 1) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Because now the weights are not moving average weights, we need to manually change this.  \n",
    "```python\n",
    "tf.train.ExponentialMovingAverage(decay).variables_to_restore()\n",
    "```  \n",
    "gives us a dictionary about the mapping between the weights and the moving average shadow weights. We can use this mapping to replace the original weights by moving average shadow weights."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "INFO:tensorflow:Restoring parameters from ./model/model.ckpt-70200\n",
      "Accurarcy: 8584/9984 = 0.859776\n"
     ]
    }
   ],
   "source": [
    "variables_to_restore = model.ema.variables_to_restore()\n",
    "saver = tf.train.Saver(variables_to_restore)\n",
    "with tf.Session() as sess:\n",
    "  # Restore variables from disk.\n",
    "  ckpt = tf.train.get_checkpoint_state(ckpt_dir)\n",
    "  if ckpt and ckpt.model_checkpoint_path:\n",
    "    saver.restore(sess, ckpt.model_checkpoint_path)\n",
    "    coord = tf.train.Coordinator()\n",
    "    threads = tf.train.start_queue_runners(sess=sess, coord=coord)\n",
    "    num_iter = NUM_EXAMPLES_PER_EPOCH_FOR_EVAL // BATCH_SIZE\n",
    "    total_sample_count = num_iter * BATCH_SIZE\n",
    "    true_count = 0\n",
    "    for _ in range(num_iter):\n",
    "      predictions = sess.run(top_k_op)\n",
    "      true_count += np.sum(predictions)\n",
    "    print('Accurarcy: %d/%d = %f' % (true_count, total_sample_count,\n",
    "                                     true_count / total_sample_count))\n",
    "    coord.request_stop()\n",
    "    coord.join(threads)\n",
    "  else:\n",
    "    print('train first')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We get a much higher accuracy than KNN and SVM. This is good enough!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "***\n",
    "# Assignment\n",
    "Implement the input pipeline of the CNN model with [dataset](https://www.tensorflow.org/programmers_guide/datasets) API mentioned last lab. The dataset should be multithreaded (16 threads). To simplify, you only need to train the model for 10 epochs. Finally, get the accuracy of this 10-epoch model. There are 4 'TODO' parts you need to finish. You only need to hand out the Lab12_{id}.ipynb.  \n",
    "The notebook should include \n",
    "* Training loss per epoch\n",
    "* The testing accuracy\n",
    "* The total time to train and test  \n",
    "\n",
    "Good luck!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from lab12_util import *\n",
    "\n",
    "DATA_URL = 'http://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz'\n",
    "DEST_DIRECTORY = 'dataset/cifar10'\n",
    "DATA_DIRECTORY = DEST_DIRECTORY + '/cifar-10-batches-bin'\n",
    "IMAGE_HEIGHT = 32\n",
    "IMAGE_WIDTH = 32\n",
    "IMAGE_DEPTH = 3\n",
    "IMAGE_SIZE_CROPPED = 24\n",
    "BATCH_SIZE = 128\n",
    "NUM_CLASSES = 10 \n",
    "LABEL_BYTES = 1\n",
    "IMAGE_BYTES = 32 * 32 * 3\n",
    "NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN = 50000\n",
    "NUM_EXAMPLES_PER_EPOCH_FOR_EVAL = 10000\n",
    "\n",
    "# download it\n",
    "maybe_download_and_extract(DEST_DIRECTORY, DATA_URL)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from tensorflow.contrib.data import FixedLengthRecordDataset, Iterator\n",
    "\n",
    "def cifar10_record_distort_parser(record):\n",
    "  ''' Parse the record into label, cropped and distorted image\n",
    "    -----\n",
    "    Args:\n",
    "        record: \n",
    "            a record containing label and image.\n",
    "    Returns:\n",
    "        label: \n",
    "            the label in the record.\n",
    "        image: \n",
    "            the cropped and distorted image in the record.\n",
    "  '''\n",
    "  # TODO1\n",
    "  pass\n",
    "\n",
    "\n",
    "def cifar10_record_crop_parser(record):\n",
    "  ''' Parse the record into label, cropped image\n",
    "    -----\n",
    "    Args:\n",
    "        record: \n",
    "            a record containing label and image.\n",
    "    Returns:\n",
    "        label: \n",
    "            the label in the record.\n",
    "        image: \n",
    "            the cropped image in the record.\n",
    "  '''\n",
    "  # TODO2\n",
    "  pass\n",
    "\n",
    "\n",
    "def cifar10_iterator(filenames, batch_size, cifar10_record_parser):\n",
    "  ''' Create a dataset and return a tf.contrib.data.Iterator \n",
    "    which provides a way to extract elements from this dataset.\n",
    "    -----\n",
    "    Args:\n",
    "        filenames: \n",
    "            a tensor of filenames.\n",
    "        batch_size: \n",
    "            batch size.\n",
    "    Returns:\n",
    "        iterator: \n",
    "            an Iterator providing a way to extract elements from the created dataset.\n",
    "        output_types: \n",
    "            the output types of the created dataset.\n",
    "        output_shapes: \n",
    "            the output shapes of the created dataset.\n",
    "  '''\n",
    "  record_bytes = LABEL_BYTES + IMAGE_BYTES\n",
    "  dataset = FixedLengthRecordDataset(filenames, record_bytes)\n",
    "  # TODO3\n",
    "  # tips: use dataset.map with cifar10_record_parser(record)\n",
    "  #       output_types = dataset.output_types\n",
    "  #       output_shapes = dataset.output_shapes\n",
    "  pass"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "tf.reset_default_graph()\n",
    "\n",
    "training_files = [\n",
    "    os.path.join(DATA_DIRECTORY, 'data_batch_%d.bin' % i) for i in range(1, 6)]\n",
    "testing_files = [os.path.join(DATA_DIRECTORY, 'test_batch.bin')]\n",
    "\n",
    "filenames_train = tf.constant(training_files)\n",
    "filenames_test = tf.constant(testing_files)\n",
    "\n",
    "iterator_train, types, shapes = cifar10_iterator(filenames_train, BATCH_SIZE,\n",
    "                                                 cifar10_record_distort_parser)\n",
    "iterator_test, _, _ = cifar10_iterator(filenames_test, BATCH_SIZE,\n",
    "                                       cifar10_record_crop_parser)\n",
    "\n",
    "# use to handle training and testing\n",
    "handle = tf.placeholder(tf.string, shape=[])\n",
    "iterator = Iterator.from_string_handle(handle, types, shapes)\n",
    "labels_images_pairs = iterator.get_next()\n",
    "\n",
    "# CNN model\n",
    "model = CNN_Model(\n",
    "    batch_size=BATCH_SIZE,\n",
    "    num_classes=NUM_CLASSES,\n",
    "    num_training_example=NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN,\n",
    "    num_epoch_per_decay=350.0,\n",
    "    init_lr=0.1,\n",
    "    moving_average_decay=0.9999)\n",
    "\n",
    "with tf.device('/cpu:0'):\n",
    "  labels, images = labels_images_pairs\n",
    "  labels = tf.reshape(labels, [BATCH_SIZE])\n",
    "  images = tf.reshape(\n",
    "      images, [BATCH_SIZE, IMAGE_SIZE_CROPPED, IMAGE_SIZE_CROPPED, IMAGE_DEPTH])\n",
    "with tf.variable_scope('model'):\n",
    "  logits = model.inference(images)\n",
    "# train\n",
    "global_step = tf.contrib.framework.get_or_create_global_step()\n",
    "total_loss = model.loss(logits, labels)\n",
    "train_op = model.train(total_loss, global_step)\n",
    "# test\n",
    "top_k_op = tf.nn.in_top_k(logits, labels, 1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "%%time\n",
    "\n",
    "# TODO4:\n",
    "# 1. train the CNN model 10 epochs\n",
    "# 2. show the loss per epoch\n",
    "# 3. get the accuracy of this 10-epoch model\n",
    "# 4. measure the time using '%%time' instruction\n",
    "# tips:\n",
    "# use placeholder handle to determine if training or testing. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}

## ExponentialMovingAverage.md

      
    Raw
  

              ExponentialMovingAverage.md
            
          
    ema.variables_to_restore dont rely on ema.apply and is not part of graph. it solely output a dict

variables_to_restore(moving_avg_variables=None)
moving_avg_variables defaults to variables.moving_average_variables() + variables.trainable_variables()


tensorflow/tensorflow#11839 (comment)

This behavior is confusing but it is not a bug. When calling variables_to_restore, you are supposed to indicate what variables have moving averages through its moving_avg_variables parameter. ExponentialMovingAverage doesn't actually use its own stored averages to determine what variables have moving averages, but instead relies on moving_avg_variables. moving_avg_variables defaults to variables.moving_average_variables() + variables.trainable_variables(), which is why Variable c is included.


I believe the reason variables_to_restore acts this way is that it is typically called during evaluation, while the moving averages were set during training. During evaluation, no moving averages are set, so variables_to_restore cannot rely on the ExponentialMovingAverage's stored moving averages, which is why it uses moving_avg_variables instead.


@sguada, I think we should change the description to something like "If a variable is in moving_avg_variables, use its moving average variable name as the restore name; otherwise, use the variable name." What do you think?


a = tf.Variable(tf.constant(1.0), name='a')
b = tf.Variable(tf.constant(3.0), name='b')
c = tf.Variable(tf.constant(5.0), name='c')
ema = tf.train.ExponentialMovingAverage(decay=0.9999)
# ema.apply([a, b])

# print(tf.get_default_graph().get_all_collection_keys())
# print(tf.get_collection('moving_average_variables'))
# print(tf.global_variables())
variables_to_restore = ema.variables_to_restore([a, b])
d = tf.Variable(tf.constant(7.0), name='d')
for mv, var in variables_to_restore.items():
    print(mv, ' : ', var)

print(tf.get_collection(tf.GraphKeys.MOVING_AVERAGE_VARIABLES))
c  :  <tf.Variable 'c:0' shape=() dtype=float32_ref>
b/ExponentialMovingAverage  :  <tf.Variable 'b:0' shape=() dtype=float32_ref>
a/ExponentialMovingAverage  :  <tf.Variable 'a:0' shape=() dtype=float32_ref>
[]