Skip to content

Instantly share code, notes, and snippets.

@kplr-io
Created January 30, 2023 12:53
Show Gist options
  • Save kplr-io/1c40fb56a436762b4821a4f714f0be6d to your computer and use it in GitHub Desktop.
Save kplr-io/1c40fb56a436762b4821a4f714f0be6d to your computer and use it in GitHub Desktop.
MNIST in Keras.ipynb
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/kplr-io/1c40fb56a436762b4821a4f714f0be6d/mnist-in-keras.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"id": "L8d0CykwSZWi"
},
"outputs": [],
"source": [
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "1ajvOFM9SZWm"
},
"source": [
"# Introduction to Deep Learning with Keras and TensorFlow\n",
"\n",
"**Daniel Moser (UT Southwestern Medical Center)**\n",
"\n",
"**Resources: [Xavier Snelgrove](https://github.com/wxs/keras-mnist-tutorial), [Yash Katariya](https://github.com/yashk2810/MNIST-Keras)**"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "pejv4cP_SZWp"
},
"source": [
"To help you understand the fundamentals of deep learning, this demo will walk through the basic steps of building two toy models for classifying handwritten numbers with accuracies surpassing 95%. The first model will be a basic fully-connected neural network, and the second model will be a deeper network that introduces the concepts of convolution and pooling."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "mOw7yH5oSZWp"
},
"source": [
"## The Task for the AI\n",
"\n",
"Our goal is to construct and train an artificial neural network on thousands of images of handwritten digits so that it may successfully identify others when presented. The data that will be incorporated is the MNIST database which contains 60,000 images for training and 10,000 test images. We will use the Keras Python API with TensorFlow as the backend."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ydcjyK1vSZWq"
},
"source": [
"<img src=\"https://github.com/AviatorMoser/keras-mnist-tutorial/blob/master/mnist.png?raw=1\" >"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "TqCc4tJ3SZWq"
},
"source": [
"## Prerequisite Python Modules\n",
"\n",
"First, some software needs to be loaded into the Python environment."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "QZCGECl9SZWr"
},
"outputs": [],
"source": [
"import numpy as np # advanced math library\n",
"import matplotlib.pyplot as plt # MATLAB like plotting routines\n",
"import random # for generating random numbers\n",
"\n",
"from keras.datasets import mnist # MNIST dataset is included in Keras\n",
"from keras.models import Sequential # Model type to be used\n",
"\n",
"from keras.layers.core import Dense, Dropout, Activation # Types of layers to be used in our model\n",
"from keras.utils import np_utils # NumPy related tools"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "H8zkcBSsSZWs"
},
"source": [
"## Loading Training Data\n",
"\n",
"The MNIST dataset is conveniently bundled within Keras, and we can easily analyze some of its features in Python."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "1zfY9eQoSZWt",
"outputId": "86212ae6-0335-4774-8e1d-ad4e6ebc44a9",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"X_train shape (60000, 28, 28)\n",
"y_train shape (60000,)\n",
"X_test shape (10000, 28, 28)\n",
"y_test shape (10000,)\n"
]
}
],
"source": [
"# The MNIST data is split between 60,000 28 x 28 pixel training images and 10,000 28 x 28 pixel images\n",
"(X_train, y_train), (X_test, y_test) = mnist.load_data()\n",
"\n",
"print(\"X_train shape\", X_train.shape)\n",
"print(\"y_train shape\", y_train.shape)\n",
"print(\"X_test shape\", X_test.shape)\n",
"print(\"y_test shape\", y_test.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "YAulEmd_SZWu"
},
"source": [
"Using matplotlib, we can plot some sample images from the training set directly into this Jupyter Notebook."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "9bFgLEtRSZWv",
"outputId": "a638d753-dad8-49ef-ad4f-a13680c56be1",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 657
}
},
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 648x648 with 9 Axes>"
],
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
],
"source": [
"plt.rcParams['figure.figsize'] = (9,9) # Make the figures a bit bigger\n",
"\n",
"for i in range(9):\n",
" plt.subplot(3,3,i+1)\n",
" num = random.randint(0, len(X_train))\n",
" plt.imshow(X_train[num], cmap='gray', interpolation='none')\n",
" plt.title(\"Class {}\".format(y_train[num]))\n",
" \n",
"plt.tight_layout()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "035GHESzSZWv"
},
"source": [
"Let's examine a single digit a little closer, and print out the array representing the last digit."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "XAbodwlxSZWw",
"outputId": "cd90f2da-0401-4efc-daba-2ff4e7edc354",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 0 194 254 254 254 255 207 92 0 0 0 0 0 0 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 0 164 253 253 253 253 253 248 200 65 16 0 0 0 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 0 17 187 253 253 253 253 253 253 253 84 0 0 0 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 0 15 180 253 223 7 54 160 205 253 239 51 0 0 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 0 63 253 253 198 0 0 0 30 207 253 225 52 0 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 0 193 253 253 68 0 0 0 0 32 207 253 225 51 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 0 193 253 224 46 0 0 0 0 0 169 253 253 92 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 68 237 253 168 0 0 0 0 0 0 151 253 253 204 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 93 253 253 168 0 0 0 0 0 0 16 253 253 245 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 93 253 253 28 0 0 0 0 0 0 16 253 253 245 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 93 253 253 14 0 0 0 0 0 0 16 253 253 245 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 168 253 253 14 0 0 0 0 0 0 16 253 253 245 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 247 253 253 14 0 0 0 0 0 0 16 253 253 245 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 247 253 253 14 0 0 0 0 0 0 120 253 253 236 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 247 253 253 14 0 0 0 0 0 103 208 253 253 92 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 247 253 253 145 0 0 0 44 137 249 253 253 162 14 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 145 251 253 222 143 88 162 226 253 253 253 249 36 0 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 0 193 253 253 253 253 253 253 253 253 180 79 0 0 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 0 58 229 253 253 253 253 253 227 156 15 0 0 0 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 0 0 55 99 211 253 253 145 51 0 0 0 0 0 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \n",
"0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \n"
]
}
],
"source": [
"# just a little function for pretty printing a matrix\n",
"def matprint(mat, fmt=\"g\"):\n",
" col_maxes = [max([len((\"{:\"+fmt+\"}\").format(x)) for x in col]) for col in mat.T]\n",
" for x in mat:\n",
" for i, y in enumerate(x):\n",
" print((\"{:\"+str(col_maxes[i])+fmt+\"}\").format(y), end=\" \")\n",
" print(\"\")\n",
"\n",
"# now print! \n",
"matprint(X_train[num])"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7dSo-0-VSZWw"
},
"source": [
"Each pixel is an 8-bit integer from 0-255. 0 is full black, while 255 is full white. This what we call a single-channel pixel. It's called monochrome.\n",
"\n",
"*Fun-fact! Your computer screen has three channels for each pixel: red, green, blue. Each of these channels also likely takes an 8-bit integer. 3 channels -- 24 bits total -- 16,777,216 possible colors!*"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "64CrKKSTSZWx"
},
"source": [
"## Formatting the input data layer\n",
"\n",
"Instead of a 28 x 28 matrix, we build our network to accept a 784-length vector.\n",
"\n",
"Each image needs to be then reshaped (or flattened) into a vector. We'll also normalize the inputs to be in the range [0-1] rather than [0-255]. Normalizing inputs is generally recommended, so that any additional dimensions (for other network architectures) are of the same scale."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "vC8EDgfBSZWx"
},
"source": [
"<img src='https://github.com/AviatorMoser/keras-mnist-tutorial/blob/master/flatten.png?raw=1' >"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "PwbuhtvvSZWx",
"outputId": "57f2049f-31ed-4e14-ba22-389aa532a98b",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Training matrix shape (60000, 784)\n",
"Testing matrix shape (10000, 784)\n"
]
}
],
"source": [
"X_train = X_train.reshape(60000, 784) # reshape 60,000 28 x 28 matrices into 60,000 784-length vectors.\n",
"X_test = #fill here # reshape 10,000 28 x 28 matrices into 10,000 784-length vectors.\n",
"\n",
"X_train = X_train.astype('float32') # change integers to 32-bit floating point numbers\n",
"X_test = X_test.astype('float32')\n",
"\n",
"X_train /= 255 # normalize each value for each pixel for the entire vector for each input\n",
"X_test /= 255\n",
"\n",
"print(\"Training matrix shape\", X_train.shape)\n",
"print(\"Testing matrix shape\", X_test.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "wV3YTokPSZWy"
},
"source": [
"We then modify our classes (unique digits) to be in the one-hot format, i.e.\n",
"\n",
"```\n",
"0 -> [1, 0, 0, 0, 0, 0, 0, 0, 0]\n",
"1 -> [0, 1, 0, 0, 0, 0, 0, 0, 0]\n",
"2 -> [0, 0, 1, 0, 0, 0, 0, 0, 0]\n",
"etc.\n",
"```\n",
"\n",
"If the final output of our network is very close to one of these classes, then it is most likely that class. For example, if the final output is:\n",
"\n",
"```\n",
"[0, 0.94, 0, 0, 0, 0, 0.06, 0, 0]\n",
"```\n",
"then it is most probable that the image is that of the digit `1`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"id": "3WcJN1SjSZWy"
},
"outputs": [],
"source": [
"nb_classes = 10 # number of unique digits\n",
"\n",
"Y_train = np_utils.to_categorical(y_train, nb_classes)\n",
"Y_test = np_utils.to_categorical(y_test, nb_classes)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "V0meJYjdSZWy"
},
"source": [
"# Building a 3-layer fully connected network (FCN)\n",
"\n",
"<img src=\"https://github.com/AviatorMoser/keras-mnist-tutorial/blob/master/figure.png?raw=1\" />"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"id": "pMRztIAHSZWz"
},
"outputs": [],
"source": [
"# The Sequential model is a linear stack of layers and is very common.\n",
"\n",
"model = Sequential()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "85H3ItNISZWz"
},
"source": [
"## The first hidden layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"id": "xtbUnhyvSZWz"
},
"outputs": [],
"source": [
"# The first hidden layer is a set of 512 nodes (artificial neurons).\n",
"# Each node will receive an element from each input vector and apply some weight and bias to it.\n",
"\n",
"model.add(Dense(512, input_shape=(784,))) #(784,) is not a typo -- that represents a 784 length vector!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"id": "muQXYwRkSZWz"
},
"outputs": [],
"source": [
"# An \"activation\" is a non-linear function applied to the output of the layer above.\n",
"# It checks the new value of the node, and decides whether that artifical neuron has fired.\n",
"# The Rectified Linear Unit (ReLU) converts all negative inputs to nodes in the next layer to be zero.\n",
"# Those inputs are then not considered to be fired.\n",
"# Positive values of a node are unchanged.\n",
"\n",
"#fill here"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "mbr9yOvQSZW0"
},
"source": [
"$$f(x) = max (0,x)$$\n",
"<img src = 'relu.jpg' >"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"id": "a_UOk65ZSZW0"
},
"outputs": [],
"source": [
"# Dropout zeroes a selection of random outputs (i.e., disables their activation)\n",
"# Dropout helps protect the model from memorizing or \"overfitting\" the training data.\n",
"model.add(Dropout(0.2))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "MpY_KMlSSZW0"
},
"source": [
"## Adding the second hidden layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"id": "tsbpjukcSZW0"
},
"outputs": [],
"source": [
"# The second hidden layer appears identical to our first layer.\n",
"# However, instead of each of the 512-node receiving 784-inputs from the input image data,\n",
"# they receive 512 inputs from the output of the first 512-node layer.\n",
"\n",
"model.add(Dense(512))\n",
"model.add(Activation('relu'))\n",
"#fill here"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "XNilP6HwSZW1"
},
"source": [
"## The Final Output Layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"id": "QiSeDRaDSZW1"
},
"outputs": [],
"source": [
"# The final layer of 10 neurons in fully-connected to the previous 512-node layer.\n",
"# The final layer of a FCN should be equal to the number of desired classes (10 in this case).\n",
"model.add(Dense(10))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"id": "-mr-z2KgSZW1"
},
"outputs": [],
"source": [
"# The \"softmax\" activation represents a probability distribution over K different possible outcomes.\n",
"# Its values are all non-negative and sum to 1.\n",
"\n",
"model.add(Activation('softmax'))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "HEOQz_a5SZW1",
"outputId": "327070f1-f153-453a-f519-67e4adbb7e43",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Model: \"sequential_1\"\n",
"_________________________________________________________________\n",
" Layer (type) Output Shape Param # \n",
"=================================================================\n",
" dense_3 (Dense) (None, 512) 401920 \n",
" \n",
" activation_3 (Activation) (None, 512) 0 \n",
" \n",
" dropout_2 (Dropout) (None, 512) 0 \n",
" \n",
" dense_4 (Dense) (None, 512) 262656 \n",
" \n",
" activation_4 (Activation) (None, 512) 0 \n",
" \n",
" dropout_3 (Dropout) (None, 512) 0 \n",
" \n",
" dense_5 (Dense) (None, 10) 5130 \n",
" \n",
" activation_5 (Activation) (None, 10) 0 \n",
" \n",
"=================================================================\n",
"Total params: 669,706\n",
"Trainable params: 669,706\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n"
]
}
],
"source": [
"# Summarize the built model\n",
"\n",
"# fill here"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "sWFZFpyjSZW1"
},
"source": [
"## Compiling the model\n",
"\n",
"Keras is built on top of Theano and TensorFlow. Both packages allow you to define a *computation graph* in Python, which then compiles and runs efficiently on the CPU or GPU without the overhead of the Python interpreter.\n",
"\n",
"When compiing a model, Keras asks you to specify your **loss function** and your **optimizer**. The loss function we'll use here is called *categorical cross-entropy*, and is a loss function well-suited to comparing two probability distributions.\n",
"\n",
"Our predictions are probability distributions across the ten different digits (e.g. \"we're 80% confident this image is a 3, 10% sure it's an 8, 5% it's a 2, etc.\"), and the target is a probability distribution with 100% for the correct category, and 0 for everything else. The cross-entropy is a measure of how different your predicted distribution is from the target distribution. [More detail at Wikipedia](https://en.wikipedia.org/wiki/Cross_entropy)\n",
"\n",
"The optimizer helps determine how quickly the model learns through **gradient descent**. The rate at which descends a gradient is called the **learning rate**."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "i416OMQ9SZW1"
},
"source": [
"<img src = \"gradient_descent.png\" >"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "CWMED60kSZW2"
},
"source": [
"<img src = \"learning_rate.png\" >"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2RYA0SvnSZW2"
},
"source": [
"So are smaller learning rates better? Not quite! It's important for an optimizer not to get stuck in local minima while neglecting the global minimum of the loss function. Sometimes that means trying a larger learning rate to jump out of a local minimum."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Az2yjptpSZW2"
},
"source": [
"<img src = 'complicated_loss_function.png' >"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"id": "t0HI0iLzSZW2"
},
"outputs": [],
"source": [
"# Let's use the Adam optimizer for learning\n",
"model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "s8ZL4XflSZW2"
},
"source": [
"## Train the model!\n",
"This is the fun part! "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "0HclES3qSZW2"
},
"source": [
"The batch size determines over how much data per step is used to compute the loss function, gradients, and back propagation. Large batch sizes allow the network to complete it's training faster; however, there are other factors beyond training speed to consider.\n",
"\n",
"Too large of a batch size smoothes the local minima of the loss function, causing the optimizer to settle in one because it thinks it found the global minimum.\n",
"\n",
"Too small of a batch size creates a very noisy loss function, and the optimizer may never find the global minimum.\n",
"\n",
"So a good batch size may take some trial and error to find!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "BIupkxW8SZW3",
"outputId": "42ee7d64-4599-490e-b8a0-ce37308d2792",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Epoch 1/5\n",
"469/469 [==============================] - 10s 20ms/step - loss: 0.2475 - accuracy: 0.9254\n",
"Epoch 2/5\n",
"469/469 [==============================] - 8s 17ms/step - loss: 0.1014 - accuracy: 0.9681\n",
"Epoch 3/5\n",
"469/469 [==============================] - 8s 17ms/step - loss: 0.0709 - accuracy: 0.9771\n",
"Epoch 4/5\n",
"469/469 [==============================] - 8s 16ms/step - loss: 0.0544 - accuracy: 0.9827\n",
"Epoch 5/5\n",
"469/469 [==============================] - 8s 17ms/step - loss: 0.0475 - accuracy: 0.9848\n"
]
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"<keras.callbacks.History at 0x7f4b50afb8d0>"
]
},
"metadata": {},
"execution_count": 40
}
],
"source": [
"model.fit(#fill here, #fill here,\n",
" batch_size=128, epochs=5,\n",
" verbose=1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "PDPsFYzgSZW3"
},
"source": [
"The two numbers, in order, represent the value of the loss function of the network on the training set, and the overall accuracy of the network on the training data. But how does it do on data it did not train on?"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "e0ZXqmFFSZW3"
},
"source": [
"## Evaluate Model's Accuracy on Test Data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "lucBi0IUSZW3",
"outputId": "748bfa3e-f2a2-4965-f075-73e4657ffbb7",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"313/313 [==============================] - 1s 3ms/step - loss: 0.0792 - accuracy: 0.9772\n",
"Test score: 0.07915858179330826\n",
"Test accuracy: 0.9771999716758728\n"
]
}
],
"source": [
"score = #fill here\n",
"print('Test score:', score[0])\n",
"print('Test accuracy:', score[1])"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "S4ZS7ZDsSZW3"
},
"source": [
"### Inspecting the output\n",
"\n",
"It's always a good idea to inspect the output and make sure everything looks sane. Here we'll look at some examples it gets right, and some examples it gets wrong."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"id": "S5JPQ0aASZW3",
"outputId": "e767e41d-8778-4899-e7b3-885d72bd216a",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:6: DeprecationWarning: elementwise comparison failed; this will raise an error in the future.\n",
" \n",
"/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:8: DeprecationWarning: elementwise comparison failed; this will raise an error in the future.\n",
" \n"
]
}
],
"source": [
"# The predict_classes function outputs the highest probability class\n",
"# according to the trained classifier for each input example.\n",
"predicted_classes = model.predict(X_test)\n",
"\n",
"# Check which items we got right / wrong\n",
"correct_indices = np.nonzero(predicted_classes == y_test)[0]\n",
"\n",
"incorrect_indices = np.nonzero(predicted_classes != y_test)[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Qc3mT5aqSZW4",
"outputId": "09da71e2-e00f-40b9-ace3-37d4fe069a96",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 102
}
},
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 648x648 with 0 Axes>"
]
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 648x648 with 1 Axes>"
],
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
],
"source": [
"plt.figure()\n",
"for i, correct in enumerate(correct_indices[:9]):\n",
" plt.subplot(3,3,i+1)\n",
" plt.imshow(X_test[correct].reshape(28,28), cmap='gray', interpolation='none')\n",
" plt.title(\"Predicted {}, Class {}\".format(predicted_classes[correct], y_test[correct]))\n",
" \n",
"plt.tight_layout()\n",
" \n",
"plt.figure()\n",
"for i, incorrect in enumerate(incorrect_indices[:9]):\n",
" plt.subplot(3,3,i+1)\n",
" plt.imshow(X_test[incorrect].reshape(28,28), cmap='gray', interpolation='none')\n",
" plt.title(\"Predicted {}, Class {}\".format(predicted_classes[incorrect], y_test[incorrect]))\n",
" \n",
"plt.tight_layout()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "W2Sl18ZMSZW4"
},
"source": [
"# Trying experimenting with the batch size!\n",
"\n",
"#### How does increasing the batch size to 10,000 affect the training time and test accuracy?\n",
"\n",
"#### How about a batch size of 32?"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Tensorflow (GPU)",
"language": "python",
"name": "py3.6-tfgpu"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.6"
},
"colab": {
"name": "MNIST in Keras.ipynb",
"provenance": [],
"include_colab_link": true
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment