Skip to content

Instantly share code, notes, and snippets.

@prpatel
Created October 31, 2018 19:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save prpatel/8a00fd2bf881e732a525160040a3efd5 to your computer and use it in GitHub Desktop.
Save prpatel/8a00fd2bf881e732a525160040a3efd5 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Intro to Deep Learning with Keras\n",
"### Starter Code\n",
"* **IBM Code London Meetup:** https://www.meetup.com/IBM-Code-London/events/255417147/\n",
"* **Date:** Wed 31st October 2018\n",
"* **Instructor:** John Sandall\n",
"* **Contact:** john@coefficient.ai / @john_sandall\n",
"\n",
"---"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Imports\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import pandas as pd\n",
"from pathlib import Path\n",
"import seaborn as sns\n",
"from sklearn import datasets, ensemble, linear_model, model_selection, neighbors, metrics, preprocessing, neural_network\n",
"import warnings\n",
"\n",
"%matplotlib inline\n",
"warnings.filterwarnings('ignore')\n",
"np.random.seed(0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Lab: Multi-Layer Perceptron\n",
"### The MNIST Dataset\n",
"![MNIST](https://upload.wikimedia.org/wikipedia/commons/2/27/MnistExamples.png)\n",
"\n",
"From [Wikipedia](https://en.wikipedia.org/wiki/MNIST_database):\n",
"> The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems. The database is also widely used for training and testing in the field of machine learning.\n",
"\n",
"From [OpenML](https://www.openml.org/d/554) (the source for this specific data):\n",
"> The MNIST database of handwritten digits with 784 features, raw data available at: http://yann.lecun.com/exdb/mnist/. It can be split in a training set of the first 60,000 examples, and a test set of 10,000 examples "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from keras import datasets"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Download MNIST via Keras (requires internet connection)\n",
"(X_train, y_train), (X_test, y_test) = datasets.mnist.load_data()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# What does this data look like?\n",
"print(\"X_train:\", X_train.shape)\n",
"print(\"X_test:\", X_test.shape)\n",
"print(\"y_train:\", y_train.shape)\n",
"print(\"y_test:\", y_test.shape)\n",
"print(\"One sample from X_train:\", X_train[0].shape)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Visualise some samples\n",
"print(\"Class =\", y_train[1])\n",
"plt.matshow(X_train[1], cmap=plt.cm.gray)\n",
"\n",
"print(\"Class =\", y_train[2])\n",
"plt.matshow(X_train[2], cmap=plt.cm.gray)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Let's look at a single sample\n",
"X_train[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# This needs to be flattened before we can feed it into sklearn's MLPClassifier\n",
"X_train_flat = np.array([elt.reshape(784,) for elt in X_train])\n",
"X_test_flat = np.array([elt.reshape(784,) for elt in X_test])\n",
"print(X_train[0].shape)\n",
"print(X_train_flat[0].shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Train a multi-layer perceptron in scikit-learn"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Fit a basic model\n",
"mlp = neural_network.MLPClassifier(hidden_layer_sizes=(50,), max_iter=10, alpha=1e-4,\n",
" solver='sgd', verbose=10, tol=1e-4, random_state=1,\n",
" learning_rate_init=.1)\n",
"mlp.fit(X_train_flat, y_train)\n",
"print(\"Training set score: %f\" % mlp.score(X_train_flat, y_train))\n",
"print(\"Test set score: %f\" % mlp.score(X_test_flat, y_test))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# What do the coefficients look like?\n",
"print(\"Hidden layer:\", mlp.coefs_[0].shape)\n",
"print(\"Output layer:\", mlp.coefs_[1].shape)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# There are 50 of these \"weight matrices\", each specialising in enhancing signal from certain shapes/areas.\n",
"# Here are first few.\n",
"plt.matshow(mlp.coefs_[0][:,1].reshape(28,28))\n",
"plt.matshow(mlp.coefs_[0][:,2].reshape(28,28))\n",
"plt.matshow(mlp.coefs_[0][:,3].reshape(28,28))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Fit a deep neural network with two hidden layers (both with 100 neurons)\n",
"# WARNING: Takes a while! scikit-learn really isn't designed for this kind of work!\n",
"mlp = neural_network.MLPClassifier(hidden_layer_sizes=(100, 100), max_iter=400, alpha=1e-4,\n",
" solver='sgd', verbose=10, tol=1e-4, random_state=1)\n",
"mlp.fit(X_train_flat, y_train)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"Training set score: %f\" % mlp.score(X_train_flat, y_train))\n",
"print(\"Test set score: %f\" % mlp.score(X_test_flat, y_test))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# What do the coefficients look like?\n",
"print(\"Hidden layer #1:\", mlp.coefs_[0].shape)\n",
"print(\"Hidden layer #2:\", mlp.coefs_[1].shape)\n",
"print(\"Output layer:\", mlp.coefs_[2].shape)\n",
"\n",
"# Look at some of the weight matrices in the first and second layer.\n",
"plt.matshow(mlp.coefs_[0][:,1].reshape(28,28))\n",
"plt.matshow(mlp.coefs_[0][:,2].reshape(28,28))\n",
"plt.matshow(mlp.coefs_[1])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Visualise and predict for one of the test set classes\n",
"print(\"Predicted class:\", mlp.predict(X_test_flat[:1])[0])\n",
"print(\"Predicted probabilities:\", [round(x, 4) for x in mlp.predict_proba(X_test_flat[:1])[0]])\n",
"plt.matshow(X_test[0])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Visualise confusion matrix\n",
"# Adapted from sklearn example code:\n",
"# http://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html\n",
"import itertools\n",
"\n",
"\n",
"def plot_confusion_matrix(cm, classes):\n",
" \"\"\"\n",
" This function prints and plots the confusion matrix.\n",
" \"\"\"\n",
" plt.imshow(cm, interpolation='nearest', cmap=plt.cm.Blues)\n",
" plt.title('Confusion matrix')\n",
" plt.colorbar()\n",
" tick_marks = np.arange(len(classes))\n",
" plt.xticks(tick_marks, classes, rotation=45)\n",
" plt.yticks(tick_marks, classes)\n",
"\n",
" thresh = cm.max() / 2.\n",
" for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):\n",
" plt.text(j, i, format(cm[i, j], 'd'),\n",
" horizontalalignment=\"center\",\n",
" color=\"white\" if cm[i, j] > thresh else \"black\")\n",
"\n",
" plt.tight_layout()\n",
" plt.ylabel('True label')\n",
" plt.xlabel('Predicted label')\n",
"\n",
"cm = metrics.confusion_matrix(y_test, mlp.predict(X_test_flat))\n",
"plot_confusion_matrix(cm, classes=range(10))\n",
"\n",
"# 5 is often confused with 8, as is 4/9, and 3/5."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Lab: Build a MLP in Keras"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from keras.callbacks import EarlyStopping, TensorBoard, ModelCheckpoint, LearningRateScheduler\n",
"from keras.models import Sequential, Model, load_model, model_from_json, model_from_yaml, save_model\n",
"from keras.layers import Input, Dense, Activation, BatchNormalization\n",
"from keras import initializers, optimizers, utils"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Check if Keras is using GPU version of TensorFlow\n",
"from tensorflow.python.client import device_lib\n",
"\n",
"print(device_lib.list_local_devices())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> #### Exercise: Create a `Sequential()` MLP with:\n",
"> - One hidden layer containing 50-neurons that accepts the flattened MNIST data as input (i.e. vector of length 784) + ReLU activation.\n",
"> - One 10-class output layer with Softmax activation.\n",
"> - Assign this to a variable called `model`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Define\n",
"model = Sequential(...)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> #### Exercise: Compile `model` using `sgd` optimizer, `categorical_crossentropy` loss, and `accuracy` metric.\n",
"> Cross-entropy aims to penalise models that estimate a low probability for the target class. For more intuition on how cross-entropy works, see https://www.quora.com/Whats-an-intuitive-way-to-think-of-cross-entropy"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Compile\n",
"model.compile(...)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> #### Exercise: Fit the compiled model to `X_train_flat` and `y_train` using 1 epoch and a `batch_size` of 32. How does the accuracy result compare to the 50-neuron MLP from sklearn?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# If you don't have a GPU you may wish to reduce the dataset for expedience. Otherwise leave this commented out.\n",
"# X_train = X_train[:10000]\n",
"# X_train_flat = X_train_flat[:10000]\n",
"# y_train = y_train[:10000]\n",
"\n",
"# Convert labels to categorical one-hot encoding\n",
"y_train_encoded = utils.to_categorical(y_train, num_classes=10)\n",
"y_test_encoded = utils.to_categorical(y_test, num_classes=10)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Fit (this may take a while!)\n",
"model.fit(...)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"score = model.evaluate(...)\n",
"score"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> #### Exercise: Repeat for a DNN with two hidden layers with 100 neurons in each layer. How does this compare (in terms of both speed and accuracy) with the MLP DNN?\n",
"> \n",
"> **Tip!** This model is a lot more complex, you may want to run this for 50+ epochs."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# ...\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Re-using Keras \"Application\" Models"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from keras.applications.resnet50 import ResNet50\n",
"from keras.preprocessing import image\n",
"from keras.applications.resnet50 import preprocess_input, decode_predictions"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"model = ResNet50(weights='imagenet')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Feel free to replace this with any image!\n",
"import urllib.request\n",
"\n",
"img_path = 'elephant.jpg'\n",
"image_url = \"https://upload.wikimedia.org/wikipedia/commons/thumb/3/37/African_Bush_Elephant.jpg/220px-African_Bush_Elephant.jpg\"\n",
"urllib.request.urlretrieve(image_url, img_path)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"img = image.load_img(img_path, target_size=(224, 224))\n",
"x = image.img_to_array(img)\n",
"x = np.expand_dims(x, axis=0)\n",
"x = preprocess_input(x)\n",
"\n",
"preds = model.predict(x)\n",
"# Decode the results into a list of tuples (class, description, probability)\n",
"print('Predicted:', decode_predictions(preds, top=3)[0])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Next steps\n",
"* **[Intro to Python for Data Science](https://www.eventbrite.co.uk/e/intro-to-python-for-data-science-registration-51843211441)** Two workshops on 5th November (Part I) and 7th November (Part II). Use `LEARN2CODE29` for 10% off (gives you both workshops for just £90).\n",
"* **[Future workshops](https://mailchi.mp/a06466074a39/coefficient-training):** [Sign up here](https://mailchi.mp/a06466074a39/coefficient-training) to hear about future workshops.\n",
"* **[Learn Python The Hard Way](https://learnpythonthehardway.org/book/):** Free online resource to learn Python to a somewhat advanced level.\n",
"* **[Learn pandas & sklearn on Kaggle](https://www.kaggle.com/learn/overview):** Jupyter Notebook based training exercises and examples.\n",
"* **Contact:** john@coefficient.ai / @john_sandall\n",
"\n",
"---"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python (conda base)",
"language": "python",
"name": "base"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.6"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment