ia35/transfer_learning_with_hub.ipynb

## transfer_learning_with_hub.ipynb
{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "name": "transfer_learning_with_hub.ipynb",
      "provenance": [],
      "private_outputs": true,
      "collapsed_sections": [],
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/gist/ia35/1537da88835e631a1c6d0c2d3dd0d476/transfer_learning_with_hub.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TRJoD8nITMtA",
        "colab_type": "text"
      },
      "source": [
        "[![](http://bec552ebfe.url-de-test.ws/ml/buttonBackProp.png)](https://www.backprop.fr)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "D9NOqI51TZAu",
        "colab_type": "text"
      },
      "source": [
        "Le code originel est décrit [ici](https://www.tensorflow.org/tutorials/images/transfer_learning_with_hub), sur le site de TensorFlow\n",
        "\n",
        "Modifications et Commentaires apportés par BackProp.fr pour des raisons pédagogiques."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "cFD-FuDNVV48",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "Le logo BackProp est présenté chaque fois qu'une modification importante est apportée au code ou à chaque fois qu'un commentaire doit être signalé. \n",
        "\n",
        "Le texte en anglais est soit le texte d'origine soit un extrait de site qui apporte des explications."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "TbU7ctkMafQe",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "# BP signifie que la ligne de code est supprimée par nous (mise en commentaires)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "W_tvPdyfA-BL"
      },
      "source": [
        "##### Copyright 2018 The TensorFlow Authors."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "cellView": "form",
        "colab_type": "code",
        "id": "0O_LFhwSBCjm",
        "colab": {}
      },
      "source": [
        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "PWUmcKKjtwXL"
      },
      "source": [
        "# Transfer learning with TensorFlow Hub\n",
        "\n",
        "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://www.tensorflow.org/tutorials/images/transfer_learning_with_hub\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/images/transfer_learning_with_hub.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://github.com/tensorflow/docs/blob/master/site/en/tutorials/images/transfer_learning_with_hub.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a href=\"https://storage.googleapis.com/tensorflow_docs/docs/site/en/tutorials/images/transfer_learning_with_hub.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a>\n",
        "  </td>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "crU-iluJIEzw"
      },
      "source": [
        "[TensorFlow Hub](http://tensorflow.org/hub) is a way to share pretrained model components. See the [TensorFlow Module Hub](https://tfhub.dev/) for a searchable listing of pre-trained models. This tutorial demonstrates:\n",
        "\n",
        "1. How to use TensorFlow Hub with `tf.keras`.\n",
        "1. How to do image classification using TensorFlow Hub.\n",
        "1. How to do simple transfer learning."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "CKFUvuEho9Th"
      },
      "source": [
        "## <font color=\"teal\">Setup</font>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "55DOF5cmYaJy",
        "colab_type": "text"
      },
      "source": [
        "Le **from __future** n'a pas vraiment de sens dans notre cas car nous forçons une version >3.x pour Python et une version 2.x pour TensorFlow"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3VoRrCeoY9Rf",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "VwJnbE3YZPAK",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "! python -V"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ST_dng1uY3qs",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "# BP from __future__ import absolute_import, division, print_function, unicode_literals"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "mZ2Qq8TAZhBz",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "import matplotlib.pylab as plt"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wqmtWqbAZv9H",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "\n",
        "On n'a pas besoin de la toute dernière version de tf (tf-nightly) la 2.x nous suffit"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "OGNpmn43C0O6",
        "colab": {}
      },
      "source": [
        "try:\n",
        "  # %tensorflow_version only exists in Colab.\n",
        "  # BP !pip install tf-nightly\n",
        "  %tensorflow_version 2.x\n",
        "except Exception:\n",
        "  pass\n",
        "import tensorflow as tf"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "SKjJbHICa9Az",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "print(\"TensorFlow version: \", tf.__version__)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "YGYYsG1YZ9d7",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "\n",
        "Idem pour tensorflow_hub"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "s4Z9vFE1IQ2Q",
        "colab": {}
      },
      "source": [
        "# BP !pip install -U tf-hub-nightly\n",
        "import tensorflow_hub as hub\n",
        "\n",
        "from tensorflow.keras import layers"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "c6lWti0va_R7",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "print(\"TensorFlow Hub version: \", hub.__version__)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "s4YuF5HvpM1W"
      },
      "source": [
        "## <font color=\"teal\">An ImageNet classifier</font>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "xEY_Ow5loN6q"
      },
      "source": [
        "### <font color=\"olive\">Download the classifier</font>\n",
        "\n",
        "Use `hub.module` to load a mobilenet, and `tf.keras.layers.Lambda` to wrap it up as a keras layer. Any [TensorFlow 2 compatible image classifier URL](https://tfhub.dev/s?q=tf2&module-type=image-classification) from tfhub.dev will work here."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fTvJb49VcLmC",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "\n",
        "Le modèle utilisé ici par défaut est mobilenet_v2_100_224. Voir notre [article](https://meetup.backprop.fr/category/tensorflow/) à ce sujet pour approfondir le modèle.\n",
        "\n",
        "L’apprentissage de ce modèle s’est fait sur ImageNet.\n",
        "\n",
        "Les images en input doivent être carrées 224."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "cellView": "both",
        "colab_type": "code",
        "id": "feiXojVXAbI9",
        "colab": {}
      },
      "source": [
        "classifier_url =\"https://tfhub.dev/google/tf2-preview/mobilenet_v2/classification/2\" #@param {type:\"string\"}"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ouEv2VVvdhFk",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "\n",
        "\n",
        "input_shape=IMAGE_SHAPE+(3,) c'est à dire (224, 224, 3)\n",
        "\n",
        "En effet, les images sont carrées 224 mais elles sont en couleur (donc 3 plans par image)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "y_6bGjoPtzau",
        "colab": {}
      },
      "source": [
        "IMAGE_SHAPE = (224, 224)\n",
        "\n",
        "classifier = tf.keras.Sequential([\n",
        "    hub.KerasLayer(classifier_url, input_shape=IMAGE_SHAPE+(3,))\n",
        "])"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ZZFFUIrX2ZUl",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "classifier.outputs"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jpjqnUQMUR5y",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "eGziscv7RX94",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "classifier.get_config()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "pdG-EepIUTpL",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "\n",
        "Notez que le modèle est séquentiel, a plus de 3M de paramètres et que les paramètres sont non entraînables. "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "C2wzA8B-T-5O",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "classifier.summary()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "pwZXaoV0uXp2"
      },
      "source": [
        "### <font color=\"olive\">Run it on a single image</font>\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "TQItP1i55-di"
      },
      "source": [
        "Download a single image to try the model on."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6pgWttjRGLsf",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "\n",
        "\n",
        "Si vous ne le savez pas, Grace Hopper est une informaticienne américaine, conceptrice du langage COBOL"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "w5wDjXNjuXGD",
        "colab": {}
      },
      "source": [
        "import numpy as np\n",
        "import PIL.Image as Image\n",
        "\n",
        "grace_hopper = tf.keras.utils.get_file('image.jpg','https://storage.googleapis.com/download.tensorflow.org/example_images/grace_hopper.jpg')\n",
        "grace_hopper = Image.open(grace_hopper).resize(IMAGE_SHAPE)\n",
        "grace_hopper"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "q4DRTGzAemh3",
        "colab_type": "text"
      },
      "source": [
        "L'image est normalisée"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "BEmmBnGbLxPp",
        "colab": {}
      },
      "source": [
        "grace_hopper = np.array(grace_hopper)/255.0\n",
        "grace_hopper.shape"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "0Ic8OEEo2b73"
      },
      "source": [
        "Add a batch dimension, and pass the image to the model."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9JEslxN3etAB",
        "colab_type": "text"
      },
      "source": [
        "Il est imposé d'avoir une dimension pour le batch même s'il est à 1"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "uOOAOYfAuheP",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "\n",
        "\n",
        "ImageNet is widely used for benchmarking image classification models. It contains 14 million images in more than 20 000 categories.\n",
        "\n",
        "One way to get the data would be to go for the ImageNet LSVRC 2012 dataset which is a [1000](https://towardsdatascience.com/how-to-scrape-the-imagenet-f309e02de1f4)-class selection of the whole ImageNet and contains 1.28 million images\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "En_zTjOv6ZvL",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "\n",
        "Il y a un point à éclaircir. ImageNet dans la version \"light\" n'a que 1000 catégories, or ici le classifier a 1001 catégories."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "EMquyn29v8q3",
        "colab": {}
      },
      "source": [
        "result = classifier.predict(grace_hopper[np.newaxis, ...])\n",
        "result.shape"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "NKzjqENF6jDF"
      },
      "source": [
        "The result is a 1001 element vector of logits, rating the probability of each class for the image.\n",
        "\n",
        "So the top class ID can be found with argmax:"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "u18iY9tIFO9n",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "\n",
        "Pour logit on regardera la définition [ici](https://stackoverflow.com/questions/41455101/what-is-the-meaning-of-the-word-logits-in-tensorflow)\n",
        "\n",
        "logit means the vector of raw (non-normalized) predictions that a classification model generates, which is ordinarily then passed to a normalization function. If the model is solving a multi-class classification problem, logits typically become an input to the softmax function. The softmax function then generates a vector of (normalized) probabilities with one value for each possible class."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "rgXb44vt6goJ",
        "colab": {}
      },
      "source": [
        "predicted_class = np.argmax(result[0], axis=-1)\n",
        "predicted_class"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "YrxLMajMoxkf"
      },
      "source": [
        "### <font color=\"olive\">Decode the predictions</font>\n",
        "\n",
        "\n",
        "We have the predicted class ID,\n",
        "Fetch the `ImageNet` labels, and decode the predictions"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9Zg2qtqrVEGj",
        "colab_type": "text"
      },
      "source": [
        "On sait que la classe prédite est la 653. Pour savoir à quoi elle correspond il faut une table de correspondances."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "ij6SrDxcxzry",
        "colab": {}
      },
      "source": [
        "labels_path = tf.keras.utils.get_file('ImageNetLabels.txt','https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt')\n",
        "imagenet_labels = np.array(open(labels_path).read().splitlines())"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "64qUgEGlgfT9",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "\n",
        "tf.keras.utils.get_file retourne le path du fichier lu"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "NhRt201lfoeu",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "labels_path"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yyGjhCfSgsjg",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "\n",
        "Quelques catégories (labels)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "jeET0xOofxi5",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "imagenet_labels"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "5oytbgcv4ZsE",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "len(imagenet_labels)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "uzziRK3Z2VQo",
        "colab": {}
      },
      "source": [
        "plt.imshow(grace_hopper)\n",
        "plt.axis('off')\n",
        "predicted_class_name = imagenet_labels[predicted_class]\n",
        "_ = plt.title(\"Prediction: \" + predicted_class_name.title())"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "amfzqn1Oo7Om"
      },
      "source": [
        "## <font color=\"teal\">Simple transfer learning</font>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "K-nIpVJ94xrw"
      },
      "source": [
        "Using TF Hub it is simple to retrain the top layer of the model to recognize the classes in our dataset."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "Z93vvAdGxDMD"
      },
      "source": [
        "### <font color=\"olive\">Dataset</font>\n",
        "\n",
        "\n",
        " For this example you will use the TensorFlow flowers dataset:"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1hWl2SHWCv93",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "\n",
        "Parmi les catégories de fleurs qui nous intéressent, seule daisy (marguerite) est présente dans le sous-ensemble ImageNet qui a servi à l'apprentissage"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "GX9big1DV2Ea",
        "colab_type": "text"
      },
      "source": [
        "L'exercice consiste à classifier un jeu de données de fleurs, réparti en 5 catégories."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "DrIUV3V0xDL_",
        "colab": {}
      },
      "source": [
        "data_root = tf.keras.utils.get_file(\n",
        "  'flower_photos','https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz',\n",
        "   untar=True)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "jFHdp18ccah7"
      },
      "source": [
        "The simplest way to load this data into our model is using `tf.keras.preprocessing.image.ImageDataGenerator`,\n",
        "\n",
        "All of TensorFlow Hub's image modules expect float inputs in the `[0, 1]` range. Use the `ImageDataGenerator`'s `rescale` parameter to achieve this.\n",
        "\n",
        "The image size will be handled later."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "p9EJwboTgU3d",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "data_root"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "2PwQ_wYDcii9",
        "colab": {}
      },
      "source": [
        "image_generator = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1/255)\n",
        "image_data = image_generator.flow_from_directory(str(data_root), target_size=IMAGE_SHAPE)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jx3XuBonWkKu",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "6tZpOesthAuw",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "type(image_data)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "QAFam-s9WlpL",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "bt-QYHa8WUML",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "image_data.labels"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9yMKOaCLWnPp",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "5vJ_PexJWZ33",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "image_data.batch_size"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "WCnUcqXgG3tr",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "\n",
        "Les 5 classes sont : "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "lA9ohMIlA0ki",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "image_data.class_indices"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "0p7iDOhIcqY2"
      },
      "source": [
        "The resulting object is an iterator that returns `image_batch, label_batch` pairs."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "W4lDPkn2cpWZ",
        "colab": {}
      },
      "source": [
        "for image_batch, label_batch in image_data:\n",
        "  print(\"Image batch shape: \", image_batch.shape)\n",
        "  print(\"Label batch shape: \", label_batch.shape)\n",
        "  break"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "0gTN7M_GxDLx"
      },
      "source": [
        "### <font color=\"olive\">Run the classifier on a batch of images</font>\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "N1xdrMNiXLq3",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "\n",
        "Le batch a une taille de 32."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "O3fvrZR8xDLv"
      },
      "source": [
        "Now run the classifier on the image batch."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "nbyg6tcyxDLh",
        "colab": {}
      },
      "source": [
        "result_batch = classifier.predict(image_batch)\n",
        "result_batch.shape"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "Kv7ZwuR4xDLc",
        "colab": {}
      },
      "source": [
        "predicted_class_names = imagenet_labels[np.argmax(result_batch, axis=-1)]\n",
        "predicted_class_names"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "QmvSWg9nxDLa"
      },
      "source": [
        "Now check how these predictions line up with the images:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "IXTB22SpxDLP",
        "colab": {}
      },
      "source": [
        "plt.figure(figsize=(10,9))\n",
        "plt.subplots_adjust(hspace=0.5)\n",
        "for n in range(30):\n",
        "  plt.subplot(6,5,n+1)\n",
        "  plt.imshow(image_batch[n])\n",
        "  plt.title(predicted_class_names[n])\n",
        "  plt.axis('off')\n",
        "_ = plt.suptitle(\"ImageNet predictions\")"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "FUa3YkvhxDLM"
      },
      "source": [
        "See the `LICENSE.txt` file for image attributions.\n",
        "\n",
        "The results are far from perfect, but reasonable considering that these are not the classes the model was trained for (except \"daisy\")."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "JzV457OXreQP"
      },
      "source": [
        "### <font color=\"olive\">Download the headless model</font>\n",
        "\n",
        "TensorFlow Hub also distributes models without the top classification layer. These can be used to easily do transfer learning.\n",
        "\n",
        "Any [Tensorflow 2 compatible image feature vector URL](https://tfhub.dev/s?module-type=image-feature-vector&q=tf2) from tfhub.dev will work here."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "cellView": "both",
        "colab_type": "code",
        "id": "4bw8Jf94DSnP",
        "colab": {}
      },
      "source": [
        "feature_extractor_url = \"https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/2\" #@param {type:\"string\"}"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "sgwmHugQF-PD"
      },
      "source": [
        "Create the feature extractor."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "5wB030nezBwI",
        "colab": {}
      },
      "source": [
        "feature_extractor_layer = hub.KerasLayer(feature_extractor_url,\n",
        "                                         input_shape=(224,224,3))"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "0QzVdu4ZhcDE"
      },
      "source": [
        "It returns a 1280-length vector for each image:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "Of7i-35F09ls",
        "colab": {}
      },
      "source": [
        "feature_batch = feature_extractor_layer(image_batch)\n",
        "print(feature_batch.shape)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "CtFmF7A5E4tk"
      },
      "source": [
        "Freeze the variables in the feature extractor layer, so that the training only modifies the new classifier layer."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "Jg5ar6rcE4H-",
        "colab": {}
      },
      "source": [
        "feature_extractor_layer.trainable = False"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "RPVeouTksO9q"
      },
      "source": [
        "### <font color=\"olive\">Attach a classification head</font>\n",
        "\n",
        "\n",
        "Now wrap the hub layer in a `tf.keras.Sequential` model, and add a new classification layer."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "EJBNrFbgYDZo",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "\n",
        "Ce qu'on fait ici est assez simple. A la fin de notre modèle on ajoute une couche Dense avec 5 neurones (car on a 5 classes de fleurs) avec une fonction d'activation de type Softmax"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "mGcY27fY1q3Q",
        "colab": {}
      },
      "source": [
        "model = tf.keras.Sequential([\n",
        "  feature_extractor_layer,\n",
        "  layers.Dense(image_data.num_classes, activation='softmax')\n",
        "])\n",
        "\n",
        "model.summary()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "G9VkAz00HOJx",
        "colab": {}
      },
      "source": [
        "predictions = model(image_batch)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "sB7sVGJ23vrY",
        "colab": {}
      },
      "source": [
        "predictions.shape"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "OHbXQqIquFxQ"
      },
      "source": [
        "### <font color=\"olive\">Train the model</font>\n",
        "\n",
        "\n",
        "Use compile to configure the training process:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "3n0Wb9ylKd8R",
        "colab": {}
      },
      "source": [
        "model.compile(\n",
        "  optimizer=tf.keras.optimizers.Adam(),\n",
        "  loss='categorical_crossentropy',\n",
        "  metrics=['acc'])"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "58-BLV7dupJA"
      },
      "source": [
        "Now use the `.fit` method to train the model.\n",
        "\n",
        "To keep this example short train just 2 epochs. To visualize the training progress, use a custom callback to log the loss and accuracy of each batch individually, instead of the epoch average."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "jZ54Gubac4Lu",
        "colab": {}
      },
      "source": [
        "class CollectBatchStats(tf.keras.callbacks.Callback):\n",
        "  def __init__(self):\n",
        "    self.batch_losses = []\n",
        "    self.batch_acc = []\n",
        "\n",
        "  def on_train_batch_end(self, batch, logs=None):\n",
        "    self.batch_losses.append(logs['loss'])\n",
        "    self.batch_acc.append(logs['acc'])\n",
        "    self.model.reset_metrics()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DYJ7XaHlfQYs",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "\n",
        "J'explicite le calcul de steps_per_epoch"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "nwtHupJVeZdz",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "steps_per_epoch = np.ceil(image_data.samples/image_data.batch_size)\n",
        "steps_per_epoch"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "vWh3bDUlegMI",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "image_data.samples"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "EyMDJxt2HdHr",
        "colab": {}
      },
      "source": [
        "#steps_per_epoch = np.ceil(image_data.samples/image_data.batch_size)\n",
        "\n",
        "batch_stats_callback = CollectBatchStats()\n",
        "\n",
        "history = model.fit_generator(image_data, epochs=2,\n",
        "                              steps_per_epoch=steps_per_epoch,\n",
        "                              callbacks = [batch_stats_callback])"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "Kd0N272B9Q0b"
      },
      "source": [
        "Now after, even just a few training iterations, we can already see that the model is making progress on the task."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "A5RfS1QIIP-P",
        "colab": {}
      },
      "source": [
        "plt.figure()\n",
        "plt.ylabel(\"Loss\")\n",
        "plt.xlabel(\"Training Steps\")\n",
        "plt.ylim([0,2])\n",
        "plt.plot(batch_stats_callback.batch_losses)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "3uvX11avTiDg",
        "colab": {}
      },
      "source": [
        "plt.figure()\n",
        "plt.ylabel(\"Accuracy\")\n",
        "plt.xlabel(\"Training Steps\")\n",
        "plt.ylim([0,1])\n",
        "plt.plot(batch_stats_callback.batch_acc)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "kb__ZN8uFn-D"
      },
      "source": [
        "### <font color=\"olive\">Check the predictions</font>\n",
        "\n",
        "\n",
        "To redo the plot from before, first get the ordered list of class names:"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tdhGUkmtl3ru",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "mbdbvz7ClG6N",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "image_data.class_indices"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "zSIrrLyol5rC",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "gQ2ypUyXlaxB",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "image_data.class_indices.items()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "kGoEDJDKl719",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "HndS5VEXlr8K",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "class_names = sorted(image_data.class_indices.items(), key=lambda pair:pair[1])\n",
        "class_names"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KuM9VHKQmAw6",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "XrodfWDKmCsz",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "class_names = np.array([key.title() for key, value in class_names])\n",
        "class_names"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "JGbEf5l1I4jz",
        "colab": {}
      },
      "source": [
        "#class_names = sorted(image_data.class_indices.items(), key=lambda pair:pair[1])\n",
        "#class_names = np.array([key.title() for key, value in class_names])\n",
        "#class_names"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "4Olg6MsNGJTL"
      },
      "source": [
        "Run the image batch through the model and convert the indices to class names."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "fCLVCpEjJ_VP",
        "colab": {}
      },
      "source": [
        "predicted_batch = model.predict(image_batch)\n",
        "predicted_id = np.argmax(predicted_batch, axis=-1)\n",
        "predicted_label_batch = class_names[predicted_id]"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "CkGbZxl9GZs-"
      },
      "source": [
        "Plot the result"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "rpFQR1MPMtT1",
        "colab": {}
      },
      "source": [
        "label_id = np.argmax(label_batch, axis=-1)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "wC_AYRJU9NQe",
        "colab": {}
      },
      "source": [
        "plt.figure(figsize=(10,9))\n",
        "plt.subplots_adjust(hspace=0.5)\n",
        "for n in range(30):\n",
        "  plt.subplot(6,5,n+1)\n",
        "  plt.imshow(image_batch[n])\n",
        "  color = \"green\" if predicted_id[n] == label_id[n] else \"red\"\n",
        "  plt.title(predicted_label_batch[n].title(), color=color)\n",
        "  plt.axis('off')\n",
        "_ = plt.suptitle(\"Model predictions (green: correct, red: incorrect)\")"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bTI_5aCzmqnG",
        "colab_type": "text"
      },
      "source": [
        "[![](https://raw.githubusercontent.com/BackProp-fr/meetup/master/images/LogoBackPropTranspSmall.png)](https://www.backprop.fr)\n",
        "\n",
        "De couleur verte lorsque la prédiction est correcte sinon rouge"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "uRcJnAABr22x"
      },
      "source": [
        "## <font color=\"teal\">Export your model</font>\n",
        "\n",
        "Now that you've trained the model, export it as a saved model:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "PLcqg-RmsLno",
        "colab": {}
      },
      "source": [
        "import time\n",
        "t = time.time()\n",
        "\n",
        "export_path = \"/tmp/saved_models/{}\".format(int(t))\n",
        "model.save(export_path, save_format='tf')\n",
        "\n",
        "export_path"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "AhQ9liIUsPsi"
      },
      "source": [
        "Now confirm that we can reload it, and it still gives the same results:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "7nI5fvkAQvbS",
        "colab": {}
      },
      "source": [
        "reloaded = tf.keras.models.load_model(export_path)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "jor83-LqI8xW",
        "colab": {}
      },
      "source": [
        "result_batch = model.predict(image_batch)\n",
        "reloaded_result_batch = reloaded.predict(image_batch)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "dnZO14taYPH6",
        "colab": {}
      },
      "source": [
        "abs(reloaded_result_batch - result_batch).max()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "TYZd4MNiV3Rc"
      },
      "source": [
        "This saved model can loaded for inference later, or converted to [TFLite](https://www.tensorflow.org/lite/convert/) or [TFjs](https://github.com/tensorflow/tfjs-converter).\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Jou9BReiH6gN",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        ""
      ],
      "execution_count": 0,
      "outputs": []
    }
  ]
}