alinaselega/tensorflow_tutorial.ipynb

## tensorflow_tutorial.ipynb
{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "tensorflow-tutorial.ipynb",
      "version": "0.3.2",
      "provenance": [],
      "collapsed_sections": []
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "AKdYl9QyjukG",
        "colab_type": "text"
      },
      "source": [
        "This notebook follows the Tensorflow tutorial for the low-level API, which can be found [here](https://www.tensorflow.org/guide/low_level_intro).\n",
        "\n",
        "#Useful definitions to start with \n",
        "\n",
        "- tf.Graph -- Tensorflow programme\n",
        "\n",
        "- tf.Session -- Tensorflow runtime, where one runs operations\n",
        "\n",
        "- tf.Tensor -- unit of data, consists of values shaped into array of any number of dimensions.  \n",
        "Rank: number of dimensions  \n",
        "Shape: tuple of integers specifying the array's length along each dimension"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Ld5Ny2CHhaoq",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from __future__ import absolute_import\n",
        "from __future__ import division\n",
        "from __future__ import print_function\n",
        "\n",
        "import numpy as np\n",
        "import tensorflow as tf"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hGaChqQYK1AP",
        "colab_type": "text"
      },
      "source": [
        "In the below example, the rank is 2 and shape is (2, 3)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "97FM35juiMcI",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "2ab04b1d-cb7e-4ce8-b523-89905ea1b182"
      },
      "source": [
        "x = [[1., 2., 3.], [4., 5., 6.]]\n",
        "x[1][2]"
      ],
      "execution_count": 45,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "6.0"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 45
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "zGiBkcpLvaDw",
        "colab_type": "text"
      },
      "source": [
        "# Computational graph\n",
        "\n",
        "A computational graph is a series of operations arranged into a graph, where\n",
        "nodes are the *operations* and edges are *tensors*.\n",
        "\n",
        "Build a simple computation graph, where we define tensors for two constant scalars and their sum:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "wvF3Fek8rNhc",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 68
        },
        "outputId": "13410e7b-1e8a-49d7-acec-74e27960a82c"
      },
      "source": [
        "a = tf.constant(3.0, dtype=tf.float32)\n",
        "b = tf.constant(4.0) # also tf.float32 implicitly\n",
        "total = a + b\n",
        "print(a)\n",
        "print(b)\n",
        "print(total)"
      ],
      "execution_count": 46,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Tensor(\"Const_10:0\", shape=(), dtype=float32)\n",
            "Tensor(\"Const_11:0\", shape=(), dtype=float32)\n",
            "Tensor(\"add_2:0\", shape=(), dtype=float32)\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fM-CZNrwlGZX",
        "colab_type": "text"
      },
      "source": [
        "Tensors `a`, `b`, and `total` represent the results of the operations\n",
        "that will be run. They are given names after the operation that produced\n",
        "them followed by an output index (e.g. `add:0`).\n",
        "\n",
        "One can visualise the computation graph using the TensorBoard."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hgO1eoOVvhIP",
        "colab_type": "text"
      },
      "source": [
        "## Running the graph\n",
        "\n",
        "Instantiate a session object to run operations:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "NaglcBmLtjCa",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "17c8a779-3e8e-439c-d9a7-ba02738eedcc"
      },
      "source": [
        "sess = tf.Session()\n",
        "print(sess.run(total))"
      ],
      "execution_count": 47,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "7.0\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8FSvNOO7lu-D",
        "colab_type": "text"
      },
      "source": [
        "Upon executing `Session.run()`, Tensorflow backtracks through the graph\n",
        "and runs all nodes that provide input into the requested output node\n",
        "(here, `total`).  \n",
        "Requesting any number of nodes:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "0NwOHnoPuuZE",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "d4511ed4-0bbb-4067-b532-63188f826708"
      },
      "source": [
        "print(sess.run({'ab':(a, b), 'total':total}))"
      ],
      "execution_count": 48,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "{'ab': (3.0, 4.0), 'total': 7.0}\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5P3AAuWrl4ak",
        "colab_type": "text"
      },
      "source": [
        "Running an Operation (rather than an output Tensor) returns `None`; it is done\n",
        "to cause an effect, not to retrieve a value."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_tkkbp61wmiP",
        "colab_type": "text"
      },
      "source": [
        "## Placeholders\n",
        "\n",
        "A *placeholder* is an edge that accepts external input (e.g. a parameter of a\n",
        "function). Placeholders are like Tensors (and can be used to overwrite Tensors)\n",
        "except that they throw an error if no value is fed to them."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "xSxgr0EOwkJW",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "x = tf.placeholder(tf.float32)\n",
        "y = tf.placeholder(tf.float32)\n",
        "z = x + y"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ycWMRHfqNwNh",
        "colab_type": "text"
      },
      "source": [
        "`feed_dict` is an argument of `Session.run()` used to feed values to placeholders."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "kMFRQNwnwsdB",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 51
        },
        "outputId": "360af5bf-ac20-4a29-b00e-c4f3cd917a53"
      },
      "source": [
        "print(sess.run(z, feed_dict={x: 3, y: 4.5}))\n",
        "print(sess.run(z, feed_dict={x: [1, 3], y: [2, 4]}))"
      ],
      "execution_count": 50,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "7.5\n",
            "[3. 7.]\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "k_VyaqnrPkPM",
        "colab_type": "text"
      },
      "source": [
        "# Handling data\n",
        "\n",
        "To load data into the model, `tf.data` object should be used. A Tensor can be\n",
        "created from a Dataset by using an Iterator:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "G86LXCK_OIWF",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "my_data = [\n",
        "    [0, 1,],\n",
        "    [2, 3,],\n",
        "    [4, 5,],\n",
        "    [6, 7,],\n",
        "]\n",
        "slices = tf.data.Dataset.from_tensor_slices(my_data)\n",
        "next_item = slices.make_one_shot_iterator().get_next()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tC90Wwm3m4YT",
        "colab_type": "text"
      },
      "source": [
        "As reaching the end of data causes DataSet to throw an error, reading all items\n",
        "from Dataset can be done like this:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "PNLPJAUzSn52",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 85
        },
        "outputId": "7a6c527c-d9a6-4035-b659-f62b4d4c7f16"
      },
      "source": [
        "while True:\n",
        "  try:\n",
        "    print(sess.run(next_item))\n",
        "  except tf.errors.OutOfRangeError:\n",
        "    break"
      ],
      "execution_count": 52,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "[0 1]\n",
            "[2 3]\n",
            "[4 5]\n",
            "[6 7]\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-1leZIH-T3ZF",
        "colab_type": "text"
      },
      "source": [
        "For stateful nodes, such as Variables or Tensors created by `tf.random_normal`,\n",
        "you need to use an initialisable iterator:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "TJkfT7WMTisL",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 187
        },
        "outputId": "da0a3332-8afd-46af-e0cc-6340773a5915"
      },
      "source": [
        "r = tf.random_normal([10,3])\n",
        "dataset = tf.data.Dataset.from_tensor_slices(r)\n",
        "iterator = dataset.make_initializable_iterator()\n",
        "next_row = iterator.get_next()\n",
        "\n",
        "sess.run(iterator.initializer)\n",
        "while True:\n",
        "  try:\n",
        "    print(sess.run(next_row))\n",
        "  except tf.errors.OutOfRangeError:\n",
        "    break"
      ],
      "execution_count": 53,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "[ 0.04021718 -1.2435639  -0.48213512]\n",
            "[ 0.7044722  -0.12914501  0.01962276]\n",
            "[-1.6218902  -0.1258606  -0.50261456]\n",
            "[-0.47550339  0.567337   -0.9583471 ]\n",
            "[ 0.18744798 -0.46359915 -0.02855888]\n",
            "[-2.0715013  -3.0730639  -0.31076798]\n",
            "[-1.1439593  -0.25405806  0.9719103 ]\n",
            "[-0.19179374  0.31662202 -0.12613805]\n",
            "[ 0.40931207  0.1564601  -0.64182574]\n",
            "[-0.24219503  0.4491331   0.75209266]\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4zSfxGTenU4v",
        "colab_type": "text"
      },
      "source": [
        "Note some warnings about function deprecation."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "aeHaFNxlbEHq",
        "colab_type": "text"
      },
      "source": [
        "# Layers\n",
        "\n",
        "Layers contain variables and operations that act on them. For example,\n",
        "a densely-connected layer computes a weighted sum of inputs for each output\n",
        "and applies an activation function.  \n",
        "Create a Dense layer and apply it to an\n",
        "input as you would do with a function. You must set the shape of the input\n",
        "for the layer to build a matrix of the correct size:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "5hrJ5I1RbGpK",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "x = tf.placeholder(tf.float32, shape=[None, 3])\n",
        "linear_model = tf.layers.Dense(units=1)\n",
        "y = linear_model(x)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "rQMt2ZP5nmPd",
        "colab_type": "text"
      },
      "source": [
        "Variables in layers must be initialised before being called. The command\n",
        "below initialises all variables in the graph."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "4h5lIf4Yceas",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "init = tf.global_variables_initializer()\n",
        "sess.run(init)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bMMFTHQ4c1WL",
        "colab_type": "text"
      },
      "source": [
        "Execute the layer:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "mqAbSAYHc27K",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 51
        },
        "outputId": "49b58e4f-9678-47db-dcf0-e5109ecd4a56"
      },
      "source": [
        "print(sess.run(y, {x: [[1, 2, 3],[4, 5, 6]]}))"
      ],
      "execution_count": 56,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "[[1.7451062]\n",
            " [2.2306447]]\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "kJoUnrWhd2ht",
        "colab_type": "text"
      },
      "source": [
        "There are shortcut functions for Layers that create and apply it in a single call:  \n",
        "`tf.layers.dense(x, units=1)`"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ieEL6xiMdqjI",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 51
        },
        "outputId": "1d0d0620-cd25-4734-8234-6a883f6eacbf"
      },
      "source": [
        "x = tf.placeholder(tf.float32, shape=[None, 3])\n",
        "y = tf.layers.dense(x, units=1)\n",
        "\n",
        "init = tf.global_variables_initializer()\n",
        "sess.run(init)\n",
        "\n",
        "print(sess.run(y, {x: [[1, 2, 3], [4, 5, 6]]}))"
      ],
      "execution_count": 57,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "[[-3.6543021]\n",
            " [-6.9044867]]\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "aibPEP33ga8N",
        "colab_type": "text"
      },
      "source": [
        "# Feature columns\n",
        "\n",
        "Encoding column data can be done with the `tf.feature_column.input_layer`function but categorical columns must be wrapped."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "_DyQh4dwgePh",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "features = {\n",
        "    'sales' : [[5], [10], [8], [9]],\n",
        "    'department': ['sports', 'sports', 'gardening', 'gardening']}\n",
        "\n",
        "department_column = tf.feature_column.categorical_column_with_vocabulary_list(\n",
        "        'department', ['sports', 'gardening'])\n",
        "department_column = tf.feature_column.indicator_column(department_column)\n",
        "\n",
        "columns = [\n",
        "    tf.feature_column.numeric_column('sales'),\n",
        "    department_column\n",
        "]\n",
        "\n",
        "inputs = tf.feature_column.input_layer(features, columns)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TWVajfyFpiiS",
        "colab_type": "text"
      },
      "source": [
        "To view the columns, categorical columns have to be initialised separately from all variables in the graph.\n",
        "\n",
        "The output shows a one-hot encoding of the categorical values of the 'department' feature and a column corresponding to the values of the 'sales' feature."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "zYkW2mGFgu8Q",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 85
        },
        "outputId": "a5236e22-1e12-4dba-a618-5e8d180d2680"
      },
      "source": [
        "init = tf.global_variables_initializer()\n",
        "table_init = tf.tables_initializer()\n",
        "sess.run((init, table_init))\n",
        "\n",
        "print(sess.run(inputs))"
      ],
      "execution_count": 59,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "[[ 1.  0.  5.]\n",
            " [ 1.  0. 10.]\n",
            " [ 0.  1.  8.]\n",
            " [ 0.  1.  9.]]\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ChlgSbvRqjvY",
        "colab_type": "text"
      },
      "source": [
        "# Training a model\n",
        "\n",
        "Define some inputs `x` and the expected outputs `y`:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "a-eG3tOPqtwQ",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "x = tf.constant([[1], [2], [3], [4]], dtype=tf.float32)\n",
        "y_true = tf.constant([[0], [-1], [-2], [-3]], dtype=tf.float32)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "o-NqDNjIqyIS",
        "colab_type": "text"
      },
      "source": [
        "Define a simple linear model with 1 output:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "_zdiXbcVq4Nt",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "linear_model = tf.layers.Dense(units=1)\n",
        "\n",
        "y_pred = linear_model(x)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1OqfIb8GrAgk",
        "colab_type": "text"
      },
      "source": [
        "Evaluate the untrained predictions:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "5-sFMG8erCDD",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 85
        },
        "outputId": "b40c2c63-df0f-464f-d714-8387f266e142"
      },
      "source": [
        "init = tf.global_variables_initializer()\n",
        "sess.run(init)\n",
        "print(sess.run(y_pred))"
      ],
      "execution_count": 62,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "[[-1.0078285]\n",
            " [-2.015657 ]\n",
            " [-3.0234854]\n",
            " [-4.031314 ]]\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "GrX2pC4srW39",
        "colab_type": "text"
      },
      "source": [
        "To train a model, we need to define a loss. `tf.losses` provides common loss functions. "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "zBWlUJ9qsE-m",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "636c5147-04d8-4f8e-e536-ef1fa346566d"
      },
      "source": [
        "loss = tf.losses.mean_squared_error(labels=y_true, predictions=y_pred)\n",
        "print(sess.run(loss))"
      ],
      "execution_count": 63,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "1.039602\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "nMsTFC5GsW-5",
        "colab_type": "text"
      },
      "source": [
        "We now need an Optimiser, which are implemented as part of `tf.train.Optimizer`.  Optimisers change each variable in incremental steps to minimise the loss. Let's use a gradient descent optimiser. The code below builds the graph necessary for optimisation and returns a training operation. "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "pr9H2S5-s24I",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "optimizer = tf.train.GradientDescentOptimizer(0.01)\n",
        "train = optimizer.minimize(loss)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BI6wy51FtTwL",
        "colab_type": "text"
      },
      "source": [
        "Run it (for 100 times) to update the variables:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "4VOZisV5tWAl",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "for i in range(100):\n",
        "  _, loss_value = sess.run((train, loss))\n",
        "  print(loss_value)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "3wdb5bbItxxn",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 85
        },
        "outputId": "f4a33b9a-d322-425e-b5f4-10d2106ae3d7"
      },
      "source": [
        "print(sess.run(y_pred))\n",
        "sess.close()"
      ],
      "execution_count": 67,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "[[-0.4370654]\n",
            " [-1.211788 ]\n",
            " [-1.9865109]\n",
            " [-2.7612336]]\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0pkSza16uLUD",
        "colab_type": "text"
      },
      "source": [
        "A complete program would look like this:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "4h3dzK5wuOi8",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "x = tf.constant([[1], [2], [3], [4]], dtype=tf.float32)\n",
        "y_true = tf.constant([[0], [-1], [-2], [-3]], dtype=tf.float32)\n",
        "\n",
        "linear_model = tf.layers.Dense(units=1)\n",
        "\n",
        "y_pred = linear_model(x)\n",
        "loss = tf.losses.mean_squared_error(labels=y_true, predictions=y_pred)\n",
        "\n",
        "optimizer = tf.train.GradientDescentOptimizer(0.01)\n",
        "train = optimizer.minimize(loss)\n",
        "\n",
        "init = tf.global_variables_initializer()\n",
        "\n",
        "sess = tf.Session()\n",
        "sess.run(init)\n",
        "for i in range(100):\n",
        "  _, loss_value = sess.run((train, loss))\n",
        "  print(loss_value)\n",
        "\n",
        "print(sess.run(y_pred))\n",
        "sess.close()"
      ],
      "execution_count": 0,
      "outputs": []
    }
  ]
}
	{
	"nbformat": 4,
	"nbformat_minor": 0,
	"metadata": {
	"colab": {
	"name": "tensorflow-tutorial.ipynb",
	"version": "0.3.2",
	"provenance": [],
	"collapsed_sections": []
	},
	"kernelspec": {
	"name": "python3",
	"display_name": "Python 3"
	}
	},
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "AKdYl9QyjukG",
	"colab_type": "text"
	},
	"source": [
	"This notebook follows the Tensorflow tutorial for the low-level API, which can be found [here](https://www.tensorflow.org/guide/low_level_intro).\n",
	"\n",
	"#Useful definitions to start with \n",
	"\n",
	"- tf.Graph -- Tensorflow programme\n",
	"\n",
	"- tf.Session -- Tensorflow runtime, where one runs operations\n",
	"\n",
	"- tf.Tensor -- unit of data, consists of values shaped into array of any number of dimensions. \n",
	"Rank: number of dimensions \n",
	"Shape: tuple of integers specifying the array's length along each dimension"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "Ld5Ny2CHhaoq",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"from __future__ import absolute_import\n",
	"from __future__ import division\n",
	"from __future__ import print_function\n",
	"\n",
	"import numpy as np\n",
	"import tensorflow as tf"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "hGaChqQYK1AP",
	"colab_type": "text"
	},
	"source": [
	"In the below example, the rank is 2 and shape is (2, 3)."
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "97FM35juiMcI",
	"colab_type": "code",
	"colab": {
	"base_uri": "https://localhost:8080/",
	"height": 34
	},
	"outputId": "2ab04b1d-cb7e-4ce8-b523-89905ea1b182"
	},
	"source": [
	"x = [[1., 2., 3.], [4., 5., 6.]]\n",
	"x[1][2]"
	],
	"execution_count": 45,
	"outputs": [
	{
	"output_type": "execute_result",
	"data": {
	"text/plain": [
	"6.0"
	]
	},
	"metadata": {
	"tags": []
	},
	"execution_count": 45
	}
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "zGiBkcpLvaDw",
	"colab_type": "text"
	},
	"source": [
	"# Computational graph\n",
	"\n",
	"A computational graph is a series of operations arranged into a graph, where\n",
	"nodes are the operations and edges are tensors.\n",
	"\n",
	"Build a simple computation graph, where we define tensors for two constant scalars and their sum:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "wvF3Fek8rNhc",
	"colab_type": "code",
	"colab": {
	"base_uri": "https://localhost:8080/",
	"height": 68
	},
	"outputId": "13410e7b-1e8a-49d7-acec-74e27960a82c"
	},
	"source": [
	"a = tf.constant(3.0, dtype=tf.float32)\n",
	"b = tf.constant(4.0) # also tf.float32 implicitly\n",
	"total = a + b\n",
	"print(a)\n",
	"print(b)\n",
	"print(total)"
	],
	"execution_count": 46,
	"outputs": [
	{
	"output_type": "stream",
	"text": [
	"Tensor(\"Const_10:0\", shape=(), dtype=float32)\n",
	"Tensor(\"Const_11:0\", shape=(), dtype=float32)\n",
	"Tensor(\"add_2:0\", shape=(), dtype=float32)\n"
	],
	"name": "stdout"
	}
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "fM-CZNrwlGZX",
	"colab_type": "text"
	},
	"source": [
	"Tensors `a`, `b`, and `total` represent the results of the operations\n",
	"that will be run. They are given names after the operation that produced\n",
	"them followed by an output index (e.g. `add:0`).\n",
	"\n",
	"One can visualise the computation graph using the TensorBoard."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "hgO1eoOVvhIP",
	"colab_type": "text"
	},
	"source": [
	"## Running the graph\n",
	"\n",
	"Instantiate a session object to run operations:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "NaglcBmLtjCa",
	"colab_type": "code",
	"colab": {
	"base_uri": "https://localhost:8080/",
	"height": 34
	},
	"outputId": "17c8a779-3e8e-439c-d9a7-ba02738eedcc"
	},
	"source": [
	"sess = tf.Session()\n",
	"print(sess.run(total))"
	],
	"execution_count": 47,
	"outputs": [
	{
	"output_type": "stream",
	"text": [
	"7.0\n"
	],
	"name": "stdout"
	}
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "8FSvNOO7lu-D",
	"colab_type": "text"
	},
	"source": [
	"Upon executing `Session.run()`, Tensorflow backtracks through the graph\n",
	"and runs all nodes that provide input into the requested output node\n",
	"(here, `total`). \n",
	"Requesting any number of nodes:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "0NwOHnoPuuZE",
	"colab_type": "code",
	"colab": {
	"base_uri": "https://localhost:8080/",
	"height": 34
	},
	"outputId": "d4511ed4-0bbb-4067-b532-63188f826708"
	},
	"source": [
	"print(sess.run({'ab':(a, b), 'total':total}))"
	],
	"execution_count": 48,
	"outputs": [
	{
	"output_type": "stream",
	"text": [
	"{'ab': (3.0, 4.0), 'total': 7.0}\n"
	],
	"name": "stdout"
	}
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "5P3AAuWrl4ak",
	"colab_type": "text"
	},
	"source": [
	"Running an Operation (rather than an output Tensor) returns `None`; it is done\n",
	"to cause an effect, not to retrieve a value."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "_tkkbp61wmiP",
	"colab_type": "text"
	},
	"source": [
	"## Placeholders\n",
	"\n",
	"A placeholder is an edge that accepts external input (e.g. a parameter of a\n",
	"function). Placeholders are like Tensors (and can be used to overwrite Tensors)\n",
	"except that they throw an error if no value is fed to them."
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "xSxgr0EOwkJW",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"x = tf.placeholder(tf.float32)\n",
	"y = tf.placeholder(tf.float32)\n",
	"z = x + y"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "ycWMRHfqNwNh",
	"colab_type": "text"
	},
	"source": [
	"`feed_dict` is an argument of `Session.run()` used to feed values to placeholders."
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "kMFRQNwnwsdB",
	"colab_type": "code",
	"colab": {
	"base_uri": "https://localhost:8080/",
	"height": 51
	},
	"outputId": "360af5bf-ac20-4a29-b00e-c4f3cd917a53"
	},
	"source": [
	"print(sess.run(z, feed_dict={x: 3, y: 4.5}))\n",
	"print(sess.run(z, feed_dict={x: [1, 3], y: [2, 4]}))"
	],
	"execution_count": 50,
	"outputs": [
	{
	"output_type": "stream",
	"text": [
	"7.5\n",
	"[3. 7.]\n"
	],
	"name": "stdout"
	}
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "k_VyaqnrPkPM",
	"colab_type": "text"
	},
	"source": [
	"# Handling data\n",
	"\n",
	"To load data into the model, `tf.data` object should be used. A Tensor can be\n",
	"created from a Dataset by using an Iterator:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "G86LXCK_OIWF",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"my_data = [\n",
	" [0, 1,],\n",
	" [2, 3,],\n",
	" [4, 5,],\n",
	" [6, 7,],\n",
	"]\n",
	"slices = tf.data.Dataset.from_tensor_slices(my_data)\n",
	"next_item = slices.make_one_shot_iterator().get_next()"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "tC90Wwm3m4YT",
	"colab_type": "text"
	},
	"source": [
	"As reaching the end of data causes DataSet to throw an error, reading all items\n",
	"from Dataset can be done like this:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "PNLPJAUzSn52",
	"colab_type": "code",
	"colab": {
	"base_uri": "https://localhost:8080/",
	"height": 85
	},
	"outputId": "7a6c527c-d9a6-4035-b659-f62b4d4c7f16"
	},
	"source": [
	"while True:\n",
	" try:\n",
	" print(sess.run(next_item))\n",
	" except tf.errors.OutOfRangeError:\n",
	" break"
	],
	"execution_count": 52,
	"outputs": [
	{
	"output_type": "stream",
	"text": [
	"[0 1]\n",
	"[2 3]\n",
	"[4 5]\n",
	"[6 7]\n"
	],
	"name": "stdout"
	}
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "-1leZIH-T3ZF",
	"colab_type": "text"
	},
	"source": [
	"For stateful nodes, such as Variables or Tensors created by `tf.random_normal`,\n",
	"you need to use an initialisable iterator:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "TJkfT7WMTisL",
	"colab_type": "code",
	"colab": {
	"base_uri": "https://localhost:8080/",
	"height": 187
	},
	"outputId": "da0a3332-8afd-46af-e0cc-6340773a5915"
	},
	"source": [
	"r = tf.random_normal([10,3])\n",
	"dataset = tf.data.Dataset.from_tensor_slices(r)\n",
	"iterator = dataset.make_initializable_iterator()\n",
	"next_row = iterator.get_next()\n",
	"\n",
	"sess.run(iterator.initializer)\n",
	"while True:\n",
	" try:\n",
	" print(sess.run(next_row))\n",
	" except tf.errors.OutOfRangeError:\n",
	" break"
	],
	"execution_count": 53,
	"outputs": [
	{
	"output_type": "stream",
	"text": [
	"[ 0.04021718 -1.2435639 -0.48213512]\n",
	"[ 0.7044722 -0.12914501 0.01962276]\n",
	"[-1.6218902 -0.1258606 -0.50261456]\n",
	"[-0.47550339 0.567337 -0.9583471 ]\n",
	"[ 0.18744798 -0.46359915 -0.02855888]\n",
	"[-2.0715013 -3.0730639 -0.31076798]\n",
	"[-1.1439593 -0.25405806 0.9719103 ]\n",
	"[-0.19179374 0.31662202 -0.12613805]\n",
	"[ 0.40931207 0.1564601 -0.64182574]\n",
	"[-0.24219503 0.4491331 0.75209266]\n"
	],
	"name": "stdout"
	}
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "4zSfxGTenU4v",
	"colab_type": "text"
	},
	"source": [
	"Note some warnings about function deprecation."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "aeHaFNxlbEHq",
	"colab_type": "text"
	},
	"source": [
	"# Layers\n",
	"\n",
	"Layers contain variables and operations that act on them. For example,\n",
	"a densely-connected layer computes a weighted sum of inputs for each output\n",
	"and applies an activation function. \n",
	"Create a Dense layer and apply it to an\n",
	"input as you would do with a function. You must set the shape of the input\n",
	"for the layer to build a matrix of the correct size:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "5hrJ5I1RbGpK",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"x = tf.placeholder(tf.float32, shape=[None, 3])\n",
	"linear_model = tf.layers.Dense(units=1)\n",
	"y = linear_model(x)"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "rQMt2ZP5nmPd",
	"colab_type": "text"
	},
	"source": [
	"Variables in layers must be initialised before being called. The command\n",
	"below initialises all variables in the graph."
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "4h5lIf4Yceas",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"init = tf.global_variables_initializer()\n",
	"sess.run(init)"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "bMMFTHQ4c1WL",
	"colab_type": "text"
	},
	"source": [
	"Execute the layer:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "mqAbSAYHc27K",
	"colab_type": "code",
	"colab": {
	"base_uri": "https://localhost:8080/",
	"height": 51
	},
	"outputId": "49b58e4f-9678-47db-dcf0-e5109ecd4a56"
	},
	"source": [
	"print(sess.run(y, {x: [[1, 2, 3],[4, 5, 6]]}))"
	],
	"execution_count": 56,
	"outputs": [
	{
	"output_type": "stream",
	"text": [
	"[[1.7451062]\n",
	" [2.2306447]]\n"
	],
	"name": "stdout"
	}
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "kJoUnrWhd2ht",
	"colab_type": "text"
	},
	"source": [
	"There are shortcut functions for Layers that create and apply it in a single call: \n",
	"`tf.layers.dense(x, units=1)`"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "ieEL6xiMdqjI",
	"colab_type": "code",
	"colab": {
	"base_uri": "https://localhost:8080/",
	"height": 51
	},
	"outputId": "1d0d0620-cd25-4734-8234-6a883f6eacbf"
	},
	"source": [
	"x = tf.placeholder(tf.float32, shape=[None, 3])\n",
	"y = tf.layers.dense(x, units=1)\n",
	"\n",
	"init = tf.global_variables_initializer()\n",
	"sess.run(init)\n",
	"\n",
	"print(sess.run(y, {x: [[1, 2, 3], [4, 5, 6]]}))"
	],
	"execution_count": 57,
	"outputs": [
	{
	"output_type": "stream",
	"text": [
	"[[-3.6543021]\n",
	" [-6.9044867]]\n"
	],
	"name": "stdout"
	}
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "aibPEP33ga8N",
	"colab_type": "text"
	},
	"source": [
	"# Feature columns\n",
	"\n",
	"Encoding column data can be done with the `tf.feature_column.input_layer`function but categorical columns must be wrapped."
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "_DyQh4dwgePh",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"features = {\n",
	" 'sales' : [[5], [10], [8], [9]],\n",
	" 'department': ['sports', 'sports', 'gardening', 'gardening']}\n",
	"\n",
	"department_column = tf.feature_column.categorical_column_with_vocabulary_list(\n",
	" 'department', ['sports', 'gardening'])\n",
	"department_column = tf.feature_column.indicator_column(department_column)\n",
	"\n",
	"columns = [\n",
	" tf.feature_column.numeric_column('sales'),\n",
	" department_column\n",
	"]\n",
	"\n",
	"inputs = tf.feature_column.input_layer(features, columns)"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "TWVajfyFpiiS",
	"colab_type": "text"
	},
	"source": [
	"To view the columns, categorical columns have to be initialised separately from all variables in the graph.\n",
	"\n",
	"The output shows a one-hot encoding of the categorical values of the 'department' feature and a column corresponding to the values of the 'sales' feature."
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "zYkW2mGFgu8Q",
	"colab_type": "code",
	"colab": {
	"base_uri": "https://localhost:8080/",
	"height": 85
	},
	"outputId": "a5236e22-1e12-4dba-a618-5e8d180d2680"
	},
	"source": [
	"init = tf.global_variables_initializer()\n",
	"table_init = tf.tables_initializer()\n",
	"sess.run((init, table_init))\n",
	"\n",
	"print(sess.run(inputs))"
	],
	"execution_count": 59,
	"outputs": [
	{
	"output_type": "stream",
	"text": [
	"[[ 1. 0. 5.]\n",
	" [ 1. 0. 10.]\n",
	" [ 0. 1. 8.]\n",
	" [ 0. 1. 9.]]\n"
	],
	"name": "stdout"
	}
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "ChlgSbvRqjvY",
	"colab_type": "text"
	},
	"source": [
	"# Training a model\n",
	"\n",
	"Define some inputs `x` and the expected outputs `y`:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "a-eG3tOPqtwQ",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"x = tf.constant([[1], [2], [3], [4]], dtype=tf.float32)\n",
	"y_true = tf.constant([[0], [-1], [-2], [-3]], dtype=tf.float32)"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "o-NqDNjIqyIS",
	"colab_type": "text"
	},
	"source": [
	"Define a simple linear model with 1 output:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "_zdiXbcVq4Nt",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"linear_model = tf.layers.Dense(units=1)\n",
	"\n",
	"y_pred = linear_model(x)"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "1OqfIb8GrAgk",
	"colab_type": "text"
	},
	"source": [
	"Evaluate the untrained predictions:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "5-sFMG8erCDD",
	"colab_type": "code",
	"colab": {
	"base_uri": "https://localhost:8080/",
	"height": 85
	},
	"outputId": "b40c2c63-df0f-464f-d714-8387f266e142"
	},
	"source": [
	"init = tf.global_variables_initializer()\n",
	"sess.run(init)\n",
	"print(sess.run(y_pred))"
	],
	"execution_count": 62,
	"outputs": [
	{
	"output_type": "stream",
	"text": [
	"[[-1.0078285]\n",
	" [-2.015657 ]\n",
	" [-3.0234854]\n",
	" [-4.031314 ]]\n"
	],
	"name": "stdout"
	}
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "GrX2pC4srW39",
	"colab_type": "text"
	},
	"source": [
	"To train a model, we need to define a loss. `tf.losses` provides common loss functions. "
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "zBWlUJ9qsE-m",
	"colab_type": "code",
	"colab": {
	"base_uri": "https://localhost:8080/",
	"height": 34
	},
	"outputId": "636c5147-04d8-4f8e-e536-ef1fa346566d"
	},
	"source": [
	"loss = tf.losses.mean_squared_error(labels=y_true, predictions=y_pred)\n",
	"print(sess.run(loss))"
	],
	"execution_count": 63,
	"outputs": [
	{
	"output_type": "stream",
	"text": [
	"1.039602\n"
	],
	"name": "stdout"
	}
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "nMsTFC5GsW-5",
	"colab_type": "text"
	},
	"source": [
	"We now need an Optimiser, which are implemented as part of `tf.train.Optimizer`. Optimisers change each variable in incremental steps to minimise the loss. Let's use a gradient descent optimiser. The code below builds the graph necessary for optimisation and returns a training operation. "
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "pr9H2S5-s24I",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"optimizer = tf.train.GradientDescentOptimizer(0.01)\n",
	"train = optimizer.minimize(loss)"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "BI6wy51FtTwL",
	"colab_type": "text"
	},
	"source": [
	"Run it (for 100 times) to update the variables:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "4VOZisV5tWAl",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"for i in range(100):\n",
	" _, loss_value = sess.run((train, loss))\n",
	" print(loss_value)"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "3wdb5bbItxxn",
	"colab_type": "code",
	"colab": {
	"base_uri": "https://localhost:8080/",
	"height": 85
	},
	"outputId": "f4a33b9a-d322-425e-b5f4-10d2106ae3d7"
	},
	"source": [
	"print(sess.run(y_pred))\n",
	"sess.close()"
	],
	"execution_count": 67,
	"outputs": [
	{
	"output_type": "stream",
	"text": [
	"[[-0.4370654]\n",
	" [-1.211788 ]\n",
	" [-1.9865109]\n",
	" [-2.7612336]]\n"
	],
	"name": "stdout"
	}
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "0pkSza16uLUD",
	"colab_type": "text"
	},
	"source": [
	"A complete program would look like this:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "4h3dzK5wuOi8",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"x = tf.constant([[1], [2], [3], [4]], dtype=tf.float32)\n",
	"y_true = tf.constant([[0], [-1], [-2], [-3]], dtype=tf.float32)\n",
	"\n",
	"linear_model = tf.layers.Dense(units=1)\n",
	"\n",
	"y_pred = linear_model(x)\n",
	"loss = tf.losses.mean_squared_error(labels=y_true, predictions=y_pred)\n",
	"\n",
	"optimizer = tf.train.GradientDescentOptimizer(0.01)\n",
	"train = optimizer.minimize(loss)\n",
	"\n",
	"init = tf.global_variables_initializer()\n",
	"\n",
	"sess = tf.Session()\n",
	"sess.run(init)\n",
	"for i in range(100):\n",
	" _, loss_value = sess.run((train, loss))\n",
	" print(loss_value)\n",
	"\n",
	"print(sess.run(y_pred))\n",
	"sess.close()"
	],
	"execution_count": 0,
	"outputs": []
	}
	]
	}