JonasSchroeder/deep-learning-fundamentals.ipynb

## deep-learning-fundamentals.ipynb
{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "Deep Learning Fundamentals",
      "provenance": [],
      "collapsed_sections": [],
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/gist/JonasSchroeder/e62d4e9bb7a5371a67c79c1db2b2a2ed/deep-learning-fundamentals.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Jq52Dette_dt",
        "colab_type": "text"
      },
      "source": [
        "# Deep networks\n",
        "Deep learning is a class of machine learning techniques, where information is processed in hierarchical layers to understand representation and features from data in increasing levels of complexity. Deep networks consist of interconnected neurons that are organized in layers. \n",
        "\n",
        "Some examples of deep learning algorithms are:\n",
        "- **Multi-layer perceptrons (MLPs):** A neural network with feedforward propagation, fully connected layers, and at least one hidden layer (see example in Introduction to Neural Networks https://colab.research.google.com/drive/1H-opMVRCqBsTJl7Ewhl1GoxTpjfbjvju)\n",
        "- **Convolutional neural networks (CNNs):** A CNN is a feedforward neural network with several types of special layers that act as filters. CNNs are organized similarily to biological cells in our brain's visual cortex. Thus, CNNs outperform all other ML algorithms on a large number of computer vision and NLP tasks.\n",
        "- **Recurrent networks:** Neural network with an internal state (memory) of all input data already fed into the network. The network's output is a combination of the memory and the latest input sample. This makes reuccent networks good for tasks with sequential data like time-series data or audio signlas.\n",
        "-**Autoencoders:** Class of unsupervised learning algorithms where the output shape is the same as the input shape, for example Generative Adversial Networks (GANs) and Variational Autoencoders (VAEs). Autoencoders take a high-dimensional signal, compress it to a dense code through encoding, followed by turning it back to a high-dimensional output using a decoder. Thus it can be used for PCAs.\n",
        "\n",
        "\n",
        "#Training of deep networks\n",
        "Statistic Gradient Descent (SGD) together with backpropagation is commonly used to train deep networks. \n",
        "\n",
        "**Momentum** is parameter that is added to the weight-updating process and refers to ...\n",
        "\n",
        "...\n",
        "\n",
        "#Applications of deep learning\n",
        "\n",
        "- Tesla's famous autopilot feature relies on CNNs and the company's director of AI is a well-known deeo learning researcher, Andrej Karpathy.\n",
        "- **Computer vision** tasks provided by Google's Vision API and Amazons Rekognition relies on deep learning models. Examples are recognizing and detecting objects and scenes in images, text recognition, face recognition,...\n",
        "- Cloud solutions provided by Amazon (AWS Deep Learning SMIs) or Google Cloud AI with Tensor Processing Units (TPUs), which are microprocessors optimized for fast neural network operations (matrix multiplications and activation functions).\n",
        "- Medical Image: fast screening of MRIs and CAT images.\n",
        "- Computer-aided diagnosis for cancer patients or diagnosis of medical history records, even handwritten ones.\n",
        "- Translation: Google's Neural Machine Translation API\n",
        "- **Natural Language Processing and Speech Recognition**: Google Duplex for natural conversations over phone, e.g. booking a restaurant reservation, voice assistents like Siri or Google Assist or Amazon Alexa,...\n",
        "- **Reinforcement learning** to beat the world's best players in various games.\n",
        "\n",
        "# Popular open source libraries for deep learning\n",
        "\n",
        "All three popular deep learning libraries have common features. \n",
        "\n",
        "They store data in **tensors** (= generalization of a matrix to higher dimensions). Tensors have an arbitrary number of axis/dimensions. A 0D tensor is just a scalar value, 1D is a vector, 2D a matrix, and so on. The first dimension of a tensor represents the sample. For example, two-dimensional grayscale images will be represented in a three-dimensional tensor (sample, value1, value2).\n",
        "\n",
        "Neural networks are represented as a **computational graph** of operations, where the nodes represent operations (weighted sums, activation functions,...) and the edges represent the flow of data. The inputs and outputs of the operations are tensors.\n",
        "\n",
        "All libraries include **automatic differentiation**, so that we only need to define the network architecture and activation functions, and the library will automatically figure out the derivatives required for training with backpropagation.\n",
        "\n",
        "They offer the implementation of **GPU operations**, using CUDA and cuDNN.\n",
        "\n",
        "\n",
        "#TensorFlow\n",
        "Tensor flow is the most common deep learning library. It is developed and maintained by Google. You can assign operations to CPUs and GPUs specifically, for example:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "yk8gGSfnNUe4",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "with tensorflow.device(\"/gpu:1\"):\n",
        "    # model definition\n"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3fzqWWNENW5Z",
        "colab_type": "text"
      },
      "source": [
        "Tensor flow has a steeper learning curve compared to the other two libraries, but refering to the documentation helps.\n",
        "\n",
        "# PyTorch\n",
        "PyTorch is a deep learning library based on Torch, developed by Facebook. It is relatively easy to use and it automatically selects the GPU as a default. Specific device selection is possible using the following code:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "vVX9-a8uOMML",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "# at beginning of the script\n",
        "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n",
        "...\n",
        "# then whenever you get a new Tensor or Module\n",
        "# this won't copy if they are already on the desired device input = data.to(device)\n",
        "model = MyModule(...).to(device)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-9EN2u2mOXpl",
        "colab_type": "text"
      },
      "source": [
        "# Keras\n",
        "Keras is a high-level neural net that runs on top of TensorFlow, CNTK, or Theano. It is relatively easy to run compared to TensorFlow, and is able to perform rapid experimentations. It will too automatically detect an available GPU. To specify a device, you use the same code as for TensorFlow.\n",
        "\n",
        "Here's an example for using **Keras to classify handwritten digits**\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "tH1DzM6eoBwr",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 51
        },
        "outputId": "17d99d6f-ad43-4e28-84f8-d8acd87432d9"
      },
      "source": [
        "# Import a dataset of 70,000 handwritten digits\n",
        "from keras.datasets import mnist\n",
        "\n",
        "# Import classes for a feed-forward network\n",
        "from keras.models import Sequential\n",
        "from keras.layers.core import Dense, Activation\n",
        "from keras.utils import np_utils\n",
        "\n",
        "# Load training and test data\n",
        "(X_train, Y_train), (X_test, Y_test) = mnist.load_data()\n",
        "\n",
        "print(X_train.shape)\n",
        "print(X_test.shape)"
      ],
      "execution_count": 4,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "(60000, 28, 28)\n",
            "(10000, 28, 28)\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6xqtzJbwph7q",
        "colab_type": "text"
      },
      "source": [
        "In order to feed the training data into a network, we need to reshape it from a 28x28 matrix to a 784px array."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "IP0za-etpqK2",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 51
        },
        "outputId": "355f59d2-4168-4f76-fd6c-defc955f1f35"
      },
      "source": [
        "X_train = X_train.reshape(60000, 784)\n",
        "X_test = X_test.reshape(10000, 784)\n",
        "\n",
        "print(X_train.shape)\n",
        "print(X_test.shape)"
      ],
      "execution_count": 5,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "(60000, 784)\n",
            "(10000, 784)\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6R9PtH8op9kG",
        "colab_type": "text"
      },
      "source": [
        "The targets for each sample correspond to the label of its numerical value, e.g. if it's a \"1\" or \"3\". We want to transform it into a 10-entry one-hot encoded vector. Our network will have 10 output neurons.\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "xLMA1mr7q_O6",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "classes = 10\n",
        "Y_train = np_utils.to_categorical(Y_train, classes)\n",
        "Y_test = np_utils.to_categorical(Y_test, classes)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "I8TNu1y3tVIS",
        "colab_type": "text"
      },
      "source": [
        "Next, we set the size of the input layer to correspond to each 784px per image. Furthermore, we set the number of hidden neurons, the number of epochs for training (= number of iterations of entire training data), and the batch size (= number of samples per gradient update)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "6zMXaisFuBiF",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        ""
      ],
      "execution_count": 0,
      "outputs": []
    }
  ]
}
	{
	"nbformat": 4,
	"nbformat_minor": 0,
	"metadata": {
	"colab": {
	"name": "Deep Learning Fundamentals",
	"provenance": [],
	"collapsed_sections": [],
	"include_colab_link": true
	},
	"kernelspec": {
	"name": "python3",
	"display_name": "Python 3"
	}
	},
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "view-in-github",
	"colab_type": "text"
	},
	"source": [
	"<a href=\"https://colab.research.google.com/gist/JonasSchroeder/e62d4e9bb7a5371a67c79c1db2b2a2ed/deep-learning-fundamentals.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "Jq52Dette_dt",
	"colab_type": "text"
	},
	"source": [
	"# Deep networks\n",
	"Deep learning is a class of machine learning techniques, where information is processed in hierarchical layers to understand representation and features from data in increasing levels of complexity. Deep networks consist of interconnected neurons that are organized in layers. \n",
	"\n",
	"Some examples of deep learning algorithms are:\n",
	"- Multi-layer perceptrons (MLPs): A neural network with feedforward propagation, fully connected layers, and at least one hidden layer (see example in Introduction to Neural Networks https://colab.research.google.com/drive/1H-opMVRCqBsTJl7Ewhl1GoxTpjfbjvju)\n",
	"- Convolutional neural networks (CNNs): A CNN is a feedforward neural network with several types of special layers that act as filters. CNNs are organized similarily to biological cells in our brain's visual cortex. Thus, CNNs outperform all other ML algorithms on a large number of computer vision and NLP tasks.\n",
	"- Recurrent networks: Neural network with an internal state (memory) of all input data already fed into the network. The network's output is a combination of the memory and the latest input sample. This makes reuccent networks good for tasks with sequential data like time-series data or audio signlas.\n",
	"-Autoencoders: Class of unsupervised learning algorithms where the output shape is the same as the input shape, for example Generative Adversial Networks (GANs) and Variational Autoencoders (VAEs). Autoencoders take a high-dimensional signal, compress it to a dense code through encoding, followed by turning it back to a high-dimensional output using a decoder. Thus it can be used for PCAs.\n",
	"\n",
	"\n",
	"#Training of deep networks\n",
	"Statistic Gradient Descent (SGD) together with backpropagation is commonly used to train deep networks. \n",
	"\n",
	"Momentum is parameter that is added to the weight-updating process and refers to ...\n",
	"\n",
	"...\n",
	"\n",
	"#Applications of deep learning\n",
	"\n",
	"- Tesla's famous autopilot feature relies on CNNs and the company's director of AI is a well-known deeo learning researcher, Andrej Karpathy.\n",
	"- Computer vision tasks provided by Google's Vision API and Amazons Rekognition relies on deep learning models. Examples are recognizing and detecting objects and scenes in images, text recognition, face recognition,...\n",
	"- Cloud solutions provided by Amazon (AWS Deep Learning SMIs) or Google Cloud AI with Tensor Processing Units (TPUs), which are microprocessors optimized for fast neural network operations (matrix multiplications and activation functions).\n",
	"- Medical Image: fast screening of MRIs and CAT images.\n",
	"- Computer-aided diagnosis for cancer patients or diagnosis of medical history records, even handwritten ones.\n",
	"- Translation: Google's Neural Machine Translation API\n",
	"- Natural Language Processing and Speech Recognition: Google Duplex for natural conversations over phone, e.g. booking a restaurant reservation, voice assistents like Siri or Google Assist or Amazon Alexa,...\n",
	"- Reinforcement learning to beat the world's best players in various games.\n",
	"\n",
	"# Popular open source libraries for deep learning\n",
	"\n",
	"All three popular deep learning libraries have common features. \n",
	"\n",
	"They store data in tensors (= generalization of a matrix to higher dimensions). Tensors have an arbitrary number of axis/dimensions. A 0D tensor is just a scalar value, 1D is a vector, 2D a matrix, and so on. The first dimension of a tensor represents the sample. For example, two-dimensional grayscale images will be represented in a three-dimensional tensor (sample, value1, value2).\n",
	"\n",
	"Neural networks are represented as a computational graph of operations, where the nodes represent operations (weighted sums, activation functions,...) and the edges represent the flow of data. The inputs and outputs of the operations are tensors.\n",
	"\n",
	"All libraries include automatic differentiation, so that we only need to define the network architecture and activation functions, and the library will automatically figure out the derivatives required for training with backpropagation.\n",
	"\n",
	"They offer the implementation of GPU operations, using CUDA and cuDNN.\n",
	"\n",
	"\n",
	"#TensorFlow\n",
	"Tensor flow is the most common deep learning library. It is developed and maintained by Google. You can assign operations to CPUs and GPUs specifically, for example:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "yk8gGSfnNUe4",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"with tensorflow.device(\"/gpu:1\"):\n",
	" # model definition\n"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "3fzqWWNENW5Z",
	"colab_type": "text"
	},
	"source": [
	"Tensor flow has a steeper learning curve compared to the other two libraries, but refering to the documentation helps.\n",
	"\n",
	"# PyTorch\n",
	"PyTorch is a deep learning library based on Torch, developed by Facebook. It is relatively easy to use and it automatically selects the GPU as a default. Specific device selection is possible using the following code:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "vVX9-a8uOMML",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"# at beginning of the script\n",
	"device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n",
	"...\n",
	"# then whenever you get a new Tensor or Module\n",
	"# this won't copy if they are already on the desired device input = data.to(device)\n",
	"model = MyModule(...).to(device)"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "-9EN2u2mOXpl",
	"colab_type": "text"
	},
	"source": [
	"# Keras\n",
	"Keras is a high-level neural net that runs on top of TensorFlow, CNTK, or Theano. It is relatively easy to run compared to TensorFlow, and is able to perform rapid experimentations. It will too automatically detect an available GPU. To specify a device, you use the same code as for TensorFlow.\n",
	"\n",
	"Here's an example for using Keras to classify handwritten digits\n"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "tH1DzM6eoBwr",
	"colab_type": "code",
	"colab": {
	"base_uri": "https://localhost:8080/",
	"height": 51
	},
	"outputId": "17d99d6f-ad43-4e28-84f8-d8acd87432d9"
	},
	"source": [
	"# Import a dataset of 70,000 handwritten digits\n",
	"from keras.datasets import mnist\n",
	"\n",
	"# Import classes for a feed-forward network\n",
	"from keras.models import Sequential\n",
	"from keras.layers.core import Dense, Activation\n",
	"from keras.utils import np_utils\n",
	"\n",
	"# Load training and test data\n",
	"(X_train, Y_train), (X_test, Y_test) = mnist.load_data()\n",
	"\n",
	"print(X_train.shape)\n",
	"print(X_test.shape)"
	],
	"execution_count": 4,
	"outputs": [
	{
	"output_type": "stream",
	"text": [
	"(60000, 28, 28)\n",
	"(10000, 28, 28)\n"
	],
	"name": "stdout"
	}
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "6xqtzJbwph7q",
	"colab_type": "text"
	},
	"source": [
	"In order to feed the training data into a network, we need to reshape it from a 28x28 matrix to a 784px array."
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "IP0za-etpqK2",
	"colab_type": "code",
	"colab": {
	"base_uri": "https://localhost:8080/",
	"height": 51
	},
	"outputId": "355f59d2-4168-4f76-fd6c-defc955f1f35"
	},
	"source": [
	"X_train = X_train.reshape(60000, 784)\n",
	"X_test = X_test.reshape(10000, 784)\n",
	"\n",
	"print(X_train.shape)\n",
	"print(X_test.shape)"
	],
	"execution_count": 5,
	"outputs": [
	{
	"output_type": "stream",
	"text": [
	"(60000, 784)\n",
	"(10000, 784)\n"
	],
	"name": "stdout"
	}
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "6R9PtH8op9kG",
	"colab_type": "text"
	},
	"source": [
	"The targets for each sample correspond to the label of its numerical value, e.g. if it's a \"1\" or \"3\". We want to transform it into a 10-entry one-hot encoded vector. Our network will have 10 output neurons.\n",
	"\n"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "xLMA1mr7q_O6",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"classes = 10\n",
	"Y_train = np_utils.to_categorical(Y_train, classes)\n",
	"Y_test = np_utils.to_categorical(Y_test, classes)"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "I8TNu1y3tVIS",
	"colab_type": "text"
	},
	"source": [
	"Next, we set the size of the input layer to correspond to each 784px per image. Furthermore, we set the number of hidden neurons, the number of epochs for training (= number of iterations of entire training data), and the batch size (= number of samples per gradient update)"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "6zMXaisFuBiF",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	""
	],
	"execution_count": 0,
	"outputs": []
	}
	]
	}