Skip to content

Instantly share code, notes, and snippets.

@ritog
Created November 10, 2020 11:52
Show Gist options
  • Save ritog/06c97e45aed6b73c514feaa39b231f0c to your computer and use it in GitHub Desktop.
Save ritog/06c97e45aed6b73c514feaa39b231f0c to your computer and use it in GitHub Desktop.
MIT_S191_Lab1_Part1_TensorFlow.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "MIT_S191_Lab1_Part1_TensorFlow.ipynb",
"provenance": [],
"collapsed_sections": [
"WBk0ZDWY-ff8"
],
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"accelerator": "GPU"
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/ghosh-r/06c97e45aed6b73c514feaa39b231f0c/mit_s191_lab1_part1_tensorflow.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "WBk0ZDWY-ff8"
},
"source": [
"<table align=\"center\">\n",
" <td align=\"center\"><a target=\"_blank\" href=\"http://introtodeeplearning.com\">\n",
" <img src=\"http://introtodeeplearning.com/images/colab/mit.png\" style=\"padding-bottom:5px;\" />\n",
" Visit MIT Deep Learning</a></td>\n",
" <td align=\"center\"><a target=\"_blank\" href=\"https://colab.research.google.com/github/aamini/introtodeeplearning/blob/master/lab1/Part1_TensorFlow.ipynb\">\n",
" <img src=\"http://introtodeeplearning.com/images/colab/colab.png?v2.0\" style=\"padding-bottom:5px;\" />Run in Google Colab</a></td>\n",
" <td align=\"center\"><a target=\"_blank\" href=\"https://github.com/aamini/introtodeeplearning/blob/master/lab1/Part1_TensorFlow.ipynb\">\n",
" <img src=\"http://introtodeeplearning.com/images/colab/github.png\" height=\"70px\" style=\"padding-bottom:5px;\" />View Source on GitHub</a></td>\n",
"</table>\n",
"\n",
"# Copyright Information\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "3eI6DUic-6jo"
},
"source": [
"# Copyright 2020 MIT 6.S191 Introduction to Deep Learning. All Rights Reserved.\n",
"# \n",
"# Licensed under the MIT License. You may not use this file except in compliance\n",
"# with the License. Use and/or modification of this code outside of 6.S191 must\n",
"# reference:\n",
"#\n",
"# © MIT 6.S191: Introduction to Deep Learning\n",
"# http://introtodeeplearning.com\n",
"#"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "57knM8jrYZ2t"
},
"source": [
"# Lab 1: Intro to TensorFlow and Music Generation with RNNs\n",
"\n",
"In this lab, you'll get exposure to using TensorFlow and learn how it can be used for solving deep learning tasks. Go through the code and run each cell. Along the way, you'll encounter several ***TODO*** blocks -- follow the instructions to fill them out before running those cells and continuing.\n",
"\n",
"\n",
"# Part 1: Intro to TensorFlow\n",
"\n",
"## 0.1 Install TensorFlow\n",
"\n",
"TensorFlow is a software library extensively used in machine learning. Here we'll learn how computations are represented and how to define a simple neural network in TensorFlow. For all the labs in 6.S191 2020, we'll be using the latest version of TensorFlow, TensorFlow 2, which affords great flexibility and the ability to imperatively execute operations, just like in Python. You'll notice that TensorFlow 2 is quite similar to Python in its syntax and imperative execution. Let's install TensorFlow and a couple of dependencies.\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "LkaimNJfYZ2w",
"outputId": "01665ca0-9b25-4bad-cba4-1a6efcb336e7",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"source": [
"%tensorflow_version 2.x\n",
"import tensorflow as tf\n",
"\n",
"# Download and import the MIT 6.S191 package\n",
"!pip install mitdeeplearning\n",
"import mitdeeplearning as mdl\n",
"\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt"
],
"execution_count": 1,
"outputs": [
{
"output_type": "stream",
"text": [
"Collecting mitdeeplearning\n",
"\u001b[?25l Downloading https://files.pythonhosted.org/packages/8b/3b/b9174b68dc10832356d02a2d83a64b43a24f1762c172754407d22fc8f960/mitdeeplearning-0.1.2.tar.gz (2.1MB)\n",
"\r\u001b[K |▏ | 10kB 24.5MB/s eta 0:00:01\r\u001b[K |▎ | 20kB 25.5MB/s eta 0:00:01\r\u001b[K |▌ | 30kB 18.3MB/s eta 0:00:01\r\u001b[K |▋ | 40kB 13.3MB/s eta 0:00:01\r\u001b[K |▉ | 51kB 12.2MB/s eta 0:00:01\r\u001b[K |█ | 61kB 11.5MB/s eta 0:00:01\r\u001b[K |█ | 71kB 11.1MB/s eta 0:00:01\r\u001b[K |█▎ | 81kB 11.2MB/s eta 0:00:01\r\u001b[K |█▍ | 92kB 10.3MB/s eta 0:00:01\r\u001b[K |█▋ | 102kB 10.3MB/s eta 0:00:01\r\u001b[K |█▊ | 112kB 10.3MB/s eta 0:00:01\r\u001b[K |█▉ | 122kB 10.3MB/s eta 0:00:01\r\u001b[K |██ | 133kB 10.3MB/s eta 0:00:01\r\u001b[K |██▏ | 143kB 10.3MB/s eta 0:00:01\r\u001b[K |██▍ | 153kB 10.3MB/s eta 0:00:01\r\u001b[K |██▌ | 163kB 10.3MB/s eta 0:00:01\r\u001b[K |██▊ | 174kB 10.3MB/s eta 0:00:01\r\u001b[K |██▉ | 184kB 10.3MB/s eta 0:00:01\r\u001b[K |███ | 194kB 10.3MB/s eta 0:00:01\r\u001b[K |███▏ | 204kB 10.3MB/s eta 0:00:01\r\u001b[K |███▎ | 215kB 10.3MB/s eta 0:00:01\r\u001b[K |███▌ | 225kB 10.3MB/s eta 0:00:01\r\u001b[K |███▋ | 235kB 10.3MB/s eta 0:00:01\r\u001b[K |███▊ | 245kB 10.3MB/s eta 0:00:01\r\u001b[K |████ | 256kB 10.3MB/s eta 0:00:01\r\u001b[K |████ | 266kB 10.3MB/s eta 0:00:01\r\u001b[K |████▎ | 276kB 10.3MB/s eta 0:00:01\r\u001b[K |████▍ | 286kB 10.3MB/s eta 0:00:01\r\u001b[K |████▋ | 296kB 10.3MB/s eta 0:00:01\r\u001b[K |████▊ | 307kB 10.3MB/s eta 0:00:01\r\u001b[K |████▉ | 317kB 10.3MB/s eta 0:00:01\r\u001b[K |█████ | 327kB 10.3MB/s eta 0:00:01\r\u001b[K |█████▏ | 337kB 10.3MB/s eta 0:00:01\r\u001b[K |█████▍ | 348kB 10.3MB/s eta 0:00:01\r\u001b[K |█████▌ | 358kB 10.3MB/s eta 0:00:01\r\u001b[K |█████▋ | 368kB 10.3MB/s eta 0:00:01\r\u001b[K |█████▉ | 378kB 10.3MB/s eta 0:00:01\r\u001b[K |██████ | 389kB 10.3MB/s eta 0:00:01\r\u001b[K |██████▏ | 399kB 10.3MB/s eta 0:00:01\r\u001b[K |██████▎ | 409kB 10.3MB/s eta 0:00:01\r\u001b[K |██████▌ | 419kB 10.3MB/s eta 0:00:01\r\u001b[K |██████▋ | 430kB 10.3MB/s eta 0:00:01\r\u001b[K |██████▊ | 440kB 10.3MB/s eta 0:00:01\r\u001b[K |███████ | 450kB 10.3MB/s eta 0:00:01\r\u001b[K |███████ | 460kB 10.3MB/s eta 0:00:01\r\u001b[K |███████▎ | 471kB 10.3MB/s eta 0:00:01\r\u001b[K |███████▍ | 481kB 10.3MB/s eta 0:00:01\r\u001b[K |███████▌ | 491kB 10.3MB/s eta 0:00:01\r\u001b[K |███████▊ | 501kB 10.3MB/s eta 0:00:01\r\u001b[K |███████▉ | 512kB 10.3MB/s eta 0:00:01\r\u001b[K |████████ | 522kB 10.3MB/s eta 0:00:01\r\u001b[K |████████▏ | 532kB 10.3MB/s eta 0:00:01\r\u001b[K |████████▍ | 542kB 10.3MB/s eta 0:00:01\r\u001b[K |████████▌ | 552kB 10.3MB/s eta 0:00:01\r\u001b[K |████████▋ | 563kB 10.3MB/s eta 0:00:01\r\u001b[K |████████▉ | 573kB 10.3MB/s eta 0:00:01\r\u001b[K |█████████ | 583kB 10.3MB/s eta 0:00:01\r\u001b[K |█████████▏ | 593kB 10.3MB/s eta 0:00:01\r\u001b[K |█████████▎ | 604kB 10.3MB/s eta 0:00:01\r\u001b[K |█████████▍ | 614kB 10.3MB/s eta 0:00:01\r\u001b[K |█████████▋ | 624kB 10.3MB/s eta 0:00:01\r\u001b[K |█████████▊ | 634kB 10.3MB/s eta 0:00:01\r\u001b[K |██████████ | 645kB 10.3MB/s eta 0:00:01\r\u001b[K |██████████ | 655kB 10.3MB/s eta 0:00:01\r\u001b[K |██████████▎ | 665kB 10.3MB/s eta 0:00:01\r\u001b[K |██████████▍ | 675kB 10.3MB/s eta 0:00:01\r\u001b[K |██████████▌ | 686kB 10.3MB/s eta 0:00:01\r\u001b[K |██████████▊ | 696kB 10.3MB/s eta 0:00:01\r\u001b[K |██████████▉ | 706kB 10.3MB/s eta 0:00:01\r\u001b[K |███████████ | 716kB 10.3MB/s eta 0:00:01\r\u001b[K |███████████▏ | 727kB 10.3MB/s eta 0:00:01\r\u001b[K |███████████▎ | 737kB 10.3MB/s eta 0:00:01\r\u001b[K |███████████▌ | 747kB 10.3MB/s eta 0:00:01\r\u001b[K |███████████▋ | 757kB 10.3MB/s eta 0:00:01\r\u001b[K |███████████▉ | 768kB 10.3MB/s eta 0:00:01\r\u001b[K |████████████ | 778kB 10.3MB/s eta 0:00:01\r\u001b[K |████████████ | 788kB 10.3MB/s eta 0:00:01\r\u001b[K |████████████▎ | 798kB 10.3MB/s eta 0:00:01\r\u001b[K |████████████▍ | 808kB 10.3MB/s eta 0:00:01\r\u001b[K |████████████▋ | 819kB 10.3MB/s eta 0:00:01\r\u001b[K |████████████▊ | 829kB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████ | 839kB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████ | 849kB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████▏ | 860kB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████▍ | 870kB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████▌ | 880kB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████▊ | 890kB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████▉ | 901kB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████ | 911kB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████▏ | 921kB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████▎ | 931kB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████▌ | 942kB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████▋ | 952kB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████▉ | 962kB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████ | 972kB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████ | 983kB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████▎ | 993kB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████▍ | 1.0MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████▋ | 1.0MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████▊ | 1.0MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████▉ | 1.0MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████ | 1.0MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████▏ | 1.1MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████▍ | 1.1MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████▌ | 1.1MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████▊ | 1.1MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████▉ | 1.1MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████ | 1.1MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████▏ | 1.1MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████▎ | 1.1MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████▌ | 1.1MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████▋ | 1.1MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████▊ | 1.2MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████ | 1.2MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████ | 1.2MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████▎ | 1.2MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████▍ | 1.2MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████▋ | 1.2MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████▊ | 1.2MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████▉ | 1.2MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████ | 1.2MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████▏ | 1.2MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████▍ | 1.3MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████▌ | 1.3MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████▋ | 1.3MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████▉ | 1.3MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████ | 1.3MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████▏ | 1.3MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████▎ | 1.3MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████▌ | 1.3MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████▋ | 1.3MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████▊ | 1.4MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████ | 1.4MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████ | 1.4MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████▎ | 1.4MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████▍ | 1.4MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████▌ | 1.4MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████▊ | 1.4MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████▉ | 1.4MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████ | 1.4MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████▏ | 1.4MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████▍ | 1.5MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████▌ | 1.5MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████▋ | 1.5MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████▉ | 1.5MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████ | 1.5MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████▏ | 1.5MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████▎ | 1.5MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████▍ | 1.5MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████▋ | 1.5MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████▊ | 1.5MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████████ | 1.6MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████████ | 1.6MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████████▏ | 1.6MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████████▍ | 1.6MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████████▌ | 1.6MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████████▊ | 1.6MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████████▉ | 1.6MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████████ | 1.6MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████████▏ | 1.6MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████████▎ | 1.6MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████████▌ | 1.7MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████████▋ | 1.7MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████████▉ | 1.7MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████████ | 1.7MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████████ | 1.7MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████████▎ | 1.7MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████████▍ | 1.7MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████████▋ | 1.7MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████████▊ | 1.7MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████████ | 1.8MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████████ | 1.8MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████████▏ | 1.8MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████████▍ | 1.8MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████████▌ | 1.8MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████████▊ | 1.8MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████████▉ | 1.8MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████████████ | 1.8MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████████████▏ | 1.8MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████████████▎ | 1.8MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████████████▌ | 1.9MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████████████▋ | 1.9MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████████████▉ | 1.9MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████████████ | 1.9MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████████████ | 1.9MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████████████▎ | 1.9MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████████████▍ | 1.9MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████████████▋ | 1.9MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████████████▊ | 1.9MB 10.3MB/s eta 0:00:01\r\u001b[K |█████████████████████████████▉ | 1.9MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████████████ | 2.0MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████████████▏ | 2.0MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████████████▍ | 2.0MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████████████▌ | 2.0MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████████████▊ | 2.0MB 10.3MB/s eta 0:00:01\r\u001b[K |██████████████████████████████▉ | 2.0MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████████████ | 2.0MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████████████▏| 2.0MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████████████▎| 2.0MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████████████▌| 2.0MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████████████▋| 2.1MB 10.3MB/s eta 0:00:01\r\u001b[K |███████████████████████████████▊| 2.1MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████████████████| 2.1MB 10.3MB/s eta 0:00:01\r\u001b[K |████████████████████████████████| 2.1MB 10.3MB/s \n",
"\u001b[?25hRequirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from mitdeeplearning) (1.18.5)\n",
"Requirement already satisfied: regex in /usr/local/lib/python3.6/dist-packages (from mitdeeplearning) (2019.12.20)\n",
"Requirement already satisfied: tqdm in /usr/local/lib/python3.6/dist-packages (from mitdeeplearning) (4.41.1)\n",
"Requirement already satisfied: gym in /usr/local/lib/python3.6/dist-packages (from mitdeeplearning) (0.17.3)\n",
"Requirement already satisfied: cloudpickle<1.7.0,>=1.2.0 in /usr/local/lib/python3.6/dist-packages (from gym->mitdeeplearning) (1.3.0)\n",
"Requirement already satisfied: scipy in /usr/local/lib/python3.6/dist-packages (from gym->mitdeeplearning) (1.4.1)\n",
"Requirement already satisfied: pyglet<=1.5.0,>=1.4.0 in /usr/local/lib/python3.6/dist-packages (from gym->mitdeeplearning) (1.5.0)\n",
"Requirement already satisfied: future in /usr/local/lib/python3.6/dist-packages (from pyglet<=1.5.0,>=1.4.0->gym->mitdeeplearning) (0.16.0)\n",
"Building wheels for collected packages: mitdeeplearning\n",
" Building wheel for mitdeeplearning (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
" Created wheel for mitdeeplearning: filename=mitdeeplearning-0.1.2-cp36-none-any.whl size=2114585 sha256=6b86eb7507a10a3de349c64c10aad6bdd2f04944df0199d523658fc2cdb39816\n",
" Stored in directory: /root/.cache/pip/wheels/27/e1/73/5f01c787621d8a3c857f59876c79e304b9b64db9ff5bd61b74\n",
"Successfully built mitdeeplearning\n",
"Installing collected packages: mitdeeplearning\n",
"Successfully installed mitdeeplearning-0.1.2\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2QNMcdP4m3Vs"
},
"source": [
"## 1.1 Why is TensorFlow called TensorFlow?\n",
"\n",
"TensorFlow is called 'TensorFlow' because it handles the flow (node/mathematical operation) of Tensors, which are data structures that you can think of as multi-dimensional arrays. Tensors are represented as n-dimensional arrays of base dataypes such as a string or integer -- they provide a way to generalize vectors and matrices to higher dimensions.\n",
"\n",
"The ```shape``` of a Tensor defines its number of dimensions and the size of each dimension. The ```rank``` of a Tensor provides the number of dimensions (n-dimensions) -- you can also think of this as the Tensor's order or degree.\n",
"\n",
"Let's first look at 0-d Tensors, of which a scalar is an example:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "tFxztZQInlAB",
"outputId": "3d1e4ce2-d9ac-4015-a15f-b3f5d8760fbf",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"source": [
"sport = tf.constant(\"Tennis\", tf.string)\n",
"number = tf.constant(1.41421356237, tf.float64)\n",
"\n",
"print(\"`sport` is a {}-d Tensor\".format(tf.rank(sport).numpy()))\n",
"print(\"`number` is a {}-d Tensor\".format(tf.rank(number).numpy()))"
],
"execution_count": 2,
"outputs": [
{
"output_type": "stream",
"text": [
"`sport` is a 0-d Tensor\n",
"`number` is a 0-d Tensor\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-dljcPUcoJZ6"
},
"source": [
"Vectors and lists can be used to create 1-d Tensors:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "oaHXABe8oPcO",
"outputId": "d759e059-eda6-486b-eb18-31b3b26acdd7",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"source": [
"sports = tf.constant([\"Tennis\", \"Basketball\"], tf.string)\n",
"numbers = tf.constant([3.141592, 1.414213, 2.71821], tf.float64)\n",
"\n",
"print(\"`sports` is a {}-d Tensor with shape: {}\".format(tf.rank(sports).numpy(), tf.shape(sports)))\n",
"print(\"`numbers` is a {}-d Tensor with shape: {}\".format(tf.rank(numbers).numpy(), tf.shape(numbers)))"
],
"execution_count": 3,
"outputs": [
{
"output_type": "stream",
"text": [
"`sports` is a 1-d Tensor with shape: [2]\n",
"`numbers` is a 1-d Tensor with shape: [3]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gvffwkvtodLP"
},
"source": [
"Next we consider creating 2-d (i.e., matrices) and higher-rank Tensors. For examples, in future labs involving image processing and computer vision, we will use 4-d Tensors. Here the dimensions correspond to the number of example images in our batch, image height, image width, and the number of color channels."
]
},
{
"cell_type": "code",
"metadata": {
"id": "tFeBBe1IouS3"
},
"source": [
"### Defining higher-order Tensors ###\n",
"\n",
"'''TODO: Define a 2-d Tensor'''\n",
"matrix = tf.constant([[1, 2, 3], [3, 4., 5], [6, 7, 9]], tf.float64)\n",
"\n",
"assert isinstance(matrix, tf.Tensor), \"matrix must be a tf Tensor object\"\n",
"assert tf.rank(matrix).numpy() == 2"
],
"execution_count": 4,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "Zv1fTn_Ya_cz"
},
"source": [
"'''TODO: Define a 4-d Tensor.'''\n",
"# Use tf.zeros to initialize a 4-d Tensor of zeros with size 10 x 256 x 256 x 3. \n",
"# You can think of this as 10 images where each image is RGB 256 x 256.\n",
"images = tf.zeros([10, 256, 256, 3], tf.float64)\n",
"\n",
"assert isinstance(images, tf.Tensor), \"matrix must be a tf Tensor object\"\n",
"assert tf.rank(images).numpy() == 4, \"matrix must be of rank 4\"\n",
"assert tf.shape(images).numpy().tolist() == [10, 256, 256, 3], \"matrix is incorrect shape\""
],
"execution_count": 5,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "wkaCDOGapMyl"
},
"source": [
"As you have seen, the ```shape``` of a Tensor provides the number of elements in each Tensor dimension. The ```shape``` is quite useful, and we'll use it often. You can also use slicing to access subtensors within a higher-rank Tensor:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "RbQ0_73uKodO",
"outputId": "d7e4833e-3637-4c56-ba17-a98f9017f4f4",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"source": [
"matrix"
],
"execution_count": 6,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"<tf.Tensor: shape=(3, 3), dtype=float64, numpy=\n",
"array([[1., 2., 3.],\n",
" [3., 4., 5.],\n",
" [6., 7., 9.]])>"
]
},
"metadata": {
"tags": []
},
"execution_count": 6
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "FhaufyObuLEG",
"outputId": "270d18a0-4d49-43fb-bae6-71a1f4349ea1",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"source": [
"row_vector = matrix[1]\n",
"column_vector = matrix[:,2]\n",
"scalar = matrix[1, 2]\n",
"\n",
"print(\"`row_vector`: {}\".format(row_vector.numpy()))\n",
"print(\"`column_vector`: {}\".format(column_vector.numpy()))\n",
"print(\"`scalar`: {}\".format(scalar.numpy()))"
],
"execution_count": 7,
"outputs": [
{
"output_type": "stream",
"text": [
"`row_vector`: [3. 4. 5.]\n",
"`column_vector`: [3. 5. 9.]\n",
"`scalar`: 5.0\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "iD3VO-LZYZ2z"
},
"source": [
"## 1.2 Computations on Tensors\n",
"\n",
"A convenient way to think about and visualize computations in TensorFlow is in terms of graphs. We can define this graph in terms of Tensors, which hold data, and the mathematical operations that act on these Tensors in some order. Let's look at a simple example, and define this computation using TensorFlow:\n",
"\n",
"![alt text](https://raw.githubusercontent.com/aamini/introtodeeplearning/master/lab1/img/add-graph.png)"
]
},
{
"cell_type": "code",
"metadata": {
"id": "X_YJrZsxYZ2z",
"outputId": "079f78de-8baf-4d6e-df76-256006a8ba7b",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"source": [
"# Create the nodes in the graph, and initialize values\n",
"a = tf.constant(15)\n",
"b = tf.constant(61)\n",
"\n",
"# Add them!\n",
"c1 = tf.add(a,b)\n",
"c2 = a + b # TensorFlow overrides the \"+\" operation so that it is able to act on Tensors\n",
"print(c1)\n",
"print(c2)"
],
"execution_count": 8,
"outputs": [
{
"output_type": "stream",
"text": [
"tf.Tensor(76, shape=(), dtype=int32)\n",
"tf.Tensor(76, shape=(), dtype=int32)\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Mbfv_QOiYZ23"
},
"source": [
"Notice how we've created a computation graph consisting of TensorFlow operations, and how the output is a Tensor with value 76 -- we've just created a computation graph consisting of operations, and it's executed them and given us back the result.\n",
"\n",
"Now let's consider a slightly more complicated example:\n",
"\n",
"![alt text](https://raw.githubusercontent.com/aamini/introtodeeplearning/master/lab1/img/computation-graph.png)\n",
"\n",
"Here, we take two inputs, `a, b`, and compute an output `e`. Each node in the graph represents an operation that takes some input, does some computation, and passes its output to another node.\n",
"\n",
"Let's define a simple function in TensorFlow to construct this computation function:"
]
},
{
"cell_type": "code",
"metadata": {
"scrolled": true,
"id": "PJnfzpWyYZ23"
},
"source": [
"### Defining Tensor computations ###\n",
"\n",
"# Construct a simple computation function\n",
"def func(a,b):\n",
" '''TODO: Define the operation for c, d, e (use tf.add, tf.subtract, tf.multiply).'''\n",
" c = tf.add(a, b)\n",
" d = tf.subtract(b, 1)\n",
" e = tf.multiply(c, d)\n",
" return e"
],
"execution_count": 9,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "AwrRfDMS2-oy"
},
"source": [
"Now, we can call this function to execute the computation graph given some inputs `a,b`:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "pnwsf8w2uF7p",
"outputId": "1ba0101f-00a6-4375-a198-b1fe97d121bd",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"source": [
"# Consider example values for a,b\n",
"a, b = 1.5, 2.5\n",
"# Execute the computation\n",
"e_out = func(a,b)\n",
"print(e_out)"
],
"execution_count": 10,
"outputs": [
{
"output_type": "stream",
"text": [
"tf.Tensor(6.0, shape=(), dtype=float32)\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6HqgUIUhYZ29"
},
"source": [
"Notice how our output is a Tensor with value defined by the output of the computation, and that the output has no shape as it is a single scalar value."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "1h4o9Bb0YZ29"
},
"source": [
"## 1.3 Neural networks in TensorFlow\n",
"We can also define neural networks in TensorFlow. TensorFlow uses a high-level API called [Keras](https://www.tensorflow.org/guide/keras) that provides a powerful, intuitive framework for building and training deep learning models.\n",
"\n",
"Let's first consider the example of a simple perceptron defined by just one dense layer: $ y = \\sigma(Wx + b)$, where $W$ represents a matrix of weights, $b$ is a bias, $x$ is the input, $\\sigma$ is the sigmoid activation function, and $y$ is the output. We can also visualize this operation using a graph: \n",
"\n",
"![alt text](https://raw.githubusercontent.com/aamini/introtodeeplearning/master/lab1/img/computation-graph-2.png)\n",
"\n",
"Tensors can flow through abstract types called [```Layers```](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Layer) -- the building blocks of neural networks. ```Layers``` implement common neural networks operations, and are used to update weights, compute losses, and define inter-layer connectivity. We will first define a ```Layer``` to implement the simple perceptron defined above."
]
},
{
"cell_type": "code",
"metadata": {
"id": "HutbJk-1kHPh",
"outputId": "9f3298c4-4ae4-46a5-ba59-0556ce529ef3",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"source": [
"### Defining a network Layer ###\n",
"\n",
"# n_output_nodes: number of output nodes\n",
"# input_shape: shape of the input\n",
"# x: input to the layer\n",
"\n",
"class OurDenseLayer(tf.keras.layers.Layer):\n",
" def __init__(self, n_output_nodes):\n",
" super(OurDenseLayer, self).__init__()\n",
" self.n_output_nodes = n_output_nodes\n",
"\n",
" def build(self, input_shape):\n",
" d = int(input_shape[-1])\n",
" # Define and initialize parameters: a weight matrix W and bias b\n",
" # Note that parameter initialization is random!\n",
" self.W = self.add_weight(\"weight\", shape=[d, self.n_output_nodes]) # note the dimensionality\n",
" self.b = self.add_weight(\"bias\", shape=[1, self.n_output_nodes]) # note the dimensionality\n",
"\n",
" def call(self, x):\n",
" '''TODO: define the operation for z (hint: use tf.matmul)'''\n",
" z = tf.add(tf.matmul(x, self.W), self.b)\n",
"\n",
" '''TODO: define the operation for out (hint: use tf.sigmoid)'''\n",
" y = tf.sigmoid(z)\n",
" return y\n",
"\n",
"# Since layer parameters are initialized randomly, we will set a random seed for reproducibility\n",
"tf.random.set_seed(1)\n",
"layer = OurDenseLayer(3)\n",
"layer.build((1,2))\n",
"x_input = tf.constant([[1,2.]], shape=(1,2))\n",
"y = layer.call(x_input)\n",
"\n",
"# test the output!\n",
"print(y.numpy())\n",
"mdl.lab1.test_custom_dense_layer_output(y)"
],
"execution_count": 11,
"outputs": [
{
"output_type": "stream",
"text": [
"[[0.2697859 0.45750412 0.66536945]]\n",
"[PASS] test_custom_dense_layer_output\n"
],
"name": "stdout"
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"True"
]
},
"metadata": {
"tags": []
},
"execution_count": 11
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Jt1FgM7qYZ3D"
},
"source": [
"Conveniently, TensorFlow has defined a number of ```Layers``` that are commonly used in neural networks, for example a [```Dense```](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense?version=stable). Now, instead of using a single ```Layer``` to define our simple neural network, we'll use the [`Sequential`](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/Sequential) model from Keras and a single [`Dense` ](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/layers/Dense) layer to define our network. With the `Sequential` API, you can readily create neural networks by stacking together layers like building blocks. "
]
},
{
"cell_type": "code",
"metadata": {
"id": "7WXTpmoL6TDz"
},
"source": [
"### Defining a neural network using the Sequential API ###\n",
"\n",
"# Import relevant packages\n",
"from tensorflow.keras import Sequential\n",
"from tensorflow.keras.layers import Dense\n",
"\n",
"# Define the number of outputs\n",
"n_output_nodes = 3\n",
"\n",
"# First define the model \n",
"model = Sequential()\n",
"\n",
"'''TODO: Define a dense (fully connected) layer to compute z'''\n",
"# Remember: dense layers are defined by the parameters W and b!\n",
"# You can read more about the initialization of W and b in the TF documentation :) \n",
"# https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense?version=stable\n",
"dense_layer = Dense(units=2)\n",
"\n",
"# Add the dense layer to the model\n",
"model.add(dense_layer)\n"
],
"execution_count": 12,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "HDGcwYfUyR-U"
},
"source": [
"That's it! We've defined our model using the Sequential API. Now, we can test it out using an example input:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "sg23OczByRDb",
"outputId": "0baaced8-1e46-480d-c4fe-22ec06642b16",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"source": [
"# Test model with example input\n",
"x_input = tf.constant([[1,2.]], shape=(1,2))\n",
"\n",
"'''TODO: feed input into the model and predict the output!'''\n",
"model_output = model(x_input)\n",
"print(model_output)"
],
"execution_count": 14,
"outputs": [
{
"output_type": "stream",
"text": [
"tf.Tensor([[ 0.8787118 -0.20480263]], shape=(1, 2), dtype=float32)\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "596NvsOOtr9F"
},
"source": [
"In addition to defining models using the `Sequential` API, we can also define neural networks by directly subclassing the [`Model`](https://www.tensorflow.org/api_docs/python/tf/keras/Model?version=stable) class, which groups layers together to enable model training and inference. The `Model` class captures what we refer to as a \"model\" or as a \"network\". Using Subclassing, we can create a class for our model, and then define the forward pass through the network using the `call` function. Subclassing affords the flexibility to define custom layers, custom training loops, custom activation functions, and custom models. Let's define the same neural network as above now using Subclassing rather than the `Sequential` model."
]
},
{
"cell_type": "code",
"metadata": {
"id": "K4aCflPVyViD"
},
"source": [
"### Defining a model using subclassing ###\n",
"\n",
"from tensorflow.keras import Model\n",
"from tensorflow.keras.layers import Dense\n",
"\n",
"class SubclassModel(tf.keras.Model):\n",
"\n",
" # In __init__, we define the Model's layers\n",
" def __init__(self, n_output_nodes):\n",
" super(SubclassModel, self).__init__()\n",
" '''TODO: Our model consists of a single Dense layer. Define this layer.''' \n",
" self.dense_layer = Dense(units=2)\n",
"\n",
" # In the call function, we define the Model's forward pass.\n",
" def call(self, inputs):\n",
" return self.dense_layer(inputs)"
],
"execution_count": 15,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "U0-lwHDk4irB"
},
"source": [
"Just like the model we built using the `Sequential` API, let's test out our `SubclassModel` using an example input.\n",
"\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "LhB34RA-4gXb",
"outputId": "8de724fa-0a74-4a88-8e12-d7f29d68b68d",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"source": [
"n_output_nodes = 3\n",
"model = SubclassModel(n_output_nodes)\n",
"\n",
"x_input = tf.constant([[1,2.]], shape=(1,2))\n",
"\n",
"print(model.call(x_input))"
],
"execution_count": 16,
"outputs": [
{
"output_type": "stream",
"text": [
"tf.Tensor([[-0.6277685 0.7001949]], shape=(1, 2), dtype=float32)\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HTIFMJLAzsyE"
},
"source": [
"Importantly, Subclassing affords us a lot of flexibility to define custom models. For example, we can use boolean arguments in the `call` function to specify different network behaviors, for example different behaviors during training and inference. Let's suppose under some instances we want our network to simply output the input, without any perturbation. We define a boolean argument `isidentity` to control this behavior:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "P7jzGX5D1xT5"
},
"source": [
"### Defining a model using subclassing and specifying custom behavior ###\n",
"\n",
"from tensorflow.keras import Model\n",
"from tensorflow.keras.layers import Dense\n",
"\n",
"class IdentityModel(tf.keras.Model):\n",
"\n",
" # As before, in __init__ we define the Model's layers\n",
" # Since our desired behavior involves the forward pass, this part is unchanged\n",
" def __init__(self, n_output_nodes):\n",
" super(IdentityModel, self).__init__()\n",
" self.dense_layer = tf.keras.layers.Dense(n_output_nodes, activation='sigmoid')\n",
"\n",
" '''TODO: Implement the behavior where the network outputs the input, unchanged, \n",
" under control of the isidentity argument.'''\n",
" def call(self, inputs, isidentity=False):\n",
" x = self.dense_layer(inputs)\n",
" if isidentity:\n",
" return inputs\n",
" else:\n",
" return self.dense_layer(inputs)"
],
"execution_count": 20,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "Ku4rcCGx5T3y"
},
"source": [
"Let's test this behavior:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "NzC0mgbk5dp2",
"outputId": "188603c2-566d-46b5-e7a0-4ece283182ec",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"source": [
"n_output_nodes = 3\n",
"model = IdentityModel(n_output_nodes)\n",
"\n",
"x_input = tf.constant([[1,2.]], shape=(1,2))\n",
"'''TODO: pass the input into the model and call with and without the input identity option.'''\n",
"out_activate = model.call(x_input)\n",
"out_identity = model.call(x_input, isidentity=True)\n",
"\n",
"print(\"Network output with activation: {}; network identity output: {}\".format(out_activate.numpy(), out_identity.numpy()))"
],
"execution_count": 21,
"outputs": [
{
"output_type": "stream",
"text": [
"Network output with activation: [[0.19695838 0.6330006 0.7668015 ]]; network identity output: [[1. 2.]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7V1dEqdk6VI5"
},
"source": [
"Now that we have learned how to define `Layers` as well as neural networks in TensorFlow using both the `Sequential` and Subclassing APIs, we're ready to turn our attention to how to actually implement network training with backpropagation."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dQwDhKn8kbO2"
},
"source": [
"## 1.4 Automatic differentiation in TensorFlow\n",
"\n",
"[Automatic differentiation](https://en.wikipedia.org/wiki/Automatic_differentiation)\n",
"is one of the most important parts of TensorFlow and is the backbone of training with \n",
"[backpropagation](https://en.wikipedia.org/wiki/Backpropagation). We will use the TensorFlow GradientTape [`tf.GradientTape`](https://www.tensorflow.org/api_docs/python/tf/GradientTape?version=stable) to trace operations for computing gradients later. \n",
"\n",
"When a forward pass is made through the network, all forward-pass operations get recorded to a \"tape\"; then, to compute the gradient, the tape is played backwards. By default, the tape is discarded after it is played backwards; this means that a particular `tf.GradientTape` can only\n",
"compute one gradient, and subsequent calls throw a runtime error. However, we can compute multiple gradients over the same computation by creating a ```persistent``` gradient tape. \n",
"\n",
"First, we will look at how we can compute gradients using GradientTape and access them for computation. We define the simple function $ y = x^2$ and compute the gradient:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "tdkqk8pw5yJM"
},
"source": [
"### Gradient computation with GradientTape ###\n",
"\n",
"# y = x^2\n",
"# Example: x = 3.0\n",
"x = tf.Variable(3.0)\n",
"\n",
"# Initiate the gradient tape\n",
"with tf.GradientTape() as tape:\n",
" # Define the function\n",
" y = x * x\n",
"# Access the gradient -- derivative of y with respect to x\n",
"dy_dx = tape.gradient(y, x)\n",
"\n",
"assert dy_dx.numpy() == 6.0"
],
"execution_count": 22,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "JhU5metS5xF3"
},
"source": [
"In training neural networks, we use differentiation and stochastic gradient descent (SGD) to optimize a loss function. Now that we have a sense of how `GradientTape` can be used to compute and access derivatives, we will look at an example where we use automatic differentiation and SGD to find the minimum of $L=(x-x_f)^2$. Here $x_f$ is a variable for a desired value we are trying to optimize for; $L$ represents a loss that we are trying to minimize. While we can clearly solve this problem analytically ($x_{min}=x_f$), considering how we can compute this using `GradientTape` sets us up nicely for future labs where we use gradient descent to optimize entire neural network losses."
]
},
{
"cell_type": "code",
"metadata": {
"attributes": {
"classes": [
"py"
],
"id": ""
},
"id": "7g1yWiSXqEf-",
"outputId": "9eae6abb-532c-4139-fba3-43461b93bc4e",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 313
}
},
"source": [
"### Function minimization with automatic differentiation and SGD ###\n",
"\n",
"# Initialize a random value for our initial x\n",
"x = tf.Variable([tf.random.normal([1])])\n",
"print(\"Initializing x={}\".format(x.numpy()))\n",
"\n",
"learning_rate = 1e-2 # learning rate for SGD\n",
"history = []\n",
"# Define the target value\n",
"x_f = 4\n",
"\n",
"# We will run SGD for a number of iterations. At each iteration, we compute the loss, \n",
"# compute the derivative of the loss with respect to x, and perform the SGD update.\n",
"for i in range(500):\n",
" with tf.GradientTape() as tape:\n",
" '''TODO: define the loss as described above'''\n",
" loss = tf.square(x - x_f)\n",
"\n",
" # loss minimization using gradient tape\n",
" grad = tape.gradient(loss, x) # compute the derivative of the loss with respect to x\n",
" new_x = x - learning_rate*grad # sgd update\n",
" x.assign(new_x) # update the value of x\n",
" history.append(x.numpy()[0])\n",
"\n",
"# Plot the evolution of x as we optimize towards x_f!\n",
"plt.plot(history)\n",
"plt.plot([0, 500],[x_f,x_f])\n",
"plt.legend(('Predicted', 'True'))\n",
"plt.xlabel('Iteration')\n",
"plt.ylabel('x value')"
],
"execution_count": 23,
"outputs": [
{
"output_type": "stream",
"text": [
"Initializing x=[[-0.00839665]]\n"
],
"name": "stdout"
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"Text(0, 0.5, 'x value')"
]
},
"metadata": {
"tags": []
},
"execution_count": 23
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"tags": [],
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "pC7czCwk3ceH"
},
"source": [
"`GradientTape` provides an extremely flexible framework for automatic differentiation. In order to back propagate errors through a neural network, we track forward passes on the Tape, use this information to determine the gradients, and then use these gradients for optimization using SGD."
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment