cooganb/Conditional-Word-level-RNN.ipynb

## Conditional-Word-level-RNN.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Conditional Word-level RNN"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Requirements\n",
    "\n",
    "You will need [PyTorch](http://pytorch.org/) to build and train the models"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import unicodedata\n",
    "import string\n",
    "import re\n",
    "import random\n",
    "import time\n",
    "import math\n",
    "import ast\n",
    "\n",
    "import torch\n",
    "import torch.nn as nn\n",
    "from torch.autograd import Variable\n",
    "from torch import optim\n",
    "import torch.nn.functional as F"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we will also define a constant to decide whether to use the GPU (with CUDA specifically) or the CPU. **If you don't have a GPU, set this to `False`**. Later when we create tensors, this variable will be used to decide whether we keep them on CPU or move them to GPU."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "metadata": {},
   "outputs": [],
   "source": [
    "USE_CUDA = False"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Indexing words\n",
    "\n",
    "We'll need a unique index per word to use as the inputs and targets of the networks later. To keep track of all this we will use a helper class called `Lang` which has word &rarr; index (`word2index`) and index &rarr; word (`index2word`) dictionaries, as well as a count of each word `word2count` to use to later replace rare words."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "SOA_token = 0\n",
    "EOA_token = 1\n",
    "\n",
    "class Lang:\n",
    "    '''Instantiates an object class that can keep its own word index count'''\n",
    "    def __init__(self, name):\n",
    "        self.name = name\n",
    "        self.word2index = {}\n",
    "        self.word2count = {}\n",
    "        self.index2word = {0: \"SOA\", 1: \"EOA\"}\n",
    "        self.n_words = 2 # Count SOA and EOA\n",
    "\n",
    "    def index_words(self, sentence):\n",
    "        for word in sentence.split(' '):\n",
    "            self.index_word(word)\n",
    "\n",
    "    def index_word(self, word):\n",
    "        if word not in self.word2index:\n",
    "            self.word2index[word] = self.n_words\n",
    "            self.word2count[word] = 1\n",
    "            self.index2word[self.n_words] = word\n",
    "            self.n_words += 1\n",
    "        else:\n",
    "            self.word2count[word] += 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Reading and decoding files\n",
    "\n",
    "We need to create word indexes for the Tensors to train a Word-level RNN.\n",
    "\n",
    "#### Normalize text\n",
    "\n",
    "The function below normalizes the text in the lead paragraph, taking out certain symbols that will create two indexes (Bush and Bush's, for example). We're also adding a space to the HTML tag `<p>` and `</p>` so they will be treated as separate words"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "def normalize_string(s):\n",
    "    '''Function meant to clean up lead paragraph strings, removing a different combo of items'''\n",
    "    # Some of these are very crude and may need to be cleaned up\n",
    "    translator = str.maketrans('', '', r'''.,'\"''')\n",
    "    s = s.lower().strip()\n",
    "    s = re.sub(r\"<p>\", r\"<p> \", s)\n",
    "    s = re.sub(r\"</p>\", r\" </p>\", s)\n",
    "    s = s.translate(translator)\n",
    "    s = re.sub(r\"[^a-zA-Z!?/<>[0-9]]+\", r\" \", s)\n",
    "\n",
    "    return s"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To read the data file we will split the file into lines, and then split lines into pairs.\n",
    "\n",
    "We will also create a Lang instance, which will create a word2index and index2word we'll use for encoding / decoding Tensors.\n",
    "\n",
    "Also, some articles may not have keywords associated with them. We will drop those while reading in the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "def read_lines(lang_class_name, doc):\n",
    "    '''Takes in the kw_text pairs, normalizes and tokenizes'''\n",
    "    print(\"Reading lines...\")\n",
    "\n",
    "    # The input document is a list of keyword and text pairings\n",
    "    # joined by tab /t and then separated by a line break /n\n",
    "\n",
    "    # Read the file and split into lines\n",
    "    lines = open(doc).read().strip().split('\\n')\n",
    "\n",
    "    # Split every line into pairs and normalizes pair[1] which is lead text\n",
    "\n",
    "    pairs = [l.split('\\t') for l in lines]\n",
    "    normalizationFail = 0\n",
    "    for pair in pairs:\n",
    "            \n",
    "        try:\n",
    "            pair[0] = ast.literal_eval(pair[0])\n",
    "            pair[1] = normalize_string(str(pair[1]))\n",
    "\n",
    "        except:\n",
    "            print(\"Issue with normalizing strings\")\n",
    "            pairs.remove(pair)\n",
    "\n",
    "    \n",
    "    lang_class_name = Lang(lang_class_name)\n",
    "    return lang_class_name, pairs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The full process for preparing the data is:\n",
    "\n",
    "* Read text file and split into lines, split lines into pairs\n",
    "* Normalize text\n",
    "* Make word lists from sentences in pairs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Reading lines...\n",
      "Read 4070 sentence pairs\n",
      "Indexing words...\n",
      "2\n"
     ]
    }
   ],
   "source": [
    "def prepare_data(nyt_russ, doc):\n",
    "    '''Preprocesses text data, returns a Lang object and list'''\n",
    "    all_categories = set()\n",
    "    \n",
    "    # Split, organize, preprocess and clean text\n",
    "\n",
    "    nyt_russ, pairs = read_lines(nyt_russ, doc)\n",
    "    print(\"Read %s sentence pairs\" % len(pairs))\n",
    "    \n",
    "    # Using Lang object returned from read_lead_kw,\n",
    "    # Create a word index list\n",
    "\n",
    "    print(\"Indexing words...\")\n",
    "    for pair in pairs:\n",
    "\n",
    "        try:\n",
    "            nyt_russ.index_words(pair[1])\n",
    "\n",
    "        except:\n",
    "            print(\"Issue Indexing\")\n",
    "        \n",
    "        try:\n",
    "\n",
    "            all_categories = all_categories.union(set(pair[0]))\n",
    "\n",
    "        except:\n",
    "            print(\"Issue creating categories\")\n",
    "    \n",
    "    all_categories = list(all_categories)\n",
    "    \n",
    "    return nyt_russ, pairs, all_categories\n",
    "\n",
    "nyt_russ, pairs, all_categories = prepare_data('nyt_russ', 'outfile.txt')\n",
    "\n",
    "# Print an example pair\n",
    "# print(random.choice(pairs))\n",
    "print(nyt_russ.word2index['<p>'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Preparing for Training\n",
    "\n",
    "Grab a random training sample"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[['united states armament and defense'], '<p> you dont need to be clausewitz to figure out what finally brought slobodan milosevic to the negotiating table on natos terms and why he is now stalling again the answer to both questions is russia </p> <p> the best way to understand russias critical role is to think of a divorce proceeding: you are negotiating a divorce with your wife she has no lawyer and you have johnnie cochran every time your wife asks for something mr cochran steps in to protect you with a point of law things are going so well you ask for a lunch break when you come back from lunch you find that mr cochran has switched sides and is now acting as your wifes lawyer nodding in agreement at all her demands youre in trouble </p>']\n"
     ]
    }
   ],
   "source": [
    "# Get a random sample\n",
    "def random_training_pair():\n",
    "    training_sample = random.choice(pairs)\n",
    "    return training_sample\n",
    "\n",
    "print(random_training_pair())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For each timestep (that is, for each word in a training sequence) the inputs of the network will be `(category, current word, hidden state)` and the outputs will be `(next word, next hidden state)`. So for each training set, we'll need the category vector, a set of input words, and a set of output/target words."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since we are predicting the next letter from the current letter for each timestep, the letter pairs are groups of consecutive letters from the line - e.g. for `\"Apple Banana Compost Diderot <EOA>\"` we would create (\"Apple\", \"Banana\"), (\"Banana\", \"Compost\"), (\"Compost\", \"Diderot\"), (\"D\", \"EOA\").\n",
    "\n",
    "![](https://i.imgur.com/JH58tXY.png)\n",
    "\n",
    "The category tensor is a tensor of size `<1 x n_categories>`. When training we feed it to the network at every timestep - this is a design choice, it could have been included as part of initial hidden state or some other strategy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 95,
   "metadata": {},
   "outputs": [],
   "source": [
    "n_words = len(nyt_russ.word2index)\n",
    "n_categories = len(all_categories)\n",
    "\n",
    "# Vector with multiply entries for each category present\n",
    "\n",
    "def make_category_input(categories):\n",
    "#     tensor = torch.zeros(1, n_categories)\n",
    "    l = [0] * n_categories\n",
    "    for i in categories:\n",
    "        li = all_categories.index(i)\n",
    "        l[li] = 1\n",
    "    var = torch.LongTensor(l)\n",
    "    var = var.view(1, n_categories)\n",
    "#     var = torch.unsqueeze(var, 0)\n",
    "    var = Variable(var)\n",
    "    if USE_CUDA: var = var.cuda()\n",
    "    \n",
    "    return var\n",
    "\n",
    "# Return a list of indexes, one for each word in the sentence\n",
    "# Pad to meet MAX_LENGTH (needed because issues arose of variable input length)\n",
    "\n",
    "MAX_LENGTH = 300\n",
    "\n",
    "def indexes_from_input(lang, input):\n",
    "    wordIndex = [lang.word2index[word] for word in input.split(' ')]\n",
    "#     if len(wordIndex) > MAX_LENGTH:\n",
    "#         fullLength = MAX_LENGTH - 1\n",
    "#         wordIndex = wordIndex[0:fullLength]\n",
    "#     else:\n",
    "#         diff = MAX_LENGTH - len(wordIndex)\n",
    "#         l = []\n",
    "#         l = [0] * diff\n",
    "#         wordIndex = wordIndex + l\n",
    "        \n",
    "    return wordIndex\n",
    "\n",
    "# FIRST MATRIX\n",
    "# One-hot matrix of first to last letters (not including EOS) for input\n",
    "# def make_word_input(lang, words):\n",
    "#     indexList = indexes_from_input(lang, words)\n",
    "#     var = Variable(torch.FloatTensor(indexList).view(-1, 1))\n",
    "# #     print('var =', var)\n",
    "#     if USE_CUDA: var = var.cuda()\n",
    "    \n",
    "#     return var\n",
    "\n",
    "\n",
    "# NEW DRAFT MATRIX\n",
    "# One-hot matrix of first to last letters (not including EOS) for input\n",
    "def make_word_input(lang, words):\n",
    "    # Return a list of word entries for text:\n",
    "    indexList = indexes_from_input(lang, words)\n",
    "    \n",
    "    # Make a 2D matrix, a stack of one-hot vectors\n",
    "    # 1 x n_words, representing the text\n",
    "    \n",
    "    tensor = torch.LongTensor(len(words), n_words)\n",
    "    words = words.split(' ')\n",
    "    for wi in range(len(words)):\n",
    "        tensor[wi][indexList[wi]] = 1\n",
    "    tensor = tensor.view(-1, 1, n_words)\n",
    "    var = Variable(tensor)\n",
    "    \n",
    "    #     print('var =', var)\n",
    "    if USE_CUDA: var = var.cuda()\n",
    "    \n",
    "    return var\n",
    "\n",
    "# LongTensor of second word to end token (EOS) for target\n",
    "\n",
    "def make_target(lang, words):\n",
    "    indexList = indexes_from_input(lang, words)\n",
    "    \n",
    "    # Shift it all one forward, append EOA to make it a target\n",
    "    indexList = indexList[1:]\n",
    "    indexList.append(1) # EOA\n",
    "    \n",
    "    # Create a stack of one-hot tensors like in make_word_input\n",
    "    tensor = torch.zeros(len(words), n_words)\n",
    "    words = words.split(' ')\n",
    "    for wi in range(len(words)):\n",
    "        tensor[wi][indexList[wi]] = 1\n",
    "    tensor = tensor.view(-1, 1, n_words)\n",
    "    var = Variable(tensor)\n",
    "    \n",
    "#     print('var =', var)\n",
    "    \n",
    "    if USE_CUDA: var = var.cuda()\n",
    "    return var"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For convenience during training we'll make a `random_training_set` function that fetches a random (category, line) pair and turns them into the required (category, input, target) tensors."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Make category, input, and target tensors from a random category, line pair\n",
    "def random_training_set():\n",
    "    training_sample = random_training_pair()\n",
    "    category_input = make_category_input(training_sample[0])\n",
    "    line_input = make_word_input(nyt_russ, training_sample[1])\n",
    "    line_target = make_target(nyt_russ, training_sample[1])\n",
    "    \n",
    "    return category_input, line_input, line_target\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Creating the Network"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 116,
   "metadata": {},
   "outputs": [],
   "source": [
    "class RNN(nn.Module):\n",
    "    def __init__(self, input_size, hidden_size, output_size):\n",
    "        super(RNN, self).__init__()\n",
    "        self.input_size = input_size\n",
    "        self.hidden_size = hidden_size\n",
    "        self.output_size = output_size\n",
    "        self.bias = False\n",
    "        print(input_size, hidden_size, output_size)\n",
    "        self.i2h = nn.GRU(n_categories + input_size + hidden_size, hidden_size)\n",
    "        self.i2o = nn.Linear(n_categories + input_size + hidden_size, output_size)\n",
    "        self.o2o = nn.Linear(hidden_size + output_size, output_size)\n",
    "        self.softmax = nn.LogSoftmax()\n",
    "    \n",
    "    def forward(self, category, input, hidden):\n",
    "        print(\"Category: \",category.data.type(), category.size(), \"\\n\", \"Input: \",input.data.type(), input.size(), \"\\n\", \"Hidden: \", hidden.data.type(), hidden.size(), \"\\n\")\n",
    "        input_combined = torch.cat((category, input, hidden), 1)\n",
    "        print(\"Input_Combined: \", input_combined.data.type())\n",
    "        hidden = self.i2h(input_combined)\n",
    "        output = self.i2o(input_combined)\n",
    "        output_combined = torch.cat((hidden, output), 1)\n",
    "        output = self.o2o(output_combined)\n",
    "        return output, hidden\n",
    "\n",
    "    def init_hidden(self):\n",
    "        hidden = Variable(torch.LongTensor(1, self.hidden_size))\n",
    "        if USE_CUDA: hidden = hidden.cuda()\n",
    "        return hidden\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Training the Network\n",
    "\n",
    "In contrast to classification, where only the last output is used, we are making a prediction at every step, so we are calculating loss at every step.\n",
    "\n",
    "The magic of autograd allows you to simply sum these losses at each step and call backward at the end. But don't ask me why initializing loss with 0 works."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 106,
   "metadata": {},
   "outputs": [],
   "source": [
    "def train(category_tensor, input_line_tensor, target_line_tensor):\n",
    "    hidden = rnn.init_hidden()\n",
    "    optimizer.zero_grad()\n",
    "    loss = 0\n",
    "\n",
    "\n",
    "    ## The below needs to be the range of the size of the incoming article\n",
    "\n",
    "    \n",
    "    for i in range(input_line_tensor.size()[1]):\n",
    "        output, hidden = rnn(category_tensor, input_line_tensor[i], hidden)\n",
    "        print(output, target_line_tensor[i])\n",
    "        loss += criterion(output, target_line_tensor[i])\n",
    "\n",
    "    loss.backward()\n",
    "    optimizer.step()\n",
    "    \n",
    "    return output, loss.data[0] / input_line_tensor.size()[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To keep track of how long training takes I am adding a `time_since(t)` function which returns a human readable string:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {},
   "outputs": [],
   "source": [
    "def time_since(t):\n",
    "    now = time.time()\n",
    "    s = now - t\n",
    "    m = math.floor(s / 60)\n",
    "    s -= m * 60\n",
    "    return '%dm %ds' % (m, s)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Training is business as usual - call train a bunch of times and wait a few minutes, printing the current time and loss every `print_every` epochs, and keeping store of an average loss per `plot_every` epochs in `all_losses` for plotting later."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 117,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "21141 128 21141\n",
      "Category:  torch.LongTensor torch.Size([1, 56]) \n",
      " Input:  torch.LongTensor torch.Size([1, 21141]) \n",
      " Hidden:  torch.LongTensor torch.Size([1, 128]) \n",
      "\n",
      "Input_Combined:  torch.LongTensor\n"
     ]
    },
    {
     "ename": "TypeError",
     "evalue": "torch.addmm received an invalid combination of arguments - got (int, torch.LongTensor, int, torch.LongTensor, torch.FloatTensor, out=torch.LongTensor), but expected one of:\n * (torch.LongTensor source, torch.LongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (torch.LongTensor source, torch.SparseLongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (int beta, torch.LongTensor source, torch.LongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (torch.LongTensor source, int alpha, torch.LongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (int beta, torch.LongTensor source, torch.SparseLongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (torch.LongTensor source, int alpha, torch.SparseLongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (int beta, torch.LongTensor source, int alpha, torch.LongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n      didn't match because some of the arguments have invalid types: (\u001b[32;1mint\u001b[0m, \u001b[32;1mtorch.LongTensor\u001b[0m, \u001b[32;1mint\u001b[0m, \u001b[32;1mtorch.LongTensor\u001b[0m, \u001b[31;1mtorch.FloatTensor\u001b[0m, \u001b[32;1mout=torch.LongTensor\u001b[0m)\n * (int beta, torch.LongTensor source, int alpha, torch.SparseLongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n      didn't match because some of the arguments have invalid types: (\u001b[32;1mint\u001b[0m, \u001b[32;1mtorch.LongTensor\u001b[0m, \u001b[32;1mint\u001b[0m, \u001b[31;1mtorch.LongTensor\u001b[0m, \u001b[31;1mtorch.FloatTensor\u001b[0m, \u001b[32;1mout=torch.LongTensor\u001b[0m)\n",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-117-3570efd110ad>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m     20\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     21\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mepoch\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mrange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn_epochs\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 22\u001b[0;31m     \u001b[0moutput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mloss\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtrain\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0mrandom_training_set\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m     23\u001b[0m     \u001b[0mloss_avg\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0mloss\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     24\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m<ipython-input-106-6ff071efaec3>\u001b[0m in \u001b[0;36mtrain\u001b[0;34m(category_tensor, input_line_tensor, target_line_tensor)\u001b[0m\n\u001b[1;32m      9\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     10\u001b[0m     \u001b[0;32mfor\u001b[0m \u001b[0mi\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mrange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput_line_tensor\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 11\u001b[0;31m         \u001b[0moutput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mrnn\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcategory_tensor\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput_line_tensor\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m     12\u001b[0m         \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0moutput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtarget_line_tensor\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     13\u001b[0m         \u001b[0mloss\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0mcriterion\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0moutput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtarget_line_tensor\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py\u001b[0m in \u001b[0;36m__call__\u001b[0;34m(self, *input, **kwargs)\u001b[0m\n\u001b[1;32m    222\u001b[0m         \u001b[0;32mfor\u001b[0m \u001b[0mhook\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_forward_pre_hooks\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    223\u001b[0m             \u001b[0mhook\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 224\u001b[0;31m         \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mforward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    225\u001b[0m         \u001b[0;32mfor\u001b[0m \u001b[0mhook\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_forward_hooks\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    226\u001b[0m             \u001b[0mhook_result\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mhook\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mresult\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m<ipython-input-116-31a4f7187888>\u001b[0m in \u001b[0;36mforward\u001b[0;34m(self, category, input, hidden)\u001b[0m\n\u001b[1;32m     16\u001b[0m         \u001b[0minput_combined\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcategory\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     17\u001b[0m         \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Input_Combined: \"\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput_combined\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 18\u001b[0;31m         \u001b[0mhidden\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mi2h\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput_combined\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m     19\u001b[0m         \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mi2o\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput_combined\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     20\u001b[0m         \u001b[0moutput_combined\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mhidden\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0moutput\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py\u001b[0m in \u001b[0;36m__call__\u001b[0;34m(self, *input, **kwargs)\u001b[0m\n\u001b[1;32m    222\u001b[0m         \u001b[0;32mfor\u001b[0m \u001b[0mhook\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_forward_pre_hooks\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    223\u001b[0m             \u001b[0mhook\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 224\u001b[0;31m         \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mforward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    225\u001b[0m         \u001b[0;32mfor\u001b[0m \u001b[0mhook\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_forward_hooks\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    226\u001b[0m             \u001b[0mhook_result\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mhook\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mresult\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/modules/rnn.py\u001b[0m in \u001b[0;36mforward\u001b[0;34m(self, input, hx)\u001b[0m\n\u001b[1;32m    160\u001b[0m             \u001b[0mflat_weight\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mflat_weight\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    161\u001b[0m         )\n\u001b[0;32m--> 162\u001b[0;31m         \u001b[0moutput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mfunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mall_weights\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    163\u001b[0m         \u001b[0;32mif\u001b[0m \u001b[0mis_packed\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    164\u001b[0m             \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mPackedSequence\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0moutput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbatch_sizes\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/_functions/rnn.py\u001b[0m in \u001b[0;36mforward\u001b[0;34m(input, *fargs, **fkwargs)\u001b[0m\n\u001b[1;32m    349\u001b[0m         \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    350\u001b[0m             \u001b[0mfunc\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mAutogradRNN\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 351\u001b[0;31m         \u001b[0;32mreturn\u001b[0m \u001b[0mfunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0mfargs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mfkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    352\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    353\u001b[0m     \u001b[0;32mreturn\u001b[0m \u001b[0mforward\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/_functions/rnn.py\u001b[0m in \u001b[0;36mforward\u001b[0;34m(input, weight, hidden)\u001b[0m\n\u001b[1;32m    242\u001b[0m             \u001b[0minput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtranspose\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    243\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 244\u001b[0;31m         \u001b[0mnexth\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mfunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mweight\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    245\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    246\u001b[0m         \u001b[0;32mif\u001b[0m \u001b[0mbatch_first\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0mbatch_sizes\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/_functions/rnn.py\u001b[0m in \u001b[0;36mforward\u001b[0;34m(input, hidden, weight)\u001b[0m\n\u001b[1;32m     82\u001b[0m                 \u001b[0ml\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mi\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mnum_directions\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0mj\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     83\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 84\u001b[0;31m                 \u001b[0mhy\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0minner\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0ml\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mweight\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0ml\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m     85\u001b[0m                 \u001b[0mnext_hidden\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mhy\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     86\u001b[0m                 \u001b[0mall_output\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0moutput\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/_functions/rnn.py\u001b[0m in \u001b[0;36mforward\u001b[0;34m(input, hidden, weight)\u001b[0m\n\u001b[1;32m    111\u001b[0m         \u001b[0msteps\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mrange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m-\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mreverse\u001b[0m \u001b[0;32melse\u001b[0m \u001b[0mrange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    112\u001b[0m         \u001b[0;32mfor\u001b[0m \u001b[0mi\u001b[0m \u001b[0;32min\u001b[0m \u001b[0msteps\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 113\u001b[0;31m             \u001b[0mhidden\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0minner\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0mweight\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    114\u001b[0m             \u001b[0;31m# hack to handle LSTM\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    115\u001b[0m             \u001b[0moutput\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mhidden\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mhidden\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtuple\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32melse\u001b[0m \u001b[0mhidden\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/_functions/rnn.py\u001b[0m in \u001b[0;36mGRUCell\u001b[0;34m(input, hidden, w_ih, w_hh, b_ih, b_hh)\u001b[0m\n\u001b[1;32m     52\u001b[0m         \u001b[0;32mreturn\u001b[0m \u001b[0mstate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mgi\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mgh\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mb_ih\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m \u001b[0;32melse\u001b[0m \u001b[0mstate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mgi\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mgh\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb_ih\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb_hh\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     53\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 54\u001b[0;31m     \u001b[0mgi\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mF\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlinear\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mw_ih\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb_ih\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m     55\u001b[0m     \u001b[0mgh\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mF\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlinear\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mhidden\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mw_hh\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb_hh\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     56\u001b[0m     \u001b[0mi_r\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mi_i\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mi_n\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mgi\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mchunk\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/functional.py\u001b[0m in \u001b[0;36mlinear\u001b[0;34m(input, weight, bias)\u001b[0m\n\u001b[1;32m    553\u001b[0m         \u001b[0;32mreturn\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0maddmm\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbias\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mweight\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mt\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    554\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 555\u001b[0;31m     \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmatmul\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mweight\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mt\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    556\u001b[0m     \u001b[0;32mif\u001b[0m \u001b[0mbias\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    557\u001b[0m         \u001b[0moutput\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0mbias\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/autograd/variable.py\u001b[0m in \u001b[0;36mmatmul\u001b[0;34m(self, other)\u001b[0m\n\u001b[1;32m    558\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    559\u001b[0m     \u001b[0;32mdef\u001b[0m \u001b[0mmatmul\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mother\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 560\u001b[0;31m         \u001b[0;32mreturn\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmatmul\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mother\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    561\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    562\u001b[0m     \u001b[0;34m@\u001b[0m\u001b[0mstaticmethod\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/functional.py\u001b[0m in \u001b[0;36mmatmul\u001b[0;34m(tensor1, tensor2, out)\u001b[0m\n\u001b[1;32m    166\u001b[0m     \u001b[0;32melif\u001b[0m \u001b[0mdim_tensor1\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;36m1\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0mdim_tensor2\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;36m2\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    167\u001b[0m         \u001b[0;32mif\u001b[0m \u001b[0mout\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 168\u001b[0;31m             \u001b[0;32mreturn\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmm\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtensor1\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0munsqueeze\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtensor2\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msqueeze_\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    169\u001b[0m         \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    170\u001b[0m             \u001b[0;32mreturn\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmm\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtensor1\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0munsqueeze\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtensor2\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mout\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mout\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msqueeze_\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/autograd/variable.py\u001b[0m in \u001b[0;36mmm\u001b[0;34m(self, matrix)\u001b[0m\n\u001b[1;32m    577\u001b[0m     \u001b[0;32mdef\u001b[0m \u001b[0mmm\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmatrix\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    578\u001b[0m         \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mVariable\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnew\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmatrix\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 579\u001b[0;31m         \u001b[0;32mreturn\u001b[0m \u001b[0mAddmm\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mapply\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0moutput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmatrix\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    580\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    581\u001b[0m     \u001b[0;32mdef\u001b[0m \u001b[0mbmm\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbatch\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/autograd/_functions/blas.py\u001b[0m in \u001b[0;36mforward\u001b[0;34m(ctx, add_matrix, matrix1, matrix2, alpha, beta, inplace)\u001b[0m\n\u001b[1;32m     24\u001b[0m         \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0m_get_output\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mctx\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0madd_matrix\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minplace\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0minplace\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     25\u001b[0m         return torch.addmm(alpha, add_matrix, beta,\n\u001b[0;32m---> 26\u001b[0;31m                            matrix1, matrix2, out=output)\n\u001b[0m\u001b[1;32m     27\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     28\u001b[0m     \u001b[0;34m@\u001b[0m\u001b[0mstaticmethod\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;31mTypeError\u001b[0m: torch.addmm received an invalid combination of arguments - got (int, torch.LongTensor, int, torch.LongTensor, torch.FloatTensor, out=torch.LongTensor), but expected one of:\n * (torch.LongTensor source, torch.LongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (torch.LongTensor source, torch.SparseLongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (int beta, torch.LongTensor source, torch.LongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (torch.LongTensor source, int alpha, torch.LongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (int beta, torch.LongTensor source, torch.SparseLongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (torch.LongTensor source, int alpha, torch.SparseLongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (int beta, torch.LongTensor source, int alpha, torch.LongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n      didn't match because some of the arguments have invalid types: (\u001b[32;1mint\u001b[0m, \u001b[32;1mtorch.LongTensor\u001b[0m, \u001b[32;1mint\u001b[0m, \u001b[32;1mtorch.LongTensor\u001b[0m, \u001b[31;1mtorch.FloatTensor\u001b[0m, \u001b[32;1mout=torch.LongTensor\u001b[0m)\n * (int beta, torch.LongTensor source, int alpha, torch.SparseLongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n      didn't match because some of the arguments have invalid types: (\u001b[32;1mint\u001b[0m, \u001b[32;1mtorch.LongTensor\u001b[0m, \u001b[32;1mint\u001b[0m, \u001b[31;1mtorch.LongTensor\u001b[0m, \u001b[31;1mtorch.FloatTensor\u001b[0m, \u001b[32;1mout=torch.LongTensor\u001b[0m)\n"
     ]
    }
   ],
   "source": [
    "n_epochs = 100000\n",
    "print_every = 10\n",
    "plot_every = 500\n",
    "all_losses = []\n",
    "loss_avg = 0 # Zero every plot_every epochs to keep a running average\n",
    "learning_rate = 0.0005\n",
    "\n",
    "rnn = RNN(n_words, 128, n_words)\n",
    "\n",
    "# Move models to GPU\n",
    "if USE_CUDA:\n",
    "    rnn.cuda()\n",
    "\n",
    "\n",
    "\n",
    "optimizer = torch.optim.Adam(rnn.parameters(), lr=learning_rate)\n",
    "criterion = nn.CrossEntropyLoss()\n",
    "\n",
    "start = time.time()\n",
    "\n",
    "for epoch in range(1, n_epochs + 1):\n",
    "    output, loss = train(*random_training_set())\n",
    "    loss_avg += loss\n",
    "    \n",
    "    if epoch % print_every == 0:\n",
    "        print('%s (%d %d%%) %.4f' % (time_since(start), epoch, epoch / n_epochs * 100, loss))\n",
    "\n",
    "    if epoch % plot_every == 0:\n",
    "        all_losses.append(loss_avg / plot_every)\n",
    "        loss_avg = 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Plotting the Network\n",
    "\n",
    "Plotting the historical loss from all_losses (hopefully) shows the network learning:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[<matplotlib.lines.Line2D at 0x109b00358>]"
      ]
     },
     "execution_count": 65,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYYAAAD8CAYAAABzTgP2AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAADqFJREFUeJzt23+o3fV9x/Hnq7k0axE00WitMbu2CiNu0MJBKdvA1V9x\n0EZa/7D7o2FryR+rf6yl0BTHtOof6tZZSruN0BZCYdXOURqQItFWGGNYT6yjzdo0t7HFpLZNjQhO\nqmR974/7dTufy4k3ud9z78nR5wMO93y/38+99/3xgs97zvcmVYUkSa9607QHkCSdWQyDJKlhGCRJ\nDcMgSWoYBklSwzBIkhqGQZLUMAySpIZhkCQ15qY9wEqcd955NT8/P+0xJGmm7N+//9dVtWm5dTMZ\nhvn5eYbD4bTHkKSZkuRnp7LOt5IkSQ3DIElqGAZJUsMwSJIahkGS1DAMkqSGYZAkNQyDJKlhGCRJ\nDcMgSWoYBklSwzBIkhqGQZLUMAySpIZhkCQ1DIMkqWEYJEkNwyBJahgGSVLDMEiSGoZBktQwDJKk\nhmGQJDUMgySpMZEwJNmW5GCShSS7xlxfn+SB7vrjSeaXXN+S5MUkn5zEPJKklesdhiTrgC8CNwBb\ngQ8l2bpk2UeA56vqUuA+4J4l1/8e+FbfWSRJ/U3iFcMVwEJVHa6qV4D7ge1L1mwH9nTPHwSuThKA\nJDcCTwMHJjCLJKmnSYThIuCZkeMj3bmxa6rqBPACcG6Ss4BPAZ+ZwBySpAmY9s3n24H7qurF5RYm\n2ZlkmGR47Nix1Z9Mkt6g5ibwNY4CF48cb+7OjVtzJMkccDbwHHAlcFOSe4FzgN8m+U1VfWHpN6mq\n3cBugMFgUBOYW5I0xiTC8ARwWZJLWAzAzcCfLVmzF9gB/AdwE/Dtqirgj19dkOR24MVxUZAkrZ3e\nYaiqE0luAR4G1gFfqaoDSe4AhlW1F/gy8NUkC8BxFuMhSToDZfEX99kyGAxqOBxOewxJmilJ9lfV\nYLl10775LEk6wxgGSVLDMEiSGoZBktQwDJKkhmGQJDUMgySpYRgkSQ3DIElqGAZJUsMwSJIahkGS\n1DAMkqSGYZAkNQyDJKlhGCRJDcMgSWoYBklSwzBIkhqGQZLUMAySpIZhkCQ1DIMkqWEYJEkNwyBJ\nahgGSVLDMEiSGoZBktQwDJKkhmGQJDUMgySpMZEwJNmW5GCShSS7xlxfn+SB7vrjSea789cm2Z/k\n+93H905iHknSyvUOQ5J1wBeBG4CtwIeSbF2y7CPA81V1KXAfcE93/tfA+6rqD4AdwFf7ziNJ6mcS\nrxiuABaq6nBVvQLcD2xfsmY7sKd7/iBwdZJU1feq6ufd+QPAW5Ksn8BMkqQVmkQYLgKeGTk+0p0b\nu6aqTgAvAOcuWfNB4MmqenkCM0mSVmhu2gMAJLmcxbeXrnuNNTuBnQBbtmxZo8kk6Y1nEq8YjgIX\njxxv7s6NXZNkDjgbeK473gx8A/hwVf3kZN+kqnZX1aCqBps2bZrA2JKkcSYRhieAy5JckuTNwM3A\n3iVr9rJ4cxngJuDbVVVJzgEeAnZV1b9PYBZJUk+9w9DdM7gFeBj4IfD1qjqQ5I4k7++WfRk4N8kC\n8Ang1T9pvQW4FPibJE91j/P7ziRJWrlU1bRnOG2DwaCGw+G0x5CkmZJkf1UNllvnv3yWJDUMgySp\nYRgkSQ3DIElqGAZJUsMwSJIahkGS1DAMkqSGYZAkNQyDJKlhGCRJDcMgSWoYBklSwzBIkhqGQZLU\nMAySpIZhkCQ1DIMkqWEYJEkNwyBJahgGSVLDMEiSGoZBktQwDJKkhmGQJDUMgySpYRgkSQ3DIElq\nGAZJUsMwSJIaEwlDkm1JDiZZSLJrzPX1SR7orj+eZH7k2qe78weTXD+JeSRJK9c7DEnWAV8EbgC2\nAh9KsnXJso8Az1fVpcB9wD3d524FbgYuB7YB/9B9PUnSlEziFcMVwEJVHa6qV4D7ge1L1mwH9nTP\nHwSuTpLu/P1V9XJVPQ0sdF9PkjQlkwjDRcAzI8dHunNj11TVCeAF4NxT/FxJ0hqamZvPSXYmGSYZ\nHjt2bNrjSNLr1iTCcBS4eOR4c3du7Jokc8DZwHOn+LkAVNXuqhpU1WDTpk0TGFuSNM4kwvAEcFmS\nS5K8mcWbyXuXrNkL7Oie3wR8u6qqO39z91dLlwCXAd+dwEySpBWa6/sFqupEkluAh4F1wFeq6kCS\nO4BhVe0Fvgx8NckCcJzFeNCt+zrwX8AJ4GNV9T99Z5IkrVwWf3GfLYPBoIbD4bTHkKSZkmR/VQ2W\nWzczN58lSWvDMEiSGoZBktQwDJKkhmGQJDUMgySpYRgkSQ3DIElqGAZJUsMwSJIahkGS1DAMkqSG\nYZAkNQyDJKlhGCRJDcMgSWoYBklSwzBIkhqGQZLUMAySpIZhkCQ1DIMkqWEYJEkNwyBJahgGSVLD\nMEiSGoZBktQwDJKkhmGQJDUMgySpYRgkSY1eYUiyMcm+JIe6jxtOsm5Ht+ZQkh3dubcmeSjJj5Ic\nSHJ3n1kkSZPR9xXDLuDRqroMeLQ7biTZCNwGXAlcAdw2EpC/q6rfA94N/GGSG3rOI0nqqW8YtgN7\nuud7gBvHrLke2FdVx6vqeWAfsK2qXqqq7wBU1SvAk8DmnvNIknrqG4YLqurZ7vkvgAvGrLkIeGbk\n+Eh37v8kOQd4H4uvOiRJUzS33IIkjwBvG3Pp1tGDqqokdboDJJkDvgZ8vqoOv8a6ncBOgC1btpzu\nt5EknaJlw1BV15zsWpJfJrmwqp5NciHwqzHLjgJXjRxvBh4bOd4NHKqqzy0zx+5uLYPB4LQDJEk6\nNX3fStoL7Oie7wC+OWbNw8B1STZ0N52v686R5C7gbOCves4hSZqQvmG4G7g2ySHgmu6YJIMkXwKo\nquPAncAT3eOOqjqeZDOLb0dtBZ5M8lSSj/acR5LUU6pm712ZwWBQw+Fw2mNI0kxJsr+qBsut818+\nS5IahkGS1DAMkqSGYZAkNQyDJKlhGCRJDcMgSWoYBklSwzBIkhqGQZLUMAySpIZhkCQ1DIMkqWEY\nJEkNwyBJahgGSVLDMEiSGoZBktQwDJKkhmGQJDUMgySpYRgkSQ3DIElqGAZJUsMwSJIahkGS1DAM\nkqSGYZAkNQyDJKlhGCRJjV5hSLIxyb4kh7qPG06ybke35lCSHWOu703ygz6zSJImo+8rhl3Ao1V1\nGfBod9xIshG4DbgSuAK4bTQgST4AvNhzDknShPQNw3ZgT/d8D3DjmDXXA/uq6nhVPQ/sA7YBJDkL\n+ARwV885JEkT0jcMF1TVs93zXwAXjFlzEfDMyPGR7hzAncBngZd6ziFJmpC55RYkeQR425hLt44e\nVFUlqVP9xkneBbyzqj6eZP4U1u8EdgJs2bLlVL+NJOk0LRuGqrrmZNeS/DLJhVX1bJILgV+NWXYU\nuGrkeDPwGPAeYJDkp90c5yd5rKquYoyq2g3sBhgMBqccIEnS6en7VtJe4NW/MtoBfHPMmoeB65Js\n6G46Xwc8XFX/WFVvr6p54I+AH58sCpKktdM3DHcD1yY5BFzTHZNkkORLAFV1nMV7CU90jzu6c5Kk\nM1CqZu9dmcFgUMPhcNpjSNJMSbK/qgbLrfNfPkuSGoZBktQwDJKkhmGQJDUMgySpYRgkSQ3DIElq\nGAZJUsMwSJIahkGS1DAMkqSGYZAkNQyDJKlhGCRJDcMgSWoYBklSwzBIkhqGQZLUMAySpIZhkCQ1\nDIMkqWEYJEkNwyBJahgGSVLDMEiSGqmqac9w2pIcA3427TlO03nAr6c9xBpzz28M7nl2/G5VbVpu\n0UyGYRYlGVbVYNpzrCX3/Mbgnl9/fCtJktQwDJKkhmFYO7unPcAUuOc3Bvf8OuM9BklSw1cMkqSG\nYZigJBuT7EtyqPu44STrdnRrDiXZMeb63iQ/WP2J++uz5yRvTfJQkh8lOZDk7rWd/vQk2ZbkYJKF\nJLvGXF+f5IHu+uNJ5keufbo7fzDJ9Ws5dx8r3XOSa5PsT/L97uN713r2lejzM+6ub0nyYpJPrtXM\nq6KqfEzoAdwL7Oqe7wLuGbNmI3C4+7ihe75h5PoHgH8GfjDt/az2noG3An/SrXkz8G/ADdPe00n2\nuQ74CfCObtb/BLYuWfOXwD91z28GHuieb+3Wrwcu6b7OumnvaZX3/G7g7d3z3weOTns/q7nfkesP\nAv8CfHLa++nz8BXDZG0H9nTP9wA3jllzPbCvqo5X1fPAPmAbQJKzgE8Ad63BrJOy4j1X1UtV9R2A\nqnoFeBLYvAYzr8QVwEJVHe5mvZ/FvY8a/W/xIHB1knTn76+ql6vqaWCh+3pnuhXvuaq+V1U/784f\nAN6SZP2aTL1yfX7GJLkReJrF/c40wzBZF1TVs93zXwAXjFlzEfDMyPGR7hzAncBngZdWbcLJ67tn\nAJKcA7wPeHQ1hpyAZfcwuqaqTgAvAOee4ueeifrsedQHgSer6uVVmnNSVrzf7pe6TwGfWYM5V93c\ntAeYNUkeAd425tKtowdVVUlO+U++krwLeGdVfXzp+5bTtlp7Hvn6c8DXgM9X1eGVTakzUZLLgXuA\n66Y9yyq7Hbivql7sXkDMNMNwmqrqmpNdS/LLJBdW1bNJLgR+NWbZUeCqkePNwGPAe4BBkp+y+HM5\nP8ljVXUVU7aKe37VbuBQVX1uAuOulqPAxSPHm7tz49Yc6WJ3NvDcKX7umajPnkmyGfgG8OGq+snq\nj9tbn/1eCdyU5F7gHOC3SX5TVV9Y/bFXwbRvcryeHsDf0t6IvXfMmo0svg+5oXs8DWxcsmae2bn5\n3GvPLN5P+VfgTdPeyzL7nGPxpvkl/P+NycuXrPkY7Y3Jr3fPL6e9+XyY2bj53GfP53TrPzDtfazF\nfpesuZ0Zv/k89QFeTw8W31t9FDgEPDLyP78B8KWRdX/B4g3IBeDPx3ydWQrDivfM4m9kBfwQeKp7\nfHTae3qNvf4p8GMW/3Ll1u7cHcD7u+e/w+JfpCwA3wXeMfK5t3afd5Az9C+vJrln4K+B/x75uT4F\nnD/t/azmz3jka8x8GPyXz5Kkhn+VJElqGAZJUsMwSJIahkGS1DAMkqSGYZAkNQyDJKlhGCRJjf8F\nFDYZsBaypoYAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x10842c470>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import matplotlib.ticker as ticker\n",
    "%matplotlib inline\n",
    "\n",
    "plt.figure()\n",
    "plt.plot(all_losses)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}