Skip to content

Instantly share code, notes, and snippets.

@cooganb
Created November 4, 2017 20:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cooganb/cde46d0fa3e1769e68724b0c81e9f2f4 to your computer and use it in GitHub Desktop.
Save cooganb/cde46d0fa3e1769e68724b0c81e9f2f4 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Conditional Word-level RNN"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Requirements\n",
"\n",
"You will need [PyTorch](http://pytorch.org/) to build and train the models"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import unicodedata\n",
"import string\n",
"import re\n",
"import random\n",
"import time\n",
"import math\n",
"import ast\n",
"\n",
"import torch\n",
"import torch.nn as nn\n",
"from torch.autograd import Variable\n",
"from torch import optim\n",
"import torch.nn.functional as F"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here we will also define a constant to decide whether to use the GPU (with CUDA specifically) or the CPU. **If you don't have a GPU, set this to `False`**. Later when we create tensors, this variable will be used to decide whether we keep them on CPU or move them to GPU."
]
},
{
"cell_type": "code",
"execution_count": 99,
"metadata": {},
"outputs": [],
"source": [
"USE_CUDA = False"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Indexing words\n",
"\n",
"We'll need a unique index per word to use as the inputs and targets of the networks later. To keep track of all this we will use a helper class called `Lang` which has word → index (`word2index`) and index → word (`index2word`) dictionaries, as well as a count of each word `word2count` to use to later replace rare words."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"SOA_token = 0\n",
"EOA_token = 1\n",
"\n",
"class Lang:\n",
" '''Instantiates an object class that can keep its own word index count'''\n",
" def __init__(self, name):\n",
" self.name = name\n",
" self.word2index = {}\n",
" self.word2count = {}\n",
" self.index2word = {0: \"SOA\", 1: \"EOA\"}\n",
" self.n_words = 2 # Count SOA and EOA\n",
"\n",
" def index_words(self, sentence):\n",
" for word in sentence.split(' '):\n",
" self.index_word(word)\n",
"\n",
" def index_word(self, word):\n",
" if word not in self.word2index:\n",
" self.word2index[word] = self.n_words\n",
" self.word2count[word] = 1\n",
" self.index2word[self.n_words] = word\n",
" self.n_words += 1\n",
" else:\n",
" self.word2count[word] += 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Reading and decoding files\n",
"\n",
"We need to create word indexes for the Tensors to train a Word-level RNN.\n",
"\n",
"#### Normalize text\n",
"\n",
"The function below normalizes the text in the lead paragraph, taking out certain symbols that will create two indexes (Bush and Bush's, for example). We're also adding a space to the HTML tag `<p>` and `</p>` so they will be treated as separate words"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"def normalize_string(s):\n",
" '''Function meant to clean up lead paragraph strings, removing a different combo of items'''\n",
" # Some of these are very crude and may need to be cleaned up\n",
" translator = str.maketrans('', '', r'''.,'\"''')\n",
" s = s.lower().strip()\n",
" s = re.sub(r\"<p>\", r\"<p> \", s)\n",
" s = re.sub(r\"</p>\", r\" </p>\", s)\n",
" s = s.translate(translator)\n",
" s = re.sub(r\"[^a-zA-Z!?/<>[0-9]]+\", r\" \", s)\n",
"\n",
" return s"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To read the data file we will split the file into lines, and then split lines into pairs.\n",
"\n",
"We will also create a Lang instance, which will create a word2index and index2word we'll use for encoding / decoding Tensors.\n",
"\n",
"Also, some articles may not have keywords associated with them. We will drop those while reading in the data."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"def read_lines(lang_class_name, doc):\n",
" '''Takes in the kw_text pairs, normalizes and tokenizes'''\n",
" print(\"Reading lines...\")\n",
"\n",
" # The input document is a list of keyword and text pairings\n",
" # joined by tab /t and then separated by a line break /n\n",
"\n",
" # Read the file and split into lines\n",
" lines = open(doc).read().strip().split('\\n')\n",
"\n",
" # Split every line into pairs and normalizes pair[1] which is lead text\n",
"\n",
" pairs = [l.split('\\t') for l in lines]\n",
" normalizationFail = 0\n",
" for pair in pairs:\n",
" \n",
" try:\n",
" pair[0] = ast.literal_eval(pair[0])\n",
" pair[1] = normalize_string(str(pair[1]))\n",
"\n",
" except:\n",
" print(\"Issue with normalizing strings\")\n",
" pairs.remove(pair)\n",
"\n",
" \n",
" lang_class_name = Lang(lang_class_name)\n",
" return lang_class_name, pairs"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The full process for preparing the data is:\n",
"\n",
"* Read text file and split into lines, split lines into pairs\n",
"* Normalize text\n",
"* Make word lists from sentences in pairs"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Reading lines...\n",
"Read 4070 sentence pairs\n",
"Indexing words...\n",
"2\n"
]
}
],
"source": [
"def prepare_data(nyt_russ, doc):\n",
" '''Preprocesses text data, returns a Lang object and list'''\n",
" all_categories = set()\n",
" \n",
" # Split, organize, preprocess and clean text\n",
"\n",
" nyt_russ, pairs = read_lines(nyt_russ, doc)\n",
" print(\"Read %s sentence pairs\" % len(pairs))\n",
" \n",
" # Using Lang object returned from read_lead_kw,\n",
" # Create a word index list\n",
"\n",
" print(\"Indexing words...\")\n",
" for pair in pairs:\n",
"\n",
" try:\n",
" nyt_russ.index_words(pair[1])\n",
"\n",
" except:\n",
" print(\"Issue Indexing\")\n",
" \n",
" try:\n",
"\n",
" all_categories = all_categories.union(set(pair[0]))\n",
"\n",
" except:\n",
" print(\"Issue creating categories\")\n",
" \n",
" all_categories = list(all_categories)\n",
" \n",
" return nyt_russ, pairs, all_categories\n",
"\n",
"nyt_russ, pairs, all_categories = prepare_data('nyt_russ', 'outfile.txt')\n",
"\n",
"# Print an example pair\n",
"# print(random.choice(pairs))\n",
"print(nyt_russ.word2index['<p>'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Preparing for Training\n",
"\n",
"Grab a random training sample"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[['united states armament and defense'], '<p> you dont need to be clausewitz to figure out what finally brought slobodan milosevic to the negotiating table on natos terms and why he is now stalling again the answer to both questions is russia </p> <p> the best way to understand russias critical role is to think of a divorce proceeding: you are negotiating a divorce with your wife she has no lawyer and you have johnnie cochran every time your wife asks for something mr cochran steps in to protect you with a point of law things are going so well you ask for a lunch break when you come back from lunch you find that mr cochran has switched sides and is now acting as your wifes lawyer nodding in agreement at all her demands youre in trouble </p>']\n"
]
}
],
"source": [
"# Get a random sample\n",
"def random_training_pair():\n",
" training_sample = random.choice(pairs)\n",
" return training_sample\n",
"\n",
"print(random_training_pair())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For each timestep (that is, for each word in a training sequence) the inputs of the network will be `(category, current word, hidden state)` and the outputs will be `(next word, next hidden state)`. So for each training set, we'll need the category vector, a set of input words, and a set of output/target words."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since we are predicting the next letter from the current letter for each timestep, the letter pairs are groups of consecutive letters from the line - e.g. for `\"Apple Banana Compost Diderot <EOA>\"` we would create (\"Apple\", \"Banana\"), (\"Banana\", \"Compost\"), (\"Compost\", \"Diderot\"), (\"D\", \"EOA\").\n",
"\n",
"![](https://i.imgur.com/JH58tXY.png)\n",
"\n",
"The category tensor is a tensor of size `<1 x n_categories>`. When training we feed it to the network at every timestep - this is a design choice, it could have been included as part of initial hidden state or some other strategy."
]
},
{
"cell_type": "code",
"execution_count": 95,
"metadata": {},
"outputs": [],
"source": [
"n_words = len(nyt_russ.word2index)\n",
"n_categories = len(all_categories)\n",
"\n",
"# Vector with multiply entries for each category present\n",
"\n",
"def make_category_input(categories):\n",
"# tensor = torch.zeros(1, n_categories)\n",
" l = [0] * n_categories\n",
" for i in categories:\n",
" li = all_categories.index(i)\n",
" l[li] = 1\n",
" var = torch.LongTensor(l)\n",
" var = var.view(1, n_categories)\n",
"# var = torch.unsqueeze(var, 0)\n",
" var = Variable(var)\n",
" if USE_CUDA: var = var.cuda()\n",
" \n",
" return var\n",
"\n",
"# Return a list of indexes, one for each word in the sentence\n",
"# Pad to meet MAX_LENGTH (needed because issues arose of variable input length)\n",
"\n",
"MAX_LENGTH = 300\n",
"\n",
"def indexes_from_input(lang, input):\n",
" wordIndex = [lang.word2index[word] for word in input.split(' ')]\n",
"# if len(wordIndex) > MAX_LENGTH:\n",
"# fullLength = MAX_LENGTH - 1\n",
"# wordIndex = wordIndex[0:fullLength]\n",
"# else:\n",
"# diff = MAX_LENGTH - len(wordIndex)\n",
"# l = []\n",
"# l = [0] * diff\n",
"# wordIndex = wordIndex + l\n",
" \n",
" return wordIndex\n",
"\n",
"# FIRST MATRIX\n",
"# One-hot matrix of first to last letters (not including EOS) for input\n",
"# def make_word_input(lang, words):\n",
"# indexList = indexes_from_input(lang, words)\n",
"# var = Variable(torch.FloatTensor(indexList).view(-1, 1))\n",
"# # print('var =', var)\n",
"# if USE_CUDA: var = var.cuda()\n",
" \n",
"# return var\n",
"\n",
"\n",
"# NEW DRAFT MATRIX\n",
"# One-hot matrix of first to last letters (not including EOS) for input\n",
"def make_word_input(lang, words):\n",
" # Return a list of word entries for text:\n",
" indexList = indexes_from_input(lang, words)\n",
" \n",
" # Make a 2D matrix, a stack of one-hot vectors\n",
" # 1 x n_words, representing the text\n",
" \n",
" tensor = torch.LongTensor(len(words), n_words)\n",
" words = words.split(' ')\n",
" for wi in range(len(words)):\n",
" tensor[wi][indexList[wi]] = 1\n",
" tensor = tensor.view(-1, 1, n_words)\n",
" var = Variable(tensor)\n",
" \n",
" # print('var =', var)\n",
" if USE_CUDA: var = var.cuda()\n",
" \n",
" return var\n",
"\n",
"# LongTensor of second word to end token (EOS) for target\n",
"\n",
"def make_target(lang, words):\n",
" indexList = indexes_from_input(lang, words)\n",
" \n",
" # Shift it all one forward, append EOA to make it a target\n",
" indexList = indexList[1:]\n",
" indexList.append(1) # EOA\n",
" \n",
" # Create a stack of one-hot tensors like in make_word_input\n",
" tensor = torch.zeros(len(words), n_words)\n",
" words = words.split(' ')\n",
" for wi in range(len(words)):\n",
" tensor[wi][indexList[wi]] = 1\n",
" tensor = tensor.view(-1, 1, n_words)\n",
" var = Variable(tensor)\n",
" \n",
"# print('var =', var)\n",
" \n",
" if USE_CUDA: var = var.cuda()\n",
" return var"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For convenience during training we'll make a `random_training_set` function that fetches a random (category, line) pair and turns them into the required (category, input, target) tensors."
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {},
"outputs": [],
"source": [
"# Make category, input, and target tensors from a random category, line pair\n",
"def random_training_set():\n",
" training_sample = random_training_pair()\n",
" category_input = make_category_input(training_sample[0])\n",
" line_input = make_word_input(nyt_russ, training_sample[1])\n",
" line_target = make_target(nyt_russ, training_sample[1])\n",
" \n",
" return category_input, line_input, line_target\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Creating the Network"
]
},
{
"cell_type": "code",
"execution_count": 116,
"metadata": {},
"outputs": [],
"source": [
"class RNN(nn.Module):\n",
" def __init__(self, input_size, hidden_size, output_size):\n",
" super(RNN, self).__init__()\n",
" self.input_size = input_size\n",
" self.hidden_size = hidden_size\n",
" self.output_size = output_size\n",
" self.bias = False\n",
" print(input_size, hidden_size, output_size)\n",
" self.i2h = nn.GRU(n_categories + input_size + hidden_size, hidden_size)\n",
" self.i2o = nn.Linear(n_categories + input_size + hidden_size, output_size)\n",
" self.o2o = nn.Linear(hidden_size + output_size, output_size)\n",
" self.softmax = nn.LogSoftmax()\n",
" \n",
" def forward(self, category, input, hidden):\n",
" print(\"Category: \",category.data.type(), category.size(), \"\\n\", \"Input: \",input.data.type(), input.size(), \"\\n\", \"Hidden: \", hidden.data.type(), hidden.size(), \"\\n\")\n",
" input_combined = torch.cat((category, input, hidden), 1)\n",
" print(\"Input_Combined: \", input_combined.data.type())\n",
" hidden = self.i2h(input_combined)\n",
" output = self.i2o(input_combined)\n",
" output_combined = torch.cat((hidden, output), 1)\n",
" output = self.o2o(output_combined)\n",
" return output, hidden\n",
"\n",
" def init_hidden(self):\n",
" hidden = Variable(torch.LongTensor(1, self.hidden_size))\n",
" if USE_CUDA: hidden = hidden.cuda()\n",
" return hidden\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Training the Network\n",
"\n",
"In contrast to classification, where only the last output is used, we are making a prediction at every step, so we are calculating loss at every step.\n",
"\n",
"The magic of autograd allows you to simply sum these losses at each step and call backward at the end. But don't ask me why initializing loss with 0 works."
]
},
{
"cell_type": "code",
"execution_count": 106,
"metadata": {},
"outputs": [],
"source": [
"def train(category_tensor, input_line_tensor, target_line_tensor):\n",
" hidden = rnn.init_hidden()\n",
" optimizer.zero_grad()\n",
" loss = 0\n",
"\n",
"\n",
" ## The below needs to be the range of the size of the incoming article\n",
"\n",
" \n",
" for i in range(input_line_tensor.size()[1]):\n",
" output, hidden = rnn(category_tensor, input_line_tensor[i], hidden)\n",
" print(output, target_line_tensor[i])\n",
" loss += criterion(output, target_line_tensor[i])\n",
"\n",
" loss.backward()\n",
" optimizer.step()\n",
" \n",
" return output, loss.data[0] / input_line_tensor.size()[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To keep track of how long training takes I am adding a `time_since(t)` function which returns a human readable string:"
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {},
"outputs": [],
"source": [
"def time_since(t):\n",
" now = time.time()\n",
" s = now - t\n",
" m = math.floor(s / 60)\n",
" s -= m * 60\n",
" return '%dm %ds' % (m, s)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Training is business as usual - call train a bunch of times and wait a few minutes, printing the current time and loss every `print_every` epochs, and keeping store of an average loss per `plot_every` epochs in `all_losses` for plotting later."
]
},
{
"cell_type": "code",
"execution_count": 117,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"21141 128 21141\n",
"Category: torch.LongTensor torch.Size([1, 56]) \n",
" Input: torch.LongTensor torch.Size([1, 21141]) \n",
" Hidden: torch.LongTensor torch.Size([1, 128]) \n",
"\n",
"Input_Combined: torch.LongTensor\n"
]
},
{
"ename": "TypeError",
"evalue": "torch.addmm received an invalid combination of arguments - got (int, torch.LongTensor, int, torch.LongTensor, torch.FloatTensor, out=torch.LongTensor), but expected one of:\n * (torch.LongTensor source, torch.LongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (torch.LongTensor source, torch.SparseLongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (int beta, torch.LongTensor source, torch.LongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (torch.LongTensor source, int alpha, torch.LongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (int beta, torch.LongTensor source, torch.SparseLongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (torch.LongTensor source, int alpha, torch.SparseLongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (int beta, torch.LongTensor source, int alpha, torch.LongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n didn't match because some of the arguments have invalid types: (\u001b[32;1mint\u001b[0m, \u001b[32;1mtorch.LongTensor\u001b[0m, \u001b[32;1mint\u001b[0m, \u001b[32;1mtorch.LongTensor\u001b[0m, \u001b[31;1mtorch.FloatTensor\u001b[0m, \u001b[32;1mout=torch.LongTensor\u001b[0m)\n * (int beta, torch.LongTensor source, int alpha, torch.SparseLongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n didn't match because some of the arguments have invalid types: (\u001b[32;1mint\u001b[0m, \u001b[32;1mtorch.LongTensor\u001b[0m, \u001b[32;1mint\u001b[0m, \u001b[31;1mtorch.LongTensor\u001b[0m, \u001b[31;1mtorch.FloatTensor\u001b[0m, \u001b[32;1mout=torch.LongTensor\u001b[0m)\n",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-117-3570efd110ad>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 20\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 21\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mepoch\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mrange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn_epochs\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 22\u001b[0;31m \u001b[0moutput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mloss\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtrain\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0mrandom_training_set\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 23\u001b[0m \u001b[0mloss_avg\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0mloss\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 24\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m<ipython-input-106-6ff071efaec3>\u001b[0m in \u001b[0;36mtrain\u001b[0;34m(category_tensor, input_line_tensor, target_line_tensor)\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 10\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mi\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mrange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput_line_tensor\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 11\u001b[0;31m \u001b[0moutput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mrnn\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcategory_tensor\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput_line_tensor\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 12\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0moutput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtarget_line_tensor\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 13\u001b[0m \u001b[0mloss\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0mcriterion\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0moutput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtarget_line_tensor\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py\u001b[0m in \u001b[0;36m__call__\u001b[0;34m(self, *input, **kwargs)\u001b[0m\n\u001b[1;32m 222\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mhook\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_forward_pre_hooks\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 223\u001b[0m \u001b[0mhook\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 224\u001b[0;31m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mforward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 225\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mhook\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_forward_hooks\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 226\u001b[0m \u001b[0mhook_result\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mhook\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mresult\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m<ipython-input-116-31a4f7187888>\u001b[0m in \u001b[0;36mforward\u001b[0;34m(self, category, input, hidden)\u001b[0m\n\u001b[1;32m 16\u001b[0m \u001b[0minput_combined\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcategory\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 17\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Input_Combined: \"\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput_combined\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 18\u001b[0;31m \u001b[0mhidden\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mi2h\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput_combined\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 19\u001b[0m \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mi2o\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput_combined\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 20\u001b[0m \u001b[0moutput_combined\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mhidden\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0moutput\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py\u001b[0m in \u001b[0;36m__call__\u001b[0;34m(self, *input, **kwargs)\u001b[0m\n\u001b[1;32m 222\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mhook\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_forward_pre_hooks\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 223\u001b[0m \u001b[0mhook\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 224\u001b[0;31m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mforward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 225\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mhook\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_forward_hooks\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 226\u001b[0m \u001b[0mhook_result\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mhook\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mresult\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/modules/rnn.py\u001b[0m in \u001b[0;36mforward\u001b[0;34m(self, input, hx)\u001b[0m\n\u001b[1;32m 160\u001b[0m \u001b[0mflat_weight\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mflat_weight\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 161\u001b[0m )\n\u001b[0;32m--> 162\u001b[0;31m \u001b[0moutput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mfunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mall_weights\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 163\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mis_packed\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 164\u001b[0m \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mPackedSequence\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0moutput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbatch_sizes\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/_functions/rnn.py\u001b[0m in \u001b[0;36mforward\u001b[0;34m(input, *fargs, **fkwargs)\u001b[0m\n\u001b[1;32m 349\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 350\u001b[0m \u001b[0mfunc\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mAutogradRNN\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 351\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mfunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0mfargs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mfkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 352\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 353\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mforward\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/_functions/rnn.py\u001b[0m in \u001b[0;36mforward\u001b[0;34m(input, weight, hidden)\u001b[0m\n\u001b[1;32m 242\u001b[0m \u001b[0minput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtranspose\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 243\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 244\u001b[0;31m \u001b[0mnexth\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mfunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mweight\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 245\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 246\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mbatch_first\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0mbatch_sizes\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/_functions/rnn.py\u001b[0m in \u001b[0;36mforward\u001b[0;34m(input, hidden, weight)\u001b[0m\n\u001b[1;32m 82\u001b[0m \u001b[0ml\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mi\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mnum_directions\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0mj\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 83\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 84\u001b[0;31m \u001b[0mhy\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0minner\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0ml\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mweight\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0ml\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 85\u001b[0m \u001b[0mnext_hidden\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mhy\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 86\u001b[0m \u001b[0mall_output\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0moutput\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/_functions/rnn.py\u001b[0m in \u001b[0;36mforward\u001b[0;34m(input, hidden, weight)\u001b[0m\n\u001b[1;32m 111\u001b[0m \u001b[0msteps\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mrange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m-\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mreverse\u001b[0m \u001b[0;32melse\u001b[0m \u001b[0mrange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 112\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mi\u001b[0m \u001b[0;32min\u001b[0m \u001b[0msteps\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 113\u001b[0;31m \u001b[0mhidden\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0minner\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0mweight\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 114\u001b[0m \u001b[0;31m# hack to handle LSTM\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 115\u001b[0m \u001b[0moutput\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mhidden\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mhidden\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtuple\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32melse\u001b[0m \u001b[0mhidden\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/_functions/rnn.py\u001b[0m in \u001b[0;36mGRUCell\u001b[0;34m(input, hidden, w_ih, w_hh, b_ih, b_hh)\u001b[0m\n\u001b[1;32m 52\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mstate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mgi\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mgh\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mb_ih\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m \u001b[0;32melse\u001b[0m \u001b[0mstate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mgi\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mgh\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhidden\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb_ih\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb_hh\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 53\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 54\u001b[0;31m \u001b[0mgi\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mF\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlinear\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mw_ih\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb_ih\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 55\u001b[0m \u001b[0mgh\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mF\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlinear\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mhidden\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mw_hh\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb_hh\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 56\u001b[0m \u001b[0mi_r\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mi_i\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mi_n\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mgi\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mchunk\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/nn/functional.py\u001b[0m in \u001b[0;36mlinear\u001b[0;34m(input, weight, bias)\u001b[0m\n\u001b[1;32m 553\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0maddmm\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbias\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mweight\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mt\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 554\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 555\u001b[0;31m \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmatmul\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mweight\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mt\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 556\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mbias\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 557\u001b[0m \u001b[0moutput\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0mbias\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/autograd/variable.py\u001b[0m in \u001b[0;36mmatmul\u001b[0;34m(self, other)\u001b[0m\n\u001b[1;32m 558\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 559\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mmatmul\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mother\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 560\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmatmul\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mother\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 561\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 562\u001b[0m \u001b[0;34m@\u001b[0m\u001b[0mstaticmethod\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/functional.py\u001b[0m in \u001b[0;36mmatmul\u001b[0;34m(tensor1, tensor2, out)\u001b[0m\n\u001b[1;32m 166\u001b[0m \u001b[0;32melif\u001b[0m \u001b[0mdim_tensor1\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;36m1\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0mdim_tensor2\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;36m2\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 167\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mout\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 168\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmm\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtensor1\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0munsqueeze\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtensor2\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msqueeze_\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 169\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 170\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmm\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtensor1\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0munsqueeze\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtensor2\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mout\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mout\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msqueeze_\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/autograd/variable.py\u001b[0m in \u001b[0;36mmm\u001b[0;34m(self, matrix)\u001b[0m\n\u001b[1;32m 577\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mmm\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmatrix\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 578\u001b[0m \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mVariable\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnew\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmatrix\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 579\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mAddmm\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mapply\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0moutput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmatrix\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 580\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 581\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mbmm\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbatch\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.5/dist-packages/torch/autograd/_functions/blas.py\u001b[0m in \u001b[0;36mforward\u001b[0;34m(ctx, add_matrix, matrix1, matrix2, alpha, beta, inplace)\u001b[0m\n\u001b[1;32m 24\u001b[0m \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0m_get_output\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mctx\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0madd_matrix\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minplace\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0minplace\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 25\u001b[0m return torch.addmm(alpha, add_matrix, beta,\n\u001b[0;32m---> 26\u001b[0;31m matrix1, matrix2, out=output)\n\u001b[0m\u001b[1;32m 27\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 28\u001b[0m \u001b[0;34m@\u001b[0m\u001b[0mstaticmethod\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mTypeError\u001b[0m: torch.addmm received an invalid combination of arguments - got (int, torch.LongTensor, int, torch.LongTensor, torch.FloatTensor, out=torch.LongTensor), but expected one of:\n * (torch.LongTensor source, torch.LongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (torch.LongTensor source, torch.SparseLongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (int beta, torch.LongTensor source, torch.LongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (torch.LongTensor source, int alpha, torch.LongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (int beta, torch.LongTensor source, torch.SparseLongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (torch.LongTensor source, int alpha, torch.SparseLongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n * (int beta, torch.LongTensor source, int alpha, torch.LongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n didn't match because some of the arguments have invalid types: (\u001b[32;1mint\u001b[0m, \u001b[32;1mtorch.LongTensor\u001b[0m, \u001b[32;1mint\u001b[0m, \u001b[32;1mtorch.LongTensor\u001b[0m, \u001b[31;1mtorch.FloatTensor\u001b[0m, \u001b[32;1mout=torch.LongTensor\u001b[0m)\n * (int beta, torch.LongTensor source, int alpha, torch.SparseLongTensor mat1, torch.LongTensor mat2, *, torch.LongTensor out)\n didn't match because some of the arguments have invalid types: (\u001b[32;1mint\u001b[0m, \u001b[32;1mtorch.LongTensor\u001b[0m, \u001b[32;1mint\u001b[0m, \u001b[31;1mtorch.LongTensor\u001b[0m, \u001b[31;1mtorch.FloatTensor\u001b[0m, \u001b[32;1mout=torch.LongTensor\u001b[0m)\n"
]
}
],
"source": [
"n_epochs = 100000\n",
"print_every = 10\n",
"plot_every = 500\n",
"all_losses = []\n",
"loss_avg = 0 # Zero every plot_every epochs to keep a running average\n",
"learning_rate = 0.0005\n",
"\n",
"rnn = RNN(n_words, 128, n_words)\n",
"\n",
"# Move models to GPU\n",
"if USE_CUDA:\n",
" rnn.cuda()\n",
"\n",
"\n",
"\n",
"optimizer = torch.optim.Adam(rnn.parameters(), lr=learning_rate)\n",
"criterion = nn.CrossEntropyLoss()\n",
"\n",
"start = time.time()\n",
"\n",
"for epoch in range(1, n_epochs + 1):\n",
" output, loss = train(*random_training_set())\n",
" loss_avg += loss\n",
" \n",
" if epoch % print_every == 0:\n",
" print('%s (%d %d%%) %.4f' % (time_since(start), epoch, epoch / n_epochs * 100, loss))\n",
"\n",
" if epoch % plot_every == 0:\n",
" all_losses.append(loss_avg / plot_every)\n",
" loss_avg = 0"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Plotting the Network\n",
"\n",
"Plotting the historical loss from all_losses (hopefully) shows the network learning:"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[<matplotlib.lines.Line2D at 0x109b00358>]"
]
},
"execution_count": 65,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYYAAAD8CAYAAABzTgP2AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAADqFJREFUeJzt23+o3fV9x/Hnq7k0axE00WitMbu2CiNu0MJBKdvA1V9x\n0EZa/7D7o2FryR+rf6yl0BTHtOof6tZZSruN0BZCYdXOURqQItFWGGNYT6yjzdo0t7HFpLZNjQhO\nqmR974/7dTufy4k3ud9z78nR5wMO93y/38+99/3xgs97zvcmVYUkSa9607QHkCSdWQyDJKlhGCRJ\nDcMgSWoYBklSwzBIkhqGQZLUMAySpIZhkCQ15qY9wEqcd955NT8/P+0xJGmm7N+//9dVtWm5dTMZ\nhvn5eYbD4bTHkKSZkuRnp7LOt5IkSQ3DIElqGAZJUsMwSJIahkGS1DAMkqSGYZAkNQyDJKlhGCRJ\nDcMgSWoYBklSwzBIkhqGQZLUMAySpIZhkCQ1DIMkqWEYJEkNwyBJahgGSVLDMEiSGoZBktQwDJKk\nhmGQJDUMgySpMZEwJNmW5GCShSS7xlxfn+SB7vrjSeaXXN+S5MUkn5zEPJKklesdhiTrgC8CNwBb\ngQ8l2bpk2UeA56vqUuA+4J4l1/8e+FbfWSRJ/U3iFcMVwEJVHa6qV4D7ge1L1mwH9nTPHwSuThKA\nJDcCTwMHJjCLJKmnSYThIuCZkeMj3bmxa6rqBPACcG6Ss4BPAZ+ZwBySpAmY9s3n24H7qurF5RYm\n2ZlkmGR47Nix1Z9Mkt6g5ibwNY4CF48cb+7OjVtzJMkccDbwHHAlcFOSe4FzgN8m+U1VfWHpN6mq\n3cBugMFgUBOYW5I0xiTC8ARwWZJLWAzAzcCfLVmzF9gB/AdwE/Dtqirgj19dkOR24MVxUZAkrZ3e\nYaiqE0luAR4G1gFfqaoDSe4AhlW1F/gy8NUkC8BxFuMhSToDZfEX99kyGAxqOBxOewxJmilJ9lfV\nYLl10775LEk6wxgGSVLDMEiSGoZBktQwDJKkhmGQJDUMgySpYRgkSQ3DIElqGAZJUsMwSJIahkGS\n1DAMkqSGYZAkNQyDJKlhGCRJDcMgSWoYBklSwzBIkhqGQZLUMAySpIZhkCQ1DIMkqWEYJEkNwyBJ\nahgGSVLDMEiSGoZBktQwDJKkhmGQJDUMgySpMZEwJNmW5GCShSS7xlxfn+SB7vrjSea789cm2Z/k\n+93H905iHknSyvUOQ5J1wBeBG4CtwIeSbF2y7CPA81V1KXAfcE93/tfA+6rqD4AdwFf7ziNJ6mcS\nrxiuABaq6nBVvQLcD2xfsmY7sKd7/iBwdZJU1feq6ufd+QPAW5Ksn8BMkqQVmkQYLgKeGTk+0p0b\nu6aqTgAvAOcuWfNB4MmqenkCM0mSVmhu2gMAJLmcxbeXrnuNNTuBnQBbtmxZo8kk6Y1nEq8YjgIX\njxxv7s6NXZNkDjgbeK473gx8A/hwVf3kZN+kqnZX1aCqBps2bZrA2JKkcSYRhieAy5JckuTNwM3A\n3iVr9rJ4cxngJuDbVVVJzgEeAnZV1b9PYBZJUk+9w9DdM7gFeBj4IfD1qjqQ5I4k7++WfRk4N8kC\n8Ang1T9pvQW4FPibJE91j/P7ziRJWrlU1bRnOG2DwaCGw+G0x5CkmZJkf1UNllvnv3yWJDUMgySp\nYRgkSQ3DIElqGAZJUsMwSJIahkGS1DAMkqSGYZAkNQyDJKlhGCRJDcMgSWoYBklSwzBIkhqGQZLU\nMAySpIZhkCQ1DIMkqWEYJEkNwyBJahgGSVLDMEiSGoZBktQwDJKkhmGQJDUMgySpYRgkSQ3DIElq\nGAZJUsMwSJIaEwlDkm1JDiZZSLJrzPX1SR7orj+eZH7k2qe78weTXD+JeSRJK9c7DEnWAV8EbgC2\nAh9KsnXJso8Az1fVpcB9wD3d524FbgYuB7YB/9B9PUnSlEziFcMVwEJVHa6qV4D7ge1L1mwH9nTP\nHwSuTpLu/P1V9XJVPQ0sdF9PkjQlkwjDRcAzI8dHunNj11TVCeAF4NxT/FxJ0hqamZvPSXYmGSYZ\nHjt2bNrjSNLr1iTCcBS4eOR4c3du7Jokc8DZwHOn+LkAVNXuqhpU1WDTpk0TGFuSNM4kwvAEcFmS\nS5K8mcWbyXuXrNkL7Oie3wR8u6qqO39z91dLlwCXAd+dwEySpBWa6/sFqupEkluAh4F1wFeq6kCS\nO4BhVe0Fvgx8NckCcJzFeNCt+zrwX8AJ4GNV9T99Z5IkrVwWf3GfLYPBoIbD4bTHkKSZkmR/VQ2W\nWzczN58lSWvDMEiSGoZBktQwDJKkhmGQJDUMgySpYRgkSQ3DIElqGAZJUsMwSJIahkGS1DAMkqSG\nYZAkNQyDJKlhGCRJDcMgSWoYBklSwzBIkhqGQZLUMAySpIZhkCQ1DIMkqWEYJEkNwyBJahgGSVLD\nMEiSGoZBktQwDJKkhmGQJDUMgySpYRgkSY1eYUiyMcm+JIe6jxtOsm5Ht+ZQkh3dubcmeSjJj5Ic\nSHJ3n1kkSZPR9xXDLuDRqroMeLQ7biTZCNwGXAlcAdw2EpC/q6rfA94N/GGSG3rOI0nqqW8YtgN7\nuud7gBvHrLke2FdVx6vqeWAfsK2qXqqq7wBU1SvAk8DmnvNIknrqG4YLqurZ7vkvgAvGrLkIeGbk\n+Eh37v8kOQd4H4uvOiRJUzS33IIkjwBvG3Pp1tGDqqokdboDJJkDvgZ8vqoOv8a6ncBOgC1btpzu\nt5EknaJlw1BV15zsWpJfJrmwqp5NciHwqzHLjgJXjRxvBh4bOd4NHKqqzy0zx+5uLYPB4LQDJEk6\nNX3fStoL7Oie7wC+OWbNw8B1STZ0N52v686R5C7gbOCves4hSZqQvmG4G7g2ySHgmu6YJIMkXwKo\nquPAncAT3eOOqjqeZDOLb0dtBZ5M8lSSj/acR5LUU6pm712ZwWBQw+Fw2mNI0kxJsr+qBsut818+\nS5IahkGS1DAMkqSGYZAkNQyDJKlhGCRJDcMgSWoYBklSwzBIkhqGQZLUMAySpIZhkCQ1DIMkqWEY\nJEkNwyBJahgGSVLDMEiSGoZBktQwDJKkhmGQJDUMgySpYRgkSQ3DIElqGAZJUsMwSJIahkGS1DAM\nkqSGYZAkNQyDJKlhGCRJjV5hSLIxyb4kh7qPG06ybke35lCSHWOu703ygz6zSJImo+8rhl3Ao1V1\nGfBod9xIshG4DbgSuAK4bTQgST4AvNhzDknShPQNw3ZgT/d8D3DjmDXXA/uq6nhVPQ/sA7YBJDkL\n+ARwV885JEkT0jcMF1TVs93zXwAXjFlzEfDMyPGR7hzAncBngZd6ziFJmpC55RYkeQR425hLt44e\nVFUlqVP9xkneBbyzqj6eZP4U1u8EdgJs2bLlVL+NJOk0LRuGqrrmZNeS/DLJhVX1bJILgV+NWXYU\nuGrkeDPwGPAeYJDkp90c5yd5rKquYoyq2g3sBhgMBqccIEnS6en7VtJe4NW/MtoBfHPMmoeB65Js\n6G46Xwc8XFX/WFVvr6p54I+AH58sCpKktdM3DHcD1yY5BFzTHZNkkORLAFV1nMV7CU90jzu6c5Kk\nM1CqZu9dmcFgUMPhcNpjSNJMSbK/qgbLrfNfPkuSGoZBktQwDJKkhmGQJDUMgySpYRgkSQ3DIElq\nGAZJUsMwSJIahkGS1DAMkqSGYZAkNQyDJKlhGCRJDcMgSWoYBklSwzBIkhqGQZLUMAySpIZhkCQ1\nDIMkqWEYJEkNwyBJahgGSVLDMEiSGqmqac9w2pIcA3427TlO03nAr6c9xBpzz28M7nl2/G5VbVpu\n0UyGYRYlGVbVYNpzrCX3/Mbgnl9/fCtJktQwDJKkhmFYO7unPcAUuOc3Bvf8OuM9BklSw1cMkqSG\nYZigJBuT7EtyqPu44STrdnRrDiXZMeb63iQ/WP2J++uz5yRvTfJQkh8lOZDk7rWd/vQk2ZbkYJKF\nJLvGXF+f5IHu+uNJ5keufbo7fzDJ9Ws5dx8r3XOSa5PsT/L97uN713r2lejzM+6ub0nyYpJPrtXM\nq6KqfEzoAdwL7Oqe7wLuGbNmI3C4+7ihe75h5PoHgH8GfjDt/az2noG3An/SrXkz8G/ADdPe00n2\nuQ74CfCObtb/BLYuWfOXwD91z28GHuieb+3Wrwcu6b7OumnvaZX3/G7g7d3z3weOTns/q7nfkesP\nAv8CfHLa++nz8BXDZG0H9nTP9wA3jllzPbCvqo5X1fPAPmAbQJKzgE8Ad63BrJOy4j1X1UtV9R2A\nqnoFeBLYvAYzr8QVwEJVHe5mvZ/FvY8a/W/xIHB1knTn76+ql6vqaWCh+3pnuhXvuaq+V1U/784f\nAN6SZP2aTL1yfX7GJLkReJrF/c40wzBZF1TVs93zXwAXjFlzEfDMyPGR7hzAncBngZdWbcLJ67tn\nAJKcA7wPeHQ1hpyAZfcwuqaqTgAvAOee4ueeifrsedQHgSer6uVVmnNSVrzf7pe6TwGfWYM5V93c\ntAeYNUkeAd425tKtowdVVUlO+U++krwLeGdVfXzp+5bTtlp7Hvn6c8DXgM9X1eGVTakzUZLLgXuA\n66Y9yyq7Hbivql7sXkDMNMNwmqrqmpNdS/LLJBdW1bNJLgR+NWbZUeCqkePNwGPAe4BBkp+y+HM5\nP8ljVXUVU7aKe37VbuBQVX1uAuOulqPAxSPHm7tz49Yc6WJ3NvDcKX7umajPnkmyGfgG8OGq+snq\nj9tbn/1eCdyU5F7gHOC3SX5TVV9Y/bFXwbRvcryeHsDf0t6IvXfMmo0svg+5oXs8DWxcsmae2bn5\n3GvPLN5P+VfgTdPeyzL7nGPxpvkl/P+NycuXrPkY7Y3Jr3fPL6e9+XyY2bj53GfP53TrPzDtfazF\nfpesuZ0Zv/k89QFeTw8W31t9FDgEPDLyP78B8KWRdX/B4g3IBeDPx3ydWQrDivfM4m9kBfwQeKp7\nfHTae3qNvf4p8GMW/3Ll1u7cHcD7u+e/w+JfpCwA3wXeMfK5t3afd5Az9C+vJrln4K+B/x75uT4F\nnD/t/azmz3jka8x8GPyXz5Kkhn+VJElqGAZJUsMwSJIahkGS1DAMkqSGYZAkNQyDJKlhGCRJjf8F\nFDYZsBaypoYAAAAASUVORK5CYII=\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10842c470>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import matplotlib.pyplot as plt\n",
"import matplotlib.ticker as ticker\n",
"%matplotlib inline\n",
"\n",
"plt.figure()\n",
"plt.plot(all_losses)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment