alierkan/birnn.ipynb

## birnn.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Figuring How Bidirectional RNN works in Pytorch"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import torch, torch.nn as nn\n",
    "from torch.autograd import Variable"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Initialize Input Sequence Randomly\n",
    "For demonstration purpose, we are going to feed RNNs only one sequence of length 5 with only one dimension."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Variable containing:\n",
       "-0.1308\n",
       "-0.4986\n",
       "-0.2581\n",
       " 1.7486\n",
       " 1.4340\n",
       "[torch.FloatTensor of size 5]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "random_input = Variable(torch.FloatTensor(5, 1, 1).normal_(), requires_grad=False)\n",
    "random_input[:, 0, 0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Initialize a Bidirectional GRU Layer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "bi_grus = torch.nn.GRU(input_size=1, hidden_size=1, num_layers=1, batch_first=False, bidirectional=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Initialize a GRU Layer ( for Feeding the Sequence Reversely)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "reverse_gru = torch.nn.GRU(input_size=1, hidden_size=1, num_layers=1, batch_first=False, bidirectional=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now make sure the weights of the reverse gru layer match ones of the (reversed) bidirectional's:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "reverse_gru.weight_ih_l0 = bi_grus.weight_ih_l0_reverse\n",
    "reverse_gru.weight_hh_l0 = bi_grus.weight_hh_l0_reverse\n",
    "reverse_gru.bias_ih_l0 = bi_grus.bias_ih_l0_reverse\n",
    "reverse_gru.bias_hh_l0 = bi_grus.bias_hh_l0_reverse"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Feed Input Sequence into Both Networks"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "bi_output, bi_hidden = bi_grus(random_input)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "reverse_output, reverse_hidden = reverse_gru(random_input[np.arange(4, -1, -1), :, :])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Check Outputs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Variable containing:\n",
       " 0.7001\n",
       " 0.8531\n",
       " 0.4716\n",
       " 0.4065\n",
       " 0.4960\n",
       "[torch.FloatTensor of size 5]"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reverse_output[:, 0, 0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The outputs of the reverse GRUs sit in the [latter half of the output](https://discuss.pytorch.org/t/get-forward-and-backward-output-seperately-from-bidirectional-rnn/2523)(in the last dimension):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Variable containing:\n",
       " 0.4960\n",
       " 0.4065\n",
       " 0.4716\n",
       " 0.8531\n",
       " 0.7001\n",
       "[torch.FloatTensor of size 5]"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "bi_output[:, 0, 1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Check Hidden States"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Variable containing:\n",
       "(0 ,.,.) = \n",
       "  0.4960\n",
       "[torch.FloatTensor of size 1x1x1]"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reverse_hidden"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The hidden states of the reversed GRUs sits in [the odd indices in the first dimension](https://discuss.pytorch.org/t/how-can-i-know-which-part-of-h-n-of-bidirectional-rnn-is-for-backward-process/3883/4)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Variable containing:\n",
       " 0.4960\n",
       "[torch.FloatTensor of size 1x1]"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "bi_hidden[1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conclusion\n",
    "\n",
    "1. The returned outputs of bidirectional RNN at timestep t is just the output after feeding input to both the reverse and normal RNN unit at timestep t. (where normal RNN has seen inputs 1...t and reverse RNN has seen inputs t...n, n being the length of the sequence)\n",
    "2. The returned hidden state of bidirectional RNN is the hidden state after the whole sequence is consume. For normal RNN it's after timestep n; for reverse RNN it's after timestep 1."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"# Figuring How Bidirectional RNN works in Pytorch"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {
	"collapsed": true
	},
	"outputs": [],
	"source": [
	"import numpy as np\n",
	"import torch, torch.nn as nn\n",
	"from torch.autograd import Variable"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### Initialize Input Sequence Randomly\n",
	"For demonstration purpose, we are going to feed RNNs only one sequence of length 5 with only one dimension."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 2,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"Variable containing:\n",
	"-0.1308\n",
	"-0.4986\n",
	"-0.2581\n",
	" 1.7486\n",
	" 1.4340\n",
	"[torch.FloatTensor of size 5]"
	]
	},
	"execution_count": 2,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"random_input = Variable(torch.FloatTensor(5, 1, 1).normal_(), requires_grad=False)\n",
	"random_input[:, 0, 0]"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### Initialize a Bidirectional GRU Layer"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"metadata": {
	"collapsed": true
	},
	"outputs": [],
	"source": [
	"bi_grus = torch.nn.GRU(input_size=1, hidden_size=1, num_layers=1, batch_first=False, bidirectional=True)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### Initialize a GRU Layer ( for Feeding the Sequence Reversely)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 4,
	"metadata": {
	"collapsed": true
	},
	"outputs": [],
	"source": [
	"reverse_gru = torch.nn.GRU(input_size=1, hidden_size=1, num_layers=1, batch_first=False, bidirectional=False)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Now make sure the weights of the reverse gru layer match ones of the (reversed) bidirectional's:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 5,
	"metadata": {},
	"outputs": [],
	"source": [
	"reverse_gru.weight_ih_l0 = bi_grus.weight_ih_l0_reverse\n",
	"reverse_gru.weight_hh_l0 = bi_grus.weight_hh_l0_reverse\n",
	"reverse_gru.bias_ih_l0 = bi_grus.bias_ih_l0_reverse\n",
	"reverse_gru.bias_hh_l0 = bi_grus.bias_hh_l0_reverse"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### Feed Input Sequence into Both Networks"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 6,
	"metadata": {},
	"outputs": [],
	"source": [
	"bi_output, bi_hidden = bi_grus(random_input)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 7,
	"metadata": {},
	"outputs": [],
	"source": [
	"reverse_output, reverse_hidden = reverse_gru(random_input[np.arange(4, -1, -1), :, :])"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### Check Outputs"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 8,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"Variable containing:\n",
	" 0.7001\n",
	" 0.8531\n",
	" 0.4716\n",
	" 0.4065\n",
	" 0.4960\n",
	"[torch.FloatTensor of size 5]"
	]
	},
	"execution_count": 8,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"reverse_output[:, 0, 0]"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"The outputs of the reverse GRUs sit in the [latter half of the output](https://discuss.pytorch.org/t/get-forward-and-backward-output-seperately-from-bidirectional-rnn/2523)(in the last dimension):"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 9,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"Variable containing:\n",
	" 0.4960\n",
	" 0.4065\n",
	" 0.4716\n",
	" 0.8531\n",
	" 0.7001\n",
	"[torch.FloatTensor of size 5]"
	]
	},
	"execution_count": 9,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"bi_output[:, 0, 1]"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### Check Hidden States"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 10,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"Variable containing:\n",
	"(0 ,.,.) = \n",
	" 0.4960\n",
	"[torch.FloatTensor of size 1x1x1]"
	]
	},
	"execution_count": 10,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"reverse_hidden"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"The hidden states of the reversed GRUs sits in [the odd indices in the first dimension](https://discuss.pytorch.org/t/how-can-i-know-which-part-of-h-n-of-bidirectional-rnn-is-for-backward-process/3883/4)."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 11,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"Variable containing:\n",
	" 0.4960\n",
	"[torch.FloatTensor of size 1x1]"
	]
	},
	"execution_count": 11,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"bi_hidden[1]"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Conclusion\n",
	"\n",
	"1. The returned outputs of bidirectional RNN at timestep t is just the output after feeding input to both the reverse and normal RNN unit at timestep t. (where normal RNN has seen inputs 1...t and reverse RNN has seen inputs t...n, n being the length of the sequence)\n",
	"2. The returned hidden state of bidirectional RNN is the hidden state after the whole sequence is consume. For normal RNN it's after timestep n; for reverse RNN it's after timestep 1."
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.6.1"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}