Skip to content

Instantly share code, notes, and snippets.

@alierkan
Forked from ceshine/birnn.ipynb
Created April 11, 2023 11:18
Show Gist options
  • Save alierkan/51d4194f05cd3b1577150e623a62e12f to your computer and use it in GitHub Desktop.
Save alierkan/51d4194f05cd3b1577150e623a62e12f to your computer and use it in GitHub Desktop.
Figuring How Bidirectional RNN works in Pytorch
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Figuring How Bidirectional RNN works in Pytorch"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import numpy as np\n",
"import torch, torch.nn as nn\n",
"from torch.autograd import Variable"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Initialize Input Sequence Randomly\n",
"For demonstration purpose, we are going to feed RNNs only one sequence of length 5 with only one dimension."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Variable containing:\n",
"-0.1308\n",
"-0.4986\n",
"-0.2581\n",
" 1.7486\n",
" 1.4340\n",
"[torch.FloatTensor of size 5]"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"random_input = Variable(torch.FloatTensor(5, 1, 1).normal_(), requires_grad=False)\n",
"random_input[:, 0, 0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Initialize a Bidirectional GRU Layer"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"bi_grus = torch.nn.GRU(input_size=1, hidden_size=1, num_layers=1, batch_first=False, bidirectional=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Initialize a GRU Layer ( for Feeding the Sequence Reversely)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"reverse_gru = torch.nn.GRU(input_size=1, hidden_size=1, num_layers=1, batch_first=False, bidirectional=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now make sure the weights of the reverse gru layer match ones of the (reversed) bidirectional's:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"reverse_gru.weight_ih_l0 = bi_grus.weight_ih_l0_reverse\n",
"reverse_gru.weight_hh_l0 = bi_grus.weight_hh_l0_reverse\n",
"reverse_gru.bias_ih_l0 = bi_grus.bias_ih_l0_reverse\n",
"reverse_gru.bias_hh_l0 = bi_grus.bias_hh_l0_reverse"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Feed Input Sequence into Both Networks"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"bi_output, bi_hidden = bi_grus(random_input)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"reverse_output, reverse_hidden = reverse_gru(random_input[np.arange(4, -1, -1), :, :])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Check Outputs"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Variable containing:\n",
" 0.7001\n",
" 0.8531\n",
" 0.4716\n",
" 0.4065\n",
" 0.4960\n",
"[torch.FloatTensor of size 5]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"reverse_output[:, 0, 0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The outputs of the reverse GRUs sit in the [latter half of the output](https://discuss.pytorch.org/t/get-forward-and-backward-output-seperately-from-bidirectional-rnn/2523)(in the last dimension):"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Variable containing:\n",
" 0.4960\n",
" 0.4065\n",
" 0.4716\n",
" 0.8531\n",
" 0.7001\n",
"[torch.FloatTensor of size 5]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bi_output[:, 0, 1]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Check Hidden States"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Variable containing:\n",
"(0 ,.,.) = \n",
" 0.4960\n",
"[torch.FloatTensor of size 1x1x1]"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"reverse_hidden"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The hidden states of the reversed GRUs sits in [the odd indices in the first dimension](https://discuss.pytorch.org/t/how-can-i-know-which-part-of-h-n-of-bidirectional-rnn-is-for-backward-process/3883/4)."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Variable containing:\n",
" 0.4960\n",
"[torch.FloatTensor of size 1x1]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bi_hidden[1]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conclusion\n",
"\n",
"1. The returned outputs of bidirectional RNN at timestep t is just the output after feeding input to both the reverse and normal RNN unit at timestep t. (where normal RNN has seen inputs 1...t and reverse RNN has seen inputs t...n, n being the length of the sequence)\n",
"2. The returned hidden state of bidirectional RNN is the hidden state after the whole sequence is consume. For normal RNN it's after timestep n; for reverse RNN it's after timestep 1."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment