Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save alierkan/68da5af5bc18b4dad5f8e471ef9cd9c6 to your computer and use it in GitHub Desktop.
Save alierkan/68da5af5bc18b4dad5f8e471ef9cd9c6 to your computer and use it in GitHub Desktop.
PyTorch implementation of an autoencoder.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Implementing an Autoencoder in PyTorch\n",
"===\n",
"\n",
"This is the PyTorch equivalent of my previous article on implementing an autoencoder in TensorFlow 2.0, which you may read [here](https://towardsdatascience.com/implementing-an-autoencoder-in-tensorflow-2-0-5e86126e9f7)\n",
"\n",
"First, to install PyTorch, you may use the following pip command,\n",
"\n",
"```\n",
"$ pip install torch torchvision\n",
"```\n",
"\n",
"The `torchvision` package contains the image data sets that are ready for use in PyTorch.\n",
"\n",
"More details on its installation through [this guide](https://pytorch.org/get-started/locally/) from [pytorch.org](pytorch.org)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"We begin by importing our dependencies."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"\n",
"import torch\n",
"import torch.nn as nn\n",
"import torch.optim as optim\n",
"import torchvision"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Set our seed and other configurations for reproducibility."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"seed = 42\n",
"torch.manual_seed(seed)\n",
"torch.backends.cudnn.benchmark = False\n",
"torch.backends.cudnn.deterministic = True"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We set the batch size, the number of training epochs, and the learning rate."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"batch_size = 512\n",
"epochs = 20\n",
"learning_rate = 1e-3"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Dataset\n",
"\n",
"We load our MNIST dataset using the `torchvision` package. "
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"transform = torchvision.transforms.Compose([torchvision.transforms.ToTensor()])\n",
"\n",
"train_dataset = torchvision.datasets.MNIST(\n",
" root=\"~/torch_datasets\", train=True, transform=transform, download=True\n",
")\n",
"\n",
"train_loader = torch.utils.data.DataLoader(\n",
" train_dataset, batch_size=batch_size, shuffle=True\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Autoencoder\n",
"\n",
"An autoencoder is a type of neural network that finds the function mapping the features x to itself. This objective is known as reconstruction, and an autoencoder accomplishes this through the following process: (1) an encoder learns the data representation in lower-dimension space, i.e. extracting the most salient features of the data, and (2) a decoder learns to reconstruct the original data based on the learned representation by the encoder.\n",
"\n",
"We define our autoencoder class with fully connected layers for both its encoder and decoder components."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"class AE(nn.Module):\n",
" def __init__(self, **kwargs):\n",
" super().__init__()\n",
" self.encoder_hidden_layer = nn.Linear(\n",
" in_features=kwargs[\"input_shape\"], out_features=128\n",
" )\n",
" self.encoder_output_layer = nn.Linear(\n",
" in_features=128, out_features=128\n",
" )\n",
" self.decoder_hidden_layer = nn.Linear(\n",
" in_features=128, out_features=128\n",
" )\n",
" self.decoder_output_layer = nn.Linear(\n",
" in_features=128, out_features=kwargs[\"input_shape\"]\n",
" )\n",
"\n",
" def forward(self, features):\n",
" activation = self.encoder_hidden_layer(features)\n",
" activation = torch.relu(activation)\n",
" code = self.encoder_output_layer(activation)\n",
" code = torch.sigmoid(code)\n",
" activation = self.decoder_hidden_layer(code)\n",
" activation = torch.relu(activation)\n",
" activation = self.decoder_output_layer(activation)\n",
" reconstructed = torch.sigmoid(activation)\n",
" return reconstructed"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Before using our defined autoencoder class, we have the following things to do:\n",
" 1. We configure which device we want to run on.\n",
" 2. We instantiate an `AE` object.\n",
" 3. We define our optimizer.\n",
" 4. We define our reconstruction loss."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"# use gpu if available\n",
"device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
"\n",
"# create a model from `AE` autoencoder class\n",
"# load it to the specified device, either gpu or cpu\n",
"model = AE(input_shape=784).to(device)\n",
"\n",
"# create an optimizer object\n",
"# Adam optimizer with learning rate 1e-3\n",
"optimizer = optim.Adam(model.parameters(), lr=learning_rate)\n",
"\n",
"# mean-squared error loss\n",
"criterion = nn.MSELoss()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We train our autoencoder for our specified number of epochs."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch : 1/20, recon loss = 0.08389361\n",
"epoch : 2/20, recon loss = 0.06247949\n",
"epoch : 3/20, recon loss = 0.05559724\n",
"epoch : 4/20, recon loss = 0.04590550\n",
"epoch : 5/20, recon loss = 0.04079516\n",
"epoch : 6/20, recon loss = 0.03859508\n",
"epoch : 7/20, recon loss = 0.03622651\n",
"epoch : 8/20, recon loss = 0.03278850\n",
"epoch : 9/20, recon loss = 0.03084057\n",
"epoch : 10/20, recon loss = 0.02930079\n",
"epoch : 11/20, recon loss = 0.02798331\n",
"epoch : 12/20, recon loss = 0.02642137\n",
"epoch : 13/20, recon loss = 0.02503968\n",
"epoch : 14/20, recon loss = 0.02353912\n",
"epoch : 15/20, recon loss = 0.02222095\n",
"epoch : 16/20, recon loss = 0.02105061\n",
"epoch : 17/20, recon loss = 0.02026297\n",
"epoch : 18/20, recon loss = 0.01937818\n",
"epoch : 19/20, recon loss = 0.01851325\n",
"epoch : 20/20, recon loss = 0.01769270\n"
]
}
],
"source": [
"for epoch in range(epochs):\n",
" loss = 0\n",
" for batch_features, _ in train_loader:\n",
" # reshape mini-batch data to [N, 784] matrix\n",
" # load it to the active device\n",
" batch_features = batch_features.view(-1, 784).to(device)\n",
" \n",
" # reset the gradients back to zero\n",
" # PyTorch accumulates gradients on subsequent backward passes\n",
" optimizer.zero_grad()\n",
" \n",
" # compute reconstructions\n",
" outputs = model(batch_features)\n",
" \n",
" # compute training reconstruction loss\n",
" train_loss = criterion(outputs, batch_features)\n",
" \n",
" # compute accumulated gradients\n",
" train_loss.backward()\n",
" \n",
" # perform parameter update based on current gradients\n",
" optimizer.step()\n",
" \n",
" # add the mini-batch training loss to epoch loss\n",
" loss += train_loss.item()\n",
" \n",
" # compute the epoch training loss\n",
" loss = loss / len(train_loader)\n",
" \n",
" # display the epoch training loss\n",
" print(\"epoch : {}/{}, recon loss = {:.8f}\".format(epoch + 1, epochs, loss))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's extract some test examples to reconstruct using our trained autoencoder."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"test_dataset = torchvision.datasets.MNIST(\n",
" root=\"~/torch_datasets\", train=False, transform=transform, download=True\n",
")\n",
"\n",
"test_loader = torch.utils.data.DataLoader(\n",
" test_dataset, batch_size=10, shuffle=False\n",
")\n",
"\n",
"test_examples = None\n",
"\n",
"with torch.no_grad():\n",
" for batch_features in test_loader:\n",
" batch_features = batch_features[0]\n",
" test_examples = batch_features.view(-1, 784)\n",
" reconstruction = model(test_examples)\n",
" break"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Visualize Results\n",
"\n",
"Let's try to reconstruct some test images using our trained autoencoder."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 1440x288 with 20 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"with torch.no_grad():\n",
" number = 10\n",
" plt.figure(figsize=(20, 4))\n",
" for index in range(number):\n",
" # display original\n",
" ax = plt.subplot(2, number, index + 1)\n",
" plt.imshow(test_examples[index].numpy().reshape(28, 28))\n",
" plt.gray()\n",
" ax.get_xaxis().set_visible(False)\n",
" ax.get_yaxis().set_visible(False)\n",
"\n",
" # display reconstruction\n",
" ax = plt.subplot(2, number, index + 1 + number)\n",
" plt.imshow(reconstruction[index].numpy().reshape(28, 28))\n",
" plt.gray()\n",
" ax.get_xaxis().set_visible(False)\n",
" ax.get_yaxis().set_visible(False)\n",
" plt.show()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment