Skip to content

Instantly share code, notes, and snippets.

@dvoils
Last active June 2, 2024 16:32
Show Gist options
  • Save dvoils/b717937f9b24ed0a54f6d909c16136b5 to your computer and use it in GitHub Desktop.
Save dvoils/b717937f9b24ed0a54f6d909c16136b5 to your computer and use it in GitHub Desktop.
build-diffuser-model.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"machine_shape": "hm",
"gpuType": "A100",
"authorship_tag": "ABX9TyOgXmmaUCVbYdtokIlBILBY",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
},
"accelerator": "GPU"
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/dvoils/b717937f9b24ed0a54f6d909c16136b5/build-diffuser-model.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"# CNNs for Denoising using Autoencoders\n",
"\n",
"Convolutional Neural Networks (CNNs) can be effectively employed for image denoising through an architecture known as a convolutional autoencoder. This type of neural network learns to transform noisy images into clean ones by compressing and then reconstructing the image data. The autoencoder consists of two main components: the encoder and the decoder.\n",
"\n",
"The encoder compresses the input image into a lower-dimensional representation, capturing essential features while reducing noise. This compression process allows the network to focus on the most important aspects of the image, discarding the irrelevant noise. The decoder then reconstructs the denoised image from this compressed representation, aiming to restore the image to its original, noise-free state. The decoder effectively learns to upsample and refine the compressed data back into a clear and detailed image.\n",
"\n",
"The training process involves feeding the autoencoder pairs of noisy and clean images. The network learns to minimize the difference between its output (the denoised image) and the clean image. This process typically involves creating a dataset of noisy and corresponding clean images. During training, the noisy images are used as input, and the clean images serve as targets. The autoencoder iteratively adjusts its parameters to reduce the error between its predictions and the actual clean images, learning to map noisy inputs to clean outputs.\n",
"\n",
"Once trained, the autoencoder can take new noisy images and effectively denoise them, restoring them to their cleaner versions. This method leverages the power of CNNs to capture and preserve important features while removing noise, making it a robust solution for image denoising tasks. By transforming noisy images into clean ones, convolutional autoencoders offer a powerful tool for enhancing image quality in various applications.\n",
"\n",
"---"
],
"metadata": {
"id": "IQ3gI-z-XdF4"
}
},
{
"cell_type": "code",
"source": [
"import torch\n",
"import torch.nn as nn\n",
"import torch.optim as optim\n",
"import torchvision.transforms as transforms\n",
"from torchvision.datasets import MNIST\n",
"from torch.utils.data import DataLoader\n",
"import matplotlib.pyplot as plt\n",
"import torch\n",
"import torch.nn as nn\n",
"import torch.optim as optim\n",
"import torchvision.transforms as transforms\n",
"from torchvision.datasets import MNIST\n",
"from torch.utils.data import DataLoader\n",
"import matplotlib.pyplot as plt"
],
"metadata": {
"id": "InQCT8VOW0tx"
},
"execution_count": 2,
"outputs": []
},
{
"cell_type": "code",
"source": [
"def add_noise(img):\n",
" noise = torch.randn(img.size()) * 0.3\n",
" noisy_img = img + noise\n",
" return noisy_img\n",
"\n",
"transform = transforms.Compose([\n",
" transforms.ToTensor(),\n",
" transforms.Lambda(add_noise)\n",
"])\n",
"\n",
"# Load the MNIST training set with noise added\n",
"train_set = MNIST(root='./data', download=True, train=True, transform=transform)\n",
"\n",
"# Normal transform for clean images\n",
"test_transform = transforms.Compose([\n",
" transforms.ToTensor()\n",
"])\n",
"\n",
"# Load the MNIST test set without noise for validation\n",
"test_set = MNIST(root='./data', download=True, train=False, transform=test_transform)\n",
"\n",
"# DataLoader\n",
"train_loader = DataLoader(train_set, batch_size=64, shuffle=True)\n",
"test_loader = DataLoader(test_set, batch_size=64, shuffle=False)\n"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "7z4oYszcW_ba",
"outputId": "8f71664d-db51-49dc-fe45-f74a4d23cf7a"
},
"execution_count": 5,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\n",
"Failed to download (trying next):\n",
"HTTP Error 403: Forbidden\n",
"\n",
"Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz\n",
"Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz\n"
]
},
{
"output_type": "stream",
"name": "stderr",
"text": [
"100%|██████████| 9912422/9912422 [00:08<00:00, 1136413.92it/s]\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw\n",
"\n",
"Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz\n",
"Failed to download (trying next):\n",
"HTTP Error 403: Forbidden\n",
"\n",
"Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz\n",
"Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz\n"
]
},
{
"output_type": "stream",
"name": "stderr",
"text": [
"100%|██████████| 28881/28881 [00:00<00:00, 57784.10it/s]\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw\n",
"\n",
"Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz\n",
"Failed to download (trying next):\n",
"HTTP Error 403: Forbidden\n",
"\n",
"Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz\n",
"Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz\n"
]
},
{
"output_type": "stream",
"name": "stderr",
"text": [
"100%|██████████| 1648877/1648877 [00:01<00:00, 1255868.85it/s]\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw\n",
"\n",
"Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz\n",
"Failed to download (trying next):\n",
"HTTP Error 403: Forbidden\n",
"\n",
"Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz\n",
"Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz\n"
]
},
{
"output_type": "stream",
"name": "stderr",
"text": [
"100%|██████████| 4542/4542 [00:00<00:00, 6539831.37it/s]"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw\n",
"\n"
]
},
{
"output_type": "stream",
"name": "stderr",
"text": [
"\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"# Convolutional Neural Networks\n",
"### 1. **Convolution Operation**\n",
"\n",
"The convolution operation is used in convolutional layers to extract features from input data such as images. The convolution involves sliding a smaller matrix (filter or kernel) over the input matrix (image) to produce a feature map.\n",
"\n",
"Given an input image matrix $X$ and a filter matrix $K$, the convolution ($\\ast$) operation is defined for each element $(i, j)$ of the output feature map $Y$ as:\n",
"\n",
"$$\n",
"Y(i, j) = (K \\ast X)(i, j) = \\sum_m \\sum_n K(m, n) \\cdot X(i-m, j-n)\n",
"$$\n",
"\n",
"- $m, n$ are the dimensions of the kernel.\n",
"- $i, j$ are the coordinates in the output feature map.\n",
"\n",
"**Edge Handling**: Typically, the input is padded with zeros (zero-padding) to control the spatial size of the output feature map.\n",
"\n",
"### 2. **Activation Functions**\n",
"\n",
"After convolution, an activation function is applied element-wise to introduce non-linearities into the model, allowing it to learn more complex patterns. A common activation function is the Rectified Linear Unit (ReLU), defined as:\n",
"\n",
"$$\n",
"\\text{ReLU}(x) = \\max(0, x)\n",
"$$\n",
"\n",
"This function replaces negative values in the output of the convolution with zero while maintaining non-negative values as they are.\n",
"\n",
"### 3. **Pooling Layers**\n",
"\n",
"Pooling (or subsampling) reduces the spatial dimensions (width and height) of the input volume for the next convolutional layer. It helps reduce the computational cost and control overfitting by providing an abstracted form of the representation.\n",
"\n",
"The most common form is max pooling, where the maximum element is selected from the region of the feature map covered by the filter, defined for a 2x2 pooling size as:\n",
"\n",
"$$\n",
"Y(i, j) = \\max_{\\substack{a \\in [i, i+1] \\\\ b \\in [j, j+1]}} X(a, b)\n",
"$$\n",
"\n",
"This operation reduces the dimensions of the input feature map by the size of the pooling region.\n",
"\n",
"### 4. **Fully Connected Layers**\n",
"\n",
"After several convolutional and pooling layers, the high-level reasoning in the neural network is done via fully connected layers. Neurons in a fully connected layer have connections to all activations in the previous layer. Their outputs are computed with a matrix multiplication followed by a bias offset:\n",
"\n",
"$$\n",
"Y = WX + b\n",
"$$\n",
"\n",
"- $W$ is the weight matrix,\n",
"- $X$ is the input vector to the fully connected layer,\n",
"- $b$ is the bias vector.\n",
"\n",
"### Example in a Simple CNN:\n",
"\n",
"Let’s assume a simple CNN architecture for clearer understanding:\n",
"1. **Input Layer**: Assume an input image of size 32x32x3 (width x height x channels).\n",
"2. **Convolutional Layer**: Use ten 5x5 filters without padding, stride = 1.\n",
" - Output size: $(32 - 5 + 1) \\times (32 - 5 + 1) \\times 10 = 28 \\times 28 \\times 10$\n",
"3. **Activation Layer**: Apply ReLU.\n",
"4. **Pooling Layer**: Apply 2x2 max pooling with stride = 2.\n",
" - Output size: $14 \\times 14 \\times 10$\n",
"5. **Fully Connected Layer**: Flatten the output and connect to a fully connected layer with 100 units.\n",
" - Flatten size: $14 \\times 14 \\times 10 = 1960$\n",
" - Output size: 100 (from the FC layer)\n",
"\n",
"These operations transform raw input into a form suitable for classification or other high-accuracy tasks.\n"
],
"metadata": {
"id": "Bgmh_oTnuZTR"
}
},
{
"cell_type": "code",
"source": [
"class DenoiseCNN(nn.Module):\n",
" def __init__(self):\n",
" super(DenoiseCNN, self).__init__()\n",
" self.encoder = nn.Sequential(\n",
" nn.Conv2d(1, 16, 3, stride=2, padding=1), # output: [16, 14, 14]\n",
" nn.ReLU(),\n",
" nn.Conv2d(16, 32, 3, stride=2, padding=1), # output: [32, 7, 7]\n",
" nn.ReLU(),\n",
" nn.Conv2d(32, 64, 7) # output: [64, 1, 1]\n",
" )\n",
" self.decoder = nn.Sequential(\n",
" nn.ConvTranspose2d(64, 32, 7),\n",
" nn.ReLU(),\n",
" nn.ConvTranspose2d(32, 16, 3, stride=2, padding=1, output_padding=1),\n",
" nn.ReLU(),\n",
" nn.ConvTranspose2d(16, 1, 3, stride=2, padding=1, output_padding=1),\n",
" nn.Sigmoid() # output: [1, 28, 28]\n",
" )\n",
"\n",
" def forward(self, x):\n",
" x = self.encoder(x)\n",
" x = self.decoder(x)\n",
" return x\n"
],
"metadata": {
"id": "9zEvKJ87XiQP"
},
"execution_count": 6,
"outputs": []
},
{
"cell_type": "code",
"source": [
"import torch\n",
"import torch.nn as nn\n",
"import torch.optim as optim\n",
"\n",
"device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
"model = DenoiseCNN().to(device)"
],
"metadata": {
"id": "bcFiPXKCdjm-"
},
"execution_count": 8,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"# The Training Loop\n",
"\n",
"The training loop iterates over batches of data provided by a DataLoader. Each batch of data is used to compute the loss and update the model's parameters.\n",
"\n",
"Before the model can perform a backward pass, the existing gradients need to be zeroed out. This is because gradients in PyTorch accumulate by default, which is useful for certain types of models but not for standard training procedures.\n",
"\n",
"## Forward Pass\n",
"In a neural network, the operation `output = model(data)` involves passing the input data through various layers of the network. Each layer computes a weighted sum of its inputs and applies an activation function. The forward pass through the network can be mathematically represented as follows for each layer $i$:\n",
"\n",
"### Linear Combination\n",
"The input to each layer $i$, denoted as $x_i$ (where $x_0$ is the initial data), is transformed by a linear combination of weights $W_i$ and biases $b_i$. This is given by the equation:\n",
"\n",
"$$\n",
" z_i = W_i \\cdot x_i + b_i\n",
"$$\n",
"\n",
"Here, $z_i$ is the result of the linear combination at layer \\(i\\).\n",
"### Activation\n",
"The linear combination output $z_i$ is then passed through an activation function $f_i$, which introduces non-linearity into the model and helps to capture complex patterns in the data. The output of this activation function is:\n",
"\n",
"$$\n",
" x_{i+1} = f_i(z_i)\n",
"$$\n",
"\n",
" - $x_{i+1}$ becomes the input for the next layer $i+1$.\n",
"\n",
"The process is repeated for each layer until the final output layer is reached. The output of the network, after processing through all layers, is:\n",
"\n",
"$$\n",
"output = x_n = f_{n-1}(W_{n-1} \\cdot x_{n-1} + b_{n-1})\n",
"$$\n",
"\n",
"This represents the network's output based on the initial input data after passing through all the transformations and activations of the layers.\n"
],
"metadata": {
"id": "VH6n7hAAFJMH"
}
},
{
"cell_type": "markdown",
"source": [
"# Loss function\n",
"\n",
"In the training of neural networks, especially in tasks like regression or autoencoders, measuring how well the model's predictions match the actual data is crucial. This is done using a loss function, which quantifies the difference between the predicted values and the actual values. In this case, we use the Mean Squared Error (MSE) loss, which is commonly used for these types of problems.\n",
"\n",
"The MSE loss is calculated as follows:\n",
"\n",
"- **Mean Squared Error Loss**:\n",
" - The MSE loss measures the average squared difference between the predicted outputs and the actual outputs. It is given by the equation:\n",
" $$\n",
" \\text{MSE} = \\frac{1}{n} \\sum_{i=1}^n (\\text{output}_i - \\text{data}_i)^2\n",
" $$\n",
" - Here, $\\text{output}_i$ refers to the predicted output of the model for the $i$-th element, and $\\text{data}_i$ refers to the actual value corresponding to the $i$-th element. The sum of squared differences is averaged over all $n$ elements in the dataset.\n",
"\n",
"The code `loss = nn.MSELoss()(output, data)` calculates this loss using PyTorch's `nn.MSELoss`, which automatically handles the computation of the squared differences and the averaging:\n",
"\n",
"- `nn.MSELoss()`: Creates an instance of the MSE loss function.\n",
"- `nn.MSELoss()(output, data)`: Computes the MSE loss between `output` and `data`.\n",
"\n",
"This loss value is then used during the training process to update the model's weights with the goal of minimizing the loss, thereby improving the model's accuracy in predicting the data.\n"
],
"metadata": {
"id": "urz4tnOON7dG"
}
},
{
"cell_type": "markdown",
"source": [
"# Back Propagation\n",
"\n",
"In training neural networks, the `loss.backward()` operation is pivotal as it initiates backpropagation, which is crucial for learning. This method calculates the gradients of the loss function with respect to the network's parameters, enabling their optimization during training.\n",
"\n",
"### Mathematical Background of `loss.backward()`\n",
"\n",
"- **Gradient Computation**:\n",
" - Consider a neural network where the output depends on the input data through a series of transformations (layers). Let the loss function be denoted by $L$. For each parameter $\\theta$ in the network (including weights and biases), the gradient of $L$ with respect to $\\theta$ is represented as $\\frac{\\partial L}{\\partial \\theta}$.\n",
" \n",
" - The gradient $\\frac{\\partial L}{\\partial \\theta}$ indicates the direction and magnitude of change in $L$ when $\\theta$ is altered slightly. Backpropagation computes these gradients using the chain rule of calculus, applying it layer by layer from the output back to the input.\n",
"\n",
"- **Chain Rule Application**:\n",
" - If the network output $y$ is a function of the weights $W$ (and biases $b$), and the loss $L$ depends on $y$, then by the chain rule, the gradient of $L$ with respect to each weight $w_{ij}$ in $W$ is:\n",
"\n",
" $$\n",
" \\frac{\\partial L}{\\partial w_{ij}} = \\frac{\\partial L}{\\partial y} \\cdot \\frac{\\partial y}{\\partial w_{ij}}\n",
" $$\n",
" \n",
" - This process reverses through the network (hence 'backpropagation'), starting from the output layer and moving towards the input, updating the gradient at each layer based on the gradients computed in subsequent layers.\n",
"\n",
"- **Efficiency of Backpropagation**:\n",
" - This approach is efficient because it allows for the simultaneous updates of all gradients using a technique called reverse mode differentiation. This only requires a single pass from the output back to the input to compute all necessary gradients, minimizing computational overhead and improving training speed.\n",
"\n",
"This methodical approach ensures that every parameter in the network is adjusted appropriately to minimize the loss, thus enhancing the model's predictive accuracy.\n"
],
"metadata": {
"id": "f1gpL1-wGXNw"
}
},
{
"cell_type": "code",
"source": [
"def train(model, device, train_loader, optimizer, epoch):\n",
" model.train()\n",
" for batch_idx, (data, _) in enumerate(train_loader):\n",
" data = data.to(device)\n",
" optimizer.zero_grad()\n",
" output = model(data)\n",
" loss = nn.MSELoss()(output, data)\n",
" loss.backward()\n",
" optimizer.step()\n",
" if batch_idx % 100 == 0:\n",
" print('Train Epoch: {} [{}/{} ({:.0f}%)]\\tLoss: {:.6f}'.format(\n",
" epoch, batch_idx * len(data), len(train_loader.dataset),\n",
" 100. * batch_idx / len(train_loader), loss.item()))\n"
],
"metadata": {
"id": "ffmRc0QxXo1l"
},
"execution_count": 9,
"outputs": []
},
{
"cell_type": "code",
"source": [
"device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
"optimizer = optim.Adam(model.parameters(), lr=1e-3)\n",
"\n",
"for epoch in range(1, 6):\n",
" train(model, device, train_loader, optimizer, epoch)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "qztHURcQXwr1",
"outputId": "5723054b-2fab-4ed0-9865-7efe124a2acc"
},
"execution_count": 10,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Train Epoch: 1 [0/60000 (0%)]\tLoss: 0.327067\n",
"Train Epoch: 1 [6400/60000 (11%)]\tLoss: 0.159962\n",
"Train Epoch: 1 [12800/60000 (21%)]\tLoss: 0.160823\n",
"Train Epoch: 1 [19200/60000 (32%)]\tLoss: 0.154230\n",
"Train Epoch: 1 [25600/60000 (43%)]\tLoss: 0.140378\n",
"Train Epoch: 1 [32000/60000 (53%)]\tLoss: 0.118569\n",
"Train Epoch: 1 [38400/60000 (64%)]\tLoss: 0.109507\n",
"Train Epoch: 1 [44800/60000 (75%)]\tLoss: 0.103985\n",
"Train Epoch: 1 [51200/60000 (85%)]\tLoss: 0.100498\n",
"Train Epoch: 1 [57600/60000 (96%)]\tLoss: 0.098651\n",
"Train Epoch: 2 [0/60000 (0%)]\tLoss: 0.098425\n",
"Train Epoch: 2 [6400/60000 (11%)]\tLoss: 0.099546\n",
"Train Epoch: 2 [12800/60000 (21%)]\tLoss: 0.095994\n",
"Train Epoch: 2 [19200/60000 (32%)]\tLoss: 0.098282\n",
"Train Epoch: 2 [25600/60000 (43%)]\tLoss: 0.096209\n",
"Train Epoch: 2 [32000/60000 (53%)]\tLoss: 0.095288\n",
"Train Epoch: 2 [38400/60000 (64%)]\tLoss: 0.096000\n",
"Train Epoch: 2 [44800/60000 (75%)]\tLoss: 0.094006\n",
"Train Epoch: 2 [51200/60000 (85%)]\tLoss: 0.094433\n",
"Train Epoch: 2 [57600/60000 (96%)]\tLoss: 0.094849\n",
"Train Epoch: 3 [0/60000 (0%)]\tLoss: 0.092818\n",
"Train Epoch: 3 [6400/60000 (11%)]\tLoss: 0.093319\n",
"Train Epoch: 3 [12800/60000 (21%)]\tLoss: 0.092815\n",
"Train Epoch: 3 [19200/60000 (32%)]\tLoss: 0.092602\n",
"Train Epoch: 3 [25600/60000 (43%)]\tLoss: 0.093454\n",
"Train Epoch: 3 [32000/60000 (53%)]\tLoss: 0.092860\n",
"Train Epoch: 3 [38400/60000 (64%)]\tLoss: 0.093124\n",
"Train Epoch: 3 [44800/60000 (75%)]\tLoss: 0.092819\n",
"Train Epoch: 3 [51200/60000 (85%)]\tLoss: 0.093516\n",
"Train Epoch: 3 [57600/60000 (96%)]\tLoss: 0.091468\n",
"Train Epoch: 4 [0/60000 (0%)]\tLoss: 0.092095\n",
"Train Epoch: 4 [6400/60000 (11%)]\tLoss: 0.090435\n",
"Train Epoch: 4 [12800/60000 (21%)]\tLoss: 0.091019\n",
"Train Epoch: 4 [19200/60000 (32%)]\tLoss: 0.090567\n",
"Train Epoch: 4 [25600/60000 (43%)]\tLoss: 0.091290\n",
"Train Epoch: 4 [32000/60000 (53%)]\tLoss: 0.090119\n",
"Train Epoch: 4 [38400/60000 (64%)]\tLoss: 0.090886\n",
"Train Epoch: 4 [44800/60000 (75%)]\tLoss: 0.091868\n",
"Train Epoch: 4 [51200/60000 (85%)]\tLoss: 0.090584\n",
"Train Epoch: 4 [57600/60000 (96%)]\tLoss: 0.090982\n",
"Train Epoch: 5 [0/60000 (0%)]\tLoss: 0.091345\n",
"Train Epoch: 5 [6400/60000 (11%)]\tLoss: 0.090041\n",
"Train Epoch: 5 [12800/60000 (21%)]\tLoss: 0.090048\n",
"Train Epoch: 5 [19200/60000 (32%)]\tLoss: 0.090627\n",
"Train Epoch: 5 [25600/60000 (43%)]\tLoss: 0.089962\n",
"Train Epoch: 5 [32000/60000 (53%)]\tLoss: 0.091128\n",
"Train Epoch: 5 [38400/60000 (64%)]\tLoss: 0.089472\n",
"Train Epoch: 5 [44800/60000 (75%)]\tLoss: 0.089823\n",
"Train Epoch: 5 [51200/60000 (85%)]\tLoss: 0.090318\n",
"Train Epoch: 5 [57600/60000 (96%)]\tLoss: 0.090904\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"def plot_examples(model, device, data_loader, num_images=6):\n",
" model.eval()\n",
" figure, ax = plt.subplots(nrows=num_images, ncols=3, figsize=(9, num_images * 3))\n",
" with torch.no_grad():\n",
" for batch_idx, (data, _) in enumerate(data_loader):\n",
" noisy_data = add_noise(data) # Add noise to the original data for visualization\n",
" noisy_data = noisy_data.to(device)\n",
" output = model(noisy_data) # Denoise the noisy data\n",
" for i in range(num_images):\n",
" if batch_idx * data_loader.batch_size + i >= num_images:\n",
" break\n",
"\n",
" # Plot original image\n",
" ax[i, 0].imshow(data[i].cpu().squeeze().numpy(), cmap='gray')\n",
" ax[i, 0].title.set_text('Original Images')\n",
" ax[i, 0].axis('off')\n",
"\n",
" # Plot noisy image\n",
" ax[i, 1].imshow(noisy_data[i].cpu().squeeze().numpy(), cmap='gray')\n",
" ax[i, 1].title.set_text('Noisy Images')\n",
" ax[i, 1].axis('off')\n",
"\n",
" # Plot denoised image\n",
" ax[i, 2].imshow(output[i].cpu().squeeze().numpy(), cmap='gray')\n",
" ax[i, 2].title.set_text('Denoised Images')\n",
" ax[i, 2].axis('off')\n",
"\n",
" if batch_idx * data_loader.batch_size + i + 1 == num_images:\n",
" break\n",
" if batch_idx * data_loader.batch_size + i + 1 >= num_images:\n",
" break\n",
"\n",
" plt.tight_layout()\n",
" plt.show()\n",
"\n",
"# Assuming model, device, and test_loader are already defined\n",
"plot_examples(model, device, test_loader)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"id": "FQEXjUvTbNOS",
"outputId": "d3673e00-64d9-4b49-c8ea-3e132a6f4fb8"
},
"execution_count": 11,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 900x1800 with 18 Axes>"
],
"image/png": "\n"
},
"metadata": {}
}
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment