Skip to content

Instantly share code, notes, and snippets.

@dvgodoy
Last active January 9, 2024 08:54
Show Gist options
  • Save dvgodoy/1d818d86a6a0dc6e7c07610835b46fe4 to your computer and use it in GitHub Desktop.
Save dvgodoy/1d818d86a6a0dc6e7c07610835b46fe4 to your computer and use it in GitHub Desktop.
torch.manual_seed(42)
x_tensor = torch.from_numpy(x).float()
y_tensor = torch.from_numpy(y).float()
# Builds dataset with ALL data
dataset = TensorDataset(x_tensor, y_tensor)
# Splits randomly into train and validation datasets
train_dataset, val_dataset = random_split(dataset, [80, 20])
# Builds a loader for each dataset to perform mini-batch gradient descent
train_loader = DataLoader(dataset=train_dataset, batch_size=16)
val_loader = DataLoader(dataset=val_dataset, batch_size=20)
# Builds a simple sequential model
model = nn.Sequential(nn.Linear(1, 1)).to(device)
print(model.state_dict())
# Sets hyper-parameters
lr = 1e-1
n_epochs = 150
# Defines loss function and optimizer
loss_fn = nn.MSELoss(reduction='mean')
optimizer = optim.SGD(model.parameters(), lr=lr)
losses = []
val_losses = []
# Creates function to perform train step from model, loss and optimizer
train_step = make_train_step(model, loss_fn, optimizer)
# Training loop
for epoch in range(n_epochs):
# Uses loader to fetch one mini-batch for training
for x_batch, y_batch in train_loader:
# NOW, sends the mini-batch data to the device
# so it matches location of the MODEL
x_batch = x_batch.to(device)
y_batch = y_batch.to(device)
# One stpe of training
loss = train_step(x_batch, y_batch)
losses.append(loss)
# After finishing training steps for all mini-batches,
# it is time for evaluation!
# We tell PyTorch to NOT use autograd...
# Do you remember why?
with torch.no_grad():
# Uses loader to fetch one mini-batch for validation
for x_val, y_val in val_loader:
# Again, sends data to same device as model
x_val = x_val.to(device)
y_val = y_val.to(device)
# What is that?!
model.eval()
# Makes predictions
yhat = model(x_val)
# Computes validation loss
val_loss = loss_fn(y_val, yhat)
val_losses.append(val_loss.item())
print(model.state_dict())
print(np.mean(losses))
print(np.mean(val_losses))
@filipre
Copy link

filipre commented Aug 14, 2019

Thank you for the useful tutorial! In the following code, I put everything together from the tutorial and also modified the loss function a bit to print out the mean loss as a progress indicator:

import numpy as np
import torch
import torch.optim as optim
import torch.nn as nn
from torchviz import make_dot
from torch.utils.data import Dataset, TensorDataset, DataLoader
from torch.utils.data.dataset import random_split

device = 'cuda' if torch.cuda.is_available() else 'cpu'

np.random.seed(42)
x = np.random.rand(100, 1)
true_a, true_b = 1, 2
y = true_a + true_b*x + 0.1*np.random.randn(100, 1)

x_tensor = torch.from_numpy(x).float()
y_tensor = torch.from_numpy(y).float()

class CustomDataset(Dataset):
    def __init__(self, x_tensor, y_tensor):
        self.x = x_tensor
        self.y = y_tensor

    def __getitem__(self, index):
        return (self.x[index], self.y[index])

    def __len__(self):
        return len(self.x)

dataset = TensorDataset(x_tensor, y_tensor) # dataset = CustomDataset(x_tensor, y_tensor)

train_dataset, val_dataset = random_split(dataset, [80, 20])

train_loader = DataLoader(dataset=train_dataset, batch_size=16)
val_loader = DataLoader(dataset=val_dataset, batch_size=20)

class ManualLinearRegression(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(1, 1)

    def forward(self, x):
        return self.linear(x)

def make_train_step(model, loss_fn, optimizer):
    def train_step(x, y):
        model.train()
        yhat = model(x)
        loss = loss_fn(y, yhat)
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
        return loss.item()
    return train_step

# Estimate a and b
torch.manual_seed(42)

model = ManualLinearRegression().to(device) # model = nn.Sequential(nn.Linear(1, 1)).to(device)
loss_fn = nn.MSELoss(reduction='mean')
optimizer = optim.SGD(model.parameters(), lr=1e-1)
train_step = make_train_step(model, loss_fn, optimizer)

n_epochs = 100
training_losses = []
validation_losses = []
print(model.state_dict())

for epoch in range(n_epochs):
    batch_losses = []
    for x_batch, y_batch in train_loader:
        x_batch = x_batch.to(device)
        y_batch = y_batch.to(device)
        loss = train_step(x_batch, y_batch)
        batch_losses.append(loss)
    training_loss = np.mean(batch_losses)
    training_losses.append(training_loss)

    with torch.no_grad():
        val_losses = []
        for x_val, y_val in val_loader:
            x_val = x_val.to(device)
            y_val = y_val.to(device)
            model.eval()
            yhat = model(x_val)
            val_loss = loss_fn(y_val, yhat).item()
            val_losses.append(val_loss)
        validation_loss = np.mean(val_losses)
        validation_losses.append(validation_loss)

    print(f"[{epoch+1}] Training loss: {training_loss:.3f}\t Validation loss: {validation_loss:.3f}")

print(model.state_dict())

Output:

OrderedDict([('linear.weight', tensor([[0.7645]])), ('linear.bias', tensor([0.8300]))])
[1] Training loss: 0.346	 Validation loss: 0.117
[2] Training loss: 0.081	 Validation loss: 0.066
[3] Training loss: 0.056	 Validation loss: 0.056
[4] Training loss: 0.048	 Validation loss: 0.050
[5] Training loss: 0.042	 Validation loss: 0.045
[6] Training loss: 0.037	 Validation loss: 0.041
[7] Training loss: 0.033	 Validation loss: 0.037
[8] Training loss: 0.030	 Validation loss: 0.034
[9] Training loss: 0.026	 Validation loss: 0.031
[10] Training loss: 0.024	 Validation loss: 0.029
[11] Training loss: 0.021	 Validation loss: 0.027
[12] Training loss: 0.020	 Validation loss: 0.025
...
[96] Training loss: 0.007	 Validation loss: 0.012
[97] Training loss: 0.007	 Validation loss: 0.012
[98] Training loss: 0.007	 Validation loss: 0.012
[99] Training loss: 0.007	 Validation loss: 0.012
[100] Training loss: 0.007	 Validation loss: 0.012
OrderedDict([('linear.weight', tensor([[1.9388]])), ('linear.bias', tensor([1.0227]))])

@IamSoo
Copy link

IamSoo commented Apr 19, 2020

This was really helpful.

@tongcezhou
Copy link

This is really helpful!

@slimbs15
Copy link

Thank you for this really complete and helpfull tutorial !

@passwortknacker
Copy link

This was actually very helpful, thanks!

@ccyccxcl
Copy link

ccyccxcl commented Aug 3, 2021

Thank you !

@ianhill60
Copy link

How would you implement a standard scaler into this workflow? You want to scale after the train-test split because it should be fit only on training data, but would you do that for each individual batch? Pytorch must have a standard scaler for Dataset objects no?
Thanks for the tutorial!

@FreeRealEstate221
Copy link

this is the best PyTorch 101 I have encountered so far. Thank you very much!

@petrov826
Copy link

Thank you so much for your great tutorial!!🤗

I used to live in TensorFlow world, and moved to fastai one 2 years ago.
These days, I've been wondering what's going on inside fastai...
After reading your blog post, I got a super solid foundation of pytorch!

@Piyushi-0
Copy link

Thanks a lot for the amazing blog.

@eswarijayakumar
Copy link

Thanks for the great tutorial about the basics of Pytorch

@elif2022
Copy link

This is such a a good, concise tutorial! it helped a lot. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment