Skip to content

Instantly share code, notes, and snippets.

@Mdhvince Mdhvince/imgrt.png
Last active Apr 25, 2019

Embed
What would you like to do?
Visualizing Training and Validation Loss During the training process using PyTorch and Bokeh

Visualizing Training and Validation Loss in real-time using PyTorch and Bokeh

Sometimes during training a neural network, I'm keeping an eye on some output like the current number of epochs, the training loss and the validation loss. All of this in order to have an Idea of in which direction, the algorithm is moving, and trying answering questions like: Should I choose a bigger/smaller Learning rate?
Should I go for a decay approach? Should I stop the training, Maybe reduce the number of epochs and many other questions.

Many of these questions can be answered by some package like early_stopping or other. But I found interesting the fact of being able to visualize these value in real-time. By Real-time, I mean during the training process.
And you know what ? Here is a quick tutorial on how do do this using the wonderful Deep Learning Framework PyTorch and the sublime Bokeh Librairy for plotting.

Step 1: Install dependencies

bokeh==1.1.0
cycler==0.10.0
Jinja2==2.10.1
kiwisolver==1.1.0
MarkupSafe==1.1.1
matplotlib==3.0.3
numpy==1.16.3
opencv-python==4.1.0.25
packaging==19.0
pandas==0.24.2
Pillow==6.0.0
pyparsing==2.4.0
python-dateutil==2.8.0
pytz==2019.1
PyYAML==5.1
six==1.12.0
torch==1.0.1.post2
torchvision==0.2.2.post3
tornado==6.0.2

Step 2: Import the necessary module

#PyTorch
import torch
...

# Bokeh
from bokeh.io import curdoc
from bokeh.layouts import column
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure

from functools import partial
from threading import Thread
from tornado import gen

Step 3: Prepare the plot

First, we have to defined an object called ColumnDataSourcewhich contains a dictionary of the variables that you want to include in the plot with initial value if you want. Here I've no intial values.

source = ColumnDataSource(data={'epochs': [],
                                'trainlosses': [],
                                'vallosses': [] }
)

Then create the window object by calling figure() and add the train and val losses as line plot.

plot = figure()
plot.line(x= 'epochs', y='trainlosses',
          color='green', alpha=0.8, legend='Train loss', line_width=2,
          source=source)

plot.line(x= 'epochs', y='vallosses',
          color='red', alpha=0.8, legend='Val loss', line_width=2,
          source=source)

Finally, we create the document that we will display by calling the curdoc() method. Here it is important to save a local copy of curdoc() in the doc variable so that all threads have access to the same document.

doc = curdoc()
# Add the plot to the current document
doc.add_root(plot)

Step 4: Update the plot

Here is a function that takes as input a dictionary that contains the same items as the data dictionary declared in step 3. This function is responsible for taking the new losses and current epochs from the training loop defined in step 5.

@gen.coroutine
def update(new_data):
    source.stream(new_data)

Step 5: Process data and write your training loop as usual

Here I assume that you you know how to train a Neural Net using PyTorch, I'll just focus on some part of the code in order to make thing more clear.

def train(n_epochs):
    model = Net()
    ...   
    model.train()
    for epoch in range(1, n_epochs+1):
        # Keep track of training and validation loss
        train_loss = 0.0
        valid_loss = 0.0

        for data in train_loader:
        	...
        	# compute your training loss as usual                               
            train_loss += loss.item()*images.size(0)
        
        model.eval()
        for data in valid_loader:
        	...
         	#compute your validation loss as usual 
            valid_loss += loss.item()*images.size(0)

        # calculate average losses as Usual
        train_loss = train_loss/len(train_loader)
        valid_loss = valid_loss/len(valid_loader)

Until here, nothing was changed comparing to what we do usually when training a neural network. The only thing we have to add after the last line and still within the for loop is the following lines. We construct the new data dictionary and then update the plot using the update method defined in step 4.

        new_data = {'epochs': [epoch],
                    'trainlosses': [train_loss],
                    'vallosses': [valid_loss] }

        doc.add_next_tick_callback(partial(update, new_data))

So the train() method should look like

def train(n_epochs):
    model = Net()
    ...   
    model.train()
    for epoch in range(1, n_epochs+1):
        ...
        for data in train_loader:
        	...
        	# compute your training loss as usual                               
            train_loss += loss.item()*images.size(0)
        
        model.eval()
        for data in valid_loader:
        	...
         	#compute your validation loss as usual 
            valid_loss += loss.item()*images.size(0)

        # calculate average losses as Usual
        train_loss = train_loss/len(train_loader)
        valid_loss = valid_loss/len(valid_loader)

        new_data = {'epochs': [epoch],
                    'trainlosses': [train_loss],
                    'vallosses': [valid_loss] }

        doc.add_next_tick_callback(partial(update, new_data))

Step 6 : Display result via the Terminal

if your file name is training.py instead of launching the python commande, we have to launch the bokeh server and execute the python script by taping in the terminal bokeh serve --show training.py

And We can see the result in the browser

I hope you enjoyed this tutorial, I tried my best with simple explanations.

Thank you !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.