TensorBoard is a visualization toolkit with allows the user to track and visualize metrics such as loss and accuracy, visualize the model graph, histograms, projects embeddings and much more.
Although Tensorboard was created by the TensorFlow people, it can also be used with PyTorch. This gist shows how one can do this.
PyTorch is used to develop Neural Network models. To quickly install and use it one can run the following command. NOTE: Best way to install the PyTorch library is to follow the instructions.
conda install pytorch -c pytorch
To install Tensorboard simply run the following command in your conda environment.
conda install tensorboard
With tensorboard one can:
- Inspect the models architecture
- Track model training
- Assess the trained model
In this section we will cover all of these points.
Lets say that we have a model net
, an data batch batch
, a criterion criterion
and optimizer optimizer
already set.
To write the information in Tensorboard readable format we must first initialize a SummaryWriter
which can be imported
from torch.utils.tensorboard
.
from torch.utils.tensorboard import SummaryWriter
# the default directory where tensorboard writes stuff is "./runs"
# we can set the directory by providing it as the input; in this example
# we will write the information into the "./logs/Net" folder - with this
# convention we will know that the information is associated with our Net
# model
writer = SummaryWriter("./logs/Net")
To visualize the model architecture one add a graph of the model to tensorboard.
# write a new graph to tensorboard
writer.add_graph(net, batch)
# close the writer to flush the information into the file
write.close()
When training the model we are interested in how the training loss changes through iterations.
We can write the training loss using the add_scalar
method.
We will illustrate how one can do this on a fictional example:
# set the hyperparameters
n_epochs = 1
n_steps = 100
running_loss = 0.0
for epoch in range(n_epochs): # loop over the training dataset `n_epochs` times
# iterate through the training examples
for i, example in enumerate(training_data, 0):
# each example is an array containing the inputs and labels: [inputs, labels]
# NOTE: you can have a different example structure
inputs, labels = example
# using the optimizer we set all model gradients to zero
optimizer.zero_grad()
# do a forward pass
outputs = net(inputs)
# calculate the loss
loss = criterion(outputs, labels)
# do a backward pass
loss.backward()
# update the model using the optimizer
optimizer.step()
# add the loss value to the running_loss
# NOTE: important to use the .item() method; otherwise you will store tensors
# which in the CUDA setting would mean unnecessarily filling GPU space
running_loss += loss.item()
if i % n_steps and i > 0:
# every n_steps log the running loss
# arguments:
# - the scalar label (tensorboard is able to plot multiple graphs for the same scalar label)
# - the average loss; consider this as the y-axis value
# - the iteration number; consider this as the x-axis value
writer.add_scalar("training loss", running_loss / n_steps, epochs * len(training_data) + i)
# set the running loss back to zero
running_loss = 0.0
Once we have some information stored we can run tensorboard as a web service. To do this execute:
tensorboard --logdir=./logs --port 6006
The arguments provided are:
- logdir. The path to the folder containing the tensorboard information
- port (optional). The port on which we wish to access tensorboard. Default: 6006
Once tensorboard is running we can go to our favorite browser and visit localhost:6006
or if we
changed the port number to something else write localhost:{port}
where {port}
is the port number.
After that, we can access all of the stored information, view the models architecture and all of our scalar values.