Logging with Weights & Biases

Monitoring your neural network’s training made easy

Abdur Rahman Kalim
Towards Data Science

--

Photo by Isaac Smith on Unsplash

Logging the loss and accuracy curves has never been an easy job. We often save these values in arrays or lists and then plot them at the end of the training. Sharing these graphs is even harder where sending screenshots of these graphs seems the only choice. In this tutorial, we will address this issue with Weights & Biases.

Weights & Biases (WandB) is a python package that allows us to monitor our training in real-time. It can be easily integrated with popular deep learning frameworks like Pytorch, Tensorflow, or Keras. Additionally, it allows us to organize our Runs into Projects where we can easily compare them and identify the best performing model. In this guide, we will learn how to use WandB for logging.

Let’s get started

First, let’s create a free WandB account here. WandB provides a 200GB limited storage on the free account, where we can log graphs, images, videos, and much more.

Install WandB

Running the code snippet below will install WandB to our Colab Notebook instance. After installing and importing WandB, it will prompt us to authenticate the same way as we do when mounting Google Drive to the notebook instance.

Dataset

In this tutorial, we will train a Convolutional Neural Network on Fashion MNIST Dataset which is available in PyTorch’s torchvision.datasets. We will split the datasets into batches and shuffle them before feeding them to our Neural Network. For simplicity, we will not use any data augmentation except transforming the images and labels into torch tensors.

Neural Network

We will use a simple Convolutional Neural Network with 2 Conv layers and 3 Linear layers. In the input Conv layer, we will keep the input channel count as 1 for the neural network to accept grayscale images. Similarly, in the final hidden layer, the output channel count should be 10 for the model to output scores for each of the 10 classes.

Training the Neural Network

Initializing WandB

Before training, we must run wandb.init(). This will initialize a new Run in the WandB database. wandb.init() has the following parameters:

  • Name of the Run (string)
  • Name of the Project where this Run should be created (string)
  • Notes about this Run (string)[optional]
  • Tags to associate with this Run (list of strings) [optional]
  • Entity is the username of our WandB account (string).

Logging Hyperparameters (optional)

Any hyperparameter that we want to log can be defined as an attribute of wandb.config. Note that wandb.config should be used to log only those values which do not change during training. Here, we have logged the learning rate using this method.

Logging Network Weight Histograms (optional)

Logging weight histograms is as simple as calling wandb.watch() and passing the network object as the argument.

Logging Loss and Accuracy

At the end of each epoch, we can log the loss and accuracy values using wandb.log(). This method takes a dictionary, mapping names (string) with the corresponding values, as the argument. Note that this method is called at the end of every epoch.

wandb.log({"Epoch": epoch,        
"Train Loss": loss_train,
"Train Acc": acc_train,
"Valid Loss": loss_valid,
"Valid Acc": acc_valid})

The complete training code is given below. Please note that train and validate are two helper functions that are defined in the Colab Notebook.

Monitor the training

Now that our model is training, we can view the training graphs in real-time. Before the training starts, the Run Link and Project Link will be printed in the cell output. Click on the Run Link to get redirected to the Run page. There, you can see the live graphs (the graphs will keep updating as the model trains). Please note that the graphs are interactive, we can merge multiple graphs, smoothen them, change the color or legends, and much more.

Recall that we are logging the network weight histograms as well. These can be viewed under the histogram section.

Compare multiple Runs

We can easily compare multiple Runs in the same project using a Parallel Coordinate Chart. It represents the model’s performance, in terms of minimum loss or maximum accuracy, with the neural network’s hyperparameters. This tool proves to be very powerful when dealing with a large number of Runs. It provides a deeper insight into how each hyperparameter affects the model’s performance.

We can also go to the project page and compare the accuracy and loss curves of selected or all the Runs within that project. The screenshot below shows one of my projects.

Share your Runs

Sharing your model’s training graphs with WandB is as simple as sharing the Run Link or Project Link, just make sure that the project is public for others to view. People with the Project Link or Run Link can view the live graphs too. Isn’t that cool!

That’s all folks!

This guide taught us how to use Weights & Biases to monitor our model’s training and also how to share your Runs with others.

Want more!

To explore WandB further, check out the links below.

--

--

Education | Technology | Productivity | Artificial Intelligence | Data Science | Deep Learning