How to Calculate Triplet Loss in PyTorch?

PyTorch framework is used to build and configure Deep Learning algorithms to solve many problems like face recognition, recommended systems, image retrieval, etc. The models are trained on the input data containing the observed information gathered over time. Evaluation of these models is an important process that can be done using different loss functions like Focal, NLL, CTC, etc.

Quick Outline

This guide explains the following sections:

What is Triplet Loss
How to Calculate Triplet Loss in PyTorch
Method 1: Calculate Triplet Loss to Optimize the Trained Model
Method 2: Calculate Triplet Loss Using TripletMarginLoss() Function
Method 3: Calculate Triplet Loss Using Distance Function
Method 4: Calculate Triplet Loss Using Custom Function
Conclusion

What is Triplet Loss?

Not many loss functions are capable of finding the similarity or difference for the dataset containing more than a million classes. To solve this problem, the Triplet loss function was proposed in 2015 using three dimensions of the data like actual data called the anchor. The second dimension is called the positive which is similar to the actual value and the negative is different from the actual data.

The triplet loss function is used to minimize the distance between the anchor and positive points as compared to the negative point. The mathematical formula explaining the triplet loss process is mentioned below:

Here:

A: Anchor or the actual input data

P: Positive which is similar to the anchor value

N: Negative value which is dissimilar to the anchor point

D(A, P): Distance between the anchor and positive points

D(A, N): Distance between the anchor and negative values

margin: The minimum distance between the anchor-positive and anchor-negative pair

How to Calculate Triplet Loss in PyTorch

PyTorch offers multiple methods like TripletMarginLoss(), TripletMarginWithDistanceLoss(), etc. to calculate the triplet loss. The user can also build a custom function to apply the triplet loss on the deep learning model using neural network dependency. To learn the process of calculating the triplet loss using the PyTorch environment, simply follow this guide:

Note: The Python code for calculating the triplet loss can be accessed from here

Method 1: Calculate Triplet Loss to Optimize the Trained Model

The following method trains the deep learning model and calculates the triplet loss to optimize its performance using multiple iterations of backpropagation. Start the process by executing the following steps:

Step 1: Install Modules

In the Python notebook, install the torch framework to get its dependencies using the “pip” Python’s package manager:

pip install torch

Step 2: Import Libraries

The first step in the process is to import the required libraries for training and optimizing the performance using the following code:

import time
import torch
import random
import numpy as np
import pandas as pd
import torch.nn as nn
import torch.optim as optim
from tqdm.notebook import tqdm
import matplotlib.pyplot as plt
from torchvision import transforms
from torch.utils.data import DataLoader, Dataset

Step 3: Upload the Dataset

The next step is to upload the dataset to the colab notebook and if you don’t have the dataset, simply download the dataset from the Kaggle library:

Now, click on the folder icon from the left panel on the colab notebook and then click on the upload icon to select the files from the local system:

Step 4: Load the Data to Train the Model

Before loading the data, apply the seed() method with the torch and numpy library to normalize the training and testing datasets. It will provide the same set of results from the same data to use different machine learning algorithms:

torch.manual_seed(2020)
np.random.seed(2020)
random.seed(2020)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

if device.type == "cuda":
    torch.cuda.get_device_name()

Add the dimensions for the deep learning algorithm to train data using 5 iterations and 32 batches for each epoch:

embedding_dims = 2
batch_size = 32
epochs = 5

Now, load the training and testing data using the read_csv() method from the “pandas” library before printing the first 5 rows of the data:

train_df = pd.read_csv('/content/train.csv')
test_df = pd.read_csv('/content/test.csv')

train_df.head()

Step 5: Design MNIST Class

Create the MNIST class to provide the logic for the training dataset by transforming the images according to the model:

class MNIST(Dataset):
    def __init__(self, df, train=True, transform=None):
        self.is_train = train
        self.transform = transform
        self.to_pil = transforms.ToPILImage()
       
        if self.is_train:           
            self.images = df.iloc[:, 1:].values.astype(np.uint8)
            self.labels = df.iloc[:, 0].values
            self.index = df.index.values
        else:
            self.images = df.values.astype(np.uint8)
       
    def __len__(self):
        return len(self.images)
   
    def __getitem__(self, item):
        anchor_img = self.images[item].reshape(28, 28, 1)
       
        if self.is_train:
            anchor_label = self.labels[item]

            positive_list = self.index[self.index!=item][self.labels[self.index!=item]==anchor_label]

            positive_item = random.choice(positive_list)
            positive_img = self.images[positive_item].reshape(28, 28, 1)
           
            negative_list = self.index[self.index!=item][self.labels[self.index!=item]!=anchor_label]
            negative_item = random.choice(negative_list)
            negative_img = self.images[negative_item].reshape(28, 28, 1)
           
            if self.transform:
                anchor_img = self.transform(self.to_pil(anchor_img))
                positive_img = self.transform(self.to_pil(positive_img))
                negative_img = self.transform(self.to_pil(negative_img))
           
            return anchor_img, positive_img, negative_img, anchor_label
       
        else:
            if self.transform:
                anchor_img = self.transform(self.to_pil(anchor_img))
            return anchor_img

Create the MNIST class to use the data frames or df for storing the datasets uploaded in the previous step.
Initialize the df using the constructor that separates the image from its labels if the train equals True value and transform is used to apply transformations to the images.
Get the total number of instances or images from the dataset using the __len__(self) method.
Return the anchor values if the model is not training and return the anchor, positive and negative values while training.
Get the pairs of the anchor-positive and anchor-negative values to calculate the triplet loss value of the model.

Create the train_ds variable to call the MNIST() method for training data with its arguments to transform the data. Load the data using the DataLoader() method and store it in the train_loader variable:

train_ds = MNIST(train_df,
                train=True,
                transform=transforms.Compose([
                    transforms.ToTensor()
                ]))
train_loader = DataLoader(train_ds, batch_size=batch_size, shuffle=True, num_workers=4)

Store the test data set in the test_ds variable using the MNIST() method and DataLoader() to load the data in the test_loader variable:

test_ds = MNIST(test_df, train=False, transform=transforms.ToTensor())
test_loader = DataLoader(test_ds, batch_size=batch_size, shuffle=False, num_workers=4)

Step 6: Design Triplet Loss Class

To calculate the triplet loss value of the model, we need to customize its function by telling it what to return to the model for backpropagation after each epoch. The TripletLoss() method uses the neural network dependency of the torch library to calculate the loss value in PyTorch:

class TripletLoss(nn.Module):
    def __init__(self, margin=1.0):
        super(TripletLoss, self).__init__()
        self.margin = margin
       
    def calc_euclidean(self, x1, x2):
        return (x1 - x2).pow(2).sum(1)
   
    def forward(self, anchor: torch.Tensor, positive: torch.Tensor, negative: torch.Tensor) -> torch.Tensor:
        distance_positive = self.calc_euclidean(anchor, positive)
        distance_negative = self.calc_euclidean(anchor, negative)
        losses = torch.relu(distance_positive - distance_negative + self.margin)

        return losses.mean()

Define the parameters to be used in the TripletLoss() method like self and margin.
Configure the calc_eucledian() with the formula for calculating the distance between the anchor from positive and negative values.
The distance_positive variable stores the distance of the anchor to positive values and distance_negative stores the gap between the anchor and negative values.
The losses variable calculates the overall loss values of the model using both the pairs and the margin to normalize the error value.

Now, design the structure of the neural network with its dimensions and the layers containing the neurons connected to the next layers. The complete network is responsible for learning the hidden patterns from the datasets throughout multiple iterations called epochs:

class Network(nn.Module):
    def __init__(self, emb_dim=128):
        super(Network, self).__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(1, 32, 5),
            nn.PReLU(),
            nn.MaxPool2d(2, stride=2),
            nn.Dropout(0.3),
            nn.Conv2d(32, 64, 5),
            nn.PReLU(),
            nn.MaxPool2d(2, stride=2),
            nn.Dropout(0.3)
        )
       
        self.fc = nn.Sequential(
            nn.Linear(64*4*4, 512),
            nn.PReLU(),
            nn.Linear(512, emb_dim)
        )
       
    def forward(self, x):
        x = self.conv(x)
        x = x.view(-1, 64*4*4)
        x = self.fc(x)
        # x = nn.functional.normalize(x)
        return x

Define the weights to be applied after setting up the network for the neural network model as the weights determine the improvement of the model with each iteration:

def init_weights(m):
    if isinstance(m, nn.Conv2d):
        torch.nn.init.kaiming_normal_(m.weight)

Step 7: Building & Training Model

Build the model by integrating all the components like Network(), weights, TripletLoss(), and others that are configured or customized previously:

model = Network(embedding_dims)
model.apply(init_weights)
model = torch.jit.script(model).to(device)

optimizer = optim.Adam(model.parameters(), lr=0.001)
criterion = torch.jit.script(TripletLoss())

Now, start the training process of the model with the number of iterations in descending order using inside the for loop. The loop keeps the iterations going to calculate the loss value at the end of each iteration to make the model improve with each epoch:

model.train()
for epoch in tqdm(range(epochs), desc="Epochs"):
    running_loss = []
    for step, (anchor_img, positive_img, negative_img, anchor_label) in enumerate(tqdm(train_loader, desc="Training", leave=False)):
        anchor_img = anchor_img.to(device)
        positive_img = positive_img.to(device)
        negative_img = negative_img.to(device)

        optimizer.zero_grad()
        anchor_out = model(anchor_img)
        positive_out = model(positive_img)
        negative_out = model(negative_img)

        loss = criterion(anchor_out, positive_out, negative_out)
        loss.backward()
        optimizer.step()

        running_loss.append(loss.cpu().detach().numpy())
    print("Epoch: {}/{} - Loss: {:.4f}".format(epoch+1, epochs, np.mean(running_loss)))

The following screenshot displays the number of iterations used to train the model and the loss value for each epoch:

Step 8: Save the Trained Model

Once the model is trained successfully, simply save the result dataset to use later like for visualization or getting insights from the model:

torch.save({"model_state_dict": model.state_dict(),
            "optimzier_state_dict": optimizer.state_dict()
          }, "trained_model.pth")

To save the result data from the model, it is required to use the correct labels for storing the data at its correct locations in the database:

train_results = []
labels = []

model.eval()
with torch.no_grad():
    for img, _, _, label in tqdm(train_loader):
        train_results.append(model(img.to(device)).cpu().numpy())
        labels.append(label)

train_results = np.concatenate(train_results)
labels = np.concatenate(labels)
train_results.shape

Step 9: Plot the Results

Graphical representation is the best way to check how the model has performed after the training. The following code is used to plot the graph of the containing dots for each class with different colors:

plt.figure(figsize=(15, 10), facecolor="azure")
for label in np.unique(labels):
    tmp = train_results[labels==label]
    plt.scatter(tmp[:, 0], tmp[:, 1], label=label)

plt.legend()
plt.show()

The model has performed pretty well as the clusters with different colors are visible and very dominant in their locations as displayed in the following picture:

That’s all about calculating the triplet loss in the deep learning model and optimizing it with each iteration. The next sections or methods explain the basic built-in functions offered by the PyTorch environment that can be used by importing the torch library.

Method 2: Calculate Triplet Loss Using TripletMarginLoss() Function

The TripletMarginLoss() method is used to find the margin between wrong predictions by calculating their gap from the anchor values. It means that the function measures the relative similarity between the sample and predicted values. To learn how to implement the method in the PyTorch environment, follow these steps:

Step 1: Import Libraries

Now, import the torch library from its framework to set up the environment for using different functions to calculate triplet loss:

import torch

Print the version of the torch framework to verify that the session is ready for using torch functions:

print(torch.__version__)

Step 2: Implementing the Code

Once the torch library is set, simply use the following code to implement the TripletMarginLoss() method to find the loss value:

import torch.nn as nn

triplet_loss = nn.TripletMarginLoss(margin=1.0, p=2, eps=1e-7)
anchor = torch.randn(100, 128, requires_grad=True)
pos = torch.randn(100, 128, requires_grad=True)
neg = torch.randn(100, 128, requires_grad=True)
result = triplet_loss(anchor, pos, neg)
result.backward()
result

The above code:

Imports the neural network dependency from the torch library using the nn keyword.
It also creates the triplet_loss variable and initializes it with the TripletMarginLoss() method with arguments like margin, number of pairs, and a constant for stability.
Defines three tensors with random values for creating the anchor, positive, and negative values to use while calling the function.
Applies the backpropagation using the backward() method to the result variable and print the loss value on the screen:

Method 3: Calculate Triplet Loss Using Distance Function

This method uses the TripletMarginWithDistanceLoss() method and includes the nonnegative values and functions for real-values. It calculates the positive distance using the correct prediction-anchor relation and the negative distance using the wrong prediction-anchor relation:

embedding = nn.Embedding(1000, 128)
anchor_ids = torch.randint(0, 1000, (1,))
pos_ids = torch.randint(0, 1000, (1,))
neg_ids = torch.randint(0, 1000, (1,))
anchor = embedding(anchor_ids)
pos = embedding(pos_ids)
neg = embedding(neg_ids)

triplet_loss = \
    nn.TripletMarginWithDistanceLoss(distance_function=nn.PairwiseDistance())
result = triplet_loss(anchor, pos, neg)
result

The code suggests:

To create three tensors for applying the triplet loss method but the change here is to call the embedding() method to each tensor.
Invoke the TripletMarginWithDistanceLoss() method to find the distance between the pairs created after embedding the tensors.
Call the result variable to display the loss value after applying the method variable using all the tensors:

Method 4: Calculate Triplet Loss Using Custom Function

The user can simply build their custom triplet loss method in the torch environment according to their model to optimize its performance:

def l_infinity(x1, x2):
    return torch.max(torch.abs(x1 - x2), dim=1).values
triplet_loss = (
    nn.TripletMarginWithDistanceLoss(distance_function=l_infinity, margin=1.5))
result = triplet_loss(anchor, pos, neg)
result

The code:

Creates a custom function to use as the distance_function argument while calling the TripletMarginWithDistanceLoss() method.
Define the l_infinity() method with the multiple arguments and Use the torch.abs() method in the max() method.
It creates the maximum absolute difference value between the tensors.
After that, call the loss method with the custom distance function and the margin as the minimum distance between the pairs.
Use the anchor, positive, and negative tensors to get the value of loss displayed on the screen:

That’s all about the process of calculating the triplet loss in PyTorch.

Conclusion

To sum up, the triplet loss uses three points to calculate the distance between the actual value and the predicted values. The anchor refers to the observed value, the positive is the correctly predicted value, and the negative is the wrong prediction. PyTorch offers multiple methods like TripletMarginLoss(), and the custom distance function for the TripletMarginWithDistanceLoss() method. This guide has also implemented the loss value on the trained model to optimize its performance.