Table of Contents
Training and Making Predictions with Siamese Networks and Triplet Loss
Configuring Your Development Environment
Having Problems Configuring Your Development Environment?
Project Structure
Training Our Siamese Network Model with Triplet Loss
Making Predictions with Our Siamese Network Based Face Recognition Model
Training and Making Predictions with Siamese Networks and Triplet Loss
In this tutorial, we will learn to train our Siamese network based face recognition application using Keras and TensorFlow. Furthermore, we will discuss how we can use our model to make predictions in real-time.
In the previous tutorial of this series, we tried to understand the formulation of triplet loss. We discussed how it could be used to learn an embedding space where “similar faces” (i.e., from the same person) reside close to each other and “dissimilar face” (i.e., from different people) reside farther apart. Additionally, we discussed a typical Siamese network pipeline and how it can be used to build our face recognition model.
Furthermore, we implemented the triplet loss and developed our Siamese network based face recognition pipeline in Keras and TensorFlow.
In this tutorial, we will take this further and learn how to train our face recognition model using Keras and TensorFlow. Once our model is trained, we will use it to predict new unseen faces in real-time.
This lesson is the 4th in a 5-part series on Siamese networks and their application in face recognition:
Face Recognition with Siamese Networks, Keras, and TensorFlow Building a Dataset for Triplet Loss with Keras and TensorFlow Triplet Loss with Keras and TensorFlowTraining and Making Predictions with Siamese Networks and Triplet Loss (this tutorial)Evaluating Siamese Network Accuracy (ROC, Precision, and Recall) with Keras and TensorFlow
To learn how to train and make predictions with Siamese networks and triplet loss, just keep reading.
Looking for the source code to this post?
Training and Making Predictions with Siamese Networks and Triplet Loss
In the second part of this series, we developed the modules required to build the data pipeline for our face recognition application. Furthermore, in the previous tutorial, we developed modules to build our Siamese Model and triplet loss function. In this tutorial, we will put everything together and build our end-to-end face recognition application using the modules that we built previously. Additionally, we will learn to train our end-to-end face recognition model and discuss how we can make predictions using it in real-time.
For this tutorial, we will use Keras and TensorFlow, as we have done in the previous parts of this series. Keras and TensorFlow provide various functionalities that allow us to elegantly put all the modules together and develop our end-to-end pipeline. This provides a simple and intuitive way in which the different parts of our application can communicate and work in tandem to create an efficient and effective face recognition application. Furthermore, Keras provides a simple API that helps us implement, compile, train, and save our model with minimal code.
Figure 1 depicts the overview of our face recognition pipeline and shows how the modules we built in the previous parts of this series work together to develop our final end-to-end application.
Let us revisit our modules and understand the structure and flow of our application.
First, we develop the data pipeline, as shown in Figure 1. Next, the Data Generator (created using the TripleGenerator Class) is used to create our training and validation Dataset with the help of the tf.data.Dataset functionality provided by TensorFlow. This dataset is then used to create our DataLoaders, allowing us to apply pre-processing transformations using the MapFunction Class and generate batches of data samples. Finally, the data pipeline returns two data loaders (i.e., trainDs and valDs for training and validation, respectively).
Next, we develop our Siamese network pipeline. We create our embedding module with the help of the get_embedding_module() function, which we had defined in earlier tutorials. Then, we use the embedding module to embed the anchor, positive, and negative images to build our Siamese network using the get_siamese_network() function. Finally, we pass our Siamese network to the SiameseModel Class which implements the triplet loss and training and test step code.
In the end, we compile and train our model using Keras and finally save our trained model so we can use it in the inference phase for making real-time predictions.
Now that we have discussed the overview of our pipeline, let us dive into the code to train and make predictions with our Siamese network based face recognition application.
Configuring Your Development Environment
To follow this guide, you need to have the TensorFlow and OpenCV libraries installed on your system.
Luckily, both TensorFlow and OpenCV are pip-installable:
$ pip install tensorflow
$ pip install opencv-contrib-python
If you need help configuring your development environment for OpenCV, we highly recommend that you read our pip install OpenCV guide — it will have you up and running in a matter of minutes.
Having Problems Configuring Your Development Environment?
All that said, are you:
Short on time?Learning on your employer’s administratively locked system?Wanting to skip the hassle of fighting with the command line, package managers, and virtual environments?Ready to run the code right now on your Windows, macOS, or Linux system?
Then join PyImageSearch University today!
Gain access to Jupyter Notebooks for this tutorial and other PyImageSearch guides that are pre-configured to run on Google Colab’s ecosystem right in your web browser! No installation required.
And best of all, these Jupyter Notebooks will run on Windows, macOS, and Linux!
Project Structure
We first need to review our project directory structure.
Start by accessing the “Downloads” section of this tutorial to retrieve the source code and example images.
├── crop_faces.py
├── face_crop_model
│ ├── deploy.prototxt.txt
│ └── res10_300x300_ssd_iter_140000.caffemodel
├── inference.py
├── pyimagesearch
│ ├── config.py
│ ├── dataset.py
│ └── model.py
└── train.py
In the previous tutorial, we presented a step-by-step walkthrough of our model.py from the pyimagesearch folder, which allows us to implement the triplet loss function and build our Siamese network model.
In this tutorial, we will discuss in detail the train.py file, which implements the code to train our face recognition pipeline, and the inference.py file, which will help us make predictions using our Siamese network based face recognition application.
Training Our Siamese Network Model with Triplet Loss
Now that we have discussed the overview of our face recognition pipeline and the function performed by the modules we have built, let us put everything together and train our Siamese network based face recognition pipeline using Keras and TensorFlow.
We open our train.py file and get started.
# USAGE
# python train.py
# import the necessary packages
from pyimagesearch.dataset import TripletGenerator
from pyimagesearch.model import get_embedding_module
from pyimagesearch.model import get_siamese_network
from pyimagesearch.model import SiameseModel
from pyimagesearch.dataset import MapFunction
from pyimagesearch import config
from tensorflow import keras
import tensorflow as tf
import os
# create the data input pipeline for train and val dataset
print(“[INFO] building the train and validation generators…”)
trainTripletGenerator = TripletGenerator(
datasetPath=config.TRAIN_DATASET)
valTripletGenerator = TripletGenerator(
datasetPath=config.TRAIN_DATASET)
print(“[INFO] building the train and validation `tf.data` dataset…”)
trainTfDataset = tf.data.Dataset.from_generator(
generator=trainTripletGenerator.get_next_element,
output_signature=(
tf.TensorSpec(shape=(), dtype=tf.string),
tf.TensorSpec(shape=(), dtype=tf.string),
tf.TensorSpec(shape=(), dtype=tf.string),
)
)
valTfDataset = tf.data.Dataset.from_generator(
generator=valTripletGenerator.get_next_element,
output_signature=(
tf.TensorSpec(shape=(), dtype=tf.string),
tf.TensorSpec(shape=(), dtype=tf.string),
tf.TensorSpec(shape=(), dtype=tf.string),
)
)
First, we import the important modules we built earlier to train our face recognition model on Lines 5-13. In previous tutorials, we noted that the pyimagesearch folder contains the code for the dataset module (dataset.py), the model definition (model.py), and the configuration file (config.py), which we discussed in detail. We will now use these modules to train our face recognition application.
We start by importing the TripletGenerator class from the dataset module (Line 5) and the get_embedding_module, get_siamese_network, and SiameseModel class from the model definition (Lines 6-8). We also import the MapFunction class and the config file on Lines 9 and 10, respectively. Finally, we import the Keras, TensorFlow, and os packages on Lines 11-13.
Next, we develop our data pipeline to allow us to sample batches for training and validation. We use the TripletGenerator class (that we developed earlier) to define the training data generator (i.e., trainTripletGenerator) and validation data generator (i.e., valTripletGenerator) on Lines 17-19. The TripletGenerator class takes as input the path to the respective dataset (i.e., config.TRAIN_DATASET) as shown on Line 20.
Now that we have defined our data generators, we use the tf.data.Dataset.from_generator functionality to define our training and validation datasets on Lines 22-37. First, we define our training dataset (i.e., trainTfDataset) on Lines 22-29. Note that tf.data.Dataset.from_generator takes as input a callable generator function (i.e., trainTripletGenerator.get_next_element), whose outputs must be compatible with the output format defined by the output_signature argument.
Similarly, we create the validation dataset (i.e., valTfDataset) using tf.data.Dataset.from_generator on Lines 30-37.
# preprocess the images
mapFunction = MapFunction(imageSize=config.IMAGE_SIZE)
print(“[INFO] building the train and validation `tf.data` pipeline…”)
trainDs = (trainTfDataset
.map(mapFunction)
.shuffle(config.BUFFER_SIZE)
.batch(config.BATCH_SIZE)
.prefetch(config.AUTO)
)
valDs = (valTfDataset
.map(mapFunction)
.batch(config.BATCH_SIZE)
.prefetch(config.AUTO)
)
# build the embedding module and the siamese network
print(“[INFO] build the siamese model…”)
embeddingModule = get_embedding_module(imageSize=config.IMAGE_SIZE)
siameseNetwork = get_siamese_network(
imageSize=config.IMAGE_SIZE,
embeddingModel=embeddingModule,
)
siameseModel = SiameseModel(
siameseNetwork=siameseNetwork,
margin=0.5,
lossTracker=keras.metrics.Mean(name=”loss”),
)
# compile the siamese model
siameseModel.compile(
optimizer=keras.optimizers.Adam(config.LEARNING_RATE)
)
# train and validate the siamese model
print(“[INFO] training the siamese model…”)
siameseModel.fit(
trainDs,
steps_per_epoch=config.STEPS_PER_EPOCH,
validation_data=valDs,
validation_steps=config.VALIDATION_STEPS,
epochs=config.EPOCHS,
)
# check if the output directory exists, if it doesn’t, then
# create it
if not os.path.exists(config.OUTPUT_PATH):
os.makedirs(config.OUTPUT_PATH)
# save the siamese network to disk
modelPath = config.MODEL_PATH
print(f”[INFO] saving the siamese network to {modelPath}…”)
keras.models.save_model(
model=siameseModel.siameseNetwork,
filepath=modelPath,
include_optimizer=False,
)
On Line 40, we define the pre-processing operations that we want to apply to our data samples using the MapFunction, which takes the config.IMAGE_SIZE parameter as an argument.
Finally, on Lines 42-52, we use the training and validation dataset (i.e., trainTfDataset and valTfDataset) to define our training and validation data loaders (i.e., trainDs and valDs). Note that TensorFlow allows us to apply different functionalities to our generated data samples.
The map function (which takes as an argument our mapFunction class) applies pre-processing transformations to our data samples. The shuffle functionality (which takes as argument config.BUFFER_SIZE) to randomly sample elements from a buffer of config.BUFFER_SIZE number of elements.The batch functionality (which takes as argument config.BATCH_SIZE) allows us to sample batches of data samples with the number of elements per batch defined by the config.BATCH_SIZE argument.The prefetch functionality directs TensorFlow to prepare later elements while current elements are being processed.
Now that we have created the data pipeline, we can define our Siamese Model. We first build our embedding module using the get_embedding_module function, which takes as input the imageSize (Line 56).
Next, we use the get_siamese_network function, which takes as an argument the imageSize and embeddingModule to build and return our siameseNetwork (Lines 57-60).
Now that we have our siameseNetwork, we use the SiameseModel class to build our Siamese network based face recognition model (Lines 61-65). The SiameseModel class takes as arguments the siameseNetwork, the margin distance that we discussed earlier, and the keras.metrics.Mean(name=”loss”) metric.
We can now compile our model using siameseModel.compile and use the Adam optimizer with the learning rate equal to config.LEARNING_RATE, as shown on Lines 68-70.
Finally, we use the fit functionality to train our Siamese network based face recognition model (Lines 74-80). This takes as input the training data loader (i.e., trainDs), the steps_per_epoch, the validation data loader (i.e., valDs), the number of validation steps (i.e., config.VALIDATION_STEPS), and the total number of epochs (i.e., config.EPOCHS).
Next, we prepare to save our model. We check if the output directory exists, and if it does not, we create it (Lines 84 and 85). On Line 88, we define the modelPath, and on Lines 90-94, we use the keras.models.save_model function to save our trained model.
Making Predictions with Our Siamese Network Based Face Recognition Model
Now that we have discussed the code required to train our model, let us implement the code to make predictions in real-time with our trained Siamese network based face recognition application.
We start by opening the inference.py file.
# USAGE
# python inference.py
# import the necessary packages
from pyimagesearch.dataset import TripletGenerator
from pyimagesearch.dataset import MapFunction
from pyimagesearch.model import SiameseModel
from matplotlib import pyplot as plt
from pyimagesearch import config
from tensorflow import keras
import tensorflow as tf
import os
# create the data input pipeline for test dataset
print(“[INFO] building the test generator…”)
testTripletGenerator = TripletGenerator(
datasetPath=config.TEST_DATASET)
print(“[INFO] building the test `tf.data` dataset…”)
testTfDataset = tf.data.Dataset.from_generator(
generator=testTripletGenerator.get_next_element,
output_signature=(
tf.TensorSpec(shape=(), dtype=tf.string),
tf.TensorSpec(shape=(), dtype=tf.string),
tf.TensorSpec(shape=(), dtype=tf.string),
)
)
mapFunction = MapFunction(imageSize=config.IMAGE_SIZE)
testDs = (testTfDataset
.map(mapFunction)
.batch(4)
.prefetch(config.AUTO)
)
As always, we first import the necessary modules like TripletGenerator, MapFunction, and SiameseModel on Lines 5-7. Additionally, we import the necessary packages like pyplot (from matplotlib), config, keras, tensorflow, and os, as shown on Lines 8-12.
We start by using the TripletGenerator class to define the test data generator (i.e., testTripletGenerator), which takes as input the path to the test dataset (i.e., config.TEST_DATASET), as shown on Lines 16 and 17.
Then, we use the tf.data.Dataset.from_generator function to define our test dataset (i.e., testTfDataset). Similar to what we discussed for the training pipeline, this function takes as inputs the test data generator function testTripletGenerator.get_next_element and the output_signature, as shown on Lines 19-26.
Similar to what we did in the training phase, now we use the MapFunction class to define the pre-processing transformations for the test set (Line 27) and define our test data loader testDs (Lines 28-32). Here, we use the map functionality to apply the pre-process transformations and use a batch size of 4 as shown.
# load the siamese network from disk and build the siamese model
modelPath = config.MODEL_PATH
print(f”[INFO] loading the siamese network from {modelPath}…”)
siameseNetwork = keras.models.load_model(filepath=modelPath)
siameseModel = SiameseModel(
siameseNetwork=siameseNetwork,
margin=0.5,
lossTracker=keras.metrics.Mean(name=”loss”),
)
# load the test data
(anchor, positive, negative) = next(iter(testDs))
(apDistance, anDistance) = siameseModel((anchor, positive, negative))
plt.figure(figsize=(10, 10))
rows = 4
for row in range(rows):
plt.subplot(rows, 3, row * 3 + 1)
plt.imshow(anchor[row])
plt.axis(“off”)
plt.title(“Anchor image”)
plt.subplot(rows, 3, row * 3 + 2)
plt.imshow(positive[row])
plt.axis(“off”)
plt.title(f”Positive distance: {apDistance[row]:0.2f}”)
plt.subplot(rows, 3, row * 3 + 3)
plt.imshow(negative[row])
plt.axis(“off”)
plt.title(f”Negative distance: {anDistance[row]:0.2f}”)
# check if the output directory exists, if it doesn’t, then
# create it
if not os.path.exists(config.OUTPUT_PATH):
os.makedirs(config.OUTPUT_PATH)
# save the inference image to disk
outputImagePath = config.OUTPUT_IMAGE_PATH
print(f”[INFO] saving the inference image to {outputImagePath}…”)
plt.savefig(fname=outputImagePath)
Next, we define the modelPath, where our trained Siamese model is stored (Line 35), and use the keras.models.load_model function to load our model (i.e., siameseNetwork) (Line 37). Now that we have our pre-trained siameseNetwork, we use it to build our siameseModel using the SiameseModel class (Lines 38-42).
Now that we have defined our test data pipeline, we can sample from the test set and see our face recognition model in action.
We use the iter() method to convert our testDs to an iterator and then use the next() method to sample a batch of test data. We store the output as a tuple (i.e., (anchor, positive, negative)) (Line 45).
We then pass these samples as input to the siameseModel and get the distance between the anchor and positive samples and the distance between the anchor and negative samples (i.e., (apDistance, anDistance)) (Line 46).
We can now create a plot using matplotlib to visualize our samples. We start by defining our figure with figsize=(10, 10) using the plt.figure() function, as shown on Line 47. We also define the number of rows in our plot, which is equal to the batch size of our test loader (Line 48).
Next, we iterate over the different rows, and for each row, we create a subplot for the anchor image, positive image, and the negative image, as shown on Lines 49-61.
Finally, we prepare to save our inference images. We check if the output directory where we want to store our inference output exists, and if it doesn’t, then we create it (Lines 65 and 66).
We then define the outputImagePath (Line 69) and use the plt.savefig function to save our inference image (Line 71).
What’s next? I recommend PyImageSearch University.
69 total classes • 73 hours of on-demand code walkthrough videos • Last updated: March 2023
★★★★★ 4.84 (128 Ratings) • 15,800+ Students Enrolled
I strongly believe that if you had the right teacher you could master computer vision and deep learning.
Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?
That’s not the case.
All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.
If you’re serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.
Inside PyImageSearch University you’ll find:
✓ 74 courses on essential computer vision, deep learning, and OpenCV topics
✓ 74 Certificates of Completion
✓ 84 hours of on-demand video
✓ Brand new courses released regularly, ensuring you can keep up with state-of-the-art techniques
✓ Pre-configured Jupyter Notebooks in Google Colab
✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
✓ Access to centralized code repos for all 500+ tutorials on PyImageSearch
✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
✓ Access on mobile, laptop, desktop, etc.
Summary
In this tutorial, we discussed how to train our Siamese network based face recognition model using Keras and TensorFlow. Specifically, we tried to understand how the modules we built in the previous parts of this series come together to form our face recognition application.
Furthermore, we discussed and implemented the code to predict new unseen face images in real-time using our trained face recognition model.
In the upcoming tutorials of this series, we will evaluate the performance of our face recognition model using different metrics.
Citation Information
Chandhok, S. “Training and Making Predictions with Siamese Networks and Triplet Loss,” PyImageSearch, P. Chugh, A. R. Gosthipaty, S. Huot, K. Kidriavsteva, R. Raha, and A. Thanki, eds., 2023, https://pyimg.co/avjyi
@incollection{Chandhok_2023_training_and_making,
author = {Shivam Chandhok},
title = {Training and Making Predictions with Siamese Networks and Triplet Loss},
booktitle = {PyImageSearch},
editor = {Puneet Chugh and Aritra Roy Gosthipaty and Susan Huot and Kseniia Kidriavsteva and Ritwik Raha and Abhishek Thanki},
year = {2023},
url = {https://pyimg.co/avjyi},
}
Want free GPU credits to train models?
We used Jarvislabs.ai, a GPU cloud, for all the experiments.
We are proud to offer PyImageSearch University students $20 worth of Jarvislabs.ai GPU cloud credits. Join PyImageSearch University and claim your $20 credit here.
In Deep Learning, we need to train Neural Networks. These Neural Networks can be trained on a CPU but take a lot of time. Moreover, sometimes these networks do not even fit (run) on a CPU.
To overcome this problem, we use GPUs. The problem is these GPUs are expensive and become outdated quickly.
GPUs are great because they take your Neural Network and train it quickly. The problem is that GPUs are expensive, so you don’t want to buy one and use it only occasionally. Cloud GPUs let you use a GPU and only pay for the time you are running the GPU. It’s a brilliant idea that saves you money.
JarvisLabs provides the best-in-class GPUs, and PyImageSearch University students get between 10-50 hours on a world-class GPU (time depends on the specific GPU you select).
This gives you a chance to test-drive a monstrously powerful GPU on any of our tutorials in a jiffy. So join PyImageSearch University today and try it for yourself.
To download the source code to this post (and be notified when future tutorials are published here on PyImageSearch), simply enter your email address in the form below!
Download the Source Code and FREE 17-page Resource Guide
Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you’ll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!
The post Training and Making Predictions with Siamese Networks and Triplet Loss appeared first on PyImageSearch.
Table of Contents Training and Making Predictions with Siamese Networks and Triplet Loss Configuring Your Development Environment Having Problems Configuring Your Development Environment? Project Structure Training Our Siamese Network Model with Triplet Loss Making Predictions with Our Siamese Network Based…
The post Training and Making Predictions with Siamese Networks and Triplet Loss appeared first on PyImageSearch. Read More Face Recognition, Keras, Siamese Networks, TensorFlow, Triplet Loss, face recognition, keras, siamese networks, tensorflow, triplet loss