Why is Face Alignment Important for Face Recognition?

Bruno Last Updated : 13 Oct, 2024

6 min read

This article was published as a part of the Data Science Blogathon.

Introduction

Face recognition is present in our society in an effusive way. These systems can help many organizations to improve their work environment or even working methods, for example:

Helping police identify and locate suspects;
Helping companies control access;
Helping security guards identify possible intruders.

In cases where the individual collaborates with the face recognition systems, the creation of these systems with good results is easily obtained, since all the system variables are controlled. However, in situations where the individual does not cooperate with the system, we have several factors that will degrade the images and consequently contribute to obtaining poor results. For example, the low resolution of the photos and the use of glasses will contribute to a greater difficulty in recognizing individuals. Therefore, in this article, we will evaluate if the face alignment contributes to better results in this type of environment.

After this article, you will be able to build a system with:

Good AUC;
A strategy to improve face recognition in real-world scenarios;
A good performance in unconstrained scenarios.

This article is organized in 7 main sections:

Face Alignment Theory;
Datasets;
Proposed idea for training;
Proposed idea for testing;
Implementation;
Results;
Analysis.

Face Alignment Theory

Theoretically, if an image is aligned, the training of the model will be more successful because the centre of the face will be equal for all the images in a dataset.

Also, the vector with the face embeddings, the output of our model, will have certain parts of the face in different places in the vector. For example, if the face is oval, you may have embeddings of that face in places that a rounder face does not have. This way, when calculating the distance or similarity between the two faces, we can get a more accurate result of whether they are the same person.

Datasets

For training, we are going to use CASIA-WebFace. The CASIA-WebFace has 494414 images divided by 10575 real identities. It was also necessary to split the dataset into two sets: 80% for training and 20% for validation.

For testing, we will use (Labeled Faces in the Wild) LFW (available here). The LFW has 13233 images divided by 5749 identities. This dataset is commonly used to test face recognition models. In addition, this dataset contains a list of pairs that will be very useful for doing face verification in the different types of tests.

Proposed Idea for Training

To prove this theory, we need to train our model with an aligned dataset and another model with an unaligned dataset. For this training, we will use the CASIA-WebFace dataset.

To do the first training, we align the entire dataset and send the aligned faces as input to the CNN, as shown in the figure below:

Face alignment — Pipeline for training with an aligned dataset

To do the second training, the pipeline is the same as the figure above, but the image is not aligned, i.e. the images passed as input for the CNN are not aligned. For this training, we use the same dataset as the previous one.

Proposed Idea for Testing

To test our model, we want to determine the ROC Curve and accuracy and analyze these results to see the impact of face alignment.

For the first experiment, we didn’t align the dataset used for testing. The model used in this testing was trained with an unaligned dataset.

For the second experiment, we align the entire dataset. After that, we will pass each element of the pairs list as input to a CNN. With this, it is possible to determine the ROC curve and accuracy.

Pipeline for testing with the aligned dataset

Implementation

For this experiment, we will use a ResNet50 as the backbone and Softmax as an activation function to train the model. The model was implemented using TensorFlow and Keras, as shown below:

# IMAGE_Size = size of the image. In this case [112,112]
# ResNet50 backbone
backbone = ResNet50(input_shape=IMAGE_SIZE + [3], weights='imagenet', include_top=False)

# IMAGE_Size = size of the image. In this case [112,112]
# ResNet50 backbone
from tensorflow.python import tf2
from keras.applications import ResNet50

# Put backbone layers trainable
for layer in backbone.layers:
layer.trainable = True

backbone.summary()

x = BatchNormalization()(backbone.output)
x = GlobalAveragePooling2D()(x)
x = Dropout(rate=0.5)(x)

x = Flatten()(x)

prediction = Dense(numClasses, activation='softmax',
activity_regularizer=regularizers.l2(0.01))(x)

# create a model object
model = Model(inputs=backbone.input, outputs=prediction)

The full code for training and testing is available on my GitHub page.

To align the dataset, we use the script available on this GitHub page. This script uses a (Multi-task Cascaded Convolutional Networks) MTCNN.

This MTCNN has 3 stages:

In the first stage, the images are fed to a CNN and then the network returns the candidate’s facial windows and their bounding box regression vectors;
In the second phase, the images are fed to another CNN. This rejects a large number of false candidates, performs calibration with bounding box regression, and conducts NMS;
In the final stage, the network identifies 5 facial landmarks and returns them.

After these stages, it’s possible to align the faces using the 5 facial landmarks.

Unfortunately, this script has some outdated libraries, so it is necessary to make some changes:

In align_dataset_mtcnn.py:

remove the value, feed_in, dim = (inp, input_shape[-1])

#need to import the following packages
import tensorflow.compat.v1 as tf #ignore the red warning
import imageio # to replace misc.imread() as format: imageio.imread(os.path.join(emb_dir, i))
from PIL import Image # to replace misc.resize as format: scaled = np.array(Image.fromarray(cropped).resize((args.image_size, args.image_size), Image.BILINEAR))

In detect_face.py:

Row 85, replace as format: data_dict = np.load(data_path, encoding='latin1', allow_pickle=True).item()

To run the alignment process more effectively, we need to split the process into 4 subprocesses by creating and running the script as follows:

for N in {1..4}; do python3 ../facenet/src/align/align_dataset_mtcnn.py pathToDataset pathToSaveAlignDataset --image_size 224 --margin 32 --random_order --gpu_memory_fraction 0.24 & done

pathToDataset – Path to the dataset that you want to align;
pathToSaveAlignDataset – Path where the aligned dataset will be saved;
image_size – The target size for the images;
margin – Margin for the crop around the bounding box (height, width) in pixels;
gpu_memory_fraction – the fraction of the GPU that will be allocated for each subprocess.

After the faces are aligned, we will use the test script available in my GitHub project to test our theory.

In summary, for the first test, the model was trained using an unaligned CASIA-WebFace dataset and tested using an unaligned LFW dataset. For the second test, the model was trained using aligned CASIA-WebFace and tested using aligned LFW.

Results

After running the first test, we obtain this ROC Curve for the unaligned LFW dataset:

The accuracy of this test was 87.63%. Analyzing the ROC curve and the accuracy, we see that we obtain a good value for the (Area under the ROC Curve) AUC, but the AUC and accuracy could improve if we applied some changes to the test.

To improve the results, we performed the second test with the aligned LFW dataset. After running this test, we obtain this ROC Curve (orange curve):

This figure contains the two tests that were performed. The orange curve refers to the second test and the blue curve refers to the first test.

Comparing the two curves, it can be seen that the alignment of the faces contributed to a steeper ROC curve and consequently a higher AUC than the blue curve. Also, the accuracy improves significantly, reaching 94.52% on the second test.

Analysis

Given the results, it can be detected that the face alignment theory holds true for face recognition cases. Whenever the alignment was applied, we obtained better results than when it is not applied. In summary, we can conclude that face alignment effectively contributes to better results in face recognition systems.

Conclusion

Face recognition systems are available in our society in many forms and are applied in many domains.In these systems, aligning the faces will improve the AUC significantly and help you get better results in cases where the variables of the system aren’t controlled. Therefore, we can conclude that facial alignment has the following advantages:

Improve the AUC;
Improve the capacity of a model to recognize individuals;
Improve systems with unconstrained scenarios (e.g., surveillance environments).

In summary, we can conclude that face alignment has an important role in face recognition systems.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Bruno

I am a computer engineering student at the university of Beira Interior in Portugal. I am interested in several areas of computer science and I like to develop software, in general. However, I have a special affection for the area of Artificial Intelligence, particularly in the area of computer vision.

Beginner Computer Vision Image Image Analysis Python Python

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Model Deployment

Introduction to Computer Vision

Getting Started with Image Data

Introduction to CNN and Implementation

Introduction to Transfer Learning

CNN Visualization

Overview of Pretrained Models

Inception

ResNets

DenseNets

CSRNet

Introduction to Object Detection

Region Based Convolutional Neural Network

Single Stage Networks

Transformed Based Object Detection Models

Face Detection

Object Tracking

Pose Estimation

Introduction to Image Segmentation

Understanding Deep Learning Architectures for Image Segmentation

Video Classification

Introduction to Image Generation

Zero and Few Shot Learning

Why is Face Alignment Important for Face Recognition?

Introduction

Face Alignment Theory

Datasets

Proposed Idea for Training

Proposed Idea for Testing

Implementation

Results

Analysis

Conclusion

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Congratulations, You Did It!

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt