This article was published as a part of the Data Science Blogathon.
Face recognition is present in our society in an effusive way. These systems can help many organizations to improve their work environment or even working methods, for example:
In cases where the individual collaborates with the face recognition systems, the creation of these systems with good results is easily obtained, since all the system variables are controlled. However, in situations where the individual does not cooperate with the system, we have several factors that will degrade the images and consequently contribute to obtaining poor results. For example, the low resolution of the photos and the use of glasses will contribute to a greater difficulty in recognizing individuals. Therefore, in this article, we will evaluate if the face alignment contributes to better results in this type of environment.
After this article, you will be able to build a system with:
This article is organized in 7 main sections:
Theoretically, if an image is aligned, the training of the model will be more successful because the centre of the face will be equal for all the images in a dataset.
Also, the vector with the face embeddings, the output of our model, will have certain parts of the face in different places in the vector. For example, if the face is oval, you may have embeddings of that face in places that a rounder face does not have. This way, when calculating the distance or similarity between the two faces, we can get a more accurate result of whether they are the same person.
For training, we are going to use CASIA-WebFace. The CASIA-WebFace has 494414 images divided by 10575 real identities. It was also necessary to split the dataset into two sets: 80% for training and 20% for validation.
For testing, we will use (Labeled Faces in the Wild) LFW (available here). The LFW has 13233 images divided by 5749 identities. This dataset is commonly used to test face recognition models. In addition, this dataset contains a list of pairs that will be very useful for doing face verification in the different types of tests.
To prove this theory, we need to train our model with an aligned dataset and another model with an unaligned dataset. For this training, we will use the CASIA-WebFace dataset.
To do the first training, we align the entire dataset and send the aligned faces as input to the CNN, as shown in the figure below:
To do the second training, the pipeline is the same as the figure above, but the image is not aligned, i.e. the images passed as input for the CNN are not aligned. For this training, we use the same dataset as the previous one.
To test our model, we want to determine the ROC Curve and accuracy and analyze these results to see the impact of face alignment.
For the first experiment, we didn’t align the dataset used for testing. The model used in this testing was trained with an unaligned dataset.
For the second experiment, we align the entire dataset. After that, we will pass each element of the pairs list as input to a CNN. With this, it is possible to determine the ROC curve and accuracy.
Pipeline for testing with the aligned dataset
For this experiment, we will use a ResNet50 as the backbone and Softmax as an activation function to train the model. The model was implemented using TensorFlow and Keras, as shown below:
# IMAGE_Size = size of the image. In this case [112,112] # ResNet50 backbone backbone = ResNet50(input_shape=IMAGE_SIZE + [3], weights='imagenet', include_top=False)
# IMAGE_Size = size of the image. In this case [112,112] # ResNet50 backbone from tensorflow.python import tf2 from keras.applications import ResNet50
# Put backbone layers trainable for layer in backbone.layers: layer.trainable = True backbone.summary() x = BatchNormalization()(backbone.output) x = GlobalAveragePooling2D()(x) x = Dropout(rate=0.5)(x) x = Flatten()(x) prediction = Dense(numClasses, activation='softmax', activity_regularizer=regularizers.l2(0.01))(x) # create a model object model = Model(inputs=backbone.input, outputs=prediction)
The full code for training and testing is available on my GitHub page.
To align the dataset, we use the script available on this GitHub page. This script uses a (Multi-task Cascaded Convolutional Networks) MTCNN.
This MTCNN has 3 stages:
After these stages, it’s possible to align the faces using the 5 facial landmarks.
Unfortunately, this script has some outdated libraries, so it is necessary to make some changes:
In align_dataset_mtcnn.py:
remove the value, feed_in, dim = (inp, input_shape[-1])
#need to import the following packages
import tensorflow.compat.v1 as tf #ignore the red warning
import imageio # to replace misc.imread() as format: imageio.imread(os.path.join(emb_dir, i))
from PIL import Image # to replace misc.resize as format: scaled = np.array(Image.fromarray(cropped).resize((args.image_size, args.image_size), Image.BILINEAR))
In detect_face.py:
Row 85, replace as format: data_dict = np.load(data_path, encoding='latin1', allow_pickle=True).item()
To run the alignment process more effectively, we need to split the process into 4 subprocesses by creating and running the script as follows:
for N in {1..4}; do python3 ../facenet/src/align/align_dataset_mtcnn.py pathToDataset pathToSaveAlignDataset --image_size 224 --margin 32 --random_order --gpu_memory_fraction 0.24 & done
After the faces are aligned, we will use the test script available in my GitHub project to test our theory.
In summary, for the first test, the model was trained using an unaligned CASIA-WebFace dataset and tested using an unaligned LFW dataset. For the second test, the model was trained using aligned CASIA-WebFace and tested using aligned LFW.
After running the first test, we obtain this ROC Curve for the unaligned LFW dataset:
The accuracy of this test was 87.63%. Analyzing the ROC curve and the accuracy, we see that we obtain a good value for the (Area under the ROC Curve) AUC, but the AUC and accuracy could improve if we applied some changes to the test.
To improve the results, we performed the second test with the aligned LFW dataset. After running this test, we obtain this ROC Curve (orange curve):
This figure contains the two tests that were performed. The orange curve refers to the second test and the blue curve refers to the first test.
Comparing the two curves, it can be seen that the alignment of the faces contributed to a steeper ROC curve and consequently a higher AUC than the blue curve. Also, the accuracy improves significantly, reaching 94.52% on the second test.
Given the results, it can be detected that the face alignment theory holds true for face recognition cases. Whenever the alignment was applied, we obtained better results than when it is not applied. In summary, we can conclude that face alignment effectively contributes to better results in face recognition systems.
Face recognition systems are available in our society in many forms and are applied in many domains.In these systems, aligning the faces will improve the AUC significantly and help you get better results in cases where the variables of the system aren’t controlled. Therefore, we can conclude that facial alignment has the following advantages:
In summary, we can conclude that face alignment has an important role in face recognition systems.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.