GhostFaceNets is a revolutionary facial recognition technology that uses affordable operations without compromising accuracy. Inspired by attention-based models, it revolutionizes facial recognition technology. This blog post explores GhostFaceNets through captivating visuals and insightful illustrations, aiming to educate, motivate, and spark creativity. The journey is not just a blog post, but a unique exploration of the limitless possibilities of GhostFaceNets. Join us on this exciting journey to discover the world of GhostFaceNets.
This article was published as a part of the Data Science Blogathon.
In today’s era of ubiquitous computing and the IOT, FR technology plays an important role in different applications, including seamless user authentication, personalized experiences, and stronger security measures. However, traditional facial recognition systems consumes high computational resources, rendering them unsuitable for deployment on low computation devices with limited resources. This is where GhostFaceNets comes into play, that promises to revolutionize how we approach and implement facial recognition technology.
As the demand for edge computing and real-time applications soared, the need for efficient and lightweight models became paramount. Researchers and engineers alike sought to strike a delicate balance between model complexity and performance, giving rise to a plethora of lightweight architectures tailored for specific tasks, including face recognition.
Deep learning algorithms like Convolutional Neural Networks (CNNs) have revolutionized face recognition research, enhancing accuracy compared to traditional methods. However, these models often struggle to balance performance and complexity, especially for real-world applications and resource-constrained devices. The Labeled Faces in the Wild dataset is the gold standard for evaluating new FR models, with Light CNN architectures reducing parameters and computational complexity. Despite these advancements, the most accurate reported performance on LFW is 99.33%.
ShiftFaceNet introduced a “Shift” operation to reduce the number of parameters in image classification models, resulting in a 2-degree accuracy drop. Other models built upon image classification backbones, such as MobileFaceNets, ShuffleFaceNet, VarGFaceNet, and MixFaceNets, have shown improved trade-offs between performance and complexity. MobileFaceNets achieved 99.55% LFW accuracy with 1M parameters, while ShuffleFaceNet achieved 99.67% LFW accuracy with 2.6M parameters and 557.5 MFLOPs.
VarGFaceNet leveraged VarGNet and achieved 99.85% LFW accuracy with 5M parameters and 1.022 GFLOPs. MixFaceNets achieved 99.68% LFW accuracy with 3.95M parameters and 626.1 MFLOPs. Other notable models include AirFace, QuantFace, and PocketNets, which have achieved 99.27% LFW accuracy with 1 GFLOPs, 99.43% LFW accuracy with 1.1M parameters, and 99.58% LFW accuracy with 0.925M parameters and 587.11 MFLOPs.
Building upon the efficient GhostNets architectures (GhostNetV1 and GhostNetV2), the authors propose GhostFaceNets, a new set of lightweight architectures tailored for face recognition and face verification. Several key modifications were made:
The authors designed a set of GhostFaceNets models by varying the training dataset, the width of the GhostNets architectures, and the stride of the first convolution layer (stem). The resulting models outperform most lightweight SOTA models on different benchmarks, as discussed in next sections.
GhostNetV1, the backbone architecture of GhostFaceNets, employs a novel concept called Ghost modules to generate a certain percentage (denoted as x%) of the feature maps, while the remaining feature maps are generated using a low-cost linear operation called as depthwise convolution (DWConv).
In a traditional convolutional layer, a 2D filter (kernel) is applied to a 2D channel of the input tensor to generate a 2D channel of the output tensor, directly generating a tensor of feature maps with Cβ channels from an input tensor of C channels. However, Ghost modules take a different approach.
The Ghost module generates the first x% of the output tensor channels using a sequential block of three layers: normal convolution, batch normalization, and a nonlinear activation function (default: ReLU). The output is then sent to a second block with depthwise convolution, batch normalization, and ReLU, and the output tensor is completed by stacking the two blocks.
As shown in Figure 1, there are clearly similar and redundant feature map pairs (ghosts) that can be generated using linear operations, reducing computational complexity without decreasing performance. The authors of GhostNetV1 exploit this observation by generating these similar and redundant features using cheap operations, rather than discarding them.
By employing Ghost modules, GhostNetV1 can effectively generate the same number of feature maps as an convolutional layer, with significant reduction in the number of parameters and FLOPs. This allows Ghost modules to be easily integrated into existing neural network architectures to reduce computational complexity.
Below code is from ghost_model.py module from backbones folder
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import backend as K
from tensorflow.keras.models import Model
from tensorflow.keras.layers import (
Activation,
Add,
BatchNormalization,
Concatenate,
Conv2D,
DepthwiseConv2D,
GlobalAveragePooling2D,
Input,
PReLU,
Reshape,
Multiply,
)
import math
CONV_KERNEL_INITIALIZER = keras.initializers.VarianceScaling(scale=2.0, mode="fan_out", distribution="truncated_normal")
def _make_divisible(v, divisor=4, min_value=None):
"""
This function is taken from the original tf repo.
It ensures that all layers have a channel number that is divisible by 8
It can be seen here:
https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py
"""
if min_value is None:
min_value = divisor
new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
if new_v < 0.9 * v:
new_v += divisor
return new_v
def activation(inputs):
return Activation("relu")(inputs)
def se_module(inputs, se_ratio=0.25):
#get the channel axis
channel_axis = 1 if K.image_data_format() == "channels_first" else -1
#filters = channel axis shape
filters = inputs.shape[channel_axis]
reduction = _make_divisible(filters * se_ratio)
#from None x H x W x C to None x C
se = GlobalAveragePooling2D()(inputs)
#Reshape None x C to None 1 x 1 x C
se = Reshape((1, 1, filters))(se)
#Squeeze by using C*se_ratio. The size will be 1 x 1 x C*se_ratio
se = Conv2D(reduction, kernel_size=1, use_bias=True, kernel_initializer=CONV_KERNEL_INITIALIZER)(se)
# se = PReLU(shared_axes=[1, 2])(se)
se = Activation("relu")(se)
#Excitation using C filters. The size will be 1 x 1 x C
se = Conv2D(filters, kernel_size=1, use_bias=True, kernel_initializer=CONV_KERNEL_INITIALIZER)(se)
se = Activation("hard_sigmoid")(se)
return Multiply()([inputs, se])
def ghost_module(inputs, out, convkernel=1, dwkernel=3, add_activation=True):
# conv_out_channel = math.ceil(out * 1.0 / 2)
conv_out_channel = out // 2
# tf.print("[ghost_module] out:", out, "conv_out_channel:", conv_out_channel)
cc = Conv2D(conv_out_channel, convkernel, use_bias=False, strides=(1, 1), padding="same", kernel_initializer=CONV_KERNEL_INITIALIZER)(
inputs
) # padding=kernel_size//2
cc = BatchNormalization(axis=-1)(cc)
if add_activation:
cc = activation(cc)
channel = int(out - conv_out_channel)
nn = DepthwiseConv2D(dwkernel, 1, padding="same", use_bias=False, depthwise_initializer=CONV_KERNEL_INITIALIZER)(cc) # padding=dw_size//2
nn = BatchNormalization(axis=-1)(nn)
if add_activation:
nn = activation(nn)
return Concatenate()([cc, nn])
def ghost_bottleneck(inputs, dwkernel, strides, exp, out, se_ratio=0, shortcut=True):
nn = ghost_module(inputs, exp, add_activation=True) # ghost1 = GhostModule(in_chs, exp, relu=True)
if strides > 1:
# Extra depth conv if strides higher than 1
nn = DepthwiseConv2D(dwkernel, strides, padding="same", use_bias=False, depthwise_initializer=CONV_KERNEL_INITIALIZER)(nn)
nn = BatchNormalization(axis=-1)(nn)
# nn = Activation('relu')(nn)
if se_ratio > 0:
# Squeeze and excite
nn = se_module(nn, se_ratio) # se = SqueezeExcite(exp, se_ratio=se_ratio)
# Point-wise linear projection
nn = ghost_module(nn, out, add_activation=False) # ghost2 = GhostModule(exp, out, relu=False)
# nn = BatchNormalization(axis=-1)(nn)
if shortcut:
xx = DepthwiseConv2D(dwkernel, strides, padding="same", use_bias=False, depthwise_initializer=CONV_KERNEL_INITIALIZER)(
inputs
) # padding=(dw_kernel_size-1)//2
xx = BatchNormalization(axis=-1)(xx)
xx = Conv2D(out, (1, 1), strides=(1, 1), padding="valid", use_bias=False, kernel_initializer=CONV_KERNEL_INITIALIZER)(xx) # padding=0
xx = BatchNormalization(axis=-1)(xx)
else:
xx = inputs
return Add()([xx, nn])
#1.3 is the width of the GhostNet as in the paper (Table 7)
def GhostNet(input_shape=(224, 224, 3), include_top=True, classes=0, width=1.3, strides=2, name="GhostNet"):
inputs = Input(shape=input_shape)
out_channel = _make_divisible(16 * width, 4)
nn = Conv2D(out_channel, (3, 3), strides=strides, padding="same", use_bias=False, kernel_initializer=CONV_KERNEL_INITIALIZER)(inputs) # padding=1
nn = BatchNormalization(axis=-1)(nn)
nn = activation(nn)
dwkernels = [3, 3, 3, 5, 5, 3, 3, 3, 3, 3, 3, 5, 5, 5, 5, 5]
exps = [16, 48, 72, 72, 120, 240, 200, 184, 184, 480, 672, 672, 960, 960, 960, 512]
outs = [16, 24, 24, 40, 40, 80, 80, 80, 80, 112, 112, 160, 160, 160, 160, 160]
use_ses = [0, 0, 0, 0.25, 0.25, 0, 0, 0, 0, 0.25, 0.25, 0.25, 0, 0.25, 0, 0.25]
strides = [1, 2, 1, 2, 1, 2, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1]
pre_out = out_channel
for dwk, stride, exp, out, se in zip(dwkernels, strides, exps, outs, use_ses):
out = _make_divisible(out * width, 4) # [ 20 32 32 52 52 104 104 104 104 144 144 208 208 208 208 208 ]
exp = _make_divisible(exp * width, 4) # [ 20 64 92 92 156 312 260 240 240 624 872 872 1248 1248 1248 664 ]
shortcut = False if out == pre_out and stride == 1 else True
nn = ghost_bottleneck(nn, dwk, stride, exp, out, se, shortcut)
pre_out = out # [ 20 32 32 52 52 104 104 104 104 144 144 208 208 208 208 208 ]
out = _make_divisible(exps[-1] * width, 4) #664
nn = Conv2D(out, (1, 1), strides=(1, 1), padding="valid", use_bias=False, kernel_initializer=CONV_KERNEL_INITIALIZER)(nn) # padding=0
nn = BatchNormalization(axis=-1)(nn)
nn = activation(nn)
if include_top:
nn = GlobalAveragePooling2D()(nn)
nn = Reshape((1, 1, int(nn.shape[1])))(nn)
nn = Conv2D(1280, (1, 1), strides=(1, 1), padding="same", use_bias=False, kernel_initializer=CONV_KERNEL_INITIALIZER)(nn)
nn = BatchNormalization(axis=-1)(nn)
nn = activation(nn)
nn = Conv2D(classes, (1, 1), strides=(1, 1), padding="same", use_bias=False, kernel_initializer=CONV_KERNEL_INITIALIZER)(nn)
nn = K.squeeze(nn, 1)
nn = Activation("softmax")(nn)
return Model(inputs=inputs, outputs=nn, name=name)
GhostNetV2 introduces significant improvements to the Ghost module of GhostNetV1, aiming to capture long-range dependencies more effectively. The key innovation is the incorporation of a novel attention-based layer called the DFC attention branch, designed to generate attention maps with global receptive fields using convolutions. Unlike traditional self-attention layers, the DFC attention branch achieves high efficiency while capturing dependencies between pixels across different spatial locations. This efficiency is crucial for hardware compatibility and inference speed, as many prior attention modules relied on computationally intensive tensor operations.
GhostNetV2’s architecture features a new bottleneck structure, allowing the Ghost module and DFC attention branch to operate in parallel. It gathers information from various viewpoints and aggregating it into the final output. This feature-wise product ensures comprehensive coverage of input data across various patches.
The DFC attention branch consists of five operations: downsampling, convolution, horizontal and vertical fully connected (FC) layers, and sigmoid activation(Refer the above image). To mitigate computational overhead, we utilize native average pooling for downsampling and bilinear interpolation for upsampling. Decomposing the FC layer into horizontal and vertical components reduces complexity while capturing long-range dependencies along both dimensions.
Overall, GhostNetV2 represents a significant advancement in attention-based models, offering improved efficiency and effectiveness in capturing long-range dependencies. Visual aids such as diagrams illustrating the architecture and operations of the DFC attention branch can improve understanding and engagement for readers. Place these diagrams strategically within the text to complement the explanations and facilitate comprehension.
Below code is from ghostv2.py module from backbones folder
!pip install keras_cv_attention_models
import tensorflow as tf
from tensorflow import keras
from keras_cv_attention_models.attention_layers import (
activation_by_name,
batchnorm_with_activation,
conv2d_no_bias,
depthwise_conv2d_no_bias,
make_divisible,
se_module,
add_pre_post_process,
)
from keras_cv_attention_models.download_and_load import reload_model_weights
PRETRAINED_DICT = {
"ghostnetv2_1x": {"imagenet": "4f28597d5f72731ed4ef4f69ec9c1799"},
"ghostnet_1x": {"imagenet": "df1de036084541c5b8bd36b179c74577"},
}
def ghost_module(inputs, out_channel, activation="relu", name=""):
ratio = 2
hidden_channel = int(tf.math.ceil(float(out_channel) / ratio))
primary_conv = conv2d_no_bias(inputs, hidden_channel, name=name + "prim_")
primary_conv = batchnorm_with_activation(primary_conv, activation=activation, name=name + "prim_")
cheap_conv = depthwise_conv2d_no_bias(primary_conv, kernel_size=3, padding="SAME", name=name + "cheap_")
cheap_conv = batchnorm_with_activation(cheap_conv, activation=activation, name=name + "cheap_")
return keras.layers.Concatenate()([primary_conv, cheap_conv])
def ghost_module_multiply(inputs, out_channel, activation="relu", name=""):
nn = ghost_module(inputs, out_channel, activation=activation, name=name)
# shortcut = keras.layers.AvgPool2D(pool_size=2, strides=2, padding="SAME")(inputs)
shortcut = keras.layers.AvgPool2D(pool_size=2, strides=2)(inputs)
shortcut = conv2d_no_bias(shortcut, out_channel, name=name + "short_1_")
shortcut = batchnorm_with_activation(shortcut, activation=None, name=name + "short_1_")
shortcut = depthwise_conv2d_no_bias(shortcut, (1, 5), padding="SAME", name=name + "short_2_")
shortcut = batchnorm_with_activation(shortcut, activation=None, name=name + "short_2_")
shortcut = depthwise_conv2d_no_bias(shortcut, (5, 1), padding="SAME", name=name + "short_3_")
shortcut = batchnorm_with_activation(shortcut, activation=None, name=name + "short_3_")
shortcut = activation_by_name(shortcut, "sigmoid", name=name + "short_")
shortcut = tf.image.resize(shortcut, tf.shape(inputs)[1:-1], antialias=False, method="bilinear")
return keras.layers.Multiply()([shortcut, nn])
def ghost_bottleneck(
inputs, out_channel, first_ghost_channel, kernel_size=3, strides=1, se_ratio=0, shortcut=True, use_ghost_module_multiply=False, activation="relu", name=""
):
if shortcut:
shortcut = depthwise_conv2d_no_bias(inputs, kernel_size, strides, padding="same", name=name + "short_1_")
shortcut = batchnorm_with_activation(shortcut, activation=None, name=name + "short_1_")
shortcut = conv2d_no_bias(shortcut, out_channel, name=name + "short_2_")
shortcut = batchnorm_with_activation(shortcut, activation=None, name=name + "short_2_")
else:
shortcut = inputs
if use_ghost_module_multiply:
nn = ghost_module_multiply(inputs, first_ghost_channel, activation=activation, name=name + "ghost_1_")
else:
nn = ghost_module(inputs, first_ghost_channel, activation=activation, name=name + "ghost_1_")
if strides > 1:
nn = depthwise_conv2d_no_bias(nn, kernel_size, strides, padding="same", name=name + "down_")
nn = batchnorm_with_activation(nn, activation=None, name=name + "down_")
if se_ratio > 0:
nn = se_module(nn, se_ratio=se_ratio, divisor=4, activation=("relu", "hard_sigmoid_torch"), name=name + "se_")
nn = ghost_module(nn, out_channel, activation=None, name=name + "ghost_2_")
return keras.layers.Add(name=name + "output")([shortcut, nn])
def GhostNetV2(
stem_width=16,
stem_strides=2,
width_mul=1.0,
num_ghost_module_v1_stacks=2, # num of `ghost_module` stcks on the head, others are `ghost_module_multiply`, set `-1` for all using `ghost_module`
input_shape=(224, 224, 3),
num_classes=1000,
activation="relu",
classifier_activation="softmax",
dropout=0,
pretrained=None,
model_name="ghostnetv2",
kwargs=None,
):
inputs = keras.layers.Input(input_shape)
stem_width = make_divisible(stem_width * width_mul, divisor=4)
nn = conv2d_no_bias(inputs, stem_width, 3, strides=stem_strides, padding="same", name="stem_")
nn = batchnorm_with_activation(nn, activation=activation, name="stem_")
""" stages """
kernel_sizes = [3, 3, 3, 5, 5, 3, 3, 3, 3, 3, 3, 5, 5, 5, 5, 5]
first_ghost_channels = [16, 48, 72, 72, 120, 240, 200, 184, 184, 480, 672, 672, 960, 960, 960, 960]
out_channels = [16, 24, 24, 40, 40, 80, 80, 80, 80, 112, 112, 160, 160, 160, 160, 160]
se_ratios = [0, 0, 0, 0.25, 0.25, 0, 0, 0, 0, 0.25, 0.25, 0.25, 0, 0.25, 0, 0.25]
strides = [1, 2, 1, 2, 1, 2, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1]
for stack_id, (kernel, stride, first_ghost, out_channel, se_ratio) in enumerate(zip(kernel_sizes, strides, first_ghost_channels, out_channels, se_ratios)):
stack_name = "stack{}_".format(stack_id + 1)
out_channel = make_divisible(out_channel * width_mul, 4)
first_ghost_channel = make_divisible(first_ghost * width_mul, 4)
shortcut = False if out_channel == nn.shape[-1] and stride == 1 else True
use_ghost_module_multiply = True if num_ghost_module_v1_stacks >= 0 and stack_id >= num_ghost_module_v1_stacks else False
nn = ghost_bottleneck(
nn, out_channel, first_ghost_channel, kernel, stride, se_ratio, shortcut, use_ghost_module_multiply, activation=activation, name=stack_name
)
nn = conv2d_no_bias(nn, make_divisible(first_ghost_channels[-1] * width_mul, 4), 1, strides=1, name="pre_")
nn = batchnorm_with_activation(nn, activation=activation, name="pre_")
if num_classes > 0:
nn = keras.layers.GlobalAveragePooling2D(keepdims=True)(nn)
nn = conv2d_no_bias(nn, 1280, 1, strides=1, use_bias=True, name="features_")
nn = activation_by_name(nn, activation, name="features_")
nn = keras.layers.Flatten()(nn)
if dropout > 0 and dropout < 1:
nn = keras.layers.Dropout(dropout)(nn)
nn = keras.layers.Dense(num_classes, dtype="float32", activation=classifier_activation, name="head")(nn)
model = keras.models.Model(inputs, nn, name=model_name)
add_pre_post_process(model, rescale_mode="torch")
reload_model_weights(model, PRETRAINED_DICT, "ghostnetv2", pretrained)
return model
def GhostNetV2_1X(input_shape=(224, 224, 3), num_classes=1000, activation="relu", classifier_activation="softmax", pretrained="imagenet", **kwargs):
return GhostNetV2(**locals(), model_name="ghostnetv2_1x", **kwargs)
""" GhostNet V1 """
def GhostNet(
stem_width=16,
stem_strides=2,
width_mul=1.0,
num_ghost_module_v1_stacks=-1, # num of `ghost_module` stcks on the head, others are `ghost_module_multiply`, set `-1` for all using `ghost_module`
input_shape=(224, 224, 3),
num_classes=1000,
activation="relu",
classifier_activation="softmax",
dropout=0,
pretrained=None,
model_name="ghostnet",
kwargs=None,
):
return GhostNetV2(**locals())
def GhostNet_1X(input_shape=(224, 224, 3), num_classes=1000, activation="relu", classifier_activation="softmax", pretrained="imagenet", **kwargs):
return GhostNet(**locals(), model_name="ghostnet_1x", **kwargs)
The Ghost module in GhostNetV1 incorporates the DFC attention branch, while GhostNetV2 employs it.
Building upon the GhostNetV1 architecture, the authors of GhostFaceNets made several key modifications to tailor the model for face recognition and face verification tasks.
GhostFaceNets are a significant advancement in lightweight face recognition and face verification models, incorporating key modifications to improve performance and efficiency. One notable improvement is the use of a modified Ghost Depthwise Convolution layer, replacing the Global Average Pooling layer in image classification models. This allows the network to learn varying weights for different feature map units, enhancing discriminative power and performance.
GhostFaceNets use the Parametric Rectified Linear Unit (PReLU) activation function instead of ReLU, enabling negative activations for complex nonlinear functions learning, improving network performance in face recognition tasks. Convolutions replace conventional FC layers in Squeeze-and-Excitation modules.
GhostFaceNets introduce a novel attention mechanism within SE modules, improving channel interdependencies at minimal computational cost. This mechanism adjusts channel weight to prioritize important features and reduces sensitivity to less relevant ones, offering flexibility in downsampling strategies.
GhostFaceNets variants design with configurable backbones, width multipliers, and stride parameters for generalization and adaptability. Experiments with hyperparameters and training datasets, including MS1MV2 and MS1MV3, optimize performance using ArcFace training loss function, minimizing intra-class gap and enhancing inter-class differentiation.
Please use the below requirements to run the code, python version is 3.9.12:
Below code is from module from main folder.
import tensorflow as tf
from tensorflow import keras
import tensorflow.keras.backend as K
def __init_model_from_name__(name, input_shape=(112, 112, 3), weights="imagenet", **kwargs):
name_lower = name.lower()
""" Basic model """
if name_lower == "ghostnetv1":
from backbones import ghost_model
xx = ghost_model.GhostNet(input_shape=input_shape, include_top=False, width=1, **kwargs)
elif name_lower == "ghostnetv2":
from backbones import ghostv2
xx = ghostv2.GhostNetV2(stem_width=16,
stem_strides=1,
width_mul=1.3,
num_ghost_module_v1_stacks=2, # num of `ghost_module` stcks on the head, others are `ghost_module_multiply`, set `-1` for all using `ghost_module`
input_shape=(112, 112, 3),
num_classes=0,
activation="prelu",
classifier_activation=None,
dropout=0,
pretrained=None,
model_name="ghostnetv2",
**kwargs)
else:
return None
xx.trainable = True
return xx
def buildin_models(
stem_model,
dropout=1,
emb_shape=512,
input_shape=(112, 112, 3),
output_layer="GDC",
bn_momentum=0.99,
bn_epsilon=0.001,
add_pointwise_conv=False,
pointwise_conv_act="relu",
use_bias=False,
scale=True,
weights="imagenet",
**kwargs
):
if isinstance(stem_model, str):
xx = __init_model_from_name__(stem_model, input_shape, weights, **kwargs)
name = stem_model
else:
name = stem_model.name
xx = stem_model
if bn_momentum != 0.99 or bn_epsilon != 0.001:
print(">>>> Change BatchNormalization momentum and epsilon default value.")
for ii in xx.layers:
if isinstance(ii, keras.layers.BatchNormalization):
ii.momentum, ii.epsilon = bn_momentum, bn_epsilon
xx = keras.models.clone_model(xx)
inputs = xx.inputs[0]
nn = xx.outputs[0]
if add_pointwise_conv: # Model using `pointwise_conv + GDC` / `pointwise_conv + E` is smaller than `E`
filters = nn.shape[-1] // 2 if add_pointwise_conv == -1 else 512 # Compitable with previous models...
nn = keras.layers.Conv2D(filters, 1, use_bias=False, padding="valid", name="pw_conv")(nn)
nn = keras.layers.BatchNormalization(momentum=bn_momentum, epsilon=bn_epsilon, name="pw_bn")(nn)
if pointwise_conv_act.lower() == "prelu":
nn = keras.layers.PReLU(shared_axes=[1, 2], name="pw_" + pointwise_conv_act)(nn)
else:
nn = keras.layers.Activation(pointwise_conv_act, name="pw_" + pointwise_conv_act)(nn)
""" GDC """
nn = keras.layers.DepthwiseConv2D(nn.shape[1], use_bias=False, name="GDC_dw")(nn)
nn = keras.layers.BatchNormalization(momentum=bn_momentum, epsilon=bn_epsilon, name="GDC_batchnorm")(nn)
if dropout > 0 and dropout < 1:
nn = keras.layers.Dropout(dropout)(nn)
nn = keras.layers.Conv2D(emb_shape, 1, use_bias=use_bias, kernel_initializer="glorot_normal", name="GDC_conv")(nn)
nn = keras.layers.Flatten(name="GDC_flatten")(nn)
embedding = keras.layers.BatchNormalization(momentum=bn_momentum, epsilon=bn_epsilon, scale=scale, name="pre_embedding")(nn)
embedding_fp32 = keras.layers.Activation("linear", dtype="float32", name="embedding")(embedding)
basic_model = keras.models.Model(inputs, embedding_fp32, name=xx.name)
return basic_model
def add_l2_regularizer_2_model(model, weight_decay, custom_objects={}, apply_to_batch_normal=False, apply_to_bias=False):
# https://github.com/keras-team/keras/issues/2717#issuecomment-456254176
if 0:
regularizers_type = {}
for layer in model.layers:
rrs = [kk for kk in layer.__dict__.keys() if "regularizer" in kk and not kk.startswith("_")]
if len(rrs) != 0:
# print(layer.name, layer.__class__.__name__, rrs)
if layer.__class__.__name__ not in regularizers_type:
regularizers_type[layer.__class__.__name__] = rrs
print(regularizers_type)
for layer in model.layers:
attrs = []
if isinstance(layer, keras.layers.Dense) or isinstance(layer, keras.layers.Conv2D):
# print(">>>> Dense or Conv2D", layer.name, "use_bias:", layer.use_bias)
attrs = ["kernel_regularizer"]
if apply_to_bias and layer.use_bias:
attrs.append("bias_regularizer")
elif isinstance(layer, keras.layers.DepthwiseConv2D):
# print(">>>> DepthwiseConv2D", layer.name, "use_bias:", layer.use_bias)
attrs = ["depthwise_regularizer"]
if apply_to_bias and layer.use_bias:
attrs.append("bias_regularizer")
elif isinstance(layer, keras.layers.SeparableConv2D):
attrs = ["pointwise_regularizer", "depthwise_regularizer"]
if apply_to_bias and layer.use_bias:
attrs.append("bias_regularizer")
elif apply_to_batch_normal and isinstance(layer, keras.layers.BatchNormalization):
if layer.center:
attrs.append("beta_regularizer")
if layer.scale:
attrs.append("gamma_regularizer")
elif apply_to_batch_normal and isinstance(layer, keras.layers.PReLU):
attrs = ["alpha_regularizer"]
for attr in attrs:
if hasattr(layer, attr) and layer.trainable:
setattr(layer, attr, keras.regularizers.L2(weight_decay / 2))
return keras.models.clone_model(model)
def replace_ReLU_with_PReLU(model, target_activation="PReLU", **kwargs):
from tensorflow.keras.layers import ReLU, PReLU, Activation
def convert_ReLU(layer):
# print(layer.name)
if isinstance(layer, ReLU) or (isinstance(layer, Activation) and layer.activation == keras.activations.relu):
if target_activation == "PReLU":
layer_name = layer.name.replace("_relu", "_prelu")
print(">>>> Convert ReLU:", layer.name, "-->", layer_name)
# Default initial value in mxnet and pytorch is 0.25
return PReLU(shared_axes=[1, 2], alpha_initializer=tf.initializers.Constant(0.25), name=layer_name, **kwargs)
elif isinstance(target_activation, str):
layer_name = layer.name.replace("_relu", "_" + target_activation)
print(">>>> Convert ReLU:", layer.name, "-->", layer_name)
return Activation(activation=target_activation, name=layer_name, **kwargs)
else:
act_class_name = target_activation.__name__
layer_name = layer.name.replace("_relu", "_" + act_class_name)
print(">>>> Convert ReLU:", layer.name, "-->", layer_name)
return target_activation(**kwargs)
return layer
input_tensors = keras.layers.Input(model.input_shape[1:])
return keras.models.clone_model(model, input_tensors=input_tensors, clone_function=convert_ReLU)
def convert_to_mixed_float16(model, convert_batch_norm=False):
policy = keras.mixed_precision.Policy("mixed_float16")
policy_config = keras.utils.serialize_keras_object(policy)
from tensorflow.keras.layers import InputLayer, Activation
from tensorflow.keras.activations import linear, softmax
def do_convert_to_mixed_float16(layer):
if not convert_batch_norm and isinstance(layer, keras.layers.BatchNormalization):
return layer
if isinstance(layer, InputLayer):
return layer
if isinstance(layer, Activation) and layer.activation == softmax:
return layer
if isinstance(layer, Activation) and layer.activation == linear:
return layer
aa = layer.get_config()
aa.update({"dtype": policy_config})
bb = layer.__class__.from_config(aa)
bb.build(layer.input_shape)
bb.set_weights(layer.get_weights())
return bb
input_tensors = keras.layers.Input(model.input_shape[1:])
mm = keras.models.clone_model(model, input_tensors=input_tensors, clone_function=do_convert_to_mixed_float16)
if model.built:
mm.compile(optimizer=model.optimizer, loss=model.compiled_loss, metrics=model.compiled_metrics)
# mm.optimizer, mm.compiled_loss, mm.compiled_metrics = model.optimizer, model.compiled_loss, model.compiled_metrics
# mm.built = True
return mm
def convert_mixed_float16_to_float32(model):
from tensorflow.keras.layers import InputLayer, Activation
from tensorflow.keras.activations import linear
def do_convert_to_mixed_float16(layer):
if not isinstance(layer, InputLayer) and not (isinstance(layer, Activation) and layer.activation == linear):
aa = layer.get_config()
aa.update({"dtype": "float32"})
bb = layer.__class__.from_config(aa)
bb.build(layer.input_shape)
bb.set_weights(layer.get_weights())
return bb
return layer
input_tensors = keras.layers.Input(model.input_shape[1:])
return keras.models.clone_model(model, input_tensors=input_tensors, clone_function=do_convert_to_mixed_float16)
def convert_to_batch_renorm(model):
def do_convert_to_batch_renorm(layer):
if isinstance(layer, keras.layers.BatchNormalization):
aa = layer.get_config()
aa.update({"renorm": True, "renorm_clipping": {}, "renorm_momentum": aa["momentum"]})
bb = layer.__class__.from_config(aa)
bb.build(layer.input_shape)
bb.set_weights(layer.get_weights() + bb.get_weights()[-3:])
return bb
return layer
input_tensors = keras.layers.Input(model.input_shape[1:])
return keras.models.clone_model(model, input_tensors=input_tensors, clone_function=do_convert_to_batch_renorm)
The authors of GhostFaceNets rigorously tested the modelβs performance on different benchmark datasets, including the widely-acclaimed Labeled Faces in the Wild (LFW) and YouTube Faces (YTF) datasets. The results were great, with GhostFaceNets achieving state-of-the-art performance while maintaining a majorly smaller model size and lower computational complexity compared to existing face recognition models.
GhostFaceNets opens up a world of possibilities like:
As the demand for edge computing and real-time face recognition continues to grow, GhostFaceNets represents a major step forward in the field, paving the way for future advancements and innovations. Researchers and engineers can build upon this groundbreaking work, exploring new architectures, optimization techniques, and applications to further push the boundaries of efficient and accurate face recognition.
GhostFaceNets is a groundbreaking engineering innovation that uses deep learning techniques and edge computing to create lightweight face recognition models. It uses ghost modules to deliver accurate and robust recognition capabilities while maintaining a computationally efficient footprint. As the world embraces ubiquitous computing and the Internet of Things, GhostFaceNets is a beacon of innovation. Integrating face recognition technology into daily life to improve experiences and security without sacrificing performance or efficiency.
A. GhostFaceNets achieves efficiency through innovative architectural enhancements, leveraging Ghost modules, modified GDC recognition heads, and attention-based mechanisms like the DFC attention branch. These optimizations reduce computational complexity while maintaining accuracy.
A. GhostFaceNets distinguishes itself by balancing efficiency and accuracy. Unlike traditional models requiring substantial computational resources, GhostFaceNets uses lightweight architectures and attention mechanisms to achieve high performance on edge devices.
A. GhostFaceNets architecture includes Ghost modules for efficient feature map generation and modified GDC recognition heads for discriminative feature vectors. It also employs PReLU activation and attention-based mechanisms like the DFC attention branch for capturing dependencies.
A. GhostFaceNets excelled on LFW and YTF, showing better performance with smaller sizes and less complexity.
The media shown in this article is not owned by Analytics Vidhya and is used at the Authorβs discretion.