Alleviation of COVID by means of Social Distancing & Face Mask Detection Using YOLO V4

Yash Last Updated : 27 Aug, 2021

10 min read

This article was published as a part of the Data Science Blogathon.

Abstract

This topic consists of social distancing & face mask detection for the events of coronavirus, alleviation in such pandemic can be solved by social distancing as well as putting on its face mask. The Covid-19 had a huge impact on different sectors in many countries and such impact caused problems to many people around the world. This small step of wearing a face mask as well as following social distancing would save lots of lives as the spread of the virus could be mitigated.

YOLO stands for You Only Look Once, this algorithm is used for Object Detection as well as Object Tracking, this research uses YOLO for calculating the social distancing & identifying face mask on people’s face with the help of Object Detection, whereas tracking the face and people in the frame for counting the objects and keeping a record of that object in the next frame is done by Object Tracking. The minimum distance to keep while adhering to social distancing is 6 Feet, keeping this as the base for calculating distance, the model was trained and used for object detection as well as object tracking.

There are different types of algorithms available, YOLO stands out from all the other present currently. The custom datasets were used to understand facemasks and were trained on those datasets for detection and tracking. For evaluation of the trained model, mAP (Mean Average Precision) was calculated for both the use cases (Social Distancing & Face Mask Detection), it works by comparing the ground-truth bounding box vs the detected box and, in the end, returns the score. The higher the mAP score would be, the better model is in the detection of objects

INTRODUCTION

Computer Vision is the subset of Artificial Intelligence that uses the computer’s power to extract meaningful information from the provided datasets, those datasets can be Images, Videos, etc. This use of computer vision can be extended to many other applications depending upon the use cases. Artificial Intelligence can be stated as the shed which covers the aspects such as Machine Learning, Deep Learning & Computer Vision.

This research-based on Face Mask Detection & Social Distancing uses computer vision to understand various aspects of the images or videos based on frames that would be provided as an input to the algorithms. The basic concept behind this is to find the bounding boxes related to the classes, the classes could be anything that would range from a Dog to Car depending on the training datasets.

Coronavirus had a great impact on various sectors of the world that be Industrial or Transportation or Agriculture, this impact caused the world to stop every sector and ordered to follow strict restrictions for following social distancing & wearing a face mask on a priority basis. This impact of Covid-19 on different sectors can be observed in Fig 1 below.

pie chart — Fig 1. Pie-Chart for Impact Distribution- Image by Author

It can be observed that the highest impact was done on the Restaurants sectors with the percentage of 20 % been the highest, followed by Real Estate (16 %) been the second-highest among others. Similarly, it can be observed that the lowest impact done by Covid-19 was on the Agriculture sector (3 %) respectively. Total cases of Coronavirus globally can be observed from Fig 2, which shows the graph of people affected by Covid based on timestamp.

Total Covid Cases Globally Yolo V4 — Fig 2. Total Covid Cases Globally- Image by Author

It can be observed that the cases started around 22^nd Jan 2020 and the graph was exponentially increasing day by day, from 0 cases to around 111 million cases by 9^th Feb 2021. This rise of Covid was impacting all countries with different figures on an individual level, such huge numbers were devastating and caused this epidemic transition to the pandemic.

METHODOLOGY

This part of the topic highlights the algorithm used for object detection as well as object tracking.

1. YOLO Architecture

The Yolo algorithm stands for You Only Look Once, this algorithm is a state of art, which works on a real-time system, build on deep learning for solving various Object Detection as well as Object Tracking problems. The architecture of Yolo can be observed from the below Fig 3.

YOLO Architecture V4 — Fig 3. YOLO Architecture- Image by Author

It can be observed from the above figure that the architecture contains the Input image layers which are responsible for taking the inputs that would be passed to further layers, input can be any image depending upon the use cases. Along the input layer comes the DarkNet Architecture, this is an open-source neural network for which framework is created with the help of C & CUDA, this framework features YOLO for object detection & object tracking.

Further, the architecture consists of the flattened layer which is densely connected with the convolutional layer which is also densely connected to pass the data from each node to other nodes in the architecture, similarly, this is passed to the output layer which gives 4-part values, those 4 parts describe the predicted value for the bounding box, denoted by x, y, w, h, along with the object detection score plus the probability of the predicted class. This YOLO is part of the One-Shot object detector family which is accurate & fast, there is also a Two-Shot object detector.

Two-Shot object detectors which are popular are R-CNN, Fast R-CNN, and Faster R-CNN, these algorithms are accurate in obtaining the results based on certain use cases but are slow as compared to that of Yolo, You Only Look Once is an algorithm that looks at the image at a single glance and based on that look predicts the bounding boxes related to certain classes, classes can be anything ranging from Dog to Car, or Gun to Tanks, this special feature makes Yolo stand out from others. Different types of object detectors based on a shot can be observed in Fig 4 below.

Yolo v4 fig4

Fig 4. Different Types of Detector- Image by Author

From the above figure, we can find out different components, there are 4 different types of components

Input The input to the detector can be an image or video based on the use cases specified in the research.

Backbone The backbone of the object detector contains models, these models can be ResNet, DenseNet, VGG.

Neck The neck in the detector acts as an extra layer, which goes in parallel to the backbone & the head.

Head The head is the network that is in charge of the detection of objects based on bounding boxes.

EXPERIMENTAL RESULTS

The experimental results section for this project details the results obtained after doing various observations and forming final outputs. This project focuses on social distancing detection & face mask detection for the events of Covid-19, Fig 5 explains the architecture for calculating the distance between objects and shows the flow of how the output is getting generated with the use of Yolo Version 4.

Fig 5. YOLO Darknet Architecture- Image by Author

The below Fig 6 is the architecture for the analysis of face masks on objects, the objects over here is the person on which the detection is performed with the help of custom datasets. The custom dataset is trained for 3 different categories (Good, None & Bad) depending upon the annotations provided, it bounds the boxes with respective classes. The difference between object detection and object tracking is the use of a tracker (in Yolo DeepSort) which helps in keeping a track of an object by assigning an Id.

YOLO V4 Deepsort Architecture — Fig 6. YOLO Deepsort Architecture- Image by Author

Below are the examples of what datasets have been used for training purposes. It can be observed from Fig 7, which shows the detection for a person based on the COCO dataset, this dataset contains a large number of classes ranging from Cat to Car to Person and so on.

yolo v4 fig 7

Fig 7. COCO Dataset Sample- Image by Author

Similarly, Fig 8 below shows the custom dataset used for Face Mask Detection, this custom dataset contains 600 Images with annotations made for every object present in the frame. The need for creating a custom dataset was because the COCO dataset doesn’t contain classes for face mask detection.

YOLO V4 Custom Dataset Sample

Based on the above figure, the annotation was created for different classes present in the frame, it can be observed from Fig 9, it contains 2 different classes (0 & 2). The classes use for face mask detection are 0 for Good, 1 for None & 2 for Bad respectively.

Annotations for Objects based on Images Yolo V4 — Fig 9. Annotations for Objects based on Images- Image by Author

Similarly, the other annotation file was created based on Person Object Detection for creating bounding boxes based on objects detected in the frame. It can be observed from Fig 10 below, which contains a single class (0 for Person), the output goal for social distancing is to detect the person in a frame, and based on the distance between the other object, the measurement is calculated. For calculating the distance between objects, the Euclidean Distance formula is used.

Annotations of Objects based on Images Yolo V4 — Fig 10. Annotations of Objects based on Images- Images by Author

Below is the training graph plotted for the training of custom dataset, the custom dataset used in this research is related to face mask, the epoch for which it was trained is 4000 Epochs, it can be observed from Fig 11, the loss vs the epochs were getting reduced after 1200 Epochs and remained constant throughout the last epoch, this explains that the training loss was minimized till 1200 and thereafter it was constant, which means that the training epochs should’ve been set around 2000, because the more number of iterations present in training the data, the more computing power is needed for performing.

Fig 11. Iteration Graph for Training Custom Data- Image by Author

The results related to the research based on social distancing are shown in Fig 12, the results are grid into 2 images, the left side of the image indicates the output with respective bounding boxes based on distance calculation.

Similarly, the project was carried on Face Mask detection has the result in below Fig 13 which shows the objects detected with bounding boxes respectively, the goal is the detect if the object (Face) is wearing a Mask or not, based on that it created a bounding box with different color and displays the class name associated to it. The color used is Green for No Mask & Purple for Mask or None.

fig 13 Face Mask Detection on Crowded Place

Another example related to face mask detection using Darknet is shown in below Fig 14, the implementation of the darknet is based on Object Detection without tracking the objects throughout different frames, it can be observed that the model detected objects wearing No Mask, still assigned some objects with Good & Bad, also an object with miss classification for No Mask was assigned with Good category, these False Positive results will be explained in Table II.

Fig 14

Fig 14. Another Example for Face Mask Detection- Image by Author

Finally, to evaluate the training of the model based on the dataset provided was done with mAP (Mean Average Precision), it is based on the calculations for Mean Average Precision over all the calculation based on the classes present in the training data & the overall IoU (Intersection Over Union) threshold, the below Table I shows the Average Precision for each category and the values obtained by True Positive & False Positive. The percentage of the threshold for which the AP was calculated as 0.25 % with 101 Recall Points.

From the above table, it can be observed that the percentage of classes for AP were above 90 % for each category.

CONCLUSION

The study of this research was to understand the social distancing & face mask detection for the events of Covid-19, the object detection for social distancing was based on persons & face mask detection was based on faces, which was done by using Yolo. The Yolo v4 for object detection was carried out by Darknet & object tracking was carried out by Deepsort.

Final calculations for how better the model was working for predictions of the object were done by calculating mAP, which showed that for a threshold of 0.25 % the average precision was around 90 % & above, for the threshold of 0.50 % the average precision was around 88 % & above. Similarly, the outputs for social distancing were carried out on different datasets of videos, to increase the complexity for detection, crowded places were also taken into consideration.

The face mask tracking model showed the percentage accuracy for each object detected. This could be carried out in bigger industries with real-time detection, for which higher computational power would require.

About the Author

My name is Yash Indulkar, Completed my under graduation from Thakur College of Science & Commerce (TCSC), my research areas are Convolutional Neural Networks, Bayesian Deep Learning, Computational Linguistics on the theoretical sides. Also Natural Language Processing, Object Detection as well as Object Tracking on the Application Side.

For more details about me, please find below links

LinkedIn https://www.linkedin.com/in/yashindulkar/

Github https://github.com/yashindulkar

Medium https://yashindulkar.medium.com/

The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.

Yash

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to Computer Vision

Getting Started with Image Data

Introduction to CNN and Implementation

Introduction to CNN and implementation

Introduction to Transfer Learning

CNN Visualization

Overview of Pretrained Models

Inception

ResNets

DenseNets

CSRNet

Introduction to Object Detection

Region Based Convolutional Neural Network

Single Stage Networks

Transformed Based Object Detection Models

Face Detection

Object Tracking

Pose Estimation

Introduction to Image Segmentation

Understanding Deep Learning Architectures for Image Segmentation

Video Classification

Introduction to Image Generation

Experiments with Generative Adversarial Networks

Zero and Few Shot Learning

Model Deployment

Alleviation of COVID by means of Social Distancing & Face Mask Detection Using YOLO V4

Abstract

INTRODUCTION

METHODOLOGY

1. YOLO Architecture

EXPERIMENTAL RESULTS

CONCLUSION

About the Author

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt