People Count in a Retail Store

Srijita Last Updated : 21 May, 2021

5 min read

This article was published as a part of the Data Science Blogathon

Aim

Crowd counting is one of the interesting works in the Computer Vision domain. People count in and around retail stores, find its use in many applications like dwell time monitoring, staffing changes, queue management, etc.

In this blog, we will discuss popular people count methods, along with some tasks done in video processing to have better results. There are some algorithms like Haar Cascade, HOG, and OpenCV methods that are used in people detection. After having an understanding of these methods along with their advantages, we can employ these methods in the people counting use case as discussed below.

Our aim is to find the number of people inside the store at a particular hour (dwell time) and the number of people at various sections (groceries, beverages, etc) inside the retail store with the help of CCTV footages. To perform this task, CCTV videos at the entry point, and at different sections inside the store are required.

The video below shows a typical CCTV footage of a retail store, having various sections of the store in the field of view.

Algorithms

Let’s discuss some of the people detection algorithms along with the approach used in this blog:

1. Haar Cascade people Detection Algorithm- It is an ML-based approach where a cascade function is trained from a lot of positive and negative images. Pre-trained cascades are used in detection. Learn more about this method here: cascade.

Below is the code for it :

import numpy as np

import cv2# Create our body classifier

body_classifier = cv2.CascadeClassifier(‘haarcascade_fullbody.xml’)# Initiate video capture for video file

cap = cv2.VideoCapture(‘/moskva.mov’)# Loop once video is successfully loaded

while cap.isOpened():

# Read first frame

ret, frame = cap.read()

gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

# Pass frame to our body classifier

bodies = body_classifier.detectMultiScale(gray, 1.1, 3)

# Extract bounding boxes for any bodies identified

for (x,y,w,h) in bodies:

cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 255), 2)

cv2.imshow(‘Pedestrians’, frame)

if cv2.waitKey(1) == 13: #13 is the Enter Key

break

cap.release()

cv2.destroyAllWindows()

2. Simple HOG detection- HOG(Histogram of Gradients) is a type of “feature descriptor”. The technique counts occurrences of gradient orientation in localized portions of an image and thereby in a video. Learn more about this method here: hog.

Below is the code for it :

import cv2
import imutils
# Initialising HOG person detector
hog = cv2.HOGDescriptor
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector)
# Reading the Image
image = cv2.imread(‘img.png’)
# Resizing the Image
image = imutils.resize(image,
width=min(400, image.shape[1]))
# Detecting all the regions in the Image that has a pedestrians inside it
(regions, _) = hog.detectMultiScale(image, winStride=(4, 4), padding=(4, 4), scale=1.05)
# Drawing the regions in the Image
for (x, y, w, h) in regions:
cv2.rectangle(image, (x, y), (x + w, y + h), (0, 0, 255), 2)
# Showing the output Image
cv2.imshow(“Image”, image)
cv2.waitKey(0)
cv2.destroyAllWindows()

3. OpenCV background subtraction- Background subtraction is a major preprocessing step in many vision-based applications. For example, consider the cases like a visitor counter where a static camera takes the number of visitors entering or leaving the room, or a traffic camera extracting information about the vehicles, etc. In all these cases, first, you need to extract the person or vehicles alone. Technically, you need to extract the moving foreground from static background. It is a relatively faster method for real-time people detection. OpenCV has implemented three such algorithms :

BackgroundSubtractorMOG
BackgroundSubtractorMOG2
BackgroundSubtractorGMG

Learn more about these here: opencv

Below is the implementation of OpenCV background subtraction using BackgroundSubtractorMOG2:

import numpy as np

import cv2

cap = cv2.VideoCapture(‘vtest.avi’)

fgbg = cv2.createBackgroundSubtractorMOG2()

while(1):

ret, frame = cap.read()

fgmask = fgbg.apply(frame)

cv2.imshow(‘frame’,fgmask)

k = cv2.waitKey(30) & 0xff

if k == 27:

break

cap.release()

cv2.destroyAllWindows()

Source : https://docs.opencv.org/3.4/d1/dc5/tutorial_background_subtraction.html

The second image shows the OpenCV background subtraction results on the first image.

Our approach uses this method for better results. Contour methods and morphological transformations have been used to count people with more accuracy.

4. HOG with linear SVM algorithm- The accuracy of the HOG detector (discussed in the Simple HOG detection method) can be further improved by using an SVM classifier to classify positive and negative features from sample images.

The extracted positive and negative features from collected positive and negative image samples are used to train the SVM model with HOG detection. This method counts traffic with maximum accuracy and the algorithm can be customized. Negative images (background images of retail stores) can be generated for any new store to increase the accuracy.

Approach

Comparison of the above-mentioned algorithms :

Source : self project work

Let’s look at the approach used in this blog, based on the above observation, keeping in mind the different type of videos that we get from the retail store:

Splitting of videos

Splitting of the store layout video is done for effective traffic count at various categories from a single camera view. One footage might cover 2-3 categories like beverage, grocery sections. To get accurate people – count at different sections of the store, splitting is helpful.

Source : self project work

As can be seen in the image above, CCTV videos are available at a bay level so to measure the traffic at a category level, the video coverage area is split into categories area-wise.

Results

In the use case, our main task is to have an estimate of the count of people inside the store(and also in various sections of the store) to analyze dwell time. Having discussed the algorithms and approaches suitable for the given case, let’s look at the results :

Entrance/Exit Camera Video

The algorithm used:- Opencv background subtractor

Reason:- Fast detections are done because people usually enter at a relatively fast speed (as compared to slow movement inside the store). The people are detected when they cross the camera view.

Result:-

Camera videos above various sections inside the store

The algorithm used:- HOG (linear SVM classifier)

Reason:- Accurate detection is needed since people usually walk with trolleys/children. This algorithm is best for this case scenario

Result:-

Grocery section people count in every frame :

Beverage section people count in every frame :

Let us know in the comments in case of any approach that may further enhance the results.

The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.

Srijita

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Reading list

Introduction to Computer Vision

Getting Started with Image Data

Introduction to CNN and Implementation

Introduction to CNN and implementation

Introduction to Transfer Learning

CNN Visualization

Overview of Pretrained Models

Inception

ResNets

DenseNets

CSRNet

Introduction to Object Detection

Region Based Convolutional Neural Network

Single Stage Networks

Transformed Based Object Detection Models

Face Detection

Object Tracking

Pose Estimation

Introduction to Image Segmentation

Understanding Deep Learning Architectures for Image Segmentation

Video Classification

Introduction to Image Generation

Experiments with Generative Adversarial Networks

Zero and Few Shot Learning

Model Deployment

People Count in a Retail Store

Aim

Algorithms

Approach

Splitting of videos

Results

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp