Your Social Distancing Detection Tool: How to Build One using your Deep Learning Skills

Aravind Pai Last Updated : 04 Jan, 2021

10 min read

Overview

Learn how to build your own Social Distancing Tool using your Deep Learning and Computer Vision skills
Understand the State-of-the-Art architectures (SOTA) for Object Detection
Hands-on with Detectron 2 – FAIR library for Object Detection and Segmentation – required to build the social distancing tool

Introduction

Social Distancing – the term that has taken the world by storm and is transforming the way we live. Social distancing has become a mantra around the world, transcending languages and cultures.

This way of living has been forced upon us by the fastest growing pandemic the world has ever seen – COVID-19. As per the World Health Organization (WHO), COVID-19 has so far infected almost 4 million people and claimed over 230K lives globally. Around 213 countries have been affected so far by the deadly virus.

The biggest cause of concern is that COVID-19 spreads from person to person through contact or if you’re within close proximity of an infected person. Given how densely populated some areas are, this has been quite a challenge.

The only way to prevent the spread of COVID-19 is Social Distancing. Keeping a safe distance from each other is the ultimate way to prevent the spread of this disease (at least until a vaccine is found).

So this got me thinking – I want to build a tool that can potentially detect where each person is in real-time, and return a bounding box that turns red if the distance between two people is dangerously close. This can be used by governments to analyze the movement of people and alert them if the situation turns serious.

Here’s a taste of the social distancing detection tool we’ll be building:

I would recommend going through the below articles and courses if you need a refresher:

Overview of Object Detection and Tracking
Evolution of SOTA for Object Detection
1. Sliding Window
2. R-CNN
3. Fast R-CNN
4. Faster R-CNN
Social Distancing Tool – An Use Case of Object Detection and Tracking
What’s Next?

Overview of Object Detection and Tracking

I can vividly recall my initial days learning computer vision. I often got confused between these two terms – Image Classification and Object Detection. I used both of these terms interchangeably assuming the idea behind them was similar. And, as a result, I kept getting confused between deep learning projects. Not ideal!

So, I will kickstart the article by answering this perplexing question – are image classification and object detection one and the same?

Think about it – objects are everywhere! That’s why Object Detection and Image Classification are very popular tasks in computer vision. They have a wide range of applications in defense, healthcare, sports, and the space industry.

The fundamental difference between these two tasks is that image classification identifies an object in an image whereas object detection identifies the object as well as its location in an image. Here’s a classic example to understand this difference:

Well, then how is Object Tracking different from Object Detection?

Object Tracking and Object Detection are similar in terms of functionality. These two tasks involve identifying the object and its location. But, the only difference between them is the type of data that you are using. Object Detection deals with images whereas Object Tracking deals with videos.

Object Detection applied on each and every frame of a video turns into an Object Tracking problem.

As a video is a collection of fast-moving frames, Object Tracking identifies an object and its location from each and every frame of a video.

Evolution of State-of-the-Art (SOTA) for Object Detection

Object Detection is one of the most challenging problems in computer vision. Having said that, there has been an immense improvement over the past 20 years in this field. We can broadly divide this into two generations – before and after deep learning:

History of Object Detection (Reference: Object Detection in 20 Years: A Survey)

Now, I will discuss some of the popular and widely used techniques for Object Detection.

Sliding Window for Object Detection

The simplest approach to build an Object Detection model is through a Sliding Window approach. As the name suggests, an image is divided into regions of a particular size and then every region is classified into the respective classes.

Remember that the regions can be overlapping and varying in size as well. It all depends on the way you want to formulate the problem.

Model Workflow

Consider an image
Divide an image into regions (assume 10 * 10 region)
For each region:
1. Pass a region to Convolutional Neural Network (CNN)
2. Extract features from CNN
3. Pass features to a classifier & regressor

This method is really simple and efficient. But it’s a time-consuming process as it considers the huge number of regions for classification. Now, we will see how we can reduce the number of regions for classification in the next approach.

R-CNN for Object Detection

So how do we make this a non-time consuming task? This can be brought down by discarding the regions that are not likely to contain the object. This process of extracting the regions that are likely to contain the object is known as Region Proposals.

Region proposals have a higher probability of containing an object

Many Region Proposal algorithms have been proposed to select a Region of Interest (ROI). Some of the popular ones are objectness, selective search, category-independent object proposals, etc. So, R-CNN was proposed with an idea of using the exterior region proposal algorithm.

R-CNN stands for Region-based Convolutional Neural Network. It uses one of the external region proposal algorithms to select the region of interest (ROI).

Model Workflow

Consider an image
Select ROI using exterior region proposal algorithm
For each region:
1. Pass a region to CNN
2. Extract features from CNN
3. Pass features to a classifier & regressor

The predicted regions can be overlapping and varying in size as well. So, Maximum Non Suppression is used to ignore the bounding boxes depending upon the Intersection Over Union (IOU) score:

Certainly, R-CNN’s architecture was the State of the Art (SOTA) at the time of the proposal. But it consumes nearly 50 seconds for every test image during inference because of the number of forward passes to a CNN for feature extraction. As you can observe under the model workflow, every region proposal is passed to a CNN for feature extraction.

For example, if an image has 2000 regions of proposals, then the number of forward passes to the CNN is around 2000. This inevitably led to another model architecture known as Fast R-CNN.

Fast R-CNN for Object Detection

In order to reduce the inference speed, a slight change in the R-CNN workflow was made and proposed, known as Fast R-CNN. The modification was done in the feature extraction of region proposals.

In R-CNN, feature extraction takes place for each region proposal whereas, in Fast R-CNN, feature extraction takes place only once for an original image. Then the relevant ROI features are chosen based on the location of the region proposals. These region proposals are constructed before passing an image to CNN.

Remember that the input to CNN is the actual image without any ROI:

Model Workflow

Consider an image
Select Regions of Interest (ROI) using exterior region proposal algorithm
Pass an image to the CNN
Extract the features of an image
Choose relevant ROI features using the location of ROI
For each ROI feature, pass features to a classifier & regressor

During inference, Fast R-CNN consumes nearly 2 seconds for each test image and is about 25 times faster than R-CNN. The reason being the change in the feature extraction of ROI. For example, if an image has a 2000 region of proposals, then the number of forward passes to the CNN is around 1.

Can we still bring down the inference speed? Yes! It’s possible. This led to Faster R-CNN, a SOTA model for object detection tasks.

Faster R-CNN for Object Detection

Faster R-CNN replaces the exterior region proposal algorithm with a Region Proposal Network (RPN). RPN learns to propose the region of interests which in turn saves a lot of time and computation as compared to a Fast R-CNN.

Faster R-CNN = Fast R-CNN + RPN

Model Workflow

Consider an image
Pass an image to CNN
Extract the features of an image
Select ROI features using Region Proposal Network (RPN)
For each ROI feature, pass features to a classifier & regressor

Faster R-CNN takes close to 0.2 seconds for every test image during inference and is about 250 times faster than Fast R-CNN.

Your Social Distancing Tool – A Use Case of Object Detection & Tracking

Social Distancing is the only way to prevent the spread of COVID-19 right now. Recently, Andrew Ng’s Landing AI team created a Social Distancing Tool using the concepts of Computer Vision. This project is inspired by their work. You can download the video from here.

Time to power up your coding skills!

Note: The code is developed on Google Colab. I would recommend using the same. Change the runtime to GPU prior to installing libraries.

Understanding Detectron 2

Detectron 2 is an open-source library for object detection and segmentation created by the Facebook AI Research team, popularly known as FAIR. Detectron 2 implements state of the art architectures like Faster R CNN, Mask R CNN, and RetinaNet for solving different computer vision tasks, such as:

Object Detection
Instance Segmentation
Keypoint Detection
Panoptic Segmentation

The baseline models of Faster R-CNN and Mask R-CNN are available with 3 different backbone combinations. Please refer to this Detectron-2 GitHub repository for additional details.

Let’s begin!

Install Dependencies

	# install dependencies: (use cu101 because colab has CUDA 10.1)
	!pip install cython pyyaml==5.1

	# install detectron2:
	!pip install detectron2==0.1.3 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.5/index.html

view raw 12_0.py hosted with ❤ by GitHub

Import Libraries

	# You may need to restart your runtime prior to this, to let your installation take effect
	# Some basic setup:
	# Setup detectron2 logger
	import detectron2
	from detectron2.utils.logger import setup_logger
	setup_logger()

	# import some common libraries
	import numpy as np
	import cv2
	import random
	from google.colab.patches import cv2_imshow
	import matplotlib.pyplot as plt

	# import some common detectron2 utilities
	from detectron2 import model_zoo
	from detectron2.engine import DefaultPredictor
	from detectron2.config import get_cfg
	from detectron2.utils.visualizer import Visualizer
	from detectron2.data import MetadataCatalog

view raw 12_1.py hosted with ❤ by GitHub

Reading a video

Read a video and save frames to a folder:

	%%time
	!rm -r frames/*
	!mkdir frames/

	#specify path to video
	video = "sample.mp4"

	#capture video
	cap = cv2.VideoCapture(video)
	cnt=0

	# Check if video file is opened successfully
	if (cap.isOpened()== False):
	print("Error opening video stream or file")

	ret,first_frame = cap.read()

	#Read until video is completed
	while(cap.isOpened()):

	# Capture frame-by-frame
	ret, frame = cap.read()

	if ret == True:

	#save each frame to folder
	cv2.imwrite('frames/'+str(cnt)+'.png', frame)
	cnt=cnt+1
	if(cnt==750):
	break

	# Break the loop
	else:
	break

view raw 12_2.py hosted with ❤ by GitHub

Check the frame rate of a video:

	#frame rate of a video
	FPS=cap.get(cv2.CAP_PROP_FPS)
	print(FPS)

view raw 10_23.py hosted with ❤ by GitHub

Output: 25.0

Download the pre-trained model for object detection from Detectron 2’s model zoo and then the model is ready for inference:

	cfg = get_cfg()

	# add project-specific config (e.g., TensorMask) here if you're not running a model in detectron2's core library
	cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_C4_3x.yaml"))
	cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.9 # set threshold for this model

	# Find a model from detectron2's model zoo. You can use the https://dl.fbaipublicfiles... url as well
	cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_50_C4_3x.yaml")
	predictor = DefaultPredictor(cfg)

view raw 12_3.py hosted with ❤ by GitHub

Read an image and pass it to the model for predictions:

	#read an image
	img = cv2.imread("frames/30.png")

	#pass to the model
	outputs = predictor(img)

view raw 10_22.py hosted with ❤ by GitHub

Can you guess the output of the model? Yes, Objects and Locations, since its an object detection model. We can use Visualizer to draw the predictions on the image:

	# Use `Visualizer` to draw the predictions on the image.
	v = Visualizer(img[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
	v = v.draw_instance_predictions(outputs["instances"].to("cpu"))
	cv2_imshow(v.get_image()[:, :, ::-1])

view raw 12_4.py hosted with ❤ by GitHub

As you can see here, multiple objects are present in an image, like a person, bicycle, and so on. We are well on our way to building the social distancing detection tool!

Next, understand the objects present in an image:

	classes=outputs['instances'].pred_classes.cpu().numpy()
	print(classes)

view raw 10_24.py hosted with ❤ by GitHub

Have a glance at the bounding boxes of an object:

	bbox=outputs['instances'].pred_boxes.tensor.cpu().numpy()
	print(bbox)

view raw 10_25.py hosted with ❤ by GitHub

As different objects are present in an image, let’s identify classes and bounding boxes related to only the people:

	#identity only persons
	ind = np.where(classes==0)[0]

	#identify bounding box of only persons
	person=bbox[ind]

	#total no. of persons
	num= len(person)

view raw 12_5.py hosted with ❤ by GitHub

Understand the format of the bounding box:

	x1,y1,x2,y2 = person[0]
	print(x1,y1,x2,y2)

view raw 10_26.py hosted with ❤ by GitHub

bounding box

Draw a bounding box for one of the people:

	img = cv2.imread('frames/30.png')
	_ = cv2.rectangle(img, (x1, y1), (x2, y2), (255,0,0), 2)

	plt.figure(figsize=(20,10))
	plt.imshow(img)

view raw 12_6.py hosted with ❤ by GitHub

Our ultimate goal is to compute the distance between two people in an image. Once we know the bounding box for each person, we can easily compute the distance between any two people. But the challenge here is to select the right coordinate for representing a person as a bounding box is in the form of a rectangle.

I have chosen the bottom center of a rectangle for representing each person to measure the distance accurately and also this measure is invariant of the height of a person:

	#compute center
	x_center = int((x1+x2)/2)
	y_center = int(y2)

	center = (x_center, y_center)

	_ = cv2.circle(img, center, 5, (255, 0, 0), -1)
	plt.figure(figsize=(20,10))
	plt.imshow(img)

view raw 12_7.py hosted with ❤ by GitHub

Define a function that returns the bottom center of every bounding box:

	#define a function which return the bottom center of every bbox
	def mid_point(img,person,idx):
	#get the coordinates
	x1,y1,x2,y2 = person[idx]
	_ = cv2.rectangle(img, (x1, y1), (x2, y2), (0,0,255), 2)

	#compute bottom center of bbox
	x_mid = int((x1+x2)/2)
	y_mid = int(y2)
	mid = (x_mid,y_mid)

	_ = cv2.circle(img, mid, 5, (0, 0, 255), -1)
	cv2.putText(img, str(idx), mid, cv2.FONT_HERSHEY_SIMPLEX,1, (255, 255, 255), 2, cv2.LINE_AA)

	return mid

view raw 12_8.py hosted with ❤ by GitHub

Compute the bottom center for every bounding box and draw the points on the image:

	#call the function
	midpoints = [mid_point(img,person,i) for i in range(len(person))]

	#visualize image
	plt.figure(figsize=(20,10))
	plt.imshow(img)

view raw 12_9.py hosted with ❤ by GitHub

Define a function to compute the Euclidean distance between every two points in an image:

	%%time
	from scipy.spatial import distance
	def compute_distance(midpoints,num):
	dist = np.zeros((num,num))
	for i in range(num):
	for j in range(i+1,num):
	if i!=j:
	dst = distance.euclidean(midpoints[i], midpoints[j])
	dist[i][j]=dst
	return dist

view raw 12_19.py hosted with ❤ by GitHub

Compute the distance between every pair of points:

dist= compute_distance(midpoints,num)

view raw 10_26.py hosted with ❤ by GitHub

Define a function that returns the closest people based on the given proximity distance. Here, proximity distance refers to the minimum distance between two people:

	%%time
	def find_closest(dist,num,thresh):
	p1=[]
	p2=[]
	d=[]
	for i in range(num):
	for j in range(i,num):
	if( (i!=j) & (dist[i][j]<=thresh)):
	p1.append(i)
	p2.append(j)
	d.append(dist[i][j])
	return p1,p2,d

view raw 12_12.py hosted with ❤ by GitHub

Set the threshold for the proximity distance. Here, I have chosen that to be 100. Let’s find the people who are within the proximity distance:

	import pandas as pd

	thresh=100
	p1,p2,d=find_closest(dist,num,thresh)
	df = pd.DataFrame({"p1":p1,"p2":p2,"dist":d})
	df

view raw 12_14.py hosted with ❤ by GitHub

pandas

From the output, we can observe that 4 people come under the red zone as the distance between them is less than the proximity threshold.

Define a function to change the color of the closest people to red:

	def change_2_red(img,person,p1,p2):
	risky = np.unique(p1+p2)
	for i in risky:
	x1,y1,x2,y2 = person[i]
	_ = cv2.rectangle(img, (x1, y1), (x2, y2), (255,0,0), 2)
	return img

view raw 12_15.py hosted with ❤ by GitHub

Let’s change the color of the closest people to red:

	img = change_2_red(img,person,p1,p2)

	plt.figure(figsize=(20,10))
	plt.imshow(img)

view raw 12_16.py hosted with ❤ by GitHub

So far, we have seen a step by step procedure on how to apply object detection using Detectron-2, compute the distance between every pair of people, and then finally identify the closest people. We will carry out similar steps on each and every frame of the video now:

	import os
	import re

	names=os.listdir('frames/')
	names.sort(key=lambda f: int(re.sub('\D', '', f)))

view raw 12_17.py hosted with ❤ by GitHub

Define a function that performs all the steps we covered on each and every frame of the video:

	def find_closest_people(name,thresh):

	img = cv2.imread('frames/'+name)
	outputs = predictor(img)
	classes=outputs['instances'].pred_classes.cpu().numpy()
	bbox=outputs['instances'].pred_boxes.tensor.cpu().numpy()
	ind = np.where(classes==0)[0]
	person=bbox[ind]
	midpoints = [mid_point(img,person,i) for i in range(len(person))]
	num = len(midpoints)
	dist= compute_distance(midpoints,num)
	p1,p2,d=find_closest(dist,num,thresh)
	img = change_2_red(img,person,p1,p2)
	cv2.imwrite('frames/'+name,img)
	return 0

view raw 10_18.py hosted with ❤ by GitHub

Identify the closest people in each frame and change the color to red:

	from tqdm import tqdm
	thresh=100
	_ = [find_closest_people(names[i],thresh) for i in tqdm(range(len(names))) ]

view raw 10_20.py hosted with ❤ by GitHub

After identifying the closest people in each frame, convert the frames back to a video. That’s it!

	%%time
	frames = os.listdir('frames/')
	frames.sort(key=lambda f: int(re.sub('\D', '', f)))

	frame_array=[]

	for i in range(len(frames)):

	#reading each files
	img = cv2.imread('frames/'+frames[i])
	img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)

	height, width, layers = img.shape
	size = (width,height)

	#inserting the frames into an image array
	frame_array.append(img)

	out = cv2.VideoWriter('sample_output.mp4',cv2.VideoWriter_fourcc(*'DIVX'), 25, size)

	for i in range(len(frame_array)):
	# writing to a image array
	out.write(frame_array[i])
	out.release()

view raw 10_21.py hosted with ❤ by GitHub

How cool is that?

What’s Next?

Keep in mind that the projection of the camera also matters a lot while computing the distance between the objects in an image.

In our case, I have not taken into account the projection of the camera since the impact of the camera’s projection on the estimated distance is minimum. However, the universal approach is to convert a video into a top view or birds’ eye view and then compute the distance between two objects in an image. This task is known as Camera Calibration.

Keep that in mind if you want to explore this further and customize your own social distancing detection tool.

End Notes

This brings us to the end of our tutorial on how to build your own social distancing tool using computer vision. I hope you have enjoyed the tutorial and found it useful. If you have any comments/queries, kindly leave them in the comments section below and I will reach out to you. And remember:

Stay SAFE (Stay Away From Everyone) to prevent the spread of the COVID-19 pandemic

Aravind Pai

Aravind Pai is passionate about building data-driven products for the sports domain. He strongly believes that Sports Analytics is a Game Changer.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Jean

The code is not working in Google Colab. Error: -------------------------------------------------------------------------- ImportError Traceback (most recent call last) in () 14 15 # import some common detectron2 utilities ---> 16 from detectron2 import model_zoo 17 from detectron2.engine import DefaultPredictor 18 from detectron2.config import get_cfg 4 frames /usr/local/lib/python3.6/dist-packages/detectron2/layers/deform_conv.py in () 8 from torch.nn.modules.utils import _pair 9 ---> 10 from detectron2 import _C 11 12 from .wrappers import _NewEmptyTensorOp ImportError: libtorch_cpu.so: cannot open shared object file: No such file or directory --------------------------------------------------------------------------- NOTE: If your import is failing due to a missing package, you can manually install dependencies using either !pip or !apt. To view examples of installing some common dependencies, click the "Open Examples" button below.

Show 1 reply

Please install torch 1.5 to resolve the error

Karanbir singh

Great work!! You are using euclidean distance for measuring distance but doesnot variation due to depth of two persons affect it.Two people who are standing one infront of other will appear as they are close but actually they can be wide apart or viceversa. How to take that into account ? Is there a true need for taking that into account?

Maddula Ravi Prakash

Hi Aravind, I am trying to run to do installation, packages get installed but when try to import detectron i get the below error. any idea what's this issue. 8 from torch.nn.modules.utils import _pair 9 ---> 10 from detectron2 import _C 11 12 from .wrappers import _NewEmptyTensorOp ImportError: libtorch_cpu.so: cannot open shared object file: No such file or directory

Hi, installing torch 1.5 resolves the error

Reading list

Introduction to Computer Vision

Getting Started with Image Data

Introduction to CNN and Implementation

Introduction to CNN and implementation

Introduction to Transfer Learning

CNN Visualization

Overview of Pretrained Models

Inception

ResNets

DenseNets

CSRNet

Introduction to Object Detection

Region Based Convolutional Neural Network

Single Stage Networks

Transformed Based Object Detection Models

Face Detection

Object Tracking

Pose Estimation

Introduction to Image Segmentation

Understanding Deep Learning Architectures for Image Segmentation

Video Classification

Introduction to Image Generation

Experiments with Generative Adversarial Networks

Zero and Few Shot Learning

Model Deployment

Your Social Distancing Detection Tool: How to Build One using your Deep Learning Skills

Overview

Introduction

Table of Contents

Overview of Object Detection and Tracking

Evolution of State-of-the-Art (SOTA) for Object Detection

Sliding Window for Object Detection

R-CNN for Object Detection

Fast R-CNN for Object Detection

Faster R-CNN for Object Detection

Your Social Distancing Tool – A Use Case of Object Detection & Tracking

What’s Next?

End Notes

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory