A Classic Computer Vision Project – How to Add an Image Behind Objects in a Video

Prateek joshi Last Updated : 15 Jun, 2020

7 min read

Overview

Adding an image behind a moving object is a classic computer vision project
Learn how to add a logo in a video using traditional computer vision techniques

Introduction

I was thrown a challenge by one of my colleagues – build a computer vision model that could insert any image in a video without distorting the moving object. This, as you imagine, was quite an intriguing project and I had a blast working on it.

Working with videos is notoriously difficult because of their dynamic nature. Unlike images, we don’t have static objects that we can easily identify and track. The complexity level goes up several levels – and that’s where our hold on image processing and computer vision techniques comes to the fore.

I decided to go with a logo in the background. The challenge, which I will elaborate on later, was to insert a logo in a way that wouldn’t impede the dynamic nature of the object in any given video. I used Python and OpenCV to build this computer vision system – and have shared my approach in this article.

We will be using the image processing concepts and OpenCV in this article. We have a collection of comprehensive free courses and an article on these topics to get you up to speed:

Understanding the Problem Statement
Getting the Data for this Project
Setting the Blueprint for our Computer Vision Project
Implementing the Technique in Python – Let’s Add the Logo!

Understanding the Problem Statement

This is going to be quite an uncommon use case of computer vision. We will be embedding a logo in a video. Now you must be thinking – what’s the big deal in that? We can simply paste the logo on top of the video, right?

However, that logo might just hide some interesting action in the video. What if the logo impedes the moving object in front? That doesn’t make a lot of sense and makes the editing looks amateurish.

Therefore, we have to figure out how we can add the logo somewhere in the background such that it doesn’t block the main action going on in the video. Check out the video below – the left half is the original video and the right half has the logo appearing on the wall behind the dancer:

This is the idea we’ll be implementing in this article.

Getting the Data for this Project

I have taken this video from pexels.com, a website for free stock videos. As I mentioned earlier, our objective is to put a logo in the video such that it should appear behind a certain moving object. So, for the time being, we will use the logo of OpenCV itself. You can use any logo you want (perhaps your favorite sports team?).

opencv logo

You can download both the video and the logo from here.

Setting the Blueprint for our Computer Vision Project

Let’s first understand the approach before we implement this project. To perform this task, we will take the help of image masking. Let me show you some illustrations to understand the technique.

Let’s say we want to put a rectangle (fig 1) in an image (fig 2) in such a manner that the circle in the second image should appear on top of the rectangle:

opencv

So, the desired outcome should look like this:

opencv

However, it is not that straightforward. When we take the rectangle from Fig 1 and insert it in Fig 2, it will appear on top of the pink circle:

opencv

This is not what we want. The circle should have been in front of the rectangle. So, let’s understand how we can solve this problem.

These images are essentially arrays. The values of these arrays are the pixel values and every color has its own pixel value. So, we would somehow set the pixel values of the rectangle to 1 where it is supposed to be overlapping with the circle (in Fig 5), while leaving the rest of the pixel values of the rectangle as they are.

In Fig 6, the region enclosed by blue-dotted lines is the region where we would put the rectangle. Let’s denote this region by R. We would set all the pixel values of R to 1 as well. However, we would leave the pixel values of the entire pink circle unchanged:

opencv

Our next step is to multiply the pixel values of the rectangle with the pixel values of R. Since multiplying any number by 1 results in that number itself, so all those pixel values of R that are 1 will be replaced by the pixels of the rectangle. Similarly, the pixel values of the rectangle that are 1 will be replaced by the pixels of Fig 6. The final output will turn out to be something like this:

opencv mask

This is the technique we are going to use to embed the OpenCV logo behind the dancing guy in the video. Let’s do it!

Implementing the Technique in Python – Let’s Add the Logo!

You can use a Jupyter Notebook or any IDE of your choice and follow along. We will first import the necessary libraries.

Import Libraries

Note: The version of the OpenCV library used for this tutorial is 4.0.0.

Load Images

Next, we will specify the path to the working directory where the logo and video are kept. Please note that you are supposed to specify the “path” in the code snippet below:

So, we have loaded the logo image and the first frame of the video. Now let’s look at the shape of these images or arrays:

logo.shape, frame.shape

Output: ((240, 195, 3), (1080, 1920, 3))

Both the outputs are 3-dimensional. The first dimension is the height of the image, the second dimension is the width of the image and the third dimension is the number of channels in the image, i.e., blue, green, and red.

Now, let’s plot and see the logo and the first frame of the video:

plt.imshow(logo)
plt.show()

plt.imshow(cv2.cvtColor(frame,cv2.COLOR_BGR2RGB))
plt.show()

Technique to Create Image Mask

The frame size is much bigger than the logo. Therefore, we can place the logo at a number of places. However, placing the logo at the center of the frame seems perfect to me as most of the action will happen around that region in the video. So, we will put the logo in the frame as shown below:

Don’t worry about the black background in the logo. We will set the pixel values in the black region to 1 later in the code. Now the problem we have to solve is that of dealing with the moving object appearing in the same region where we have placed the logo.

As discussed earlier, we need to make the logo allow itself to be occluded by that moving object.

Right now, the area where we will put the logo in has a wide range of pixel values. Ideally, all the pixel values should be the same in this area. So how can we do that?

We will have to make the pixels of the wall enclosed by the green dotted box have the same value. We can do this with the help of HSV (hue, saturation, value) colorspace:

Our image is in RGB colorspace. We will convert it into an HSV image. The image below is the HSV version:

The next step is to find the range of the HSV values of only the part that is inside the green dotted box. It turns out that most of the pixels in the box range from [6, 10, 68] to [30, 36, 122]. These are the lower and upper HSV ranges, respectively.

Now using this range of HSV values, we can create a binary mask. This mask is nothing but an image with pixel values of either 0 or 255. So, the pixels falling in the upper and lower range of the HSV values will be equal to 255 and the rest of the pixels will be 0.

Given below is the mask prepared from the HSV image. All the pixels in the yellow region have pixel value of 255 and the rest have pixel value of 0:

Now we can easily set the pixel values inside the green dotted box to 1 as and when required. Let’s go back to the code:

The code snippet above will load the frames from the video, pre-process it, and create HSV images and masks and finally insert the logo into the video. And there you have it!

End Notes

In this article, we covered a very interesting use case of computer vision and implemented it from scratch. In the process, we also learned about working with image arrays and how to create masks from these arrays.

This is something that would help you when you work on other computer vision tasks. Feel free to reach out to me if you have any doubts or feedback to share. I would be glad to help you.

Prateek joshi

Data Scientist at Analytics Vidhya with multidisciplinary academic background. Experienced in machine learning, NLP, graphs & networks. Passionate about learning and applying data science to solve real world problems.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Mahtab

It's a very grateful article I have learn something new thank you

Show 1 reply

Prateek Joshi

You're welcome Mahtab!

Nisha

This article is very useful for me.... So keep sharing articles related to CV.

Thanks Nisha!

Rajesh

Great article. If you are using Google colab notebook, need to change cv2.imshow to cv2_imshow for the code to work. Thanks for great details.

Thanks Rajesh for your input.

A Classic Computer Vision Project – How to Add an Image Behind Objects in a Video

Overview

Introduction

Table of Contents

Understanding the Problem Statement

Getting the Data for this Project

Setting the Blueprint for our Computer Vision Project

Implementing the Technique in Python – Let’s Add the Logo!

Import Libraries

Load Images

Technique to Create Image Mask

End Notes

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

s_pltp

s_tslv

li_theme

li_theme_set

Google (11)

_gcl_au

SID

SAPISID

__Secure-#

APISID

SSID

HSID

DV

NID

1P_JAR

OTZ

Facebook (2)

_fbp

fr

LinkedIn (6)

bscookie