A Basic Introduction to OpenCV in Deep Learning

Pranshu Sharma Last Updated : 12 Mar, 2024

8 min read

Introduction

OpenCV Python is a massive open-source library for various fields like computer vision, machine learning, image processing and plays a critical function in real-time operations, which are fundamental in today’s systems. It is deployed for the detection of items, faces, Diseases, lesions, Number plates, and even handwriting in various images and videos. With help of OpenCV basics in Deep Learning, we deploy vector space and execute mathematical operations on these features to identify visual patterns and their various features.

OpenCV in Deep Learning — Source: Credera.com

This article was published as a part of the Data Science Blogathon.

What is Computer Vision?
Installing and Importing the OpenCV Image Preprocessing Package
Reading an Input Image
Image Data Type
Image Resolution
Image Pixel Values
Viewing the Images
Image Operations Using OpenCV and Python
OpenCV Applications
Functionality of OpenCV
Frequently Asked Questions?

What is Computer Vision?

Computer vision is an approach to understanding how photos and movies are stored, as well as manipulating and extracting information from them. Artificial Intelligence depends on or is mostly based on computer vision. Self-driving cars, robotics, and picture editing apps all rely heavily on computer vision

Human vision has a resemblance to that of computer vision. Human vision learns from the various life experiences and deploys them to distinguish objects and interpret the distance between various objects and estimate the relative position.

With cameras, data, and algorithms, computer vision trains machines to accomplish these jobs in much less time.

Computer vision allows computers and systems to extract useful data from digital images and video inputs.

Installing and Importing the OpenCV Image Preprocessing Package

OpenCV in deep learning is an extremely important important aspect of many Machine Learning algorithms. OpenCV basics is an open-source library (package) for computer vision, machine learning, and image processing applications that run on the CPU exclusively. It works with many different programming languages, including Python. It can be imported with single line command as being depicted below

pip install opencv-python

A package in Python is a collection of modules that contain pre-written programmes. These packages allow you to import modules separately or in their whole. Importing the package is as simple as calling the “cv2” module as seen below:

import cv2 as cv

Reading an Input Image

Colour photographs, grayscale photographs, binary photographs, and multispectral photographs are all examples of digital images. In a colour image, each pixel contains its colour information. Binary images have only two colours, usually black and white pixels, and grayscale images have only shades of grey as their only colour. Multispectral pictures gather image data spanning the electromagnetic spectrum within a specific wavelength.

To read the image, we use the “imread” method from the cv2 package, where the first parameter is the image’s path, including filename and extension, and the second parameter is a flag that determines how to read in the image.

By changing the absolute path of the image here, you can test reading it from your local computer or even the internet! If the image is already in your current working directory, you only need to specify the picture name and extension type. Set the second parameter to 0 to read it as a grayscale image, -1 to read it as unmodified (reads the image as alpha or transparency channel if it exists), and 1 to read it as a colour image if you want to read it as a colour image.

OpenCV Functions to Start your Computer Vision journey

The features of a picture that is being utilised as an input

import cv2
# To read image cv2.imread function, 
img = cv2.imread("pythonlogo.png", cv2.IMREAD_COLOR)
# Creating GUI window to display an image on screen
cv2.imshow("Cute Kitens", img)

Output:

Image Data Type

To discover the image’s type, use the “dtype” technique. This strategy enables us to comprehend the representation of visual data and the pixel value.

in addition to the image kind, It’s a multidimensional container for things of comparable shape and size.

Pixel values for the image

A collection of small samples can be thought of as an image. These samples are referred to as pixels. To have a better understanding of an image, try zooming in as much as possible. Divided into several squares, the same can be seen. These are pixels, and when all of them are combined, they form an image. One of the simplest methods to represent an image is via a matrix.

Code:

print("The data type of the image is",image.dtype) 

Output:
The data type of the image is uint8
uint8 is representing  each pixel value being an Unsigned Integer of
8 bits. This data type ranges between 0 to 255

Image Resolution

Image resolution is defined as the number of pixels in an image. As the number of pixels rises, the image quality improves. As we saw before, the image’s shape determines the number of rows and columns. Pixel values in images: 320 x 240 pixels (mostly suitable for small screen devices), 1024 x 768 pixels (appropriate for viewing on standard computer monitors), 720 x 576 pixels (good for viewing on standard definition TV sets with 4:3 aspect ratio), 1280 x 720 pixels (for viewing on widescreen monitors), 1280 x 1024 pixels (for viewing on full-screen monitors) Pixel values in images.

Image Classification Using CNN

Image Pixel Values

A collection of small samples can be thought of as an image. The unit of measurement for these samples is pixels. For improved comprehension, try zooming in on a picture as much as possible. The same can be divided into several different squares. These are pixels that, when combined, make up an image.

The quality of an image decreases as the number of pixels in the image increases. The image’s shape, which we saw earlier, determines the number of rows and columns.

Viewing the Images

Let’s have a look at how to make the image appear in a window. We’ll need to create a graphical user interface (GUI) window to display the image on the screen to do so. The title of the GUI window screen must be the first parameter, and it must be specified in string format. The image can be displayed in a pop-up window using the cv2.imshow() method. However, if you try to close it, you can get stuck with its window. We can use the “waitKey” method to mitigate this.

The “waitKey” parameter has been set to ‘0’ to keep the window open until we close it. (You can specify the time in milliseconds instead of 0, indicating how long it should be open for.)

# To read image from disk, we use
# cv2.imread function, in below method,
img = cv2.imread("python logo.png", cv2.IMREAD_COLOR)
# Creating GUI window to display an image on screen
# first Parameter is windows title (should be in string format)
# Second Parameter is image array
cv2.imshow("The Logo", img)
# To hold the window on screen, we use cv2.waitKey method,
If 0 pass an parameter, then it will
# hold the screen until user close it.
cv2.waitKey(0)
# for removing/deleting created GUI window from screen
# and memory
cv2.destroyAllWindows()

Output:

Output: GUI Window, Source: Author

Reconstructing the image bit planes after extracting the image bit planes

An image can be divided into several levels of bit planes. Divide an image into 8-bit (0-7) planes, with the last few planes containing the majority of the image’s data.

Image Operations Using OpenCV and Python

Checking Properties of the Input Image

Input Image:

import cv2
import numpy as np
import matplotlib.pyplot as plt
img = plt.imread("my pic.jpg")
plt.imshow(img)
print(img.shape)
print(img.size)
print(img.dtype)

Output:

(1921, 1921, 3)
11070723
uint8

Basic Image Processing

Input Image:

import matplotlib.pyplot as plt
import cv2
import numpy as np
image = cv2.imread(“baby yoda.jpg”)
#cv2.imshow(‘Example – Show image in window’,image)
img2 = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)

Output:

Dilation and Erosion of the Input Image

Input Image:

import cv2
import numpy as np
import matplotlib.pyplot as plt
img = plt.imread("baby yoda.jpg")
# Taking a matrix of size 5 as the kernel
kernel = np.ones((5,5), np.uint8)
# first parameter is basicaly  the original image,
# kernel is the matrix with which image is convolved 
# and third parameter is the number of iterations, which will determine how much
# you want to erode/dilate a given image.
img_erosion = cv2.erode(img, kernel, iterations=1)
img_dilation = cv2.dilate(img, kernel, iterations=1)
plt.imshow(img)
plt.imshow(img_erosion)
plt.imshow(img_dilation)

Output:

Normal image — Source: PopularMechanic.com

OpenCV Applications

The concept of OpenCV basics in Deep Learning is applied for recognition of faces.
Counting the number of people (foot traffic in a mall, for example)
Counting the number of automobiles on motorways and their speeds
Interaction-based art installations
Anomalies (defects) are detected during the production process (the odd defective products)
Stitching an image from a street view
Street view image stitching
Video/image search and retrieval
Robot and autonomous car navigation and control
Object recognition
Medical image analysis
Movies – 3D structure from motion

Functionality of OpenCV

I/O, processing and display of images and videos
Detection of objects and features
Computer vision based on geometry
Computer-assisted photography

Conclusion

So in this article, we covered the basic Introduction about OpenCV Library and its application in real-time scenarios. We also covered other key terminologies and fields where OpenCV in deep learning is being deployed(Computer Vision) as well as implemented python code for performing some of the basic image operations(dilation, erosion, and changing image colours) with the help of the OpenCV library. Apart from that OpenCV basics in deep learning would also find application in a variety of industries.

The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion.

Frequently Asked Questions?

Q1. What is OpenCV and what are its main applications?

A. OpenCV stands for Open Source Computer Vision. It is a vast open-source library utilized in fields such as computer vision, machine learning, and image processing. Its applications include object detection, facial recognition, medical image analysis, and more.

Q2. How does OpenCV contribute to real-time operations?

A. OpenCV Basically plays a critical role in real-time systems by providing algorithms and tools for processing images and videos swiftly. It enables tasks such as object detection, face recognition, and handwriting recognition in real-time scenarios.

Q3. How does Computer Vision relate to human vision?

A. Computer vision mimics human vision by interpreting visual data from images and videos. Similar to how humans learn from experiences to recognize objects and estimate distances, computer vision uses algorithms to analyze visual data and extract useful information.

Q4. What programming languages can be used with OpenCV Python Basics?

A. OpenCV Basics is compatible with various programming languages, including Python, C++, and Java. However, Python is widely used due to its simplicity and ease of integration with other libraries.

Q5. What are some key features of OpenCV’s Python image processing capabilities?

A. OpenCV provides functionalities for reading and manipulating images, including reading different image types (color, grayscale, binary), extracting pixel values, viewing images in graphical user interfaces, and performing basic image processing operations like dilation and erosion.

Pranshu Sharma

Aspiring Data Scientist | M.TECH, CSE at NIT DURGAPUR

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to Computer Vision

Getting Started with Image Data

Introduction to CNN and Implementation

Introduction to CNN and implementation

Introduction to Transfer Learning

CNN Visualization

Overview of Pretrained Models

Inception

ResNets

DenseNets

CSRNet

Introduction to Object Detection

Region Based Convolutional Neural Network

Single Stage Networks

Transformed Based Object Detection Models

Face Detection

Object Tracking

Pose Estimation

Introduction to Image Segmentation

Understanding Deep Learning Architectures for Image Segmentation

Video Classification

Introduction to Image Generation

Experiments with Generative Adversarial Networks

Zero and Few Shot Learning

Model Deployment

A Basic Introduction to OpenCV in Deep Learning

Introduction

Table of contents

What is Computer Vision?

Installing and Importing the OpenCV Image Preprocessing Package

Reading an Input Image

Image Data Type

Pixel values for the image

Image Resolution

Image Pixel Values

Viewing the Images

Image Operations Using OpenCV and Python

Basic Image Processing

Dilation and Erosion of the Input Image

OpenCV Applications

Functionality of OpenCV

Conclusion

Frequently Asked Questions?

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth