OpenCV Python is a massive open-source library for various fields like computer vision, machine learning, image processing and plays a critical function in real-time operations, which are fundamental in today’s systems. It is deployed for the detection of items, faces, Diseases, lesions, Number plates, and even handwriting in various images and videos. With help of OpenCV basics in Deep Learning, we deploy vector space and execute mathematical operations on these features to identify visual patterns and their various features.
This article was published as a part of the Data Science Blogathon.
Computer vision is an approach to understanding how photos and movies are stored, as well as manipulating and extracting information from them. Artificial Intelligence depends on or is mostly based on computer vision. Self-driving cars, robotics, and picture editing apps all rely heavily on computer vision
Human vision has a resemblance to that of computer vision. Human vision learns from the various life experiences and deploys them to distinguish objects and interpret the distance between various objects and estimate the relative position.
With cameras, data, and algorithms, computer vision trains machines to accomplish these jobs in much less time.
Computer vision allows computers and systems to extract useful data from digital images and video inputs.
OpenCV in deep learning is an extremely important important aspect of many Machine Learning algorithms. OpenCV basics is an open-source library (package) for computer vision, machine learning, and image processing applications that run on the CPU exclusively. It works with many different programming languages, including Python. It can be imported with single line command as being depicted below
pip install opencv-python
A package in Python is a collection of modules that contain pre-written programmes. These packages allow you to import modules separately or in their whole. Importing the package is as simple as calling the “cv2” module as seen below:
import cv2 as cv
Colour photographs, grayscale photographs, binary photographs, and multispectral photographs are all examples of digital images. In a colour image, each pixel contains its colour information. Binary images have only two colours, usually black and white pixels, and grayscale images have only shades of grey as their only colour. Multispectral pictures gather image data spanning the electromagnetic spectrum within a specific wavelength.
To read the image, we use the “imread” method from the cv2 package, where the first parameter is the image’s path, including filename and extension, and the second parameter is a flag that determines how to read in the image.
By changing the absolute path of the image here, you can test reading it from your local computer or even the internet! If the image is already in your current working directory, you only need to specify the picture name and extension type. Set the second parameter to 0 to read it as a grayscale image, -1 to read it as unmodified (reads the image as alpha or transparency channel if it exists), and 1 to read it as a colour image if you want to read it as a colour image.
OpenCV Functions to Start your Computer Vision journey
The features of a picture that is being utilised as an input
import cv2
# To read image cv2.imread function,
img = cv2.imread("pythonlogo.png", cv2.IMREAD_COLOR)
# Creating GUI window to display an image on screen
cv2.imshow("Cute Kitens", img)
Output:
To discover the image’s type, use the “dtype” technique. This strategy enables us to comprehend the representation of visual data and the pixel value.
in addition to the image kind, It’s a multidimensional container for things of comparable shape and size.
A collection of small samples can be thought of as an image. These samples are referred to as pixels. To have a better understanding of an image, try zooming in as much as possible. Divided into several squares, the same can be seen. These are pixels, and when all of them are combined, they form an image. One of the simplest methods to represent an image is via a matrix.
Code:
print("The data type of the image is",image.dtype)
Output:
The data type of the image is uint8
uint8 is representing each pixel value being an Unsigned Integer of
8 bits. This data type ranges between 0 to 255
Image resolution is defined as the number of pixels in an image. As the number of pixels rises, the image quality improves. As we saw before, the image’s shape determines the number of rows and columns. Pixel values in images: 320 x 240 pixels (mostly suitable for small screen devices), 1024 x 768 pixels (appropriate for viewing on standard computer monitors), 720 x 576 pixels (good for viewing on standard definition TV sets with 4:3 aspect ratio), 1280 x 720 pixels (for viewing on widescreen monitors), 1280 x 1024 pixels (for viewing on full-screen monitors) Pixel values in images.
Image Classification Using CNN
A collection of small samples can be thought of as an image. The unit of measurement for these samples is pixels. For improved comprehension, try zooming in on a picture as much as possible. The same can be divided into several different squares. These are pixels that, when combined, make up an image.
The quality of an image decreases as the number of pixels in the image increases. The image’s shape, which we saw earlier, determines the number of rows and columns.
Let’s have a look at how to make the image appear in a window. We’ll need to create a graphical user interface (GUI) window to display the image on the screen to do so. The title of the GUI window screen must be the first parameter, and it must be specified in string format. The image can be displayed in a pop-up window using the cv2.imshow() method. However, if you try to close it, you can get stuck with its window. We can use the “waitKey” method to mitigate this.
The “waitKey” parameter has been set to ‘0’ to keep the window open until we close it. (You can specify the time in milliseconds instead of 0, indicating how long it should be open for.)
# To read image from disk, we use
# cv2.imread function, in below method,
img = cv2.imread("python logo.png", cv2.IMREAD_COLOR)
# Creating GUI window to display an image on screen
# first Parameter is windows title (should be in string format)
# Second Parameter is image array
cv2.imshow("The Logo", img)
# To hold the window on screen, we use cv2.waitKey method,
If 0 pass an parameter, then it will
# hold the screen until user close it.
cv2.waitKey(0)
# for removing/deleting created GUI window from screen
# and memory
cv2.destroyAllWindows()
Output:
Output: GUI Window, Source: Author
Reconstructing the image bit planes after extracting the image bit planes
An image can be divided into several levels of bit planes. Divide an image into 8-bit (0-7) planes, with the last few planes containing the majority of the image’s data.
Checking Properties of the Input Image
Input Image:
import cv2
import numpy as np
import matplotlib.pyplot as plt
img = plt.imread("my pic.jpg")
plt.imshow(img)
print(img.shape)
print(img.size)
print(img.dtype)
Output:
(1921, 1921, 3) 11070723 uint8
Input Image:
import matplotlib.pyplot as plt
import cv2
import numpy as np
image = cv2.imread(“baby yoda.jpg”)
#cv2.imshow(‘Example – Show image in window’,image)
img2 = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)
Output:
Input Image:
import cv2
import numpy as np
import matplotlib.pyplot as plt
img = plt.imread("baby yoda.jpg")
# Taking a matrix of size 5 as the kernel
kernel = np.ones((5,5), np.uint8)
# first parameter is basicaly the original image,
# kernel is the matrix with which image is convolved
# and third parameter is the number of iterations, which will determine how much
# you want to erode/dilate a given image.
img_erosion = cv2.erode(img, kernel, iterations=1)
img_dilation = cv2.dilate(img, kernel, iterations=1)
plt.imshow(img)
plt.imshow(img_erosion)
plt.imshow(img_dilation)
Output:
So in this article, we covered the basic Introduction about OpenCV Library and its application in real-time scenarios. We also covered other key terminologies and fields where OpenCV in deep learning is being deployed(Computer Vision) as well as implemented python code for performing some of the basic image operations(dilation, erosion, and changing image colours) with the help of the OpenCV library. Apart from that OpenCV basics in deep learning would also find application in a variety of industries.
The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion.
A. OpenCV stands for Open Source Computer Vision. It is a vast open-source library utilized in fields such as computer vision, machine learning, and image processing. Its applications include object detection, facial recognition, medical image analysis, and more.
A. OpenCV Basically plays a critical role in real-time systems by providing algorithms and tools for processing images and videos swiftly. It enables tasks such as object detection, face recognition, and handwriting recognition in real-time scenarios.
A. Computer vision mimics human vision by interpreting visual data from images and videos. Similar to how humans learn from experiences to recognize objects and estimate distances, computer vision uses algorithms to analyze visual data and extract useful information.
A. OpenCV Basics is compatible with various programming languages, including Python, C++, and Java. However, Python is widely used due to its simplicity and ease of integration with other libraries.
A. OpenCV provides functionalities for reading and manipulating images, including reading different image types (color, grayscale, binary), extracting pixel values, viewing images in graphical user interfaces, and performing basic image processing operations like dilation and erosion.
This is a great introduction to OpenCV in the context of deep learning! I appreciate how you broke down the concepts and provided practical examples. Looking forward to applying these insights in my projects. Thank you for sharing!