Computer vision is one of the hottest research fields in the data science world. Moreover, it has become a part of our personal lives. Knowingly or unknowingly, we all use various features which have computer vision techniques running at the backend.
Note: If you are more interested in learning concepts in an Audio-Visual format, We have this entire article explained in the video below. If not, you may continue reading.
Like computer vision is being used for the security cameras in identifying the person or in self-driving cars for detecting the object present in front of the car.
If we try to define computer vision formally
Computer vision is artificially recreating the human vision in perceiving and understanding the images.
Once we have a clear idea of computer vision. Now, let’s understand some real-world problems that can be solved using computer vision.
Suppose we an image and the question we are trying to solve here is what image is present in the picture? The object of importance is here the Dog. This looks like a simple problem, Right? We can formulate this as an image classification problem.
Now instead of having a single object in the image, we can have multiple objects. So our question would change accordingly, What are the objects present in the image. The answer to this is a dog and a cat. We can formulate this as a multiclass classification problem.
Now suppose we also care about where are the objects are present instead of just seeing what are the objects. We can identify what objects are present in the image and also draw a bounding box alongside these objects to specify their location.
For example, we say here is a cat and we draw a red box around it. Similarly, we say that there is a dog and draw a blue box around it. This problem can be specified as an object detection task.
Further, if we have the same question of finding the objects in the same image and their location but instead we want the exact location of the objects.
We can give the exact location by identifying every pixel that represents the object. In simple words, instead of drawing a rough rectangular box around the object, we draw a polygon around the object and also color every pixel of that object as can be seen here. This can be formulated as an Image Segmentation problem.
Formally speaking,
Image segmentation is the task of partitioning an image based on the objects present and their semantic importance.
This makes it a whole lot easier to analyze the given image, because instead of getting an approximate location from a rectangular box. We can get the exact pixel-wise location of the objects.
Now there are multiple applications where image segmentation can be applied, Let’s discuss a few interesting image segmentation applications.
Image segmentation can be applied for medical imaging tasks such as cancer cell segmentation. Where it is of utmost importance that we identify the exact location of the tumors or cancerous cells.
Other applications of image segmentation could be self-driving systems. for lane segmentation or pedestrian identification. By precisely predicting the location of the object of importance like roads or a person, a self-driving system can take appropriate steps to handle the downstream task like applying the breaks or slowing down the car.
Another application could be in the remote sensing domain. Where we can identify the composition of the ground such as what is the forest cover in the region or finding illegal activities like mining or even forest fires.
Now even in Image segmentation, there are a few types of problems that you should get yourself acquainted with. First and foremost is the semantic segmentation then comes instance segmentation. The next one is panoptic segemntation. Let’s see each of them.
Semantic segmentation describes the process of associating each pixel with a class label. So simply, here we just care about a coarse representation of all the objects present in the image. Here you can see all the cars represented in blue, the pedestrian with red, and the street as slightly pink.
So there is no clear distinction between the cars, which means all the cars are colored blue. This is the simplest way to define an image segmentation problem.
Now If you want to go a bit further and want to represent each instance of a class differently, this problem will be formally known as instance segmentation.
Unlike semantic segmentation, in image segmentation, we mask each instance of an object contained in an image independently. So this implies, that we will focus on the object of importance first and then identify each instance of the object separately.
You can see that all objects in the image cars and persons are highlighted and all of these are different colors. This is what instance segmentation is.
When you combine semantic segmentation and instance segmentation, you get panoptic segmentation. It is a recent research area where we need to associate each pixel in the image with the semantic label for the classification and also identify the instances of a particular class.
So this is a more complex problem for image segmentation.
To summarize, in this article we saw different computer vision tasks and discussed image segmentation in detail. Also, we saw a few applications of image segmentation like medical imaging, self-driving systems, and remote sensing.
At last, we discussed different types of image segmentation given as semantic segmentation, instance segmentation, and panoptic segmentation.
If you have any queries, let me know in the comments below!
Nice explanation
Shipra, You made it so simple, even computer illiterate like me can also understand it. Continue like this for benefit of people like me Thank you so much