I started using Facebook 10 years ago. Likewise, if you used that long ago you must remember the manual tagging of photographs. But now we do not have to tag these pictures manually. Facebook recognizes most of the people in the uploaded picture and provides suggestions to tag them. Similarly, you must have seen those hilarious filters on Snapchat where people use the dog filter and get a dog face on. Have you ever wondered how is all of it possible? How is our phone able to detect our face and add filters over it? These are some of the computer vision applications and projects.
Computer vision is one of the hottest research fields in the data science world. Moreover, it has become a part of our personal lives. Knowingly or unknowingly, we all use various features which have computer vision techniques running at the backend. For instance, we use face unlock in our smartphones. The image below is efficiently explaining how does face detection works.
Source: Interest
I choose face detection for starting this article since this is the one application of computer vision, we all have seen. But trust me computer vision is not limited to this. In this article, you will explore more interesting applications of computer vision.
If you are looking to master computer vision, check out our course Computer Vision using Deep Learning 2.0
Before entering into the world of computer vision applications, first, let’s understand what computer vision is? In short, Computer vision is a multidisciplinary branch of artificial intelligence trying to replicate the powerful capabilities of human vision.
If we go through the formal definition,
“Computer vision is a utility that makes useful decisions about real physical objects and scenes based on sensed images” (Sockman & Shapiro, 2001)
Computer vision works through visual recognition techniques like Image classification, object detection, Image segmentation, object tracking, optical character recognition, image captioning, etc. I know these are a lot of technical terms but understanding them is not tough. Just see the image below and you will understand many of these terminologies.
Source: Oreilly
Let’s start with the first image. If I ask you what is there in the picture? Your answer will be, its a cat. This is classification. That means labelling the image based on what it consists of is classification. Here the class is ‘Cat’.
Now you know the class of the image. The next question comes where the object is situated in the image. When we identify the location of the object in the frame and create a bounding box around it, It is known as localization. In the second image, we have identified the location of the object and labeled it as a cat.
The next term is the object detection. In the previous two cases, we have a single object in the image but what if there are multiple objects present. Here we identify the instances present and their location via bounding boxes.
In object detection, we use a bounding box that is either square or rectangular in shape but it does not tell anything about the shape of the objects. Instance segmentation creates a pixel-wise mask around each object. Hence instance segmentation gives a deeper understanding of the image.
Check the following resources if you want to know more about Computer Vision-
Recent developments in deep learning approaches and advancements in technology have tremendously increased the capabilities of visual recognition systems. As a result, computer vision has been rapidly adopted by companies. Successful use-cases of computer vision can be seen across the industrial sectors leading to widening the applications and increased demand for computer vision tools.
Now without losing more time, let’s jump into the 5 exciting applications of computer vision.
Human Pose Estimation is an interesting application of Computer Vision projects. You must have heard about Posenet, which is an open-source model for Human pose estimation. In brief, pose estimation is a computer vision projects technique to infer the pose of a person or object present in the image/video.
Before discussing the working of pose estimation let us first understand ‘Human Pose Skeleton’. It is the set of coordinates to define the pose of a person. A pair of coordinates is known as the limb. Further, pose estimation is performed by identifying, locating, and tracking the key points of Humans pose skeleton in an Image or video.
Source: Researchgate
The following are some of the applications of Human Pose Estimation-
The following are some datasets if you want to develop a pose estimation model by yourself-
I found DeepPose by Google as a very interesting research paper using deep learning models for pose estimation. For digging deeper you can visit multiple research papers available on the pose estimation
Faceapp is a very interesting and trending application among the people. It is an image manipulation tool and transforms the input image using filters. Filters may include aging or the recent one gender swap filter.
Source: Comicbook
Look at the above image, funny right? A few months ago it was a hot topic on the internet. People were sharing images after swapping their gender. But what is the technology working behind such apps? Yes, you guessed it correctly it’s Computer Vision projects, to be more specific its Deep convolution generative adversarial networks.
Generative adversarial networks popularly known as GAN is an exciting innovation in the field of computer vision projects. Although GANs is an old concept, in the present form it was proposed by Ian Goodfellow in 2014. Since then it has seen a lot of developments.
The training of GANs involves two Neural nets play against each other, in order to generate new data based on the distribution of the given training data. Although originally proposed as an unsupervised learning mechanism GANs has proven itself a good candidate for supervised as well as semi-supervised learning.
To know more about the working of Gans check out the article below.
The following are some must-read research papers on GANs that I personally recommend-
The following are some datasets will help you get hands-on experience with GANs-
When it’s about discussing the applications of Images generated using Gans, we have many. The following are some of its applications-
If you find something more interesting, let me know in the comment section.
For the last few months, the world is suffering from pandemic COVID-19. It is found that till the vaccine of the disease is not available, we all must take the precautionary measures of using hand sanitizers, face mask and the most important is following social distancing.
Computer vision project technology can play a vital role in this crucial scenario. It can be used to track people in a premise or a particular area to know whether they are following social distancing norms or not.
The social distancing tool is an application of object detection and tracking in real-time. In this case, to check the social distancing violation, we detect each person present in the video using a bounding box. Later we track the movement of each box in the frame and calculate the distance between them. If it detects any violation of the social distancing norm then it highlights those bounding boxes.
Further, to make these tools more advanced and accurate you may use transfer learning techniques. Various pre-trained object detection models like YOLO or Mask R-CNN are also there.
The following article helps you create a social distancing tool by yourself –
Here is another very interesting application of computer vision. It is converting 2- dimensional images into 3D models. For instance, imagine you have a photograph from your old collection and are able to transform that into a 3d model and inspect like you were there.
Source: Petapixel
The researchers at Deep Mind have come up with an AI system that works on similar lines. It is known as Generative Query Network, It can perceive images from different angles like humans.
Also, Nvidia has developed an AI architecture that can predict 3D properties from an image. Similarly, Facebook AI is offering a similar tool known as the 3D Photo feature.
The following are some relevant datasets available for you to experiment with-
Also, check these interesting papers to know more about the application.
Now you must be thinking about the use cases of this technology. The following are its applications –
For a long time now, computer-supported medical images are being used for a diagnosis like CT scans, X-rays, etc. Furthermore, recent developments in computer vision technologies allow doctors to understand them better by converting into 3d interactive models and make their interpretation easy.
If we look at the most recent use case of computer vision then we will find it is detecting COVID-19 cases using a chest x-ray. Moreover, according to a study at the Department of Radiology, Wuhan, the deep learning methods can be used efficiently to distinguish Covid-19 from community-acquired pneumonia.
Check out the COVID -19 chest x-ray dataset by Kaggle and get your hands dirty in implementation.
In the meantime, if you want to work on another dataset then you have CT medical images also available on Kaggle itself. In addition, if you are looking forward to knowing more about medical image processing and its applications in healthcare, read these research papers and their implementations.
To summarize, computer vision is a fascinating field of artificial intelligence. You name the field and you will get an application of CV there. In this article, I discussed a few of them I found interesting. But this is just the tip of the iceberg.
In case you are interested to know how to have a career in Computer Vision, read the following-
Now it’s your turn to start the implementation of the computer vision on your own. Don’t forget to share your favourite Machine Vision application in the comment box.
A. Real-life examples of computer vision projects include facial recognition systems used for security, medical imaging applications like MRI and CT scans, quality control in manufacturing processes, and autonomous vehicles for object detection and navigation.
A. Applications of computer vision in traffic management include traffic monitoring systems for congestion detection, automated license plate recognition for law enforcement, pedestrian detection for safer crossings, and traffic flow analysis for optimizing signal timings.
A. Early applications of computer vision were primarily in industrial automation, such as inspecting manufactured parts for defects, sorting items on assembly lines, and reading handwritten characters for mail sorting. Other early uses included medical image analysis and military reconnaissance.
A. In robotics, computer vision is applied for tasks such as object recognition and manipulation, autonomous navigation, and mapping environments. Robots equipped with computer vision can perform activities like picking and placing objects in warehouses, assisting in surgical procedures, and exploring hazardous environments like disaster zones.
Wonderful piece. Excellent writing.
Hii shipra ... jarur loved the article ... then come to know it is written by my own first friend of school. I didn't know u r working in same field .
In fact ive been reading many articles relating to CV but yours is beyond expectations and I will be glad to discuss with you if you do not mind