This article was published as a part of the Data Science Blogathon
Image super-resolution (SR) is the process of recovering high-resolution (HR) images from low-resolution (LR) images. It is an important class of image processing techniques in computer vision and image processing and enjoys a wide range of real-world applications, such as medical imaging, satellite imaging, surveillance and security, astronomical imaging, amongst others.
With the advancement in deep learning techniques in recent years, deep learning-based SR models have been actively explored and often achieve state-of-the-art performance on various benchmarks of SR. A variety of deep learning methods have been applied to solve SR tasks, ranging from the early Convolutional Neural Networks (CNN) based method to recent promising Generative Adversarial Nets based SR approaches.
Image super-resolution (SR) problem, particularly single image super-resolution (SISR), has gained a lot of attention in the research community. SISR aims to reconstruct a high-resolution image ISR from a single low-resolution image ILR. Generally, the relationship between ILR and the original high-resolution image IHR can vary depending on the situation. Many studies assume that ILR is a bicubic downsampled version of IHR, but other degrading factors such as blur, decimation, or noise can also be considered for practical applications.
In this article, we would be focusing on supervised learning methods for super-resolution tasks. By using HR images as target and LR images as input, we can treat this problem as a supervised learning problem.
Before understanding the rest of the theory behind the super-resolution, we need to understand upsampling (Increasing the spatial resolution of images or simply increasing the number of pixel rows/columns or both in the image) and its various methods.
1. Interpolation-based methods – Image interpolation (image scaling), refers to resizing digital images and is widely used by image-related applications. The traditional methods include nearest-neighbor interpolation, linear, bilinear, bicubic interpolation, etc.
Nearest-neighbor interpolation with the scale of 2
Shortcomings – Interpolation-based methods often introduce some side effects such as computational complexity, noise amplification, blurring results, etc.
2. Learning-based upsampling – To overcome the shortcomings of interpolation-based methods and learn upsampling in an end-to-end manner, transposed convolution layer and sub-pixel layer are introduced into the SR field.
Transposed convolution layer – The blue boxes denote the input,
and the green boxes indicate the kernel and the convolution output.
Sub-pixel layer – The blue boxes denote the input and the boxes with other colors indicate different convolution operations and different output feature maps.
Since image super-resolution is an ill-posed problem, how to perform upsampling (i.e., generating HR output from LR input) is the key problem. There are mainly four model frameworks based on the employed upsampling operations and their locations in the model (refer to the table above).
1. Pre-upsampling Super-resolution –
We don’t do a direct mapping of LR images to HR images since it is considered to be a difficult task. We utilize traditional upsampling algorithms to obtain higher resolution images and then refining them using deep neural networks is a straightforward solution. For example – LR images are upsampled to coarse HR images with the desired size using bicubic interpolation. Then deep CNNs are applied to these images for reconstructing high-quality images.
2. Post-upsampling Super-resolution –
To improve the computational efficiency and make full use of deep learning technology to increase resolution automatically, researchers propose to perform most computation in low-dimensional space by replacing the predefined upsampling with end-to-end learnable layers integrated at the end of the models. In the pioneer works of this framework, namely post-upsampling SR, the LR input images are fed into deep CNNs without increasing resolution, and end-to-end learnable upsampling layers are applied at the end of the network.
In the super-resolution field, loss functions are used to
measure reconstruction error and guide the model optimization. In early times, researchers usually employ the pixelwise L2 loss(mean squared error), but later discover that it cannot measure the
reconstruction quality very accurately. Therefore, a variety
of loss functions (e.g., content loss, adversarial loss) are adopted for better measuring the reconstruction
error and producing more realistic and higher-quality results.
In MSE, I is a noise-free m×n monochrome image (ground truth) and K is the generated image (noisy approximation). In PSNR, MAXI represents the maximum possible pixel value of the image.
Various network designs in super-resolution architecture
Enough of the basics! Let’s discuss some of the state-of-art super-resolution methods –
Super-Resolution Generative Adversarial Network (SRGAN) – Uses the idea of GAN for super-resolution task i.e. generator will try to produce an image from noise which will be judged by the discriminator. Both will keep training so that generator can generate images that can match the true training data.
Architecture of Generative Adversarial Network
Check the original papers for detailed information.
Steps –
Network architecture of SRGAN
Key features of the method –
EDSR, MDSR – Residual learning techniques exhibit
improved performance of super-resolution through deep convolutional neural networks(DCNN). Single-scale architecture Enhanced Deep Super-Resolution network(EDSR) handles specific super-resolution scale and Multi-scale Deep Super-Resolution system(MDSR) reconstructs various scales of high-resolution images in a single model. The significant performance improvement of the model
is due to optimization by removing unnecessary modules in
conventional residual networks.
Check the original papers for detailed information.
Some of the key features of the methods –
Comparison of the residual blocks
So now we have come to the end of the blog! To learn about super-resolution, refer to these survey papers.
Kindly share your feedback about the blog in the comment section. Happy Learning 🙂
The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.