Master Image Augmentation with 3 Widely Used Python Libraries

Devashree Last Updated : 21 Feb, 2024

10 min read

In my previous article, we explored image augmentation using AugLy, a recently introduced library from Facebook. Now, let’s delve into three popular image augmentation libraries in Python.

Grayscale, GPU, GaussianBlur, Kaggle, DataGen, Batch_Size, Algorithm, Sequential, Resizing, API:

An image classifier’s performance improves with a larger and more diverse dataset. However, gathering diverse data can be time-consuming and expensive. Image data augmentation solves this problem by generating various images for training. Techniques include geometric transformations (flipping, cropping, rotating, zooming), color transformations (brightness adjustment, saturation), and more.

We can augment the image data using various techniques. It can include:

Augmenting image data using Geometric transformations such as flipping, cropping, rotating, zooming, etc.
Augmenting image data by using Color transformations such as by adjusting brightness, darkness, sharpness, saturation, etc.
Augmenting image data by random erasing, mixing images, etc.

This article was published as a part of the Data Science Blogathon.

Learning Objectives

Understand the importance of image augmentation in machine learning tasks.
Learn about different Python libraries for image augmentation: Imgaug, Albumentations, and SOLT.
Gain familiarity with various image augmentation techniques and how to implement them using different libraries.
Learn to define augmentation pipelines for efficient data augmentation.
Understand the role of augmentation in improving model robustness and generalization capabilities.

Imgaug Tutorial

Imgaug is an open-source python package that allows you to augment images in machine learning experiments. It works with a variety of augmentation techniques. It has a simple yet powerful interface and can augment images optimizer, landmarks, bounding boxes, heatmaps, and segmentation maps.

Imgaug is a powerful library for image augmentation in machine learning experiments. It supports a variety of techniques such as flipping, rotation range, and cropping. It also allows more complex methods like adding Gaussian noise or blurring the images.

Let’s start by installing this library first using pip from PyPI.

pip install imgaug

Next, we will install the python package named ‘IPyPlot’ in the command prompt using the pip command:

pip install ipyplot

IPyPlot is a Python tool that allows for the fast and efficient display of images within Python Notebook cells. This package combines IPython with HTML to provide a quicker, richer, and more interactive way to show images. This package’s ‘plot_images’ command will be used to plot all of the images in a grid-like structure.

Also, we will import all the necessary packages needed to augment the data.

import imageio
import imgaug as ia
import imgaug.augmenters as iaa

The image path for augmentation is defined here. We’ll use a bird image as an example.

input_img = imageio.imread('../input/image-bird/bird.jpg')

Image Flipping

We can flip the image horizontally and vertically using the commands shown below. ‘Fliplr’ keyword in the following code flips the image horizontally. Similarly, the keyword ‘Flipud’ flips the image vertically.

Image flipping is a simple yet effective technique used in data augmentation. It helps the model generalize better by providing it with ‘new’ images that are flipped versions of the original images in the dataset.

#Horizontal Flip
hflip= iaa.Fliplr(p=1.0)
input_hf= hflip.augment_image(input_img)

#Vertical Flip
vflip= iaa.Flipud(p=1.0) 
input_vf= vflip.augment_image(input_img)
images_list=[input_img, input_hf, input_vf]
labels = ['Original', 'Horizontally flipped', 'Vertically flipped']
ipyplot.plot_images(images_list,labels=labels,img_width=180)

The probability of each image getting flipped is represented by p. The probability is set to 0.0 by default. To flip the input image horizontally, use Fliplr(1.0) rather than just Fliplr (). Similarly, when flipping the image vertically, use Flipud(1.0) rather than just Flipud().

Image Rotation

By defining the rotation in degrees, we can rotate the image.

Image rotation is another common technique in data augmentation. By rotating the images at various angles, we can increase the diversity of our training data and help our model become more robust.

rot1 = iaa.Affine(rotate=(-50,20))
input_rot1 = rot1.augment_image(input_img)
images_list=[input_img, input_rot1]
labels = ['Original', 'Rotated Image']
ipyplot.plot_images(images_list,labels=labels,img_width=180)

Image Cropping

Image cropping is used to focus on specific parts of an image. This is particularly useful in tasks like object detection where we want our model to recognize objects regardless of their position in the image. Cropping images includes removing columns or rows of pixels from the image’s sides. This augmenter enables the extraction of smaller-sized subimages from full-sized input images. The number of pixels to be removed can be specified in absolute numbers or as a fraction of the image size.

In this case, we crop each side of the image with a random fraction taken uniformly from the continuous interval [0.0, 0.3] and sampled once per image and side. Here, we are taking a sampled fraction of 0.3 for the top side, which will crop the image by 0.3*H, where H is the height of the input image.

crop1 = iaa.Crop(percent=(0, 0.3)) 
input_crop1 = crop1.augment_image(input_img)
images_list=[input_img, input_crop1]
labels = ['Original', 'Cropped Image']
ipyplot.plot_images(images_list,labels=labels,img_width=180)

Adding Noise to Images

Adding noise to images is a technique used to make the model more robust. It involves adding random variations to the pixel values of the images.

This augmenter adds gaussian noise to the input image. The scale value is the standard deviation of the normal distribution that generates the noise.

noise=iaa.AdditiveGaussianNoise(10,40)
input_noise=noise.augment_image(input_img)
images_list=[input_img, input_noise]
labels = ['Original', 'Gaussian Noise Image']
ipyplot.plot_images(images_list,labels=labels,img_width=180)

Image Shearing

Image shearing involves shifting one part of an image to a direction while keeping the other parts fixed. It’s a useful technique for training neural networks as it provides a different perspective of the data.

This augmenter shears the image by random amounts ranging from -40 to 40 degrees.

shear = iaa.Affine(shear=(-40,40))
input_shear=shear.augment_image(input_img)
images_list=[input_img, input_shear]
labels = ['Original', 'Image Shearing']
ipyplot.plot_images(images_list,labels=labels,img_width=180)

Image Contrast

Adjusting the image contrast can highlight or obscure certain features in the image. This can be beneficial in tasks like object detection or image recognition.

This augmenter adjusts the image contrast by scaling pixel values.

contrast=iaa.GammaContrast((0.5, 2.0))
contrast_sig = iaa.SigmoidContrast(gain=(5, 10), cutoff=(0.4, 0.6))
contrast_lin = iaa.LinearContrast((0.6, 0.4))
input_contrast = contrast.augment_image(input_img)
sigmoid_contrast = contrast_sig.augment_image(input_img)
linear_contrast = contrast_lin.augment_image(input_img)
images_list=[input_img, input_contrast,sigmoid_contrast,linear_contrast]
labels = ['Original', 'Gamma Contrast','SigmoidContrast','LinearContrast']
ipyplot.plot_images(images_list,labels=labels,img_width=180)

The GammaContrast function here adjusts image contrast using the formula 255*((v/255)**gamma, where v is a pixel value and gamma is evenly sampled from the range [0.5, 2.0]. SigmoidContrast adjusts image contrast using the formula 255*1/(1+exp(gain*(cutoff-v/255)) (where v is a pixel value, the gain is sampled uniformly from the interval [3, 10] (once per image), and the cutoff is sampled consistently from the interval [0.4, 0.6]. LinearContrast, on the other hand, alters image contrast using the formula 127 + alpha*(v-127)’, where v is a pixel value and alpha is sampled uniformly from the range [0.4, 0.6].

Image Transformations

Image transformations involve changing the appearance of an image using operations like translation, rotation, scaling, etc. These transformations can help improve the performance of the model by providing it with a more diverse dataset.

The ‘Elastic Transformation’ augmenter transforms images by shifting pixels around locally using displacement fields. The augmenter’s parameters are alpha and sigma. The strength of the displacement is controlled by alpha, wherein greater values indicate that pixels are shifted further. The smoothness of the displacement is controlled by sigma, in which larger values result in smoother patterns.

elastic = iaa.ElasticTransformation(alpha=60.0, sigma=4.0)
polar = iaa.WithPolarWarping(iaa.CropAndPad(percent=(-0.2, 0.7)))
jigsaw = iaa.Jigsaw(nb_rows=20, nb_cols=15, max_steps=(3, 7))
input_elastic = elastic.augment_image(input_img)
input_polar = polar.augment_image(input_img)
input_jigsaw = jigsaw.augment_image(input_img)
images_list=[input_img, input_elastic,input_polar,input_jigsaw]
labels = ['Original', 'elastic','polar','jigsaw']
ipyplot.plot_images(images_list,labels=labels,img_width=180)

The ‘Polar Warping’ Augmenter first applies cropping and padding in polar representation, then warps the image back to cartesian representation. This augmenter can add additional pixels to the image.The augmenter will fill these additional pixels with black. In addition, the ‘Jigsaw’ augmentation moves cells inside pictures in a manner similar to jigsaw patterns.

Bounding Box on Image

In object detection tasks, bounding boxes denote the location of the object in the image. Typically, they are created during the preprocessing stage before feeding the images into the model.

imgaug also provides bounding box support for images. The library can rotate all bounding boxes on an image if rotated during augmentation.

from imgaug.augmentables.bbs import BoundingBox, BoundingBoxesOnImage
bbs = BoundingBoxesOnImage([
 BoundingBox(x1=40, x2=550, y1=40, y2=780)
], shape=input_img.shape)
ia.imshow(bbs.draw_on_image(input_img))

Albumentations

Albumentations is a fast and flexible image augmentation library. Albumentations offers a wide range of augmentation techniques and is optimized for high performance, making it suitable for tasks that require heavy image processing.

Albumentations is a fast and well-known library that integrates with popular deep learning frameworks such as PyTorch and TensorFlow. It is also a part of the PyTorch ecosystem.

Albumentations can perform all typical computer vision tasks, including classification, semantic segmentation, instance segmentation, object identification, and posture estimation. This library includes over 70 different augmentations for creating new training samples from existing data. Industry, deep learning research, machine learning contests, and open-source projects commonly utilize it.

Let’s start by installing the library first using the pip command.

pip install Albumentations

We will import all the necessary packages needed for augmenting data with Albumentations:

import albumentations as A
import cv2

In addition to the Albumentations package, we use the OpenCV package, an open-source computer vision library that supports a wide range of image formats. Albumentations are dependent on OpenCV; thus, you already have it installed.

Each of these topics plays a crucial role in image processing and you can implement them using various tools like Keras, Numpy, etc. They help in preprocessing the images, augmenting the image dataset, and improving the performance of the model. Remember, the goal of these techniques is to make your model more robust and capable of generalizing from the training data to new images it has never seen before.

Image Flipping

The ‘A.HorizontalFlip’ and ‘A.VerticalFlip’ functions flip the image horizontally and vertically. Most augmentations support the parameter ‘p’, which controls the probability of the augmentation being used.

#HorizontalFlip
transform = A.HorizontalFlip(p=0.5)
augmented_image = transform(image=input_img)['image']
plt.figure(figsize=(4, 4))
plt.axis('off')
plt.imshow(augmented_image)

#VerticalFlip
transform = A.VerticalFlip(p=1)
augmented_image = transform(image=input_img)['image']
plt.figure(figsize=(4, 4))
plt.axis('off')
plt.imshow(augmented_image)

Image Scale and Rotate

This augmenter uses affine transformations at random to translate, scale, and rotate the input image.

transform = A.ShiftScaleRotate(p=0.5)
random.seed(7) 
augmented_image = transform(image=input_img)['image']
plt.figure(figsize=(4, 4))
plt.axis('off')
plt.imshow(augmented_image)

Image ChannelShuffle

This augmenter randomly rearranges the RGB channels of the input image.

from albumentations.augmentations.transforms import ChannelShuffle
transform = ChannelShuffle(p=1.0)
random.seed(7) 
augmented_image = transform(image=input_img)['image']
plt.figure(figsize=(4, 4))
plt.axis('off')
plt.imshow(augmented_image)

Image Solarize

This augmenter inverts all pixel values greater than a certain threshold in the input image.

from albumentations.augmentations.transforms import Solarize
transform = Solarize(threshold=200,  p=1.0)
augmented_image = transform(image=input_img)['image']
plt.figure(figsize=(4, 4))
plt.axis('off')
plt.imshow(augmented_image)

Invert Image

By subtracting pixel values from 255, this augmenter inverts the input image.

from albumentations.augmentations.transforms import InvertImg
transform = InvertImg(p=1.0)
augmented_image = transform(image=input_img)['image']
plt.figure(figsize=(4, 4))
plt.axis('off')
plt.imshow(augmented_image)

Augmentation pipeline using Compose

To define an augmentation pipeline, first, create a Compose instance. You must provide a list of augmentations as an argument to the Compose class. In this example, we’ll utilize a variety of augmentations such as transposition, blur, distortion, etc.

A Compose call will result in the return of a transform function that will do image augmentation.

transform = A.Compose([
    A.RandomRotate90(),
    A.Transpose(),
    A.ShiftScaleRotate(shift_limit=0.08, scale_limit=0.5, rotate_limit=5, p=.8),
    A.Blur(blur_limit=7),
    A.GridDistortion(),
])
random.seed(2) 
augmented_image = transform(image=input_img)['image']
plt.figure(figsize=(4, 4))
plt.axis('off')
plt.imshow(augmented_image)

SOLT

SOLT is a Deep Learning data augmentation library that supports images, segmentation masks, labels, and key points. SOLT is also fast and has OpenCV in its backend. Complete auto-generated documentation and examples can be found here: https://mipt-oulu.github.io/solt/.

We will start with the installation of SOLT by using the pip command –

pip install solt

Then we will import all the necessary packages of SOLT required for augmenting the image data.

import solt
import solt.transforms as slt
h, w, c = input_img.shape
img = input_img[:w]

Here we will create a Stream instance for an augmentation pipeline. You must provide a list of augmentations as an argument to the stream class.

stream = solt.Stream([
    slt.Rotate(angle_range=(-90, 90), p=1, padding='r'),
    slt.Flip(axis=1, p=0.5),
    slt.Flip(axis=0, p=0.5),
    slt.Shear(range_x=0.3, range_y=0.8, p=0.5, padding='r'),
    slt.Scale(range_x=(0.8, 1.3), padding='r', range_y=(0.8, 1.3), same=False, p=0.5),
    slt.Pad((w, h), 'r'),
    slt.Crop((w, w), 'r'),
    slt.Blur(k_size=7, blur_type='m'),
    solt.SelectiveStream([
        slt.CutOut(40, p=1),
        slt.CutOut(50, p=1),
        slt.CutOut(10, p=1),
        solt.Stream(),
        solt.Stream(),
    ], n=3),
], ignore_fast_mode=True)
fig = plt.figure(figsize=(17,17))
n_augs = 10
random.seed(2)
for i in range(n_augs):
    img_aug = stream({'image': img}, return_torch=False, ).data[0].squeeze()
    ax = fig.add_subplot(1,n_augs,i+1)
    if i == 0:
        ax.imshow(img)
    else:
        ax.imshow(img_aug)
    ax.set_xticks([])
    ax.set_yticks([])
plt.show()

Conclusion

Image augmentations can help in increasing the existing dataset. There are several Python libraries currently available for image augmentations. In this article, we have explored different image augmentation techniques using three Python libraries – Imgaug, Albumentations, and Solt.

I hope you enjoyed reading this article! The next time you train a machine learning or a deep learning model, do try one of these three libraries and the techniques shared in this article to generate additional image data quickly.

Key Takeaways

Image augmentation is vital for improving machine learning model performance by enhancing dataset diversity.
Python libraries like Imgaug, Albumentations, and SOLT offer powerful tools for image augmentation.
Techniques include geometric transformations, color adjustments, noise addition, and more.
You can efficiently define augmentation pipelines using libraries like Albumentations in Python.
The goal is to make models more robust and capable of generalizing to unseen data.

Frequently Asked Questions

Q1. What is image augmentation in Python?

A. In Python, we use image augmentation to artificially increase the dataset size by creating modified versions of existing images. This involves applying various transformations such as flipping, rotating, zooming, or shifting the images.

Q2. What is Imgaug in Python?

A. Imgaug is an open-source Python library used for image augmentation in machine learning experiments. It supports a variety of augmentation techniques and can augment images, landmarks, bounding boxes, heatmaps, and segmentation maps31.

Q3. How do you augment an image?

A. Image augmentation involves applying various transformations to the original images to generate new ones. These transformations can include geometric changes like flipping, cropping, rotating, zooming, and color transformations like adjusting brightness, darkness, sharpness, saturation, etc.

Q4. What does augment mean in Python?

A. In Python, ‘augment’ often refers to the concept of augmented assignment operators. These operators combine an arithmetic or bitwise operation with an assignment.

Q5. What is data augmentation?

A. Data augmentation is a technique used in machine learning to artificially increase the amount of data by creating new data points from existing data. This includes making small changes to the data or using deep learning models to generate new data points.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Devashree

Devashree has an M.Eng degree in Information Technology from Germany and a Data Science background. As an Engineer, she enjoys working with numbers and uncovering hidden insights in diverse datasets from different sectors to build beautiful visualizations to try and solve interesting real-world machine learning problems.

In her spare time, she loves to cook, read & write, discover new Python-Machine Learning libraries or participate in coding competitions.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Irfan ali

HI, It was a great article .It was easy to read and understand the concept. Bangalore, the heart of IT in India is home to many MNCs and industries. This electronic city holds 4th rank in the world among the most powerful IT infrastructures. Due to this several people relocate to Bangalore, to make their careers in the IT sector. As much as this city provides numerous job opportunities in the IT industry, it surely has a plethora of computer-related training institutes including Data Science and AI. Learnbay is one such institute that offers industry-accredited online data science courses with domain elective choices. If you’re wondering where you can upskill yourself in data science and AI, Learnbay is considered to be one of the best data science training institutes in Bangalore. It provides top-notch instructor-led online data science courses in collaboration with IBM. Its courses are specifically designed for working professionals from any domain. Along with theoretical learning, you can work on various industrial projects led by our experts, that will help you gain experiential expertise.

Reading list

Introduction to Deep Learning

Feed Forward Networks

Gradient Descent

Loss Function

Activation Functions

Introduction to Neural networks

Forward and Backward Propagation

Optimizers

Learning Rate Schedulers

NN on Structured Data

Improving the Deep Learning Model

Deep Learning Model Optimization

Unsupervised Deep Learning

AutoDL

Model Deployment

Introduction to PyTorch

Master Image Augmentation with 3 Widely Used Python Libraries

Learning Objectives

Imgaug Tutorial

Image Flipping

Image Rotation

Image Cropping

Adding Noise to Images

Image Shearing

Image Contrast

Image Transformations

Bounding Box on Image

Albumentations

Image Flipping

Image Scale and Rotate

Image ChannelShuffle

Image Solarize

Invert Image

Augmentation pipeline using Compose

SOLT

Conclusion

Key Takeaways

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory