Get ready for a thrilling adventure in the realm of computer vision! Our beginner-friendly project involves training a Convolutional Neural Network (CNN) to distinguish between cats and dogs in images. We’ll use a dataset containing images of both animals as our training data. Throughout this journey, we’ll manipulate data frames, visualize images using subplots, load images with imread, display them with imshow, and implement dropout regularization for better model performance. Let’s embark on this exciting journey of image classification together!
This article was published as a part of the Data Science Blogathon
A Convolutional Neural Network (CNN) operates by applying convolutional layers, utilizing operations like conv2d to convolve learned filters (kernels) with input images. These filters assign weights and biases to different aspects of the image, aiding in feature extraction. During training, batches of labeled images are fed into the network. We compare predictions to ground truth labels using algorithms like argmax to determine the class with the highest probability. We apply batch normalization to enhance learning by normalizing the input across batches. The network parameters are adjusted iteratively to minimize the distance between predictions and labels. This process repeats for each batch, gradually improving the network’s prediction capabilities.
This tutorial aims to create a system capable of recognizing cat and dog images. It analyzes input images of cats and images of dogs to make predictions. The implemented model is adaptable for websites or mobile devices. The Dogs vs Cats dataset, available on Kaggle, comprises images for the model to learn distinctive features. After training, the classification model distinguishes between cat and dog images.
Also Read: Top 25 Machine Learning Projects for Beginners in 2024
import pandas as pd
import numpy as np
import os
import matplotlib.pyplot as plt
from os import listdir
from sklearn import metrics
from keras.models import Sequential
from keras.layers import Convolution2D
from keras.layers import MaxPooling2D
from keras.layers import Dense
from keras.layers import Flatten
CNN does the processing of Images with the help of matrixes of weights known as filters. They detect low-level features like vertical and horizontal edges etc. Through each layer, the filters recognize high-level features.
We first initialize the CNN:
#initializing the cnn
classifier=Sequential()
For compiling the CNN, we are using adam optimizer.
Adaptive Moment Estimation (Adam) is a method used for computing individual learning rates for each parameter. For loss function, we are using Binary cross-entropy to compare the class output to each of the predicted probabilities. Then it calculates the penalization score based on the total distance from the expected value.
Image augmentation is a method of applying different kinds of transformation to original images resulting in multiple transformed copies of the same image. The images are different from each other in certain aspects because of shifting, rotating, flipping techniques. So, we are using the Keras ImageDataGenerator class to augment our images.
#part2-fitting the cnn to the images
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
We need a way to turn our images into batches of data arrays in memory so that they can be fed to the network during training. ImageDataGenerator can readily be used for this purpose. So, we import this class and create an instance of the generator. We are using Keras to retrieve images from the disk with the flow_from_directory method of the ImageDataGenerator class.
# Generating images for the Test set
test_datagen = ImageDataGenerator(rescale = 1./255)
# Creating training set
training_set = train_datagen.flow_from_directory('C:/Users/khushi shah/AndroidStudioProjects/catanddog/dataset/training_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')
# Creating the Test set
test_set = test_datagen.flow_from_directory('C:/Users/khushi shah/AndroidStudioProjects/catanddog/dataset/test_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')
Also Read: 25 Open Datasets for Deep Learning Every Data Scientist Must Work With!
Convolution involves linearly multiplying weights with the input. This multiplication occurs between an array of input data and a 2D array of weights called a filter or kernel. The filter is consistently smaller than the input data, and the dot product takes place between the input and filter array.
We add the activation function to assist the Artificial Neural Network (ANN) in learning complex patterns within the data. The primary purpose of the activation function is to introduce non-linearity into the neural network.
The pooling operation provides spatial variance making the system capable of recognizing an object with some varied appearance. It involves adding a 2Dfilter over each channel of the feature map and thus summarise features lying in that region covered by the filter.
So, pooling basically helps reduce the number of parameters and computations present in the network. It progressively reduces the spatial size of the network and thus controls overfitting. There are two types of operations in this layer; Average pooling and Maximum pooling. Here, we are using max-pooling which according to its name will only take out the maximum from a pool. With the help of filters sliding through the input and at each stride, the maximum parameter is taken out, and the rest are dropped.
The pooling layer does not modify the depth of the network unlike in the convolution layer.
The fully connected layer receives the flattened output from the final pooling layer.
The Full Connection process practically works as follows:
The neurons present in the fully connected layer detect a certain feature and preserves its value then communicates the value to both the dog and cat classes who then check out the feature and decide if the feature is relevant to them.
#step1-convolution
classifier.add(Convolution2D(32,3,3,input_shape=(64,64,3),activation='relu'))
#step2-maxpooling
classifier.add(MaxPooling2D(pool_size=(2,2)))
#step3-flattening
classifier.add(Flatten())
#step4-fullconnection
classifier.add(Dense(output_dim=128,activation='relu'))
classifier.add(Dense(output_dim=1,activation='sigmoid'))
We are fitting our model to the training set. It will take some time for this to finish.
classifier.fit_generator(training_set,samples_per_epoch=8000,nb_epoch=25,validation_data=test_set,nb_val_samples=2000)
It is seen that we have 0.8115 accuracies on our training set.
We can predict new images with our model by predict_image function where we have to provide a path of new image as image path and using predict method. If the probability is more than 0.5 then the image will be of a dog else of cat.
#to predict new images
def predict_image(imagepath, classifier):
predict = image.load_img(imagepath, target_size = (64, 64))
predict_modified = image.img_to_array(predict)
predict_modified = predict_modified / 255
predict_modified = np.expand_dims(predict_modified, axis = 0)
result = classifier.predict(predict_modified)
if result[0][0] >= 0.5:
prediction = 'dog'
probability = result[0][0]
print ("probability = " + str(probability))
else:
prediction = 'cat'
probability = 1 - result[0][0]
print ("probability = " + str(probability))
print("Prediction = " + prediction)
In this exhilarating journey through the realm of image classification, we delved into the marvels of Convolutional Neural Networks (CNN). From discerning between cats and dogs to installing essential Python packages, we’ve left no stone unturned. This beginner-friendly project provides invaluable insights and sets the stage for exploring diverse applications. With a solid understanding of CNN fundamentals, you’re now ready to embark on your own image classification escapades! Don’t forget to leverage techniques like softmax activation and model.predict to further enhance your models and you can overlook key metrics like validation loss (val_loss) to assess model performance accurately.
A. Adam is popular in deep learning due to its adaptive learning rate and momentum features, improving optimization efficiency.
A. Cat and Dog Classification using CNN involves training a convolutional neural network on labeled cat and dog image data to differentiate between the two classes.
A. In transfer learning, practitioners transfer knowledge from a pre-trained model to a new model, usually achieved by retraining the output layer on new data.
A. Generating Class Activation Maps involves visualizing which parts of an image are important for classification, often done by appending a global average pooling layer and visualizing activations.
A. To predict images in the test1 dataset, use a trained model on test data, typically resizing images to match training image size, then generating predictions, often with libraries like PyTorch. Detailed tutorials are available on platforms like GitHub.
The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.
Hi I have a question, please contact me as soon as possible