Neural Network For Classification with Tensorflow

Aytan Last Updated : 14 Oct, 2024

9 min read

Introduction

In this article, I am going to build artificial neural network models with TensorFlow to solve a classification problem. Let’s explore together that how we can approach a classification problem in Tensorflow. But firstly, I would like to emphasize that it would be beneficial to have a foundational understanding of classification using machine learning as we delve into the intricacies of artificial neural networks.

It’s crucial to keep in mind that logistic regression is a powerful machine learning method that’s widely applied to classification tasks. Even though this article will mostly discuss artifical neural network , recognizing the versatility of methods like logistic regression can contribute to a well-rounded understanding of classification techniques.

Learning Objectives

Grasp the core concepts of classification tasks in machine learning, including the definition of classification, types of classification problems, and the role of logistic regression and artificial neural networks in classification.
Gain hands-on experience in building neural network models for classification using TensorFlow, from importing necessary libraries to creating datasets and training models.
Learn strategies for evaluating model performance, identifying model shortcomings through visualization, and implementing optimization techniques to enhance model accuracy and generalization.
Explore different activation functions used in neural networks, such as ReLU and sigmoid, understanding their impact on model performance and their role in introducing non-linearity to the model for better classification of complex data.

This article was published as a part of the Data Science Blogathon!

What is a Neural Network?
What is Classification?
Types of Classification
Importing Python Libraries
Creating a Dataset
Data Visualization
Steps in Modeling Neural Network For Classification with Tensorflow
Improving the Neural Network For Classification model with Tensorflow
Visualize the Neural Network model
Activation Functions for Neural Networks
Evaluate the Model

Frequently Asked Questions

What is a Neural Network?

The main purpose of a neural network is to try to find the relationship between features in a data set., and it consists of a set of learning algorithms that mimic the work of the human brain. A “neuron” in a neural network is a mathematical function that collects and classifies information according to a specific architecture.

What is Classification?

Classification problem involves predicting if something belongs to one class or not. In other words, while doing it we try to see something is one thing or another.

Types of Classification

Suppose that you want to predict if a person has diabetes or not. İf you are facing this kind of situation, there are two possibilities, right? That is called Binary Classification.
Suppose that you want to identify if a photo is of a toy, a person, or a cat, right? this is called Multi-class Classification because there are more than two options.
Suppose you want to decide that which categories should be assigned to an article. If so, it is called Multi-label Classification, because one article could have more than one category assigned. Let’s take our explanation through this article. We may assign categories like “Deep Learning, TensorFlow, Classification” etc. to this article

Also Read: 5 Types of Classification Algorithms in Machine Learning

Now we can move forward because we have a common understanding of the problem we will be working on. So, it is time for coding. I hope you are writing them down with me because the only way to get better, make fewer mistakes is to write more code.

Importing Python Libraries

We are starting with importing Python libraries that we will be using:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
print(tf.__version__)

Creating a Dataset

It is time for creating a dataset to work on:

from sklearn.datasets import make_circles

samples = 1000
X, y = make_circles(samples,
                    noise = 0.03,
                    random_state = 42)

print('X : ', X)
print('\n')
print('y : ', y)

We have created some data, let’s get more information about it.

Data Visualization

Okay, we have seen our dataset in more detail, but we still don’t know anything about it, right? That is why here one important step is to become one with the data, and visualization is the best way to do this.

circle = pd.DataFrame({ 'X0' : X[:, 0], 'X1' : X[:, 1], 'label' : y})
circle.head()

Here one question arises, what kind of labels are we dealing with?

circle.label.value_counts()
>> 1    500
   0    500
   Name: label, dtype: int64

Looks like we are dealing with a binary classification problem, because we have 2 labels(0 and 1).

plt.scatter(X[:,0], X[:,1], c = y, cmap = plt.cm.RdYlBu)

As I mentioned above, the best way of getting one with the data is visualization. Now plot says itself that what kind of model we need to build. We will build a model which is able to distinguish blue dots from red dots.

Before building any neural network model, we must check the shapes of our input and output features. they must be the same!

print(X.shape, y.shape)
print(len(X), len(y))
>> (1000, 2) (1000,)
   1000 1000

We have the same amount of values for each feature, but the shape of X is different? Why? Let’s check it out.

X[0], y[0]
>> (array([0.75424625, 0.23148074]), 1)

Okay, we have 2 X features for 1 y. So we can move forward without any problem.

Steps in Modeling Neural Network For Classification with Tensorflow

In TensorFlow there are fixed stages for creating a model:

Creating a model – piece together the layers of a Neural Network using the Functional or Sequential API
Compiling a model – defining how a model’s performance should be measured, and how it should improve (loss function and optimizer)
Fitting a model – letting a model find patterns in the data

We will be using the Sequential API. So, let’s get started

tf.random.set_seed(42)

model_1 = tf.keras.Sequential([tf.keras.layers.Dense(1)])

model_1.compile(loss = tf.keras.losses.BinaryCrossentropy(),

#we use Binary as loss function,because we are working with 2 classes


                optimizer = tf.keras.optimizers.SGD(), 
#SGD stands for Stochastic Gradient Descent

                metrics = ['accuracy'])

model_1.fit(X, y, epochs = 5)

>> Epoch 1/5 32/32 [==============================] - 1s 1ms/step - loss: 2.8544 - accuracy: 0.4600 
   Epoch 2/5 32/32 [==============================] - 0s 2ms/step - loss: 0.7131 - accuracy: 0.5430 
   Epoch 3/5 32/32 [==============================] - 0s 2ms/step - loss: 0.6973 - accuracy: 0.5090 
   Epoch 4/5 32/32 [==============================] - 0s 2ms/step - loss: 0.6950 - accuracy: 0.5010 
   Epoch 5/5 32/32 [==============================] - 0s 1ms/step - loss: 0.6942 - accuracy: 0.4830

The model’s accuracy is approximately 50% which basically means the model is just guessing, let’s try to train it longer

model_1.fit(X, y, epochs = 200, verbose = 0) 
#we set verbose = 0 to remove training procedure )
model_1.evaluate(X, y)

>> 32/32 [==============================] - 0s 1ms/step - loss: 0.6935 - accuracy: 0.5000 
   [0.6934829950332642, 0.5]

Even after 200 epochs, it still performs like it is guessing Next step is adding more layers and training for longer.

tf.random.set_seed(42)

model_2 = tf.keras.Sequential([ tf.keras.layers.Dense(1),

                               tf.keras.layers.Dense(1)

])

model_2.compile(loss = tf.keras.losses.BinaryCrossentropy(),

                optimizer = tf.keras.optimizers.SGD(),

                metrics = ['accuracy'])

model_2.fit(X, y, epochs = 100, verbose = 0)


model_2.evaluate(X,y)

>> 32/32 [==============================] - 0s 1ms/step - loss: 0.6933 - accuracy: 0.5000 
   [0.6933314800262451, 0.5]

Still, there is not even a little change, seems like something is wrong.

Improving the Neural Network For Classification model with Tensorflow

There are different ways of improving a model at different stages:

Creating a model – add more layers, increase the number of hidden units(neurons), change the activation functions of each layer
Compiling a model – try different optimization functions, for example use Adam() instead of SGD().
Fitting a model – we could increase the number of epochs

Let’s try to add more neurons and try Adam optimizer

tf.random.set_seed(42)

model_3 = tf.keras.Sequential([

  tf.keras.layers.Dense(100), # add 100 dense neurons

  tf.keras.layers.Dense(10), # add another layer with 10 neurons

  tf.keras.layers.Dense(1)

])

model_3.compile(loss=tf.keras.losses.BinaryCrossentropy(),

                optimizer=tf.keras.optimizers.Adam(), 

                metrics=['accuracy'])

model_3.fit(X, y, epochs=100, verbose=0)

model_3.evaluate(X,y)
>> 32/32 [==============================] - 0s 1ms/step - loss: 0.6980 - accuracy: 0.5080
   [0.6980254650115967, 0.5080000162124634]

Still not getting better! Let’s visualize the data to see what is going wrong.

Visualize the Neural Network model

To visualize our model’s predictions we’re going to create a function plot_decision_boundary() which:

Takes in a trained model, features, and labels
Create a meshgrid of the different X values.
Makes predictions across the meshgrid.
Plots the predictions with line.

Note: This function has been adapted from CS231n Made with ML basics.

def plot_decision_boundary(model, X, y):
  # Define the axis boundaries of the plot and create a meshgrid
  x_min, x_max = X[:, 0].min() - 0.1, X[:, 0].max() + 0.1
  y_min, y_max = X[:, 1].min() - 0.1, X[:, 1].max() + 0.1
  xx, yy = np.meshgrid(np.linspace(x_min, x_max, 100),
                       np.linspace(y_min, y_max, 100))
  # Create X values (we're going to predict on all of these)
  x_in = np.c_[xx.ravel(), yy.ravel()] 
  # Make predictions using the trained model
  y_pred = model.predict(x_in)
  # Check for multi-class

  if len(y_pred[0]) > 1:
     print("doing multiclass classification...")
     # We have to reshape our predictions to get them ready for plotting
     y_pred = np.argmax(y_pred, axis=1).reshape(xx.shape)
  else:
     print("doing binary classifcation...")
     y_pred = np.round(y_pred).reshape(xx.shape)
  # Plot decision boundary
  plt.contourf(xx, yy, y_pred, cmap=plt.cm.RdYlBu, alpha=0.7)
  plt.scatter(X[:, 0], X[:, 1], c=y, s=40, cmap=plt.cm.RdYlBu)
  plt.xlim(xx.min(), xx.max())
  plt.ylim(yy.min(), yy.max())

 plot_decision_boundary(model_3, X, y)

Here it is! Again visualization shows us what is wrong and what to do? Our model is trying to draw a straight line through the data, but our data is not separable by a straight line. There is something missing out on our classification problem? What it is?

This is non-linearity! We need some non-linear lines. You may get confused now, if you are thinking that you didn’t see that kind of function before, you are wrong, because you have. Let’s see them visually. Visualization always works better!

There are some activation functions in Neural Network that we can use, like ReLu, Sigmoid. Let’s create a little toy tensor and check those functions on it.

Activation Functions for Neural Networks

A = tf.cast(tf.range(-12,12), tf.float32)
print(A)
>> tf.Tensor(
   [-12. -11. -10.  -9.  -8.  -7.  -6.  -5.  -4.  -3.  -2.  -1.   0.   1.
      2.   3.   4.   5.   6.   7.   8.   9.  10.  11.], shape=(24,), dtype=float32)

Let’s see how our toy tensor looks like?

plt.plot(A)

It looks like this, a straight line!

Now let’s recreate activation functions to see what they do to our tensor?

Sigmoid:

def sigmoid(x):
  return 1 / (1 + tf.exp(-x))
sigmoid(A)
plt.plot(sigmoid(A))

A non-straight line!

ReLu:

Now let’s check what does ReLu do? Relu turns all negative values to 0 and positive values stay the same.

def relu(x):
  return tf.maximum(0,x)
plt.plot(relu(A))

Another non-straight line!

Now you have seen non-linear activation functions, and these are what will work for us, the model cannot learn anything on a non-linear dataset with linear activation functions! If have learned this, it is time for dividing our data into training and test sets or validation sets and building strong models.

X_train, y_train = X[:800], y[:800]
X_test, y_test = X[800:], y[800:]
X_train.shape, X_test.shape
>>((800, 2), (200, 2))

Great, now we’ve got training and test sets, let’s model the training data and evaluate what our model has learned on the test set.

tf.random.set_seed(42)

model_4 = tf.keras.Sequential([

                               tf.keras.layers.Dense(4, activation = 'relu'), #we may right it "tf.keras.activations.relu" too

                               tf.keras.layers.Dense(4, activation = 'relu'),

                               tf.keras.layers.Dense(1, activation = 'sigmoid')

])

model_4.compile( loss= tf.keras.losses.binary_crossentropy,

                optimizer = tf.keras.optimizers.Adam(lr = 0.01),

                metrics = ['accuracy'])

model_4.fit(X_train, y_train, epochs = 25, verbose = 0)

Evaluate the Model

loss, accuracy = model_4.evaluate(X_test, y_test)
print(f' Model loss on the test set: {loss}')
print(f' Model accuracy on the test set: {100*accuracy}')

>> 7/7 [==============================] - 0s 2ms/step - loss: 0.1247 - accuracy: 1.0000
   Model loss on the test set: 0.1246885135769844
   Model accuracy on the test set: 100.0

Voila! 100% accuracy! let’s see this result visually

plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.title("Train")
plot_decision_boundary(model_4, X=X_train, y=y_train)
plt.subplot(1, 2, 2)
plt.title("Test")
plot_decision_boundary(model_4, X=X_test, y=y_test)
plt.show()

With just a few tweaks our model is now predicting the blue and red circles almost perfectly.

Conclusion

Let’s take a brief look at what we are talking about in this article. Together we looked at how to approach a classification task in the Neural Network with TensorFlow. We created 3 models in the first way that came to mind, and with the help of visualization we realized where we were wrong, we explored linearity, non-linearity, and finally, we managed to build a generalized model. What I was trying to show with all these codes and the steps I was following was that nothing is 100 percent accurate or fixed, everything continues to change every day. To guess which problem you are likely to face in which kind of data and to see which combinations lead to a better result, all you need is to write a lot more code and gain experience.

I hope the article was a little helpful to you and made some contributions!

Key Takeaways

Gain insight into the fundamentals of classification tasks in machine learning, focusing on the application of logistic regression and artificial neural networks.
Recognize the different types of classification problems, including binary, multi-class, and multi-label classification, each with its own unique characteristics and applications.
Learn the essential steps involved in building a neural network model for classification using TensorFlow, including model creation, compilation, and training.
Explore methods for improving model performance, such as adding more layers, increasing the number of neurons, changing activation functions, and utilizing different optimization algorithms like Adam.
Understand the importance of visualizing data and model predictions to diagnose issues and refine the model, ultimately achieving better accuracy and generalization.

Frequently Asked Questions

Q1. What is the best neural network for data classification?

A. There’s no one-size-fits-all answer. The choice depends on the specific characteristics of the data and the problem. Convolutional Neural Networks (CNNs) are often used for image classification, while Recurrent Neural Networks (RNNs) are suitable for sequential data.

Q2. What is the neural network structure for classification?

A. A typical structure involves an input layer, one or more hidden layers with activation functions, and an output layer with a softmax activation for multi-class classification. The number of neurons and layers can vary based on the complexity of the task.

Q3. Why is Adam the most popular optimizer in Deep Learning?

A. Adam is popular due to its adaptive learning rate mechanism, which dynamically adjusts the learning rates for each parameter during training. This helps converge faster and deal effectively with different types of data and architectures, making it widely used in deep learning applications

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

Aytan

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to Computer Vision

Getting Started with Image Data

Introduction to CNN and Implementation

Introduction to CNN and implementation

Introduction to Transfer Learning

CNN Visualization

Overview of Pretrained Models

Inception

ResNets

DenseNets

CSRNet

Introduction to Object Detection

Region Based Convolutional Neural Network

Single Stage Networks

Transformed Based Object Detection Models

Face Detection

Object Tracking

Pose Estimation

Introduction to Image Segmentation

Understanding Deep Learning Architectures for Image Segmentation

Video Classification

Introduction to Image Generation

Experiments with Generative Adversarial Networks

Zero and Few Shot Learning

Model Deployment

Neural Network For Classification with Tensorflow

Introduction

Learning Objectives

Table of contents

What is a Neural Network?

What is Classification?

Types of Classification

Importing Python Libraries

Creating a Dataset

Data Visualization

Steps in Modeling Neural Network For Classification with Tensorflow

Improving the Neural Network For Classification model with Tensorflow

Visualize the Neural Network model

Activation Functions for Neural Networks

Evaluate the Model

Conclusion

Key Takeaways

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth