Guide on AdaBoost Algorithm

Anshul Last Updated : 04 Apr, 2025

10 min read

AdaBoost algorithm, introduced by Freund and Schapire in 1997, revolutionized ensemble modeling. Since its inception, AdaBoost has become a widely adopted technique for addressing binary classification challenges. This powerful algorithm enhances prediction accuracy by transforming a multitude of weak learners into robust, strong learners

The main idea behind AdaBoosting algorithms is to start by creating a model on the training dataset. Then, we build another model to correct the errors made by the first model. This process continues until the errors are minimized, and the dataset is accurately predicted. AdaBoosting works by combining multiple models, known as weak learners, to create a final, stronger model, referred to as a strong learner.

In this article, you will understand what AdaBoost is, how AdaBosting works, the AdaBoost algorithm in machine learning, and the AdaBoost classifier. AdaBoost, short for Adaptive Boosting, is an ensemble learning technique that combines multiple weak learners to create a strong classifier, improving the accuracy of machine learning models.

Learning Objectives

To understand what the AdaBoost algorithm is and how it works.
To understand what stumps are.
To find out how boosting algorithms help increase the accuracy of ML models.

This article was published as a part of the Data Science Blogathon

What Is the AdaBoost Algorithm?
Understanding the Working of the AdaBoost Algorithm
Python implementation of AdaBoost
- Building AdaBoost from Scratch
- Using Scikit-learn
Frequently Asked Questions

What Is the AdaBoost Algorithm?

There are many machine learning algorithms to choose from for your problem statements. One of these algorithms for predictive modeling is called AdaBoost.

AdaBoost algorithm, short for Adaptive Boosting, is a Boosting technique used as an Ensemble Method in Machine Learning. It is called Adaptive Boosting as the weights are re-assigned to each instance, with higher weights assigned to incorrectly classified instances.

What this algorithm does is that it builds a model and gives equal weights to all the data points. It then assigns higher weights to points that are wrongly classified. Now all the points with higher weights are given more importance in the next model. It will keep training models until and unless a lower error is received.

Let’s take an example to understand this, suppose you built a decision tree algorithm on the Titanic dataset, and from there, you get an accuracy of 80%. After this, you apply a different algorithm and check the accuracy, and it comes out to be 75% for KNN and 70% for Linear Regression.

When building different models on the same dataset, we observe variations in accuracy. However, leveraging the power of AdaBoost classifier, we can combine these algorithms to enhance the final predictions. By averaging the results from diverse models, Adaboost allows us to achieve higher accuracy and bolster predictive capabilities effectively.

Here we will be more focused on mathematics intuition.

There is another ensemble learning algorithm called the gradient ada boosting algorithm. In this algorithm, we try to reduce the error instead of wights, as in AdaBoost. But in this article, we will only be focussing on the mathematical intuition of Adaptive Boosting.

Checkout this article about AdaBoost Algorithm with Python

Understanding the Working of the AdaBoost Algorithm

Let’s understand what and how this algorithm works under the hood with the following tutorial.

Step 1: Assigning Weights

The Image shown below is the actual representation of our dataset. Since the target column is binary, it is a classification problem. First of all, these data points will be assigned some weights. Initially, all the weights will be equal.

The formula to calculate the sample weights is:

Where N is the total number of data points

Here since we have 5 data points, the sample weights assigned will be 1/5.

Step 2: Classify the Samples

We start by seeing how well “Gender” classifies the samples and will see how the variables (Age, Income) classify the samples.

We’ll create a decision stump for each of the features and then calculate the Gini Index of each tree. The tree with the lowest Gini Index will be our first stump.

Here in our dataset, let’s say Gender has the lowest gini index, so it will be our first stump.

Step 3: Calculate the Influence

We’ll now calculate the “Amount of Say” or “Importance” or “Influence” for this classifier in classifying the data points using this formula:

The total error is nothing but the summation of all the sample weights of misclassified data points.

Here in our dataset, let’s assume there is 1 wrong output, so our total error will be 1/5, and the alpha (performance of the stump) will be:

Note: Total error will always be between 0 and 1.

0 Indicates perfect stump, and 1 indicates horrible stump.

From the graph above, we can see that when there is no misclassification, then we have no error (Total Error = 0), so the “amount of say (alpha)” will be a large number.

When the classifier predicts half right and half wrong, then the Total Error = 0.5, and the importance (amount of say) of the classifier will be 0.

If all the samples have been incorrectly classified, then the error will be very high (approx. to 1), and hence our alpha value will be a negative integer.

If you are Beginner and want to know AdaBoost you can go with this article

Step 4: Calculate TE and Performance

You might be wondering about the significance of calculating the Total Error (TE) and performance of an Adaboost stump. The reason is straightforward – updating the weights is crucial. If identical weights are maintained for the subsequent model, the output will mirror what was obtained in the initial model.

The wrong predictions will be given more weight, whereas the correct predictions weights will be decreased. Now when we build our next model after updating the weights, more preference will be given to the points with higher weights.

After finding the importance of the classifier and total error, we need to finally update the weights, and for this, we use the following formula:

The amount of, say (alpha) will be negative when the sample is correctly classified.

The amount of, say (alpha) will be positive when the sample is miss-classified.

There are four correctly classified samples and 1 wrong. Here, the sample weight of that datapoint is 1/5, and the amount of say/performance of the stump of Gender is 0.69.

New weights for correctly classified samples are:

correctly classified | AdaBoost Algorithm

For wrongly classified samples, the updated weights will be:

Note

See the sign of alpha when I am putting the values, the alpha is negative when the data point is correctly classified, and this decreases the sample weight from 0.2 to 0.1004. It is positive when there is misclassification, and this will increase the sample weight from 0.2 to 0.3988

We know that the total sum of the sample weights must be equal to 1, but here if we sum up all the new sample weights, we will get 0.8004. To bring this sum equal to 1, we will normalize these weights by dividing all the weights by the total sum of updated weights, which is 0.8004. So, after normalizing the sample weights, we get this dataset, and now the sum is equal to 1.

Step 5: Decrease Errors

Now, we need to make a new dataset to see if the errors decreased or not. For this, we will remove the “sample weights” and “new sample weights” columns and then, based on the “new sample weights,” divide our data points into buckets.

Step 6: New Dataset

We are almost done. Now, what the algorithm does is selects random numbers from 0-1. Since incorrectly classified records have higher sample weights, the probability of selecting those records is very high.

Suppose the 5 random numbers our algorithm take is 0.38,0.26,0.98,0.40,0.55.

Now we will see where these random numbers fall in the bucket, and according to it, we’ll make our new dataset shown below.

This comes out to be our new dataset, and we see the data point, which was wrongly classified, has been selected 3 times because it has a higher weight.

ReadMore about the Gradient Boosting Algorithm for Beginners

Step 7: Repeat Previous Steps

Now this act as our new dataset, and we need to repeat all the above steps i.e.

Assign equal weights to all the data points.
Find the stump that does the best job classifying the new collection of samples by finding their Gini Index and selecting the one with the lowest Gini index.
Calculate the “Amount of Say” and “Total error” to update the previous sample weights.
Normalize the new sample weights.

Iterate through these steps until and unless a low training error is achieved.

Suppose, with respect to our dataset, we have constructed 3 decision trees (DT1, DT2, DT3) in a sequential manner. If we send our test data now, it will pass through all the decision trees, and finally, we will see which class has the majority, and based on that, we will do predictions
for our test dataset.

Python implementation of AdaBoost

To implement the AdaBoost algorithm in Python, you can either build it from scratch or use libraries like Scikit-learn.

Building AdaBoost from Scratch

Here’s a simple implementation of the AdaBoost algorithm using only NumPy:

python

import numpy as np

class DecisionStump:
    def __init__(self):
        self.polarity = 1
        self.feature_idx = None
        self.threshold = None
        self.alpha = None

    def predict(self, X):
        n_samples = X.shape[0]
        predictions = np.ones(n_samples)
        feature_column = X[:, self.feature_idx]

        if self.polarity == 1:
            predictions[feature_column < self.threshold] = -1
        else:
            predictions[feature_column > self.threshold] = -1

        return predictions

class AdaBoost:
    def __init__(self, n_clf=5):
        self.n_clf = n_clf
        self.clfs = []

    def fit(self, X, y):
        n_samples, n_features = X.shape
        w = np.full(n_samples, (1 / n_samples))

        for _ in range(self.n_clf):
            clf = DecisionStump()
            min_error = float('inf')

            for feature_i in range(n_features):
                X_column = X[:, feature_i]
                thresholds = np.unique(X_column)

                for threshold in thresholds:
                    predictions = np.ones(n_samples)
                    predictions[X_column < threshold] = -1

                    error = sum(w[y != predictions])

                    if error > 0.5:
                        error = 1 - error
                        p = -1
                    else:
                        p = 1

                    if error < min_error:
                        clf.polarity = p
                        clf.threshold = threshold
                        clf.feature_idx = feature_i
                        min_error = error

            EPS = 1e-10
            clf.alpha = 0.5 * np.log((1.0 - min_error + EPS) / (min_error + EPS))
            predictions = clf.predict(X)
            w *= np.exp(-clf.alpha * y * predictions)
            w /= np.sum(w)
            self.clfs.append(clf)

    def predict(self, X):
        clf_preds = [clf.alpha * clf.predict(X) for clf in self.clfs]
        y_pred = np.sum(clf_preds, axis=0)
        return np.sign(y_pred)

Using Scikit-learn

If you prefer a more straightforward approach, you can use the Scikit-learn library, which has a built-in AdaBoost classifier. Here’s how to do it:

from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import pandas as pd

# Load dataset
data = pd.read_csv("Iris.csv")  # Adjust the file path as necessary
X = data.iloc[:, :-1].values  # Features
y = data.iloc[:, -1].values  # Target

# Split the dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create the AdaBoost classifier
abc = AdaBoostClassifier(base_estimator=DecisionTreeClassifier(max_depth=1), n_estimators=50)

# Fit the model
abc.fit(X_train, y_train)

# Predict and evaluate
y_pred = abc.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))

Here You Can Clear your Doubts with Linear Regression With Python Implementation

Conclusion

You have finally mastered this algorithm if you understand each and every line of this article.

We started by introducing you to what Boosting is and what are its various types to make sure that you understand the Adaboost classifier and where AdaBoost algorithm falls exactly. We then applied straightforward math and saw how every part of the formula worked.

Hope you like the article! The AdaBoost algorithm, also known as the Ada boosting algorithm, enhances the performance of weak classifiers by combining their predictions. This powerful technique, often referred to as Ada boost, improves accuracy significantly.

In the next article, I will explain Gradient Descent and Xtreme Gradient Descent algorithm, which are a few more important Boosting techniques to enhance the prediction power.

If you want to know about the python implementation for beginners of the AdaBoost classifier machine learning model from scratch, then visit this complete guide from analytics vidhya. This article mentions the difference between bagging and ada boosting, as well as the advantages and disadvantages of the AdaBoost algorithm.

Checkout this article about AdaBoost with Introduction to Ensemble learning

Key Takeaways:

In this article, we understood how ada boosting works.
We understood the maths behind adaboost.
We learned how weak learners are used as estimators to increase accuracy.

Frequently Asked Questions

Q1. Is the AdaBoost algorithm supervised or unsupervised?

A. Adaboost falls under the supervised learning branch of machine learning. This means that the training data must have a target variable. Using the adaboost learning technique, we can solve both classification and regression problems.

Q2. What are the advantages of the AdaBoost algorithm?

A. Lesser preprocessing is required, as you do not need to scale the independent variables. Each iteration in the AdaBoost algorithm uses decision stumps as individual models, so the preprocessing required is the same as decision trees. AdaBoost is less prone to overfitting as well. In addition to ada boost weak learners, we can also fine-tune hyperparameters(learning_rate, for example) in these ensemble techniques to get even better accuracy.

Q3. How do you use the AdaBoost algorithm?

A. Much like random forests, decision trees, logistic regression, and svm classifiers, AdaBoost also requires the training data to have a target variable. This target variable could be either categorical or continuous. The scikit-learn library contains the Adaboost classifiers and regressors; hence we can use sklearn in python to create an adaboost model.

Q4.What is the difference between AdaBoost and boosting?

Boosting: Makes a strong learner from many weak ones. Focuses on improving past mistakes with each learner.

AdaBoost: A type of boosting. Focuses on hard-to-learn examples by giving them more weight during training.

Anshul

I have recently graduated with a Bachelor's degree in Statistics and am passionate about pursuing a career in the field of data science, machine learning, and artificial intelligence. Throughout my academic journey, I thoroughly enjoyed exploring data to uncover valuable insights and trends.

I am eager to continue learning and expanding my knowledge in the field of data science. I am particularly interested in exploring deep learning and natural language processing, and I am constantly seeking out new challenges to improve my skills. My ultimate goal is to use my expertise to help businesses and organizations make data-driven decisions and drive growth and success.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Basics of Machine Learning

Machine Learning Lifecycle

Importance of Stats and EDA

Understanding Data

Probability

Exploring Continuous Variable

Exploring Categorical Variables

Missing Values and Outliers

Central Limit theorem

Bivariate Analysis Introduction

Continuous - Continuous Variables

Continuous Categorical

Categorical Categorical

Multivariate Analysis

Different tasks in Machine Learning

Build Your First Predictive Model

Evaluation Metrics

Preprocessing Data

Linear Models

KNN

Selecting the Right Model

Feature Selection Techniques

Decision Tree

Feature Engineering

Naive Bayes

Multiclass and Multilabel

Basics of Ensemble Techniques

Advance Ensemble Techniques

Hyperparameter Tuning

Support Vector Machine

Advance Dimensionality Reduction

Unsupervised Machine Learning Methods

Recommendation Engines

Improving ML models

Working with Large Datasets

Interpretability of Machine Learning Models

Automated Machine Learning

Model Deployment

Deploying ML Models

Embedded Devices

Guide on AdaBoost Algorithm

Learning Objectives

Table of contents

What Is the AdaBoost Algorithm?

Understanding the Working of the AdaBoost Algorithm

Step 1: Assigning Weights

Step 2: Classify the Samples

Step 3: Calculate the Influence

Step 4: Calculate TE and Performance

Note

Step 5: Decrease Errors

Step 6: New Dataset

Step 7: Repeat Previous Steps

Python implementation of AdaBoost

Building AdaBoost from Scratch

Using Scikit-learn

Conclusion

Key Takeaways:

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck