What is Decision Tree?

Anshul Last Updated : 24 Mar, 2025

9 min read

Decision trees are a simple machine learning tool used for classification and regression tasks. They break complex decisions into smaller steps, making them easy to understand and implement. This article explains all about decision tree, how decision trees work, their advantages, disadvantages, and applications.

This article was published as a part of the Data Science Blogathon!

Understanding Decision Tree
Types of Decision Tree
Decision Tree Terminologies
How Decision Tree Algorithms Work?
Decision Tree Assumptions
Advantages of Decision Trees
Disadvantages of Decision Trees
How do Decision Trees use Entropy?
Applications of Decision Trees
Frequently Asked Questions

Understanding Decision Tree

A decision tree, which has a hierarchical structure made up of root, branches, internal, and leaf nodes, is a non-parametric supervised learning approach used for classification and regression applications.

It is a tool that has applications spanning several different areas. These trees can be used for classification as well as regression problems. The name itself suggests that it uses a flowchart like a tree structure to show the predictions that result from a series of feature-based splits. It starts with a root node and ends with a decision made by leaves.

Types of Decision Tree

ID3 : This algorithm measures how mixed up the data is at a node using something called entropy. It then chooses the feature that helps to clarify the data the most.C4.5 : This is an improved version of ID3 that can handle missing data and continuous attributes.

CART : This algorithm uses a different measure called Gini impurity to decide how to split the data. It can be used for both classification (sorting data into categories) and regression (predicting continuous values) tasks.

Decision Tree Terminologies

Before learning more about decision trees let’s get familiar with some of the terminologies:

Root Node: The initial node at the beginning of a decision tree, where the entire population or dataset starts dividing based on various features or conditions.
Decision Nodes: Nodes resulting from the splitting of root nodes are known as decision nodes. These nodes represent intermediate decisions or conditions within the tree.
Leaf Nodes: Nodes where further splitting is not possible, often indicating the final classification or outcome. Leaf nodes are also referred to as terminal nodes.
Sub-Tree: Similar to a subsection of a graph being called a sub-graph, a sub-section of a these tree is referred to as a sub-tree. It represents a specific portion of the decision tree.
Pruning: The process of removing or cutting down specific nodes in a tree to prevent overfitting and simplify the model.
Branch / Sub-Tree: A subsection of the entire is referred to as a branch or sub-tree. It represents a specific path of decisions and outcomes within the tree.
Parent and Child Node: In a decision tree, a node that is divided into sub-nodes is known as a parent node, and the sub-nodes emerging from it are referred to as child nodes. The parent node represents a decision or condition, while the child nodes represent the potential outcomes or further decisions based on that condition.

Checkout this article Step-by-Step Working of Decision Tree Algorithm

Example of Decision Tree

Let’s understand decision trees with the help of an example:

Decision trees are upside down which means the root is at the top and then this root is split into various several nodes. They are nothing but a bunch of if-else statements in layman terms. It checks if the condition is true and if it is then it goes to the next node attached to that decision.

In the below diagram the tree will first ask what is the weather? Is it sunny, cloudy, or rainy? If yes then it will go to the next feature which is humidity and wind. It will again check if there is a strong wind or weak, if it’s a weak wind and it’s rainy then the person may go and play.

Did you notice anything in the above flowchart? We see that if the weather is cloudy then we must go to play. Why didn’t it split more? Why did it stop there?
To answer this question, we need to know about few more concepts like entropy, information gain, and Gini index. But in simple terms, I can say here that the output for the training dataset is always yes for cloudy weather, since there is no disorderliness here we don’t need to split the node further.
The goal of machine learning is to decrease uncertainty or disorders from the dataset and for this, we use these trees.
Now you must be thinking how do I know what should be the root node? what should be the decision node? when should I stop splitting? To decide this, there is a metric called “Entropy” which is the amount of uncertainty in the dataset.

How Decision Tree Algorithms Work?

Decision Tree algorithm works in simpler steps:

Starting at the Root: The algorithm begins at the top, called the “root node,” representing the entire dataset.
Asking the Best Questions: It looks for the most important feature or question that splits the data into the most distinct groups. This is like asking a question at a fork in the tree.
Branching Out: Based on the answer to that question, it divides the data into smaller subsets, creating new branches. Each branch represents a possible route through the tree.
Repeating the Process: The algorithm continues asking questions and splitting the data at each branch until it reaches the final “leaf nodes,” representing the predicted outcomes or classifications.

Read More about the Tree Based Algorithms from Scratch

Decision Tree Assumptions

Several assumptions are made to build effective models when creating decision trees. These assumptions help guide the tree’s construction and impact its performance. Here are some common assumptions and considerations when creating decision trees:

Binary Splits

Decision trees typically make binary splits, meaning each node divides the data into two subsets based on a single feature or condition. This assumes that each decision can be represented as a binary choice.

Recursive Partitioning

Decision trees use a recursive partitioning process, where each node is divided into child nodes, and this process continues until a stopping criterion is met. This assumes that data can be effectively subdivided into smaller, more manageable subsets.

Feature Independence

These trees often assume that the features used for splitting nodes are independent. In practice, feature independence may not hold, but it can still perform well if features are correlated.

Homogeneity

It aim to create homogeneous subgroups in each node, meaning that the samples within a node are as similar as possible regarding the target variable. This assumption helps in achieving clear decision boundaries.

Top-Down Greedy Approach

They are constructed using a top-down, greedy approach, where each split is chosen to maximize information gain or minimize impurity at the current node. This may not always result in the globally optimal tree.

Advantages of Decision Trees

Easy to Understand: They are simple to visualize and interpret, making them easy to understand even for non-experts.
Handles Both Numerical and Categorical Data: They can work with both types of data without needing much preprocessing.
No Need for Data Scaling: These trees do not require normalization or scaling of data.
Automated Feature Selection: They automatically identify the most important features for decision-making.
Handles Non-Linear Relationships: They can capture non-linear patterns in the data effectively.

Disadvantages of Decision Trees

Overfitting Risk: It can easily overfit the training data, especially if they are too deep.
Unstable with Small Changes: Small changes in data can lead to completely different trees.
Biased with Imbalanced Data: They tend to be biased if one class dominates the dataset.
Limited to Axis-Parallel Splits: They struggle with diagonal or complex decision boundaries.
Can Become Complex: Large trees can become hard to interpret and may lose their simplicity.

How do Decision Trees use Entropy?

Now we know what entropy is and what is its formula, Next, we need to know that how exactly does it work in this algorithm.
Entropy basically measures the impurity of a node. Impurity is the degree of randomness; it tells how random our data is. Apure sub-splitmeans that either you should be getting “yes”, or you should be getting “no”.
Supposea featurehas 8 “yes” and 4 “no” initially, after the first split the left node gets 5 ‘yes’ and 2 ‘no’whereas right node gets 3 ‘yes’ and 2 ‘no’.
We see here the split is not pure, why? Because we can still see some negative classes in both the nodes. In order to make a this tree, we need to calculate the impurity of each split, and when the purity is 100%, we make it as a leaf node.

To check the impurity of feature 2 and feature 3 we will take the help for Entropy formula.

decision tree algorithm in machine learning

For feature 3,

We can clearly see from the tree itself that left node has low entropy or more purity than right node since left node has a greater number of “yes” and it is easy to decide here.
Always remember that the higher the Entropy, the lower will be the purity and the higher will be the impurity.
As mentioned earlier the goal of machine learning is to decrease the uncertainty or impurity in the dataset, here by using the entropy we are getting the impurity of a particular node, we don’t know if the parent entropy or the entropy of a particular node has decreased or not.
For this, we bring a new metric called “Information gain” which tells us how much the parent entropy has decreased after splitting it with some feature.

Understand about the Complete Flow of Decision Tree Algorithm

Applications of Decision Trees

Healthcare
- Diagnosing diseases based on patient symptoms: Decision trees help doctors analyze symptoms and medical history to identify potential illnesses. For example, they can determine if a patient has diabetes or heart disease by evaluating factors like age, weight, and test results.
- Predicting patient outcomes and treatment effectiveness: Decision trees can predict how a patient might respond to a specific treatment, helping doctors choose the best course of action.
- Identifying risk factors for specific health conditions: They can analyze data to find patterns, such as lifestyle habits or genetic factors, that increase the risk of diseases like cancer or diabetes.
Finance
- Assessing credit risk for loan approvals: Decision trees evaluate an applicant’s credit history, income, and other factors to decide whether to approve or reject a loan application.
- Detecting fraudulent transactions: By analyzing transaction patterns, decision trees can flag unusual or suspicious activities, helping banks prevent fraud.
- Predicting stock market trends and investment risks: They analyze historical data to forecast market trends, helping investors make informed decisions.
Marketing
- Segmenting customers for targeted campaigns: Decision trees group customers based on their behavior, preferences, or demographics, allowing businesses to create personalized marketing strategies.
- Predicting customer churn and retention: They analyze customer data to identify those likely to stop using a service, enabling companies to take proactive steps to retain them.
- Recommending products based on customer preferences: Decision trees suggest products or services to customers based on their past purchases or browsing history.
Education
- Predicting student performance and outcomes: It analyze factors like attendance, grades, and study habits to predict how well a student might perform in exams or courses.
- Identifying factors affecting student dropout rates: They help schools understand why students drop out, such as financial issues or academic struggles, so they can intervene early.
- Personalizing learning paths for students: These are recommend tailored learning materials or courses based on a student’s strengths and weaknesses.

Conclusion

To summarize, in this article we learned about decision trees. On what basis the tree splits the nodes and how to can stop overfitting. why linear regression doesn’t work in the case of classification problems.To check out the full implementation of these please refer to my Github repository. You can master all the Data Science topics with our Black Belt Plus Program with out 50+ projects and 20+ tools. We hope you like this article, and get clear understanding on decision tree algorithm, decision tree examples that will help you to get clear understanding .Start your learning journey today!

Frequently Asked Questions

Q1.Why is it called a decision tree?

A. A decision tree is a tree-like structure that represents a series of decisions and their possible consequences. It is used in machine learning for classification and regression tasks. An example of a decision tree is a flowchart that helps a person decide what to wear based on the weather conditions.

Q2. What are the three types of decision trees?

The three main types are:
Classification Trees: Used to predict categories (e.g., yes/no, spam/not spam).
Regression Trees: Used to predict numerical values (e.g., house prices, temperature).
CART (Classification and Regression Trees): A combination of both classification and regression trees.

Q3. What are the 4 types of decision tree?

A. The four types of decision trees are Classification tree, Regression tree, Cost-complexity pruning tree, and Reduced Error Pruning tree.

Q4. What is a decision tree algorithm?

A. A decision tree algorithm is a machine learning algorithm that uses a decision tree to make predictions. It follows a tree-like model of decisions and their possible consequences. The algorithm works by recursively splitting the data into subsets based on the most significant feature at each node of the tree.

Q5.What is an example of a decision tree?

A. A decision tree is like a flowchart that helps make decisions. For example, imagine deciding whether to play outside or stay indoors. The tree might ask, “Is it raining?” If yes, you stay indoors. If no, it might ask, “Is it too hot?” and so on, until you reach a decision.

Anshul

I have recently graduated with a Bachelor's degree in Statistics and am passionate about pursuing a career in the field of data science, machine learning, and artificial intelligence. Throughout my academic journey, I thoroughly enjoyed exploring data to uncover valuable insights and trends.

I am eager to continue learning and expanding my knowledge in the field of data science. I am particularly interested in exploring deep learning and natural language processing, and I am constantly seeking out new challenges to improve my skills. My ultimate goal is to use my expertise to help businesses and organizations make data-driven decisions and drive growth and success.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Rashi Khandelwal

Hey Anshul, Just wanted to appreciate you for an amazing explanation about theoretical concepts behind decision tree. The crisp clarity and easy, simple and relatable examples made the understanding better to core. Thank You

Saurabh Prasad

The contents provided in this blog are extremely good. I can easily relate to the explanations and gain a better understanding. But still, I can't access it completely due to the inserted images in case of trees, formulae and tables. I will have to rely on some other sources. Please improve on this. Will be helpful for many. You may refer "Geeks for Geeks" or "stack Overflow" to understand how trees, formulae or tables can be written as text.

Dileep Koneru

Thanks for sharing this info. The content & presentation is excellent.

Reading list

Basics of Machine Learning

Machine Learning Lifecycle

Importance of Stats and EDA

Understanding Data

Probability

Exploring Continuous Variable

Exploring Categorical Variables

Missing Values and Outliers

Central Limit theorem

Bivariate Analysis Introduction

Continuous - Continuous Variables

Continuous Categorical

Categorical Categorical

Multivariate Analysis

Different tasks in Machine Learning

Build Your First Predictive Model

Evaluation Metrics

Preprocessing Data

Linear Models

KNN

Selecting the Right Model

Feature Selection Techniques

Decision Tree

Feature Engineering

Naive Bayes

Multiclass and Multilabel

Basics of Ensemble Techniques

Advance Ensemble Techniques

Hyperparameter Tuning

Support Vector Machine

Advance Dimensionality Reduction

Unsupervised Machine Learning Methods

Recommendation Engines

Improving ML models

Working with Large Datasets

Interpretability of Machine Learning Models

Automated Machine Learning

Model Deployment

Deploying ML Models

Embedded Devices

What is Decision Tree?

Table of contents

Understanding Decision Tree

Types of Decision Tree

Decision Tree Terminologies

Example of Decision Tree

How Decision Tree Algorithms Work?

Decision Tree Assumptions

Binary Splits

Recursive Partitioning

Feature Independence

Homogeneity

Top-Down Greedy Approach

Advantages of Decision Trees

Disadvantages of Decision Trees

How do Decision Trees use Entropy?

Applications of Decision Trees

Conclusion

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck