Elbow Method for Optimal Cluster Number in K-Means

Basil Saji Last Updated : 10 Mar, 2025

9 min read

Clustering is an unsupervised machine-learning technique. It is the process of division of the dataset into groups in which the members in the same group possess similarities in features. The commonly used clustering techniques are K-Means clustering, Hierarchical clustering, Density-based clustering, Model-based clustering, etc. It can even handle large datasets. We can implement the K-Means clustering machine learning algorithm in the elbow method using the scikit-learn library in Python.

Learning Objectives

Learn the basics of clustering in unsupervised machine learning and its significance in data analysis.
Grasp the steps and processes involved in the K-Means clustering algorithm.
Apply the Elbow method to determine the optimal number of clusters in K-Means using Python.
Assess the quality of K-Means clusters using Within-Cluster Sum of Squares (WCSS).
Recognize the drawbacks of the Elbow method and K-Means, and explore alternative techniques for cluster determination.

This article was published as a part of the Data Science Blogathon

What is the Elbow Method ?
What is the Elbow Method in K-Means Clustering?
K Means Clustering Using the Elbow Method
Elbow method k means
Implementation of the Elbow Method
Full Code
Elbow Method Drawbacks
Conclusion
Frequently Asked Questions

Quiz Time

Are you up for a challenge in the realm of the Elbow Method for Finding the Optimal Number of Clusters in K-Means? Challenge yourself here!

What is the Elbow Method ?

The Elbow Method is a technique used in data analysis and machine learning for determining the optimal number of clusters in a dataset. It involves plotting the variance explained by different numbers of clusters and identifying the “elbow” point, where the rate of variance decreases sharply levels off, suggesting an appropriate cluster count for analysis or model training.

This method is a visual technique used to determine the best K value for a k-means clustering algorithm. In this method, a graph known as the elbow graph plots the within-cluster-sum-of-square (WCSS) values against various K values. The optimal K value is identified at the point where the graph bends like an elbow.

What is the Elbow Method in K-Means Clustering?

The elbow method is a graphical representation of finding the optimal ‘K’ in a K-means clustering. It works by finding WCSS (Within-Cluster Sum of Square) i.e. the sum of the square distance between points in a cluster and the cluster centroid.

Let’s go through the steps involved in K-means clustering for a better understanding:

Select the number of clusters for the dataset (K)
Select the K number of centroids randomly from the dataset.
Now we will use Euclidean distance or Manhattan distance as the metric to calculate the distance of the points from the nearest centroid and assign the points to that nearest cluster centroid, thus creating K clusters.
Now we find the new centroid of the clusters thus formed.
Again reassign the whole data point based on this new centroid, then repeat step 4. We will continue this for a given number of iterations until the position of the centroid doesn’t change, i.e., there is no more convergence.

Finding the optimal number of clusters is an important part of this algorithm. A commonly used method for finding the optimum K value is Elbow Method.

Checkout this article how K means Clustering Works!

K Means Clustering Using the Elbow Method

In the Elbow method, we are actually varying the number of clusters (K) from 1 – 10. For each value of K, we are calculating WCSS (Within-Cluster Sum of Square). WCSS is the sum of the squared distance between each point and the centroid in a cluster. When we plot the WCSS with the K value, the plot looks like an Elbow. As the number of clusters increases, the WCSS value will start to decrease. WCSS value is largest when K = 1. When we analyze the graph, we can see that the graph will rapidly change at a point and thus creating an elbow shape. From this point, the graph moves almost parallel to the X-axis. The K value corresponding to this point is the optimal value of K or an optimal number of clusters.

K-Means Clustering number of clusters | | elbow method

Now let’s implement K-Means clustering using Python.

Elbow method k means

The elbow method is a common technique used to determine the optimal number of clusters (k) in k-means clustering. It’s a graphical approach that relies on the idea that as you increase the number of clusters, the sum of squared distances between points and their cluster centers (WCSS) will continue to decrease. This is because you’re essentially splitting the data into increasingly finer groups.

Here’s how it works:

Calculate WCSS for different K values: You iterate through a range of possible k values (number of clusters). For each k, you perform k-means clustering and calculate the WCSS. This represents the total variance within each cluster.
Plot the WCSS vs. K: You plot the WCSS values on the y-axis and the number of clusters (k) on the x-axis.
Identify the elbow point: The ideal k value is identified as the “elbow” of the curve. This is the point where the WCSS starts to decrease at a much slower rate. In essence, you’re looking for the point where adding more clusters isn’t significantly improving the fit within the clusters.

Here are some things to keep in mind about the elbow method:

It can be subjective: The “elbow” might not always be a very sharp bend, and it can be difficult to determine the exact point objectively.
Not always the best method: The elbow method might not be suitable for all datasets, especially for those with high dimensionality or clusters of irregular shapes.
Consider other factors: While the elbow method is a helpful starting point, it’s important to consider other factors like the interpretability of the clusters and the domain knowledge of the data when choosing the final k value.pen_spark

tunesharemore_vert

Read More about this article how to choose the value if K in KNN Algorithm

Implementation of the Elbow Method

Sample Dataset

The dataset we are using here is the Mall Customers data (Download here). It’s unlabeled data that contains the details of customers in a mall (features like genre, age, annual income(k$), and spending score). Our aim is to cluster the customers based on the relevant features of annual income and spending score.

Importing Libraries

First of all, we have to import the essential libraries.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd 
import sklearn

Now let’s import the given dataset and slice the important features.

dataset = pd.read_csv('Mall_Customers.csv')
X = dataset.iloc[:, [3, 4]].values

We have to find the optimal K value for clustering the data. Now we are using the Elbow Method to find the optimal K value.

from sklearn.cluster import KMeans
wcss = [] for i in range(1, 11): 
    kmeans = KMeans(n_clusters = i, init = 'k-means++', random_state = 42)
    kmeans.fit(X) 
    wcss.append(kmeans.inertia_)

The “init” argument is the method for initializing the centroid. We calculated the WCSS value for each K value. Now we have to plot the WCSS with the K value.

The graph will be like this:

Training the Model

The point at which the elbow shape is created is 5; that is, our K value or an optimal number of clusters is 5. Now let’s train the model on the input data with a number of clusters 5.

kmeans = KMeans(n_clusters = 5, init = "k-means++", random_state = 42)
y_kmeans = kmeans.fit_predict(X)

y_kmeans will be:

array([3, 0, 3, 0, 3, 0, 3, 0, 3, 0, 3, 0, 3, 0, 3, 0, 3, 0, 3, 0, 3, 0,
       3, 0, 3, 0, 3, 0, 3, 0, 3, 0, 3, 0, 3, 0, 3, 0, 3, 0, 3, 0, 3, 1,
       3, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 4, 2, 1, 2, 4, 2, 4, 2,
       1, 2, 4, 2, 4, 2, 4, 2, 4, 2, 1, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2,
       4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2,
       4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2,
       4, 2])

y_kmeans gives us different clusters corresponding to X. Now, let’s plot all the clusters using matplotlib.

plt.scatter(X[y_kmeans == 0, 0], X[y_kmeans == 0, 1], s = 60, c = 'red', label = 'Cluster1')
plt.scatter(X[y_kmeans == 1, 0], X[y_kmeans == 1, 1], s = 60, c = 'blue', label = 'Cluster2')
plt.scatter(X[y_kmeans == 2, 0], X[y_kmeans == 2, 1], s = 60, c = 'green', label = 'Cluster3)
plt.scatter(X[y_kmeans == 3, 0], X[y_kmeans == 3, 1], s = 60, c = 'violet', label = 'Cluster4')
plt.scatter(X[y_kmeans == 4, 0], X[y_kmeans == 4, 1], s = 60, c = 'yellow', label = 'Cluster5') 
plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s = 100, c = 'black', label = 'Centroids')
plt.xlabel('Annual Income (k$)') plt.ylabel('Spending Score (1-100)') plt.legend() 

plt.show()

Graph:

Now we will visualize the clusters using the scatter plot. As you can see, there are 5 clusters in total that are visualized in different colors, and the centroid of each cluster is visualized in black color.

Full Code

# Importing the libraries
import numpy as np 
import matplotlib.pyplot as plt
import pandas as pd # Importing the dataset 

X = dataset.iloc[:, [3, 4]].values
dataset = pd.read_csv('Mall_Customers.csv') 

from sklearn.cluster import KMeans

# Using the elbow method to find the optimal number of clusters wcss = [] for i in range(1, 11): 
 wcss.append(kmeans.inertia_)
 kmeans = KMeans(n_clusters = i, init = 'k-means++', random_state = 42) kmeans.fit(X) plt.plot(range(1, 11), wcss) plt.xlabel('Number of clusters') 

y_kmeans = kmeans.fit_predict(X)

plt.ylabel('WCSS') plt.show() # Training the K-Means model on the dataset kmeans = KMeans(n_clusters = 5, init = 'k-means++', random_state = 42) y_kmeans = kmeans.fit_predict(X)

# Visualising the clusters
plt.scatter( X[y_kmeans == 1, 0], X[y_kmeans == 1, 1], s = 60, c = 'blue', label = 'Cluster2')
plt.scatter( X[y_kmeans == 0, 0], X[y_kmeans == 0, 1], s = 60, c = 'red', label = 'Cluster1') plt.scatter( X[y_kmeans == 2, 0], X[y_kmeans == 2, 1], s = 60, c = 'green', label = 'Cluster3') 
plt.scatter( kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s = 100, c = 'black', label = 'Centroids')
plt.scatter( X[y_kmeans == 3, 0], X[y_kmeans == 3, 1], s = 60, c = 'violet', label = 'Cluster4') plt.scatter( X[y_kmeans == 4, 0], X[y_kmeans == 4, 1], s = 60, c = 'yellow', label = 'Cluster5') plt.xlabel('Annual Income (k$)') plt.ylabel('Spending Score (1-100)') plt.legend() 

plt.show()

Elbow Method Drawbacks

The elbow method, while a useful tool for determining the optimal number of clusters in K-means clustering, has some drawbacks:

Subjectivity: The choice of the “elbow point” can be subjective and might vary between individuals analyzing the same data.
Non-Gaussian Data: It assumes that clusters are spherical and equally sized, which may not hold for complex datasets with irregularly shaped or differently sized clusters.
Sensitivity to Initialization: K-means itself is sensitive to initial cluster centroids, which can affect the WCSS values and, consequently, the choice of the optimal K.
Inefficient for Large Datasets: For large datasets, calculating WCSS for a range of K values can be computationally expensive and time-consuming.
Unsuitable for All Distributions: The elbow method is not suitable for all data distributions, especially when clusters have varying densities or are non-convex.
Limited to K-means: It specifically applies to K-means clustering and may not be suitable for other clustering algorithms with different objectives.

Despite these drawbacks, the elbow method remains a valuable starting point for selecting the number of clusters, and it often provides useful insights into the data’s underlying structure. However, it’s essential to complement it with other validation techniques when working with more complex datasets or different clustering algorithms.

Conclusion

In this article, we covered the basic concepts of the K-Means Clustering algorithm in Machine Learning. We used the Elbow method to find the optimal K value for clustering the data in our sample data set. We then used the matplotlib Python library to visualize the clusters as a scatterplot graph. In the upcoming articles, we can learn more about different ML Algorithms.

Key Takeaways

K-Means is a popular unsupervised machine-learning algorithm widely used by Data Scientists on unlabeled data.
The k-Means Elbow method is used to find the optimal value of the K in the K-Means algorithm.

Frequently Asked Questions

Q1. How does elbow method work?

A. The elbow method is a technique used in clustering analysis to determine the optimal number of clusters. It involves plotting the within-cluster sum of squares (WCSS) for different cluster numbers and identifying the “elbow” point where WCSS starts to level off.

Q2. What is the elbow method of clustering?

A. The elbow method is a clustering validation technique. It helps find the optimal number of clusters by plotting WCSS for a range of cluster counts and selecting the point where the plot forms an “elbow” or a significant change in the rate of decrease.

Q3. What is the elbow method in math?

A. The elbow method finds the best number of clusters by plotting the variance explained against the number of clusters. The elbow point, where the slope changes, indicates the optimal number.

Q4. Are you still using the elbow method?

A. It’s a valuable technique in machine learning for selecting the appropriate number of clusters in various applications.

Basil Saji

Python Developer, ML Enthusiast, Blogger and an Electronics and Communication Engineering aspirant determined and motivated to finish tasks with atmost sincerity and dedication.I'am a good learner who ready to accept challenges to bring up my best even in the worst. Wish for a world with enough advancements and opportunities for the people.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Basics of Machine Learning

Machine Learning Lifecycle

Importance of Stats and EDA

Understanding Data

Probability

Exploring Continuous Variable

Exploring Categorical Variables

Missing Values and Outliers

Central Limit theorem

Bivariate Analysis Introduction

Continuous - Continuous Variables

Continuous Categorical

Categorical Categorical

Multivariate Analysis

Different tasks in Machine Learning

Build Your First Predictive Model

Evaluation Metrics

Preprocessing Data

Linear Models

KNN

Selecting the Right Model

Feature Selection Techniques

Decision Tree

Feature Engineering

Naive Bayes

Multiclass and Multilabel

Basics of Ensemble Techniques

Advance Ensemble Techniques

Hyperparameter Tuning

Support Vector Machine

Advance Dimensionality Reduction

Unsupervised Machine Learning Methods

Recommendation Engines

Improving ML models

Working with Large Datasets

Interpretability of Machine Learning Models

Automated Machine Learning

Model Deployment

Deploying ML Models

Embedded Devices

Elbow Method for Optimal Cluster Number in K-Means

Learning Objectives

Table of contents

What is the Elbow Method ?

What is the Elbow Method in K-Means Clustering?

K Means Clustering Using the Elbow Method

Elbow method k means

Implementation of the Elbow Method

Sample Dataset

Importing Libraries

Training the Model

Full Code

Elbow Method Drawbacks

Conclusion

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID