Know The Best Evaluation Metrics for Your Regression Model !

Raghav Agrawal Last Updated : 04 Apr, 2025

10 min read

Evaluation Metrics for regression are essential for assessing the performance of regression models specifically. These metrics help in measuring how well a regression model is able to predict continuous outcomes. Common regression evaluation metrics for regression include Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), R-squared (Coefficient of Determination), and Mean Absolute Percentage Error (MAPE). By utilizing these regression-specific metrics, data scientists and machine learning engineers can evaluate the accuracy and effectiveness of their metrics for regression models in making predictions.

Regression is also one type of supervised Machine learning and in this tutorial, we will discuss various metrics for evaluating regression Models and How to implement them using the sci-kit-learn library.

Learning Objectives:

Understand the importance of evaluation metrics in assessing regression model performance
Learn about various regression evaluation metrics like MAE, MSE, RMSE, R-squared, etc.
Gain knowledge on implementing these metrics using Python’s scikit-learn library
Also, the Regression metrics quantify the accuracy of predictive models by measuring the difference between predicted and actual values

This article was published as a part of the Data Science Blogathon

Regression
Why We Require Evaluation Metrics?
Types of Regression Metrics
Example of Using Regression Metrics on Different Dataset
Conclusion
Frequently Asked Questions

Regression

Regression is a type of Machine learning which helps in finding the relationship between independent and dependent variables.

In simple words, Regression can be defined as a Machine learning problem where we have to predict continuous values like price, Rating, Fees, etc.

Why We Require Evaluation Metrics?

Most beginners and practitioners most of the time do not bother about the model performance. The talk is about building a well-generalized model, Machine learning model cannot have 100 per cent efficiency otherwise the model is known as a biased model. which further includes the concept of overfitting and underfitting.

It is necessary to obtain the accuracy on training data, But it is also important to get a genuine and approximate result on unseen data otherwise Model is of no use.

So to build and deploy a generalized model we require to Evaluate the model on different regression model evaluation metrics. These metrics helps us to better optimize the performance, fine-tune it, and obtain a better result.

If one metric is perfect, there is no need for multiple metrics. To understand the benefits and disadvantages of regression evaluation metrics for regression because different evaluation metric fits on a different set of a dataset.

Now, I hope you get the importance of Evaluation metrics. let’s start understanding various regression evaluation metrics used for regression tasks.

Read this article about Evaluating Regression Models

Dataset

For demonstrating each evaluation metric using the sci-kit-learn library we will use the placement dataset which is a simple linear dataset that looks something like this.

Now I am applying linear regression on the particular dataset and after that, we will study each evaluation metric and check it on our Linear Regression model.

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
import pandas as pd

cgpa = [6.89, 5.12, 7.82, 7.42, 6.94, 7.89, 6.73, 6.75, 6.09]
package = [3.26, 1.98, 3.25, 3.67, 3.57, 2.99, 2.6, 2.48, 2.31]
df = pd.DataFrame({'cgpa' : cgpa, 'package' : package})
y = df['package']
X = df.drop('package', axis = 1)

X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=2)

lr = LinearRegression()
lr.fit(X_train,y_train)
y_pred = lr.predict(X_test)
print(y_pred)

let’s start Exploring various evaluation metrics for regression.

Also, you can check 12 important Model for Evaluation Metrics for Machine Learning

Types of Regression Metrics

Mean Absolute Error(MAE)
Mean Squared Error(MSE)
Root Mean Squared Error(RMSE)
Root Mean Squared Log Error(RMSLE)
R Squared (R2)
R Squared (R2)

Mean Absolute Error(MAE)

MAE is a very simple metric which calculates the absolute difference between actual and predicted values.

To better understand, let’s take an example you have input data and output data and use Linear Regression, which draws a best-fit line.

Now you have to find the MAE of your model which is basically a mistake made by the model known as an error. Now find the difference between the actual value and predicted value that is an absolute error but we have to find the mean absolute of the complete dataset.

so, sum all the errors and divide them by a total number of observations And this is MAE. And we aim to get a minimum MAE because this is a loss.

Advantages of MAE

The MAE you get is in the same unit as the output variable.
It is most Robust to outliers.

Disadvantages of MAE

The graph of MAE is not differentiable so we have to apply various optimizers like Gradient descent which can be differentiable.

from sklearn.metrics import mean_absolute_error
print("MAE",mean_absolute_error(y_test,y_pred))

Now to overcome the disadvantage of MAE next metric came as MSE.

Mean Squared Error(MSE)

MSE is a most used and very simple metric with a little bit of change in mean absolute error. Mean squared error states that finding the squared difference between actual and predicted value.

So, above we are finding the absolute difference and here we are finding the squared difference.

What actually the MSE represents? It represents the squared distance between actual and predicted values. we perform squared to avoid the cancellation of negative terms and it is the benefit of MSE.

MSE(Mean squared error) evaluation metrics for regression

Advantages of MSE

The graph of MSE is differentiable, so you can easily use it as a loss function.

Disadvantages of MSE

The value you get after calculating MSE is a squared unit of output. for example, the output variable is in meter(m) then after calculating MSE the output we get is in meter squared.
If you have outliers in the dataset then it penalizes the outliers most and the calculated MSE is bigger. So, in short, It is not Robust to outliers which were an advantage in MAE.

from sklearn.metrics import mean_squared_error
print("MSE",mean_squared_error(y_test,y_pred))

Root Mean Squared Error(RMSE)

As RMSE is clear by the name itself, that it is a simple square root of mean squared error.

evaluation metrics for regression | Rmse

Advantages of RMSE

The output value you get is in the same unit as the required output variable which makes interpretation of loss easy.

Disadvantages of RMSE

It is not that robust to outliers as compared to MAE.

for performing RMSE we have to NumPy NumPy square root function over MSE.

print("RMSE",np.sqrt(mean_squared_error(y_test,y_pred)))

Most of the time people use RMSE as an evaluation metric and mostly when you are working with deep learning techniques the most preferred metric is RMSE.

Root Mean Squared Log Error(RMSLE)

Taking the log of the RMSE metric slows down the scale of error. The metric is very helpful when you are developing a model without calling the inputs. In that case, the output will vary on a large scale.

To control this situation of RMSE we take the log of calculated RMSE error and resultant we get as RMSLE.
To perform RMSLE we have to use the NumPy log function over RMSE.

print("RMSE",np.log(np.sqrt(mean_squared_error(y_test,y_pred))))

It is a very simple metric that is used by most of the datasets hosted for Machine Learning competitions.

R Squared (R2)

R2 score is a metric that tells the performance of your model, not the loss in an absolute sense that how many wells did your model perform.

In contrast, MAE and MSE depend on the context as we have seen whereas the R2 score is independent of context.

So, with help of R squared we have a baseline model to compare a model which none of the other metrics provides. The same we have in classification problems which we call a threshold which is fixed at 0.5. So basically R2 squared calculates how must regression line is better than a mean line.

Hence, R2 squared is also known as Coefficient of Determination or sometimes also known as Goodness of fit.

r2 evaluation metrics for regression — R2 Squared

Now, how will you interpret the R2 score? suppose If the R2 score is zero then the above regression line by mean line is equal means 1 so 1-1 is zero. So, in this case, both lines are overlapping means model performance is worst, It is not capable to take advantage of the output column.

Now the second case is when the R2 score is 1, it means when the division term is zero and it will happen when the regression line does not make any mistake, it is perfect. In the real world, it is not possible.

So we can conclude that as our regression line moves towards perfection, R2 score move towards one. And the model performance improves.

The normal case is when the R2 score is between zero and one like 0.8 which means your model is capable to explain 80 per cent of the variance of data.

from sklearn.metrics import r2_score
r2 = r2_score(y_test,y_pred)
print(r2)

Adjusted R Squared

The disadvantage of the R2 score is while adding new features in data the R2 score starts increasing or remains constant but it never decreases because It assumes that while adding more data variance of data increases.

But the problem is when we add an irrelevant feature in the dataset then at that time R2 sometimes starts increasing which is incorrect.

Hence, To control this situation Adjusted R Squared came into existence.

Now as K increases by adding some features so the denominator will decrease, n-1 will remain constant. R2 score will remain constant or will increase slightly so the complete answer will increase and when we subtract this from one then the resultant score will decrease. so this is the case when we add an irrelevant feature in the dataset.

And if we add a relevant feature then the R2 score will increase and 1-R2 will decrease heavily and the denominator will also decrease so the complete term decreases, and on subtracting from one the score increases.

n=40
k=2
adj_r2_score = 1 - ((1-r2)*(n-1)/(n-k-1))
print(adj_r2_score)

Hence, this metric becomes one of the most important metrics to use during the evaluation of the model.

Example of Using Regression Metrics on Different Dataset

Here are a few examples of scenarios where you might write about “Using Regression Metrics”:

1.Predictive Modeling in Real Estate

You are building a regression model to predict house prices based on features like square footage, number of bedrooms, location, and age of the property. After training the model, you need to evaluate its performance. You can write about how you used regression metrics such as:

Mean Absolute Error (MAE): To measure the average absolute difference between predicted and actual house prices.
Mean Squared Error (MSE): To penalize larger errors more heavily, which is useful if you want to avoid significant over- or under-predictions.
R-squared (R²): To determine how well the model explains the variance in house prices.

You might conclude that while the model has a low MAE, the R² value is moderate, indicating that additional features or more data might be needed to improve the model.

2.Sales Forecasting for a Retail Business

You are tasked with predicting monthly sales for a retail store using historical sales data, marketing spend, and seasonal trends. After training a regression model, you evaluate its performance using metrics like:

Root Mean Squared Error (RMSE): To understand the typical error in your sales predictions in the same units as the sales data.
Mean Absolute Percentage Error (MAPE): To express the error as a percentage of the actual sales, which is useful for communicating the model’s accuracy to stakeholders.

You might find that the model performs well during non-holiday seasons but struggles during peak shopping periods, indicating a need to incorporate more granular seasonal data.

3.Energy Consumption Prediction

You are working on a project to predict energy consumption for a manufacturing plant based on factors like production volume, weather conditions, and time of day. You use regression metrics to assess the model:

Explained Variance Score: To measure how much of the variability in energy consumption is explained by the model.
Median Absolute Error: To evaluate the model’s robustness to outliers, such as unexpected spikes in energy usage.

You might discover that the model performs well overall but struggles to predict extreme values, suggesting the need for outlier detection or a more robust algorithm.

4.Student Performance Prediction

You are developing a model to predict students’ final exam scores based on factors like attendance, homework grades, and midterm scores. You use regression metrics to evaluate the model:

R-squared (R²): To determine how well the model explains the variance in exam scores.
Mean Squared Error (MSE): To assess the average squared difference between predicted and actual scores.

You might find that the model performs well for students with average scores but struggles to predict high or low performers, indicating a potential need for stratified sampling or additional features.

5.Stock Price Prediction

You are building a regression model to predict the future price of a stock based on historical prices, trading volume, and macroeconomic indicators. You evaluate the model using:

Mean Absolute Error (MAE): To measure the average error in price predictions.
Root Mean Squared Error (RMSE): To emphasize larger errors, which are critical in financial applications.
R-squared (R²): To assess how well the model captures the variability in stock prices.

You might conclude that while the model performs reasonably well, the high volatility of stock prices makes it challenging to achieve high accuracy, and you might explore additional features like news sentiment analysis.

Conclusion

Evaluating metrics for regression models using appropriate metrics is crucial for assessing their performance and making informed decisions. By understanding and utilizing metrics like MAE, MSE, RMSE, R-squared, and others, data scientists can quantify the accuracy, goodness of fit, and overall effectiveness of their models. Ultimately, these regression evaluation metrics serve as valuable tools for model selection, optimization, and deployment in real-world regression problems.

Key Takeaways:

Evaluation metrics quantify how well a regression model performs on unseen data
Different metrics capture different aspects of model performance (error, variance explained, etc.)
Interpreting multiple metrics provides a comprehensive understanding of a model’s strengths and limitations
Regression metrics are essential tools for evaluating the performance of predictive models, helping to quantify accuracy and guide improvements.

Frequently Asked Questions

Q1. What is the evaluation metric for regression?

A. The evaluation metric for regression includes Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and R-squared (R²).

Q2. What are the performance metrics for regression classification?

A. Performance metrics for regression classification are MSE, RMSE, MAE, R², and Mean Absolute Percentage Error (MAPE).

Q3. What is the R2 metric of regression?

A. The R² metric, or coefficient of determination, measures the proportion of variance in the dependent variable predictable from the independent variables.

Q4. How to measure performance of regression?

A. Measuring Regression Performance
Error Metrics: MSE, RMSE, MAE (penalize errors differently)
Goodness-of-Fit: R², Adjusted R² (explain variance)
Other: MAPE (percentage error)
Choose metric based on: outliers, interpretability, business impact, model comparison.

Raghav Agrawal

I am a software Engineer with a keen passion towards data science. I love to learn and explore different data-related techniques and technologies. Writing articles provide me with the skill of research and the ability to make others understand what I learned. I aspire to grow as a prominent data architect through my profession and technical content writing as a passion.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Geoffrey

Thank you for the article, interesting especially if the importance of metrics is overshadowed. Just 2 or 3 things : 1/ You said " Regression can be defined as a Machine learning problem where we have to predict discrete values like price, Rating, Fees, etc." It is not discrete, it is continuous values. 2/ I did not fully understand the very last part about adjusted R squared. "It assumes that while adding more data variance of data increases" is that always the case and if so, why ? I would have said that if you add many datapoints with the same "y-value / target-value" the variance will on the contrary decrease ? 3/ I thought that having (linearly ?) dependant features was bad in any case, but at the beginning of the article you seem to say that linear regression is OK with dependant and independant features. Is that the case ? Linked to my question 2, having a redundant feature isn't it almost the same as adding an irrelevant feature or at least it can artificially increase the R2 score while the information used is redundant and we did not really increased the performance of our model Thank you very much ! Geoffrey

Anjana

Hey hi. "In simple words, Regression can be defined as a Machine learning problem where we have to predict discrete values like price, Rating, Fees, etc." Shouldnt it be "continous values"?

Reading list

Basics of Machine Learning

Machine Learning Lifecycle

Importance of Stats and EDA

Understanding Data

Probability

Exploring Continuous Variable

Exploring Categorical Variables

Missing Values and Outliers

Central Limit theorem

Bivariate Analysis Introduction

Continuous - Continuous Variables

Continuous Categorical

Categorical Categorical

Multivariate Analysis

Different tasks in Machine Learning

Build Your First Predictive Model

Evaluation Metrics

Preprocessing Data

Linear Models

KNN

Selecting the Right Model

Feature Selection Techniques

Decision Tree

Feature Engineering

Naive Bayes

Multiclass and Multilabel

Basics of Ensemble Techniques

Advance Ensemble Techniques

Hyperparameter Tuning

Support Vector Machine

Advance Dimensionality Reduction

Unsupervised Machine Learning Methods

Recommendation Engines

Improving ML models

Working with Large Datasets

Interpretability of Machine Learning Models

Automated Machine Learning

Model Deployment

Deploying ML Models

Embedded Devices

Know The Best Evaluation Metrics for Your Regression Model !

Learning Objectives:

Table of contents

Regression

Why We Require Evaluation Metrics?

Dataset

Types of Regression Metrics

Mean Absolute Error(MAE)

Advantages of MAE

Disadvantages of MAE

Mean Squared Error(MSE)

Root Mean Squared Error(RMSE)

Advantages of RMSE

Disadvantages of RMSE

Root Mean Squared Log Error(RMSLE)

R Squared (R2)

Adjusted R Squared

Example of Using Regression Metrics on Different Dataset

1.Predictive Modeling in Real Estate

2.Sales Forecasting for a Retail Business

3.Energy Consumption Prediction

4.Student Performance Prediction

5.Stock Price Prediction

Conclusion

Key Takeaways:

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken