Data Science 101: Introduction to Cost Function

Himanshi Singh Last Updated : 17 Oct, 2024

6 min read

Introduction

While dealing with Linear Regression we can have multiple lines for different values of slopes and intercepts. But the main question that arises is which of those lines actually represents the right relationship between the X and Y and in order to find that we can use the Mean Squared Error or MSE as the parameter. For linear regression, this MSE is nothing but the Cost Function.

Mean Squared Error is the sum of the squared differences between the prediction and true value. And the output is a single number representing the cost. So the line with the minimum cost function or MSE represents the relationship between X and Y in the best possible manner. And once we have the slope and intercept of the line which gives the least error, we can use that line to predict Y.

So this article is all about calculating the errors/cost for various lines and then finding the cost function, which can be used for prediction.

New Feature

Get Personalized Learning Path! Set your goal and timeline. Get a path—under 2 mins.

Note: If you are more interested in learning concepts in an Audio-Visual format, We have this entire article explained in the video below. If not, you may continue reading.

As we know that any line can be represented by two parameters- slope(β) and intercept(b).

And once we have a line we can always calculate the errors(also known as cost or loss) which this line would have from the underlying data point and the idea is to find the line which gives us the least error. This basically becomes an optimization problem. Let’s go through an exercise where you’ll see what is the error for various values of β and B and then the question is how do we find the most optimum values of these two parameters.

Let’s begin!

Importing Libraries

I’ll start by importing the necessary libraries which are matplotlib, pandas and scikit learn to calculate the error:

#import the libraries

import matplotlib.pyplot as plt
import pandas as pd
from sklearn.metrics import mean_squared_error as mse

Creating sample Data

And I have created a data set for Experience and Salary. For that, I’ve created a list and then just simply converted it to a Pandas Dataframe using pd.DataFrame():

Python code:

import pandas as pd
# creating the sample dataset 
experience = [1.2,1.5,1.9,2.2,2.4,2.5,2.8,3.1,3.3,3.7,4.2,4.4] 
salary = [1.7,2.4,2.3,3.1,3.7,4.2,4.4,6.1,5.4,5.7,6.4,6.2] 
data = pd.DataFrame({ "salary" : salary, "experience" : experience }) 
print(data.head())

You can see the first five rows of our dataset.

Plotting the data

Let us now explore the dataset by exploring the relationship between salary and experience. Let’s plot this using Matplotlib:

# plotting the data

plt.scatter(data.experience, data.salary, color = 'red', label = 'data points')
plt.xlim(1,4.5)
plt.ylim(1,7)
plt.xlabel('Experience')
plt.ylabel('Salary')
plt.legend()

You can see a linear relationship between experience and salary. And this is what we expected. So let’s now start by plotting the lines by using various values of β and b.

Starting the Line using small values of parameters(beta and b)

To start with let me take β = 0.1 and b = 1.1 and what I’m going to do is create a line with these two parameters and plot it over the scatter plot. Now, I take values and then apply this relationship to create each of these lines and then I plot this over the above-created scatter plot.

# making lines for different Values of Beta 0.1, 0.8, 1.5
beta = 0.1

# keeping intercept constant
b = 1.1

# to store predicted points
line1 = []

# generating predictions for every data point
for i in range(len(data)):
    line1.append(data.experience[i]*beta + b)

# Plotting the line
plt.scatter(data.experience, data.salary, color = 'red')
plt.plot(data.experience, line1, color = 'black', label = 'line')
plt.xlim(1,4.5)
plt.ylim(1,7)
plt.xlabel('Experience')
plt.ylabel('Salary')
plt.legend()

MSE = mse(data.salary, line1)

plt.title("Beta value "+str(beta)+" with MSE "+ str(MSE))

MSE = mse(data.salary, line1)

Starting the Line using small values of parameters(beta and b)

We have this line for beta = 0.1 and b = 1.1 and the MSE for this line is 2.69. Now as you can see the slope of the line is very less so I would want to actually try out a higher slope so instead of beta = 0.1 let me change this to beta = 1.5 :

This is the line for beta = 1.5 and b = 1.1 and the MSE for this line is 6.40. You can see a better slope but it is probably more than what we actually want. So the right value might be somewhere in between 0.1 and1.5, so let’s try beta = 0.8:

This is the line with beta = 0.8 and b = 1.1 and we can see that the mean squared error has come down to 0.336

We have tried 3 different lines with each of these having different MSE; the first one had 2.69, the second had 6.4 and the third one had 0.33. Now we can go on and try this for various values of beta. So now we can try this with various values of Beta and see what is the relationship between beta and mean squared error(MSE), for a fixed value intercept i.e b. So we’re not changing b.

Computing Cost Function over a range of values of Beta

So let’s create a function which I am calling as Error and what this function does is for a given value beta it is basically giving me what is the MSE for these data points. Here b is fixed and I am trying different values of Beta.

# function to calculate error

def Error(Beta, data):

# b is constant
    b = 1.1

    salary = []
    experience = data.experience

# Loop to calculate predict salary variables
    for i inrange(len(data.experience)):
        tmp = data.experience[i] * Beta + b
        salary.append(tmp)

MSE = mse(data.salary, salary)
return MSE

Now I am trying all values of Beta between 0 to 1.5 with an increment of 0.01 and I’ll then append everything to a list:

# Range of slopes from 0 to 1.5 with increment of 0.01

slope = [i/100 for i in range(0,150)]
Cost = []
for i in slope:
    cost = Error( Beta = i, data = data)
    Cost.append(cost)

Visualizing cost with respect to Beta

When this is done, just convert this into this is the Data Frame:

# Arranging in DataFrame

Cost_table = pd.DataFrame({
'Beta' : slope,
'Cost' : Cost
})

Cost_table.head()

What this data frame is showing that for a value of Beta which is 0.00 the cost or MSE we’re getting is 3.72, similarly for beta = 0.04, we are getting cost = 3.29. Let’s quickly visualize this:

# plotting the cost values corresponding to every value of Beta

plt.plot(Cost_table.Beta, Cost_table.Cost, color = 'blue', label = 'Cost Function Curve')
plt.xlabel('Value of Beta')
plt.ylabel('Cost')
plt.legend()

This is the plot which we get. So as you can see the value of cost at 0 was around 3.72, so that is the starting value. After that the error goes down with the increasing value of Beta, reaches a minimum, and then it starts increasing.

Now the question is given that we know this relationship, what are the values of beta and b for which we can find out this particular location where my cost is minimum.

Now what happens when you have two parameters, I mean right now we assumed b to be 0.1 but actually what we want to do is change it with respect to b as well as beta and in this particular situation this is the curve which we get:

So again, this is just a 3D plot, so you have two dimensions but it will again have a minimum value and the idea would be to find the minimum value. But you cannot do this iteratively in the way we did so far. So what we need to find is a smarter way in which we can find this minimum value and that is where the Gradient Descent technique comes into play.

End Notes

In this article, you learned how to calculate the error for various lines and how to find the optimum line.

If you are looking to kick start your Data Science Journey and want every topic under one roof, your search stops here. Check out Analytics Vidhya’s Certified AI & ML BlackBelt Plus Program

This article concluded with the introduction of a new term Gradient Descent, which will be covered in upcoming articles. So, stay tuned!

Himanshi Singh

I’m a data lover who enjoys finding hidden patterns and turning them into useful insights. As the Manager - Content and Growth at Analytics Vidhya, I help data enthusiasts learn, share, and grow together.

Thanks for stopping by my profile - hope you found something you liked :)

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Augusto Acioli Pinho Vanderley

Really lovely article. Simple, well explained and to the point. Looking forward for more.

Really lovely article. Simple, well explained and to the point. I'm looking forward for more of it.

Show 1 reply

Thanks for the appreciation, Augusto:)

MikeFin

Plotted values do not match the actual results! Is this a typo? I re-ran the exact python commands, and confirmed the same in Excel/Matlab.

Reading list

Introduction to Deep Learning

Feed Forward Networks

Gradient Descent

Loss Function

Activation Functions

Introduction to Neural networks

Forward and Backward Propagation

Optimizers

Learning Rate Schedulers

NN on Structured Data

Improving the Deep Learning Model

Deep Learning Model Optimization

Unsupervised Deep Learning

AutoDL

Model Deployment

Introduction to PyTorch

Data Science 101: Introduction to Cost Function

Introduction

Get Personalized Learning Path! Set your goal and timeline. Get a path—under 2 mins.

Importing Libraries

Creating sample Data

Plotting the data

Starting the Line using small values of parameters(beta and b)

Computing Cost Function over a range of values of Beta

Visualizing cost with respect to Beta

End Notes

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

s_pltp

s_tslv

li_theme

li_theme_set