Statistics 101: Beginners Guide to Continuous Probability Distributions

Priyanka Last Updated : 09 Feb, 2021

5 min read

This article was published as a part of the Data Science Blogathon.

Introduction

In the previous post, we have defined Probability Distributions and briefly discussed different Discrete Probability distributions. In this post, we will continue learning about probability distributions through Continuous Probability Distributions.

Definition

If you recall from our previous discussion, continuous random variables can take an infinite number of values over a given interval. For example, in the interval [2, 3] there are infinite values between 2 and 3. Continuous distributions are defined by the Probability Density Functions(PDF) instead of Probability Mass Functions. The probability that a continuous random variable is equal to an exact value is always equal to zero. Continuous probabilities are defined over an interval. For instance, P(X = 3) = 0 but P(2.99 < X < 3.01) can be calculated by integrating the PDF over the interval [2.99, 3.01]

List of Continuous Probability Distributions

We discuss the most commonly used continuous probability distributions below:

1. Continuous Uniform Distribution

Uniform distribution has both continuous and discrete forms. Here, we discuss the continuous one. This distribution plots the random variables whose values have equal probabilities of occurring. The most common example is flipping a fair die. Here, all 6 outcomes are equally likely to happen. Hence, the probability is constant.

Consider the example where a = 10 and b = 20, the distribution looks like this:

Continuous Probability Distributions - Uniform

The PDF is given by,

where a is the minimum value and b is the maximum value.

2. Normal Distribution

This is the most commonly discussed distribution and most often found in the real world. Many continuous distributions often reach normal distribution given a large enough sample. This has two parameters namely mean and standard deviation.

This distribution has many interesting properties. The mean has the highest probability and all other values are distributed equally on either side of the mean in a symmetric fashion. The standard normal distribution is a special case where the mean is 0 and the standard deviation of 1.

Normal Continuous Probability Distributions

It also follows the empirical formula that 68% of the values are 1 standard deviation away, 95% percent of them are 2 standard deviations away, and 99.7% are 3 standard deviations away from the mean. This property is greatly useful when designing hypothesis tests(https://www.statisticshowto.com/probability-and-statistics/hypothesis-testing/).

The PDF is given by,

where μ is the mean of the random variable X and σ is the standard deviation.

3. Log-normal Distribution

This distribution is used to plot the random variables whose logarithm values follow a normal distribution. Consider the random variables X and Y. Y = ln(X) is the variable that is represented in this distribution, where ln denotes the natural logarithm of values of X.

The PDF is given by,

where μ is the mean of Y and σ is the standard deviation of Y.

4. Student’s T Distribution

The student’s t distribution is similar to the normal distribution. The difference is that the tails of the distribution are thicker. This is used when the sample size is small and the population variance is not known. This distribution is defined by the degrees of freedom(p) which is calculated as the sample size minus 1(n – 1).

As the sample size increases, degrees of freedom increases the t-distribution approaches the normal distribution and the tails become narrower and the curve gets closer to the mean. This distribution is used to test estimates of the population mean when the sample size is less than 30 and population variance is unknown. The sample variance/standard deviation is used to calculate the t-value.

Continuous Probability Distributions - t-distribution

The PDF is given by,

where p is the degrees of freedom and Γ is the gamma function. Check this link for a brief description of the gamma function.

The t-statistic used in hypothesis testing is calculated as follows,

Continuous Probability Distributions t formula

where x̄ is the sample mean, μ the population mean and s is the sample variance.

5. Chi-square Distribution

This distribution is equal to the sum of squares of p normal random variables. p is the number of degrees of freedom. Like the t-distribution, as the degrees of freedom increase, the distribution gradually approaches the normal distribution. Below is a chi-square distribution with three degrees of freedom.

Continuous Probability Distributions chi

The PDF is given by,

where p is the degrees of freedom and Γ is the gamma function.

The chi-square value is calculated as follows:

where o is the observed value and E represents the expected value. This is used in hypothesis testing to draw inferences about the population variance of normal distributions.

6. Exponential Distribution

Recall the discrete probability distribution we have discussed in the Discrete Probability post. In the Poisson distribution, we took the example of calls received by the customer care center. In that example, we considered the average number of calls per hour. Now, in this distribution, the time between successive calls is explained.

The exponential distribution can be seen as an inverse of the Poisson distribution. The events in consideration are independent of each other.

The PDF is given by,

where λ is the rate parameter. λ = 1/(average time between events).

To conclude, we have very briefly discussed different continuous probability distributions in this post. Feel free to add any comments or suggestions below.

About Me

I am Priyanka Madiraju, a former software engineer, working on transitioning into Data Science. I am a master’s student in Data Science. Please feel free to connect with me on https://www.linkedin.com/in/priyanka-madiraju

The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.

Priyanka

I am a former software engineer with 6 years of work experience. I am pursuing my Masters in Data Science student @ TU Dortmund. I write about my areas of interest regularly on LinkedIn and Medium. Follow me for more technical content.

Beginner Maths Statistics

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Andrew James

Hi, I also read your previous post and very much impressed with the knowledge. I really thank you to share both articles. Regards

GIDEON OGIDI-GH

God richly bless you Brother for such a brief explanation. but please if you could add real life examples to each distribution type for more better understanding. Thank you Snr

Reading list

Basics of Machine Learning

Machine Learning Lifecycle

Importance of Stats and EDA

Understanding Data

Probability

Exploring Continuous Variable

Exploring Categorical Variables

Missing Values and Outliers

Central Limit theorem

Bivariate Analysis Introduction

Continuous - Continuous Variables

Continuous Categorical

Categorical Categorical

Multivariate Analysis

Different tasks in Machine Learning

Build Your First Predictive Model

Evaluation Metrics

Preprocessing Data

Linear Models

KNN

Selecting the Right Model

Feature Selection Techniques

Decision Tree

Feature Engineering

Naive Bayes

Multiclass and Multilabel

Basics of Ensemble Techniques

Advance Ensemble Techniques

Hyperparameter Tuning

Support Vector Machine

Advance Dimensionality Reduction

Unsupervised Machine Learning Methods

Recommendation Engines

Improving ML models

Working with Large Datasets

Interpretability of Machine Learning Models

Automated Machine Learning

Model Deployment

Deploying ML Models

Embedded Devices

Statistics 101: Beginners Guide to Continuous Probability Distributions

Introduction

Definition

List of Continuous Probability Distributions

1. Continuous Uniform Distribution

2. Normal Distribution

3. Log-normal Distribution

4. Student’s T Distribution

5. Chi-square Distribution

6. Exponential Distribution

About Me

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#