Probability for Machine Learning:Probability Distribution Function

The probability distribution function (PDF) is a mathematical function that describes how likely it is to observe different values of a random variable. A random variable is a quantity that can take different values depending on some random process, such as flipping a coin, rolling a die, or measuring the height of a person.

There are two main types of probability distribution functions: discrete and continuous. Discrete probability distribution functions are used for random variables that can only take a finite number of values, such as 0 or 1, heads or tails, or red or blue. Continuous probability distribution functions are used for random variables that can take any value in a continuous range, such as height, weight, or temperature.

In this blog post, we will explore some of the most common probability distribution functions and their formulae.

Discrete Probability Distribution Functions

Probability Mass Function

The probability mass function (PMF) is the function that gives the probability of each possible value of a discrete random variable. The PMF satisfies two properties:

  • The probability of each value is between 0 and 1: 0≤P(X=x)≤1 for all x

  • The sum of all probabilities is equal to 1: ∑x​P(X=x)=1

The PMF can be represented as a table, a graph, or a formula.

Example: PMF of a fair coin flip

Let X be the random variable that represents the outcome of a fair coin flip, where X = 0 for heads and X = 1 for tails. The PMF of X is:

xP(X = x)
00.5
10.5

The PMF can also be written as:

$$P(X = x) = \begin{cases} 0.5 & \text{if } x = 0 \\ 0.5 & \text{if } x = 1 \\ 0 & \text{otherwise} \end{cases}$$

Cumulative Distribution Function

The cumulative distribution function (CDF) is the function that gives the probability of a discrete random variable being less than or equal to a given value. The CDF satisfies three properties:

  • The probability of any value is between 0 and 1: 0≤P(X≤x)≤1 for all x

  • The probability is non-decreasing: P(X≤x)≤P(X≤y) for all x≤y.

  • The probability approaches 0 as x approaches negative infinity and approaches 1 as x approaches positive infinity:lim x→−∞​P(X≤x)=0 and limx→+∞​P(X≤x)=1

The CDF can be represented as a graph or a formula.

Example: CDF of a fair coin flip

Let X be the random variable that represents the outcome of a fair coin flip, where X = 0 for heads and X = 1 for tails. The CDF of X is:

$$P(X \leq x) = \begin{cases} 0 & \text{if } x < 0 \\ 0.5 & \text{if } 0 \leq x < 1 \\ 1 & \text{if } x \geq 1 \end{cases}$$

Continuous Probability Distribution Functions

Probability Density Function

The probability density function (PDF) is the function that gives the relative likelihood of different values of a continuous random variable. The PDF satisfies two properties:

  • The probability density is non-negative: f(x)≥0 for all x.

  • The area under the curve is equal to 1: ∫−∞+∞ ​f(x)dx=1

The PDF can be represented as a graph or a formula.

Example: PDF of a standard normal distribution

Let Z be the random variable that follows a standard normal distribution, which has mean μ=0 and standard deviation σ=1. The PDF of Z is:

$$f(z) = \frac{1}{\sqrt{2\pi}} e^{-\frac{z^2}{2}}$$

Cumulative Distribution Function

The cumulative distribution function (CDF) is the function that gives the probability of a continuous random variable being less than or equal to a given value. The CDF satisfies three properties:

  • The probability of any value is between 0 and 1: 0≤F(x)≤1 for all x.

  • The probability is non-decreasing: F(x)≤F(y) for all x≤y

  • The probability approaches 0 as x approaches negative infinity and approaches 1 as x approaches positive infinity: lim⁡x→−∞F(x)=0 and limx→+∞​F(x)=1

The CDF can be represented as a graph or a formula.

Example: CDF of a standard normal distribution

Let Z be the random variable that follows a standard normal distribution, which has mean μ=0 and standard deviation σ=1. The CDF of Z is:

$$F(z) = P(Z \leq z) = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{z} e^{-\frac{x^2}{2}} dx$$

Conclusion

In this blog post, we have learned about some common probability distribution functions and their formulae. We have seen how they can help us describe and model different types of random variables and their probabilities. We have also seen how they can be represented using tables, graphs, or formulas.

Probability distribution functions are useful tools for understanding and analyzing data and making predictions using machine learning models. They can help us quantify and handle uncertainty, design and train models, evaluate and compare performance, and model complex relationships among variables.

Hope you got value out of this blog. Subscribe to the newsletter to get more such blogs.

Thanks :)