Last modified: May 11, 2025

This article is written in: πŸ‡ΊπŸ‡Έ

Introduction to Distributions

A distribution is a function that describes the probability of a random variable. It helps to understand the underlying patterns and characteristics of a dataset. Distributions are widely used in statistics, data analysis, and machine learning for tasks such as hypothesis testing, confidence intervals, and predictive modeling.

Random Variables

Random variables assign numerical values to outcomes of random processes in probability and statistics. Random variables can be discrete (taking specific values) or continuous (taking any value within a range).

Example: Drawing a Card from a Deck

Example: Weather Forecast

Probability Calculations

Types of Probability Distributions

Example: Probability Distribution of a Discrete Random Variable

Consider a discrete random variable X with the following probability distribution:

Value of X Probability pX(x)
1 0.05
2 0.10
3 0.15
4 0.20
5 0.15
6 0.10
7 0.08
8 0.07
9 0.05
10 0.05

This table can be visualized using a bar graph, with the height of each bar representing the likelihood of each outcome.

be5cfcd9-6dcb-48ab-80a4-c10313a0ace0

Example: Roll a Six-Sided Die Until a 6 Appears

Roll a fair six-sided die repeatedly until the die shows a 6.

Number of Rolls Probability
1 1/6 β‰ˆ 0.1667
2 (5/6) * (1/6) β‰ˆ 0.1389
3 (5/6)^2 * (1/6) β‰ˆ 0.1157
4 (5/6)^3 * (1/6) β‰ˆ 0.0964
5 (5/6)^4 * (1/6) β‰ˆ 0.0803
6 (5/6)^5 * (1/6) β‰ˆ 0.0669

bcc766bb-54d8-4005-8a06-e4de2f8b571d

Find the probability that the first 6 appears:

  1. On the third roll.
  2. On the third or fourth roll.
  3. In less than five rolls.
  4. In no more than three rolls.
  5. After three rolls.
  6. In at least three rolls.

Now let's do calculations:

  1. P(3)=(5/6)2βˆ—(1/6)β‰ˆ0.1157
  2. P(3 or 4)=P(3)+P(4)β‰ˆ0.1157+0.0964β‰ˆ0.2121
  3. P(X<5)=P(1)+P(2)+P(3)+P(4)β‰ˆ0.1667+0.1389+0.1157+0.0964β‰ˆ0.5177
  4. P(X≀3)=P(1)+P(2)+P(3)β‰ˆ0.1667+0.1389+0.1157β‰ˆ0.4213
  5. P(X>3)=1βˆ’P(X≀3)β‰ˆ1βˆ’0.4213β‰ˆ0.5787
  6. P(Xβ‰₯3)=P(3)+P(4)+P(5)+P(6)+...β‰ˆ0.1157+0.0964+0.0803+0.0669+...

Example: Number of Pets Owned by Individuals

Consider the following probability distribution for the number of pets P owned by individuals.

P P(P)
0 0.28
1 0.35
2 0.22
3 0.10
4 0.04
5 0.01

3331851a-bd33-48fd-9f30-6e8ef54be22e

Find the probability that an individual owns:

  1. Less than 2 pets. P(P<2)=P(P=0)+P(P=1)=0.28+0.35=0.63
  2. More than 3 pets. P(P>3)=P(P=4)+P(P=5)=0.04+0.01=0.05
  3. 1 or 4 pets. P(P=1 or P=4)=P(P=1)+P(P=4)=0.35+0.04=0.39
  4. At most 3 pets. P(P≀3)=P(P=0)+P(P=1)+P(P=2)+P(P=3)=0.28+0.35+0.22+0.10=0.95
  5. 2 or fewer, or more than 4 pets. P(P≀2 or P>4)=P(P≀2)+P(P>4)=(0.28+0.35+0.22)+0.01=0.86

Expected Value

Calculating Expected Value

Example: Expected Value in a Dice Roll

Consider a fair six-sided dice roll. Each side, numbered from 1 to 6, has an equal probability of appearing on a single roll. The probability for each outcome is 16.

Step-by-Step Calculation of Expected Value:

I. List All Possible Outcomes and Their Probabilities.

Outcome (X) Probability P(X)
1 1/6
2 1/6
3 1/6
4 1/6
5 1/6
6 1/6

II. Multiply Each Outcome by Its Probability.

III. Sum Up the Products.

E(X)=(1Γ—16)+(2Γ—16)+(3Γ—16)+(4Γ—16)+(5Γ—16)+(6Γ—16)

E(X)=1+2+3+4+5+66=216

E(X)=3.5

IV. Interpretation:

Probability Density Function (PDF) - Continuous Variables

For continuous random variables, the PDF provides the probability density at a specific point x. The area under the curve between two points on the PDF represents the probability of the variable falling within that range.

fX(x)

Properties:

  1. Non-negative: fX(x)β‰₯0 for all x.
  2. Normalization: The total area under the curve of the PDF is 1.

Joint PDF for Multiple Variables: For two continuous random variables X and Y, the joint PDF fX,Y(x,y) gives the density at a particular point (x,y).

daee62ad-7315-4736-b786-bb7cafe700e1

9ebbf08c-de34-44c8-b0f3-5632b0faa7e5

Probability Mass Function (PMF) - Discrete Variables

For discrete random variables, the PMF specifies the probability of the variable taking a particular value x. Directly find the probability of specific outcomes.

pX(x)=P(X=x)

Properties:

  1. Non-negative: pX(x)β‰₯0 for all x.
  2. Sum to One: The sum of all probabilities for all possible values of X is 1.

Joint PMF for Multiple Variables: For two discrete random variables X and Y, the joint PMF pX,Y(x,y) gives the probability of X and Y simultaneously taking values x and y, respectively.

d20b90d0-5240-4f0a-88f5-dfc50cf8e39d

31d2a302-cacc-472a-a5cc-83e0a640e4d3

Cumulative Distribution Function (CDF) - Both Continuous and Discrete Variables

The CDF shows the probability that a random variable is less than or equal to a specific value x. Calculate the probability of the variable falling below a certain threshold.

FX(x)=P(X≀x)

Properties:

  1. Non-decreasing function.
  2. limxβ†’βˆžFX(x)=1.
  3. limxβ†’βˆ’βˆžFX(x)=0.
  4. Right-continuous: For any x and decreasing sequence xn converging to x, limxn→x+FX(xn)=FX(x).

Joint CDF: For two variables X and Y, F(a,b)=P(X≀a,Y≀b). To derive the marginal distribution of X: FX(a)=limbβ†’βˆžF(a,b).

1de74524-d5fe-4c04-909f-0cbb6d9ebed7

36f05abf-38ae-49c3-898a-18c34edef19d

Table of Contents

    Introduction to Distributions
    1. Random Variables
      1. Example: Drawing a Card from a Deck
      2. Example: Weather Forecast
      3. Probability Calculations
    2. Types of Probability Distributions
      1. Example: Probability Distribution of a Discrete Random Variable
      2. Example: Roll a Six-Sided Die Until a 6 Appears
      3. Example: Number of Pets Owned by Individuals
    3. Expected Value
      1. Calculating Expected Value
      2. Example: Expected Value in a Dice Roll
    4. Probability Density Function (PDF) - Continuous Variables
    5. Probability Mass Function (PMF) - Discrete Variables
    6. Cumulative Distribution Function (CDF) - Both Continuous and Discrete Variables