Last modified: July 28, 2024

This article is written in: 🇺🇸

Covariance

Covariance is a fundamental statistical measure that quantifies the degree to which two random variables change together. It indicates the direction of the linear relationship between variables:

Definition

The covariance between two random variables X and Y is defined as the expected value (mean) of the product of their deviations from their respective means:

Cov(X,Y)=E[(XμX)(YμY)]

Where:

Alternative Expression

By expanding the definition and applying the linearity properties of expectation, covariance can also be expressed as:

Cov(X,Y)=E[XY]E[X]E[Y]

Derivation:

  1. Start with the definition:

Cov(X,Y)=E[(XμX)(YμY)]

  1. Expand the product inside the expectation:

Cov(X,Y)=E[XYXμYμXY+μXμY]

  1. Use the linearity of expectation:

Cov(X,Y)=E[XY]μYE[X]μXE[Y]+μXμY

  1. Recognize that μX=E[X] and μY=E[Y]:

Cov(X,Y)=E[XY]μYμXμXμY+μXμY=E[XY]μXμY

Thus, we arrive at:

Cov(X,Y)=E[XY]E[X]E[Y]

Interpretation

Important Note:

Properties of Covariance

I. Symmetry:

Cov(X,Y)=Cov(Y,X)

II. Linearity in Each Argument:

For constants a and b, and random variables X, Y, and Z:

Cov(aX+bY,Z)=aCov(X,Z)+bCov(Y,Z)

III. Covariance with Itself (Variance Relation):

The covariance of a variable with itself is the variance of that variable:

Cov(X,X)=Var(X)

IV. Scaling:

If a and b are constants:

Cov(aX,bY)=abCov(X,Y)

V. Addition of Constants:

Adding a constant to a variable does not affect the covariance:

Cov(X+c,Y)=Cov(X,Y)

VI. Relationship with Correlation:

Covariance is related to the correlation coefficient ρXY:

ρXY=Cov(X,Y)σXσY

Where σX and σY are the standard deviations of X and Y, respectively.

Sample Covariance

When working with sample data, the sample covariance between two variables X and Y is calculated as:

sXY=Cov(X,Y)=1n1ni=1(XiˉX)(YiˉY)

Where:

Note: The denominator n1 provides an unbiased estimate of the covariance for a sample drawn from a population.

Example: Calculating Covariance Step by Step

Let's calculate the covariance between two variables X and Y using the following dataset:

Observation (i) Xi Yi
1 1 2
2 2 4
3 3 6

Step 1: Calculate the Sample Means

Compute the mean of X and Y:

ˉX=1nni=1Xi=1+2+33=63=2

ˉY=1nni=1Yi=2+4+63=123=4

Step 2: Compute the Deviations from the Mean

Calculate (XiˉX) and (YiˉY):

i Xi Yi XiˉX YiˉY
1 1 2 12=1 24=2
2 2 4 22=0 44=0
3 3 6 32=1 64=2

Step 3: Calculate the Product of Deviations

Compute (XiˉX)(YiˉY):

i XiˉX YiˉY (XiˉX)(YiˉY)
1 -1 -2 (1)(2)=2
2 0 0 (0)(0)=0
3 1 2 (1)(2)=2

Step 4: Sum the Products of Deviations

Compute the sum:

ni=1(XiˉX)(YiˉY)=2+0+2=4

Step 5: Calculate the Sample Covariance

Use the sample covariance formula:

sXY=Cov(X,Y)=1n1ni=1(XiˉX)(YiˉY)

Since n=3:

sXY=131×4=12×4=2

Interpretation:

Step 6: Calculate the Variances (Optional)

For completeness, calculate the variances of X and Y:

Variance of X

sXX=Var(X)=1n1ni=1(XiˉX)2

Compute (XiˉX)2:

i XiˉX (XiˉX)2
1 -1 (1)2=1
2 0 (0)2=0
3 1 (1)2=1

Sum:

ni=1(XiˉX)2=1+0+1=2

Compute variance:

sXX=12×2=1

Variance of Y

Similarly, compute (YiˉY)2:

i YiˉY (YiˉY)2
1 -2 (2)2=4
2 0 (0)2=0
3 2 (2)2=4

Sum:

ni=1(YiˉY)2=4+0+4=8

Compute variance:

sYY=Var(Y)=12×8=4

Step 7: Calculate the Correlation Coefficient (Optional)

The correlation coefficient rXY standardizes the covariance, providing a dimensionless measure of the strength and direction of the linear relationship:

rXY=sXYsXX×sYY=21×4=22=1

Interpretation:

Plot:

output(13)

Limitations of Covariance

I. Scale Dependence:

II. Comparison Difficulties:

III. Not a Measure of Strength:

IV. Linear Relationships Only:

Table of Contents

  1. Definition
    1. Alternative Expression
    2. Interpretation
  2. Properties of Covariance
  3. Sample Covariance
  4. Example: Calculating Covariance Step by Step
    1. Step 1: Calculate the Sample Means
    2. Step 2: Compute the Deviations from the Mean
    3. Step 3: Calculate the Product of Deviations
    4. Step 4: Sum the Products of Deviations
    5. Step 5: Calculate the Sample Covariance
    6. Step 6: Calculate the Variances (Optional)
      1. Variance of X
      2. Variance of Y
    7. Step 7: Calculate the Correlation Coefficient (Optional)
  5. Limitations of Covariance