Last modified: September 16, 2024

This article is written in: 🇺🇸

Autocovariance functions and coefficients

Autocovariance functions describe how values of a time series relate to their lagged counterparts, measuring the joint variability between a series at time $t$ and its value at a previous time $t-k$ (where $k$ is the lag). In autoregressive models, these relationships are expressed through coefficients, which quantify the influence of past values on future values. The autocovariance function helps in estimating these coefficients by analyzing the strength and pattern of correlations at different lags. Higher autocovariance at a specific lag suggests a stronger influence of past values on the present, aiding in model selection and parameter estimation for time series models like AR, MA, and ARIMA.

Random Variables (r.v.)

A random variable (r.v.) is a mapping from a set of outcomes in a probability space to a set of real numbers. We can distinguish between:

I. Discrete random variables take on countable values. For example, let:

$$ X = {45, 36, 27, \dots} $$

II. Continuous random variables take on any value in a continuous range. For instance:

$$ Y \in (10, 60) $$

A realization is a specific observed value of a random variable. For instance:

$$ X = 20 \quad \text{and} \quad Y = 30.29 $$

Covariance

The covariance between two random variables $X$ and $Y$ measures the linear relationship between them. It is defined as:

$$ \text{Cov}(X, Y) = E\left[(X - \mu_X)(Y - \mu_Y)\right] $$

Where:

The covariance is symmetric:

$$ \text{Cov}(X, Y) = \text{Cov}(Y, X) $$

Interpretation:

Estimation of Covariance

To estimate the covariance from a paired dataset $(x_1, y_1), (x_2, y_2), \dots, (x_N, y_N)$, we use the sample covariance formula:

$$ s_{xy} = \frac{1}{N - 1} \sum_{t=1}^{N} (x_t - \bar{x})(y_t - \bar{y}) $$

Where:

Stochastic Processes

A stochastic process is a collection of random variables indexed by time, denoted as:

$$ {X_t : t \in T} $$

where $T$ is the index set (often time or space).

Each $X_t$ follows a certain distribution with a mean $\mu$ and variance $\sigma^2$:

$$ X_t \sim \text{Distribution}(\mu, \sigma^2) $$

Example: A time series is a realization of a stochastic process. Consider the following realizations:

$$ X_1, X_2, X_3, \dots $$

Realized as:

$$ 30, 29, 57, \dots $$

Autocovariance Function

The autocovariance function measures the covariance between two values of the time series at different times $s$ and $t$:

$$ \gamma(s, t) = \text{Cov}(X_s, X_t) = E\left[(X_s - \mu_s)(X_t - \mu_t)\right] $$

Where:

Variance as a special case:

When $s = t$, the autocovariance function simplifies to the variance of the series at time $t$:

$$ \gamma(t, t) = E\left[(X_t - \mu_t)^2\right] = \text{Var}(X_t) = \sigma_t^2 $$

Lagged Autocovariance

The lagged autocovariance function measures the covariance between values of the series at times $t$ and $t+k$, where $k$ is the lag:

$$ \gamma_k = \gamma(t, t+k) = E\left[(X_t - \mu)(X_{t+k} - \mu)\right] $$

For a stationary process, the autocovariance function depends only on the lag $k$, not the specific times $t$ and $t+k$:

$$ \gamma_k \approx c_k $$

This implies that the autocovariance function remains constant for different time points, provided the lag $k$ is the same.

Autocovariance Coefficients

Autocovariance measures the covariance of a time series with itself at different time lags. For a time series ${X_t}$, the autocovariance at lag $k$ is defined as:

$$ \gamma_k = \text{Cov}(X_t, X_{t+k}) = E\left[(X_t - \mu)(X_{t+k} - \mu)\right] $$

Where:

Sample Estimation of the autocovariance coefficient $\gamma_k$ is denoted by $c_k$. For a time series with $N$ observations, the estimator is:

$$ c_k = \frac{1}{N} \sum_{t=1}^{N-k} (x_t - \bar{x})(x_{t+k} - \bar{x}) $$

Where:

Assumption of Weak Stationarity

For weakly stationary processes, the mean $\mu$ is constant, and the autocovariance $\gamma_k$ depends only on the lag $k$, not on the actual time points $t$ and $t+k$. Therefore, the autocovariance function becomes:

$$ \gamma_k = E\left[(X_t - \mu)(X_{t+k} - \mu)\right] = \text{Cov}(X_t, X_{t+k}) $$

Under the assumption of weak stationarity, the sample autocovariance $c_k$ is computed as:

$$ c_k = \frac{1}{N} \sum_{t=1}^{N-k} (x_t - \bar{x})(x_{t+k} - \bar{x}) $$

This allows us to estimate the strength of the relationship between $X_t$ and $X_{t+k}$ at different lags $k$.

Table of Contents

    Autocovariance functions and coefficients
    1. Random Variables (r.v.)
    2. Covariance
      1. Estimation of Covariance
    3. Stochastic Processes
    4. Autocovariance Function
    5. Lagged Autocovariance
      1. Autocovariance Coefficients
      2. Assumption of Weak Stationarity