Last modified: November 10, 2024

This article is written in: πŸ‡ΊπŸ‡Έ

Time Series Modeling

Time series modeling involves analyzing data points collected or recorded at specific time intervals to understand underlying structures and make forecasts. Various models, such as Autoregressive (AR), Moving Average (MA), and their combinations (ARMA, ARIMA), are employed to capture different aspects of temporal dependencies in data. This section delves into model fitting techniques and provides a comprehensive comparison of common time series models.

Model Fitting

Fitting a time series model involves estimating the model's coefficients to best capture the underlying patterns in the data. For models like AR, MA, or ARMA, coefficients are typically estimated using Maximum Likelihood Estimation (MLE) or Least Squares Estimation (LSE). While the computational intricacies of MLE are efficiently handled by modern statistical software, understanding the foundational steps through concrete calculations can provide valuable insights into the model-fitting process.

The primary objective in model fitting is to determine the coefficients that minimize the cumulative squared errors (white noise terms), formally expressed as:

Minimizeβˆ‘t=1nwt2

where wt represents the residuals or error terms at time t.

Below, we expand on this by providing concrete calculations for fitting an AR(2) and an MA(2) model using Least Squares Estimation. We'll use a small synthetic dataset for illustration purposes.

Synthetic Dataset

Consider the following time series data for Yt over t=1 to t=5:

t Yt
1 2.0
2 2.5
3 3.0
4 3.5
5 4.0

For simplicity, we'll assume that the series starts at t=1, and initial lag values (Y0 and Yβˆ’1) are known or set to zero.

Fitting an AR(2) Model

An Autoregressive model of order 2 (AR(2)) is defined as:

Yt=B0+B1Ytβˆ’1+B2Ytβˆ’2+wt

Our goal is to estimate the coefficients B0, B1, and B2 that minimize the sum of squared residuals:

Minimizeβˆ‘t=15wt2=βˆ‘t=15(Ytβˆ’B0βˆ’B1Ytβˆ’1βˆ’B2Ytβˆ’2)2

Step-by-Step Calculation

I. Construct the Equations:

Since Yt depends on its two previous values, we can start constructing equations from t=3 to t=5:

For t=3:

3.0=B0+B1β‹…2.5+B2β‹…2.0+w3

For t=4:

3.5=B0+B1β‹…3.0+B2β‹…2.5+w4

For t=5:

4.0=B0+B1β‹…3.5+B2β‹…3.0+w5

II. Set Up the System of Equations:

Ignoring the error terms for the purpose of least squares estimation, we have:

3.0=B0+2.5B1+2.0B2

3.5=B0+3.0B1+2.5B2

4.0=B0+3.5B1+3.0B2

III. Matrix Representation:

Represent the system in matrix form Y=XB+w:

Y=[3.03.54.0]

[12.52.013.02.513.53.0][B0B1B2]+[w3w4w5]

IV. Apply Least Squares Estimation:

The least squares solution is given by:

B=(X⊀X)βˆ’1X⊀Y

Let's compute each component step by step.

Compute X⊀X:

X⊀X=

[1112.53.03.52.02.53.0][12.52.013.02.513.53.0]

=[397.5928.2523.757.523.7519.25]

Compute X⊀Y:

X⊀Y=

[1112.53.03.52.02.53.0][3.03.54.0]

[10.533.2527.5]

Compute (X⊀X)βˆ’1:

Calculating the inverse of a 3Γ—3 matrix can be involved. For brevity, we'll provide the inverse matrix directly:

(X⊀X)βˆ’1β‰ˆ

[4.25βˆ’1.8βˆ’0.35βˆ’1.80.70.1βˆ’0.350.10.3]

Compute B:

B=(X⊀X)βˆ’1X⊀Yβ‰ˆ

[4.25βˆ’1.8βˆ’0.35βˆ’1.80.70.1βˆ’0.350.10.3][10.533.2527.5]

[1.00.50.2]

V. Estimated Coefficients:

B0β‰ˆ1.0,B1β‰ˆ0.5,B2β‰ˆ0.2

VI. Model Interpretation:

The fitted AR(2) model is:

Yt=1.0+0.5Ytβˆ’1+0.2Ytβˆ’2+wt

VII. Validation:

To validate, plug the estimated coefficients back into the equations and compute residuals wt:

For t=3:

3.0=1.0+0.5β‹…2.5+0.2β‹…2.0+w3

3.0=1.0+1.25+0.4+w3

w3=3.0βˆ’2.65=0.35

For t=4:

3.5=1.0+0.5β‹…3.0+0.2β‹…2.5+w4

3.5=1.0+1.5+0.5+w4

w4=3.5βˆ’3.0=0.5

For t=5:

4.0=1.0+0.5β‹…3.5+0.2β‹…3.0+w5

4.0=1.0+1.75+0.6+w5

w5=4.0βˆ’3.35=0.65

The residuals w3=0.35, w4=0.5, and w5=0.65 represent the errors between the observed and fitted values.

Fitting an MA(2) Model

A Moving Average model of order 2 (MA(2)) is defined as:

Yt=ΞΌ+wt+ΞΈ1wtβˆ’1+ΞΈ2wtβˆ’2

Fitting an MA model is inherently more complex than fitting an AR model because the error terms wt are part of the model equations. Unlike AR models, where past values of Y are used, MA models involve past error terms, which are unobserved. Therefore, estimating the coefficients typically requires iterative methods such as Maximum Likelihood Estimation (MLE) or the Method of Moments.

However, for illustrative purposes, let's attempt a simplified approach using a small dataset and assuming initial error terms are zero.

Step-by-Step Calculation

I. Assumptions:

II. Calculate ΞΌ:

ΞΌ=2.0+2.5+3.0+3.5+4.05=15.05=3.0

III. Construct the Equations:

The MA(2) model can be rewritten for each time point as:

Ytβˆ’ΞΌ=wt+ΞΈ1wtβˆ’1+ΞΈ2wtβˆ’2

Substituting the known values and assumptions:

For t=1:

2.0βˆ’3.0=w1+ΞΈ1β‹…0+ΞΈ2β‹…0

βˆ’1.0=w1

For t=2:

2.5βˆ’3.0=w2+ΞΈ1w1+ΞΈ2β‹…0

βˆ’0.5=w2+ΞΈ1(βˆ’1.0)

For t=3:

3.0βˆ’3.0=w3+ΞΈ1w2+ΞΈ2w1

0.0=w3+ΞΈ1w2+ΞΈ2(βˆ’1.0)

For t=4:

3.5βˆ’3.0=w4+ΞΈ1w3+ΞΈ2w2

0.5=w4+ΞΈ1w3+ΞΈ2w2

For t=5:

4.0βˆ’3.0=w5+ΞΈ1w4+ΞΈ2w3

1.0=w5+ΞΈ1w4+ΞΈ2w3

IV. Solving the Equations:

The system involves both the coefficients ΞΈ1,ΞΈ2 and the error terms wt. To solve for the coefficients, we need to express the equations in terms of ΞΈ1 and ΞΈ2.

From t=1:

w1=βˆ’1.0

From t=2:

βˆ’0.5=w2βˆ’ΞΈ1β‡’w2=βˆ’0.5+ΞΈ1

From t=3:

0=w3+ΞΈ1w2βˆ’ΞΈ2β‡’w3=βˆ’ΞΈ1w2+ΞΈ2

Substituting w2:

w3=βˆ’ΞΈ1(βˆ’0.5+ΞΈ1)+ΞΈ2=0.5ΞΈ1βˆ’ΞΈ12+ΞΈ2

From t=4:

0.5=w4+ΞΈ1w3+ΞΈ2w2

Substitute w3 and w2:

0.5=w4+ΞΈ1(0.5ΞΈ1βˆ’ΞΈ12+ΞΈ2)+ΞΈ2(βˆ’0.5+ΞΈ1)

From t=5:

1.0=w5+ΞΈ1w4+ΞΈ2w3

This system is nonlinear and interdependent, making it challenging to solve analytically. Instead, iterative numerical methods or optimization algorithms are typically employed to estimate ΞΈ1 and ΞΈ2.

V. Simplified Approach:

Given the complexity, we'll adopt a simplified approach by making initial guesses for ΞΈ1 and ΞΈ2 and iteratively refine them to minimize the sum of squared residuals.

Initial Guesses:

ΞΈ1=0.0,ΞΈ2=0.0

Iteration 1:

w1=βˆ’1.0

w2=βˆ’0.5+0.0=βˆ’0.5

w3=0.5β‹…0.0βˆ’0.02+0.0=0.0

0.5=w4+0.0β‹…0.0+0.0β‹…(βˆ’0.5)β‡’w4=0.5

1.0=w5+0.0β‹…0.5+0.0β‹…0.0β‡’w5=1.0

Sum of Squared Residuals (SSR):

SSR=(βˆ’1.0)2+(βˆ’0.5)2+0.02+0.52+1.02=1.0+0.25+0.0+0.25+1.0=2.5

Iteration 2:

Suppose we adjust ΞΈ1 and ΞΈ2 to reduce SSR. For instance, set ΞΈ1=0.1, ΞΈ2=0.05.

Recompute residuals with new coefficients:

w2=βˆ’0.5+0.1=βˆ’0.4

w3=0.5β‹…0.1βˆ’(0.1)2+0.05=0.05βˆ’0.01+0.05=0.09

0.5=w4+0.1β‹…0.09+0.05β‹…(βˆ’0.4)

0.5=w4+0.009βˆ’0.02

w4=0.5βˆ’0.009+0.02=0.511

1.0=w5+0.1β‹…0.511+0.05β‹…0.09

1.0=w5+0.0511+0.0045

w5=1.0βˆ’0.0556=0.9444

New SSR:

SSR=(βˆ’1.0)2+(βˆ’0.4)2+0.092+0.5112+0.94442β‰ˆ1.0+0.16+0.0081+0.2612+0.8911=3.3204

The SSR has increased, indicating that the initial adjustment did not improve the fit. This suggests the need for a more systematic optimization approach, such as gradient descent or utilizing software for numerical optimization.

The above example illustrates that fitting an MA(2) model manually involves solving a system of nonlinear equations, which is not straightforward. In practice, statistical software packages (e.g., R's stats package, Python's statsmodels) implement sophisticated algorithms to estimate MA model parameters efficiently using MLE or other optimization techniques.

Comparison of Common Models

Selecting the appropriate time series model is pivotal for accurate forecasting and analysis. Below is a summary table comparing commonly used time series models, highlighting their components, use cases, assumptions, strengths, and limitations.

Model Components Use Case Key Assumptions Strengths Limitations
AR (Autoregressive) Past values of the series (Ytβˆ’1,Ytβˆ’2,…) Captures relationships between past values of the series. Stationarity (constant mean/variance over time). Simple to interpret; effective for stationary data. Ineffective for non-stationary data or irregular patterns.
MA (Moving Average) Past forecast errors (Ξ΅tβˆ’1,Ξ΅tβˆ’2,…) Models influence of random shocks (errors) on the series. Stationarity; residuals are white noise. Captures short-term dependencies caused by noise. Requires accurate identification of significant error lags.
ARMA (AR + MA) Combines AR and MA components (p,q) Models both past values and past forecast errors. Stationarity; linear relationships in data. Balances modeling of past values and shocks. Struggles with data exhibiting trends or seasonality.
ARIMA (Autoregressive Integrated Moving Average) AR + MA + differencing (p,d,q) Handles non-stationary data by differencing. Differencing converts the data to stationary. Versatile; applicable to a wide range of stationary and non-stationary series. Selecting appropriate p,d,q can be challenging.
SARIMA (Seasonal ARIMA) ARIMA + seasonal terms (P,D,Q,m) Models seasonal patterns in addition to trends and noise. Seasonality is stable and periodic (fixed frequency). Ideal for seasonal data with trends. Computationally intensive; requires specification of seasonal terms.
SES (Simple Exponential Smoothing) Weighted average of past observations Forecasts data without trends or seasonality (level only). Data has no trend or seasonality; relies on exponential weighting. Easy to use; effective for flat, stationary series. Ineffective for data with trends or seasonality.
Holt's Linear SES + trend component Models level and trend for forecasting. Additive linear trend (no seasonality). Suitable for data with trends but no seasonality. Fails if seasonality is present.
Holt-Winters SES + trend + seasonality components Models level, trend, and seasonality. Additive or multiplicative seasonality; periodic patterns are consistent over time. Captures complex patterns in data. Requires stable seasonal structure.
ETS (Error-Trend-Seasonality) Exponential smoothing framework Flexible model for level, trend, and seasonality. Error, trend, and seasonality are modeled explicitly. Automatically selects the best smoothing model. Less interpretable than ARIMA-type models.
VAR (Vector Autoregression) Multivariate time series (relationships between multiple series) Models relationships between two or more time series. All series must be stationary; interdependence is linear. Handles interdependent series; suitable for causal analysis. Complex; requires all series to be stationary and interrelated.
ARCH (Autoregressive Conditional Heteroskedasticity) Variance of errors depends on past variances. Models volatility clustering in financial/economic data. Errors exhibit changing variance (heteroskedasticity). Excellent for analyzing volatility in returns or prices. Assumes specific forms of variance dynamics.
GARCH (Generalized ARCH) Extends ARCH with lagged variance terms. Captures long-term and short-term volatility in data. Errors have heteroskedasticity and correlations in variance. Flexible; captures complex volatility patterns. Requires careful parameter tuning.
TBATS (Exponential Smoothing State Space Model) Exponential smoothing + trend + seasonality + Box-Cox transformation Models complex seasonal patterns (e.g., multiple seasonalities). Handles irregular and multiple seasonalities. Flexible for advanced forecasting scenarios. Computationally intensive.
Prophet (Facebook) Trend + seasonality + holidays Forecasts with irregular data and explicit handling of external events. Assumes linear or logistic growth; holidays/events are known and well-defined. User-friendly; handles missing data and holidays. Less precise for short-term, high-frequency data.

Table of Contents

    Time Series Modeling
    1. Model Fitting
      1. Synthetic Dataset
    2. Fitting an AR(2) Model
      1. Step-by-Step Calculation
    3. Fitting an MA(2) Model
      1. Step-by-Step Calculation
    4. Comparison of Common Models