Last modified: August 08, 2019
This article is written in: 🇺🇸
Moving Average (MA) models are a fundamental class of univariate time series models used for forecasting and understanding temporal data. Unlike Autoregressive (AR) models, which rely on past values of the series itself, MA models utilize past forecast errors to model the current value of the series. This approach is particularly effective for capturing short-term dependencies and abrupt changes in the data.
An MA model expresses the current value of the time series as a linear combination of past error terms and a constant mean. This method can be visualized similarly to a low-pass filter in signal processing, where high-frequency noise is smoothed out to reveal the underlying trend.
A Moving Average model of order $q$, denoted as MA($q$), is defined by the following equation:
$$Y_t = \mu + \varepsilon_t + \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2} + \dots + \theta_q \varepsilon_{t-q}$$
Alternatively, using summation notation:
$$Y_t = \mu + \sum_{i=0}^{q} \theta_i \varepsilon_{t-i}$$
where:
An MA(1) model incorporates the current error term and the immediately preceding error term:
$$Y_t = \mu + \varepsilon_t + \theta_1 \varepsilon_{t-1}$$
An MA(2) model includes the current error term and the two most recent error terms:
$$Y_t = \mu + \varepsilon_t + \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2}$$
A $q$-th order MA model extends this concept by including $q$ lagged error terms:
$$Y_t = \mu + \varepsilon_t + \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2} + \dots + \theta_q \varepsilon_{t-q}$$
Understanding the theoretical properties of MA models is crucial for effective modeling and forecasting. Below, we outline the key properties, particularly focusing on the MA(1) model as an example.
For an MA(1) model defined as:
$$Y_t = \mu + \varepsilon_t + \theta_1 \varepsilon_{t-1}$$
the following properties hold:
I. Mean:
$$\mathbb{E}[Y_t] = \mu$$
II. Variance:
$$\text{Var}(Y_t) = \sigma^2 (1 + \theta_1^2)$$
III. Autocorrelation Function (ACF):
The ACF measures the correlation between $Y_t$ and $Y_{t-h}$ for different lags $h$.
Lag 1 ($h = 1$):
$$\rho_1 = \frac{\theta_1}{1 + \theta_1^2}$$
Lags $h \geq 2$:
$$\rho_h = 0$$
The ACF of an MA($q$) model has non-zero autocorrelations up to lag $q$ and zero autocorrelations beyond that. Specifically:
This truncation property of the ACF is instrumental in determining the appropriate order $q$ for an MA model.
Selecting the correct order $q$ is essential for building an effective MA model. The process involves analyzing the ACF and, in some cases, the Partial Autocorrelation Function (PACF).
I. Plot the Autocorrelation Function (ACF):
II. Estimate Models with Varying Orders:
Fit MA models with different orders $q$ (e.g., MA(1), MA(2), MA(3), etc.).
III. Compare Model Fits Using Information Criteria:
Akaike Information Criterion (AIC):
$$\text{AIC} = -2 \ln(L) + 2k$$
Bayesian Information Criterion (BIC):
$$\text{BIC} = -2 \ln(L) + k \ln(n)$$
where:
Lower AIC or BIC values indicate a better balance between model fit and complexity.
IV. Select the Model with the Lowest AIC/BIC: - Choose the MA($q$) model that minimizes the chosen information criterion.
Suppose the ACF plot of a time series shows significant autocorrelations at lags 1 and 2, with autocorrelations near zero for lags $h \geq 3$. This pattern suggests considering an MA(2) model.
After fitting MA models of orders 1, 2, and 3, you obtain the following information criteria:
Model | AIC | BIC |
MA(1) | 200 | 205 |
MA(2) | 190 | 195 |
MA(3) | 192 | 200 |
Both AIC and BIC are minimized at MA(2), indicating that an MA(2) model is the most appropriate choice for the data.
To illustrate the concepts of Moving Average (MA) models, consider the following example of an MA(1) model.
Suppose we have an MA(1) model defined as:
$$Y_t = \mu + w_t + \theta_1 w_{t-1}$$
Where:
Given Parameters:
$\mu = 10$
Thus, the MA(1) model becomes:
$$Y_t = 10 + w_t + 0.5w_{t-1}$$
The Autocorrelation Function (ACF) for an MA($q$) model has non-zero autocorrelations up to lag $q$ and zero autocorrelations beyond that. Specifically, for an MA(1) model:
I. Lag 1 ($h = 1$):
$$\rho_1 = \frac{\theta_1}{1 + \theta_1^2} = \frac{0.5}{1 + (0.5)^2} = \frac{0.5}{1.25} = 0.4$$
II. Lags $h \geq 2$:
$$\rho_h = 0 \quad \text{for} \quad h \geq 2$$
Visual Representation:
Note: In practice, sample ACF plots may not perfectly align with theoretical expectations due to randomness and finite sample sizes. However, the characteristic pattern—significant autocorrelation at lag 1 and near-zero autocorrelations at higher lags—serves as a strong indicator for identifying the order of an MA model.
The Simple Moving Average (SMA) is the unweighted mean of the previous k
data points. It's used to smooth out data series and identify trends over time. The formula for SMA is:
$$ SMA_t = \frac{1}{k} \sum_{i=0}^{k-1} y_{t-i} $$
t
.t
.SMA helps in reducing the noise in the data to see the underlying trend more clearly.
The Exponential Moving Average (EMA) places a greater weight on more recent data points, making it more responsive to new information. The formula for EMA is:
$$ EMA_{t} = (1 - \alpha) y_t + \alpha \cdot {EMA}_{t-1} $$
t
.t
.A higher $\alpha$ places more weight on recent observations, helping in tracking the latest changes more closely.
When analyzing stock prices, technical indicators like the Simple Moving Average (SMA) and the Exponential Moving Average (EMA) are frequently employed. These methods help to smooth out price data over a specified period and can be crucial in identifying trends.
The Simple Moving Average (SMA) is a calculation that takes the arithmetic mean of a given set of prices over a specific number of days in the past; for instance, over the previous 20 days.
$SMA = (P1 + P2 + ... + P20) / 20$$
Here, $P1$, $P2$, ..., $P20$ represent the stock prices for each of the 20 days.
The 20-day SMA helps smooth out short-term fluctuations in stock prices, providing a clearer view of the overall price trend.
The Exponential Moving Average (EMA) gives more weight to more recent prices. This sensitivity to newer prices makes the EMA more responsive to price changes. Unlike the SMA, the EMA applies a weighting factor to each day's price depending on its recency.
$$EMA = Price(T) * k + EMA(Y) * (1 - k)$$
Where:
The EMA is valuable for capturing more recent trends and is often used for shorter time frames.