*Last modified: September 16, 2024*

*This article is written in: ðŸ‡ºðŸ‡¸*

## Stationarity in Time Series

Stationarity is a fundamental concept in time series analysis. A time series is considered **stationary** if its statistical propertiesâ€”such as mean, variance, and autocovarianceâ€”remain constant over time. Stationary processes are crucial in time series modeling because many methods, such as ARIMA and ARMA models, assume stationarity.

Stationarity can be classified into two types:

- Strict stationarity implies that the
**entire distribution of the process remains the same over time**. - Weak stationarity, also known as second-order stationarity, requires only that the
**mean, variance, and autocovariance remain time-invariant**over time.

### Intuition for Stationary Time Series

A **stationary time series** behaves similarly over time, meaning:

- The mean of the series shows
**no trend**and does not systematically change over time. - The variability around the mean has a
**constant variance**, remaining stable throughout. - There are
**no periodic fluctuations**such as seasonality or cyclic behavior, unless explicitly modeled.

This means that the statistical properties of one segment of the series are similar to those of any other segment, allowing us to predict future behavior based on past data.

### Strict Stationarity

A process is said to be **strictly stationary** if the joint distribution of any subset of observations $X_{t_1}, X_{t_2}, \dots, X_{t_k}$ is the same as the distribution of $X_{t_1 + \tau}, X_{t_2 + \tau}, \dots, X_{t_k + \tau}$ for all $\tau$.

In simple terms, the process looks the same no matter how we shift it in time. Strict stationarity implies that:

- The distribution of $X_t$ does not change over time.
- All moments of the distribution (mean, variance, higher moments) are constant over time.

### Weak (Second-Order) Stationarity

Weak stationarity, also known as **second-order stationarity**, requires only that the **first two moments** (mean and variance) and the **autocovariance** depend solely on the lag between observations, not on time itself.

A time series ${X_t}$ is weakly stationary if:

- The
**mean**of the series is constant: $E[X_t] = \mu$ for all $t$. - The
**variance**is constant: $\text{Var}(X_t) = \sigma^2$ for all $t$. - The
**autocovariance**between $X_t$ and $X_{t+k}$ depends only on the lag $k$, not on $t$:

$$ \text{Cov}(X_t, X_{t+k}) = \gamma(k) $$

Weak stationarity is often sufficient for most time series models, as it focuses on ensuring that the mean and variance remain stable over time, making the process easier to model and analyze.

### Properties of Stationary Processes

#### Mean, Variance, and Autocovariance Functions

To analyze a stationary process, we focus on three key functions:

- The
**mean function**$\mu(t) = E[X_t]$ represents the expected value of the process at time $t$, and for a stationary process, this should remain constant. - The
**variance function**$\sigma^2(t) = \text{Var}(X_t)$ gives the variance at time $t$, which must also be constant for stationarity. - The
**autocovariance function**$\gamma(k) = \text{Cov}(X_t, X_{t+k})$ measures how the process correlates with itself at different time lags $k$, and for a stationary process, it depends only on the lag $k$, not on time $t$.

#### Autocorrelation and Bounds

For a weakly stationary process, the **autocorrelation function** $\rho(k)$, which measures the correlation between two points in the series separated by lag $k$, is bounded by -1 and 1:

$$ -1 \leq \rho(k) \leq 1 $$

This bound can be derived from basic linear algebra principles that apply to correlations between random variables.

### Examples of Stationary Processes

**White Noise**

White noise is the simplest example of a stationary process. It is defined as a sequence of uncorrelated, identically distributed random variables:

$$ X_t \sim \mathcal{N}(0, \sigma^2) $$

Properties of white noise:

- The
**mean**is constant: $E[X_t] = 0$. - The
**variance**is constant: $\text{Var}(X_t) = \sigma^2$. - The
**autocovariance**function is:

$$ \gamma(k) = \begin{cases} \sigma^2 & \text{if } k = 0 \\ 0 & \text{if } k \neq 0 \end{cases} $$

- The
**autocorrelation**function is:

$$ \rho(k) = \begin{cases} 1 & \text{if } k = 0 \\ 0 & \text{if } k \neq 0 \end{cases} $$

Thus, white noise is a stationary process because its mean and variance are constant, and its autocovariance depends only on the lag.

#### Moving Average (MA) Process

A **moving average (MA) process** of order $q$, denoted as MA(q), is another example of a weakly stationary process. It is defined as:

$$ X_t = \beta_0 Z_t + \beta_1 Z_{t-1} + \dots + \beta_q Z_{t-q} $$

where $Z_t \sim \mathcal{N}(0, \sigma_Z^2)$ are independent white noise terms.

For an MA(q) process:

- The
**mean**is zero: $E[X_t] = 0$. - The
**variance**is constant:

$$ \text{Var}(X_t) = \sigma_Z^2 \sum_{i=0}^{q} \beta_i^2 $$

- The
**autocovariance**function $\gamma(k)$ depends on the lag $k$:

$$ \gamma(k) = \begin{cases} \sigma_Z^2 \sum_{i=0}^{q-k} \beta_i \beta_{i+k} & \text{if } k \leq q \\ 0 & \text{if } k > q \end{cases} $$

The autocorrelation function $\rho(k)$ is obtained by normalizing the autocovariance by the variance:

$$ \rho(k) = \frac{\gamma(k)}{\gamma(0)} $$

The MA(q) process is weakly stationary because its mean and variance are constant, and the autocovariance depends only on the lag.

### Non-Stationary Processes

#### Random Walk

A **random walk** is an example of a non-stationary process. A random walk can be written as:

$$ X_t = X_{t-1} + Z_t $$

where $Z_t$ is white noise.

For a random walk:

The **mean** grows over time:

$$ E[X_t] = t \cdot \mu $$

The **variance** increases with time:

$$ \text{Var}(X_t) = t \cdot \sigma^2 $$

Since the variance and mean depend on time, the random walk is **not stationary**. However, applying a **difference operator** can transform a random walk into a stationary series.

#### Differencing to Remove Non-Stationarity

To handle non-stationary processes like random walks, we can apply the **difference operator** $\Delta$, which removes trends and transforms the process into a stationary one.

The difference operator is defined as:

$$ \Delta X_t = X_t - X_{t-1} = Z_t $$

By differencing the series, we convert a random walk into white noise, which is stationary. This technique is essential for models like ARIMA that require the data to be stationary before modeling.

### Dealing with Non-Stationary Time Series

In real-world applications, many time series are non-stationary. To apply statistical models that require stationarity, we often use **transformations** such as:

- Applying
**differencing**helps remove trends and makes the series stationary. - Using
**logarithmic transformations**can stabilize the variance in the series. - The process of
**detrending**removes long-term trends, allowing a focus on short-term fluctuations.

#### Example of Differencing in Python

We can use Python to difference a non-stationary series like a random walk:

```
import numpy as np
import matplotlib.pyplot as plt
# Simulate a random walk
np.random.seed(42)
N = 1000
Z = np.random.normal(0, 1, N)
X = np.cumsum(Z) #
Random walk as cumulative sum of white noise
# Apply differencing to make it stationary
diff_X = np.diff(X)
# Plot the original random walk and the differenced series
plt.figure(figsize=(10, 6))
plt.subplot(2, 1, 1)
plt.plot(X, label='Random Walk')
plt.title('Random Walk (Non-Stationary)')
plt.grid(True)
plt.subplot(2, 1, 2)
plt.plot(diff_X, label='Differenced Series')
plt.title('Differenced Series (Stationary)')
plt.grid(True)
plt.tight_layout()
plt.show()
```

I. Simulating a Random Walk:

- A random walk is generated by taking the cumulative sum of normally distributed random numbers. This produces a series where each value depends on the previous one plus some random noise.
- The random walk is non-stationary because it lacks a constant mean and variance over timeâ€”it drifts unpredictably.

II. Differencing:

- Differencing transforms the non-stationary series into a stationary one by subtracting the previous observation from the current one. This removes any trend or long-term structure in the data.
- In Python, this is done using
`np.diff()`

, which takes the difference between consecutive elements of the series.

The result plot would look like the following:

In this plot, the upper section shows the random walk (non-stationary), while the lower section shows the differenced series (stationary). Differencing removes the trend from the original series, making it easier to model and predict future values.