ARIMA(p, d, q) models

Last time we have combined AR(p) and MA(q) models to make ARMA(p, q) model. We are getting quite close to understanding what ARFIMA processes are, but to do that we still have things to look at. This time lets take into account letter I (which stands for "integrated") - lets look at ARIMA models.

Although in the earlier posts we have seen that ARMA(p, q) models can produce non-stationary time series, fitting these models to non-stationary data is quite problematic. Therefore in practice ARMA(p, q) models are used to study and forecast only stationary time series [1].

In an earlier post we have used differencing technique to transform non-stationary random walk into uncorrelated stationary time series (white noise). Not surprisingly differencing is also behind ARIMA models.

In ARIMA(p, d, q) models ARMA(p, q) is used to forecast not the time series itself (\( x_t \)), but the \( d \)-th difference of \( x_t \). If we are considering second difference (\( d = 2 \)) of \( x_t \), then we have to use ARMA on time series \( z_t \):

\begin{equation} z_t = y_t - y_{t-1} = ( x_t - x_{t-1} ) - ( x_{t-1} - x_{t-2} ) = x_t - 2 x_{t-1} + x_{t-2} . \end{equation}

In the above \( y_t = x_t - x_{t-1} \) is the first difference of \( x_t \).

Excellent, but what if you have \( z_t \) (e.g., you have forecasted values into the future) and you want to obtain \( x_t \)? Simply use cumulative sum \( d \) times (\( d = 2 \) in our example):

\begin{equation} y_t = \sum_{i=1}^{t} z_i + y_0 , \quad x_t = \sum_{i=1}^{t} y_i + x_0 . \end{equation}

Forecasting is beyond the scope and interest of this blog, but here you can use the app below to explore time series generated by the ARIMA model for given parameter values.

For example, notice that ARIMA(1, 0, 0) and ARIMA(0, 1, 0) both model random walk. ARIMA(1, 0, 0) is equivalent to AR(1) and we have already seen that it models random walk. For ARIMA(0, 1, 0) we understand that ARMA(0, 0) is used to model the first differences of \( x_t \):

\begin{equation} y_t = \xi_t , \end{equation}

\begin{equation} x_t = \sum_{i=1}^t y_t + x_0 = \sum_{i=1}^t \xi_t + x_0 . \end{equation}

In the above we see just a sum (integral) of white noise, \( \xi_t \), which actually is the definition of random walk.


  • E. E. Holmes, M. D. Scheuerell, E. J. Ward. Applied Time Series Analysis for Fisheries and Environmental Sciences. Edition 2021.