Tuesday, July 26, 2016

The Forecasting Performance of Models for Cointegrated Data

Here's an interesting practical question that arises when you're considering different forms of econometric models for forecasting time-series data:
"Which type of model will perform best when the data are non-stationary, and perhaps cointegrated?"
To answer this question we have to think about the alternative models that are available to us; and we also have to decide on what we mean by 'best'. In other words, we have to agree on some sort of loss function or performance criterion for measuring forecast quality.

Notice that the question I've posed above allows for the possibility that the data that we're using are integrated, and the various series we're working with may or may not be cointegrated. This scenario covers a wide range of commonly encountered situations in econometrics.

In an earlier post I discussed some of the basic "mechanics" of forecasting from an Error Correction Model. This type of model is used in the case where our data are non-stationary and cointegrated, and we want to focus on the short-run dynamics of the relationship that we're modelling. However, in that post I deliberately didn't take up the issue of whether or not such a model will out-perform other competing models when it comes to forecasting.

Let's look at that issue here.

We'll consider several types of econometric models that might be used with non-stationary time-series data, and we'll see what evidence there is in the literature regarding their relative merits with regard to forecasting performance

(Keep in mind that forecasting is just one reason for estimating these models. A model type that performs best at forecasting need not necessarily be the optimal one for measuring marginal effects, for instance.)

In the following discussion I'll consider various regression models, with a dependent variable that we wish to forecast, and one or more regressors. We'll also look at single-variable time-series models in the ARIMA family, where only the past history of the data for the series to be predicted (say Yt) is used.

With regard to the latter, I'll consider two broad situations:
  • Yt is a stationary series.
  • Yt is non-stationary, but is difference-stationary. That is, it is I(d), for some d > 0.
In the case of regression models, three situations will be considered:
  1. All of the data are stationary.
  2. Some (or all) of the series are non-stationary, but are not cointegrated.
  3. The time-series are non-stationary, and are cointegrated.

In the case of the single-variable ARIMA-type models, the situation is straightforward. If Yt is stationary, then we can proceed with a standard Box-Jenkins ARIMA analysis, using the level of the Yt data. If Yt is integrated (of order one) then we just first-difference the data, and apply the usual ARIMA modelling to ΔYt. If Yt is seasonally integrated, then we just seasonally difference the data and go from there, etc

Getting a forecast of Yt is trivial if the data are stationary. It's also straightforward if ΔYt is stationary - if ΔYt* is the forecast of ΔYt, then the static forecast of Yt itself is Yt* = ΔYt* + Yt-1; and the dynamic forecast of Yt is Yt* = ΔYt* + Yt-1*.

(For a discussion of "static" and "dynamic" forecasts, see this earlier post.)

Now, what about forecasts that are based on a regression model? To simplify the exposition, and without any loss of generality, suppose that the model is one that "explains" Yt in terms of a single regressor, Xt, and the relationship is linear in the parameters:

                Yt = α + β Xt + εt .                                   (1)

(We'll also assume that the error term, εt, is fully "well-behaved".)  

Under case 1. above, all of the data are stationary, we would just proceed to estimate (1) and predict future values of Y in the usual way. If values of X are not known in the forecast period(s) they can (for example) be predicted using an appropriate ARMA model.

In case 2, either or both of the Y and X variables will have to be differenced (an appropriate number of times) before the equation of interest is estimated. For instance, if both Y and X are I(1), but not cointegrated, the model that we'd estimate would be of the form

             ΔYt = α + β ΔXt + ε't .                                          (2)

This would allow us to forecast  ΔYt, and then either static or dynamic forecasts of Yt itself can be "recovered" in exactly the way that was described above for ARIMA models.

Case 3 is the probably the really interesting case. Now Y and X are cointegrated, so there are two (or sometimes three) different single-equation models for Y that we can consider. These are:

(i)  Long-term equilibrium model:

                Yt = α + β Xt + εt .                                         (3)

(ii) Error-correction model (ECM):

               ΔYt = α + β ΔXt + γ et-1 + ut ,                           (4)

where et = (Yt - a - bXt) is the OLS residual from equation (3), and its lagged value is the so-called "error-correction term". (In (4), and (5) below, lagged values of  ΔYt and ΔXt may be included as additional regressors.)

(iii) Restricted ECM:

Sometimes an exact cointegrating relationship between X and Y is suggested (or dictated) by the underlying economic theory, the error-correction mechanism doesn't need to be estimated, and so the ECM in (4) is modified to become:

             ΔYt = α + β ΔXδ (θ'Wt-1) + vt ,                           (5)

where Wt' = (Yt , Xt), θ is a known vector, and now (θ'Wt-1) is the error-correction term.

One example of this situation would be the permanent income hypothesis. Another would be the Fisher hypothesis.

(I'm limiting the discussion to the case where we are interested in modelling and forecasting only Y, so VAR models aren't considered here. Useful references for studies that do include VAR models for forecasting cointegrated data include Fanchon and Wendel, 1992; Hall et al., 1992; and Duy and Thoma, 1998.)

So, an interesting question that arises here is: "which of models (i) to (iii) will perform best when our objective is to predict future values of Y"?

Not surprisingly, there's a literature that addresses this question. What does that literature tell us?

Here's a really quick summary, based largely on the study by Duy and Thoma (1998), with forecast "quality" measured in terms of predictive mean squared error:

  • Models of type (iii) involve the imposition of exact restrictions on the parameters. This might be expected to be helpful if the restrictions are indeed correct, but risky (from a statistical perspective), with adverse implications for forecasting performance, if these restrictions are false.
  • Notwithstanding this potential trade-off, type (iii) models generally out-forecast type (ii) models, and both out-perform type (i) levels models.
  • These rankings become more pronounced as the forecast horizon increases.
  • These results can be sensitive to the particular time-series data that are being considered.

Of course, although these findings are based on studies that are carefully executed, they aren't definitive. In addition, they're limited to the use of a quadratic loss function when evaluating forecasting performance.

If this is a topic that interests you, then I urge you to take careful look at the Duy and Thoma paper (and the other papers that they cite) to get a more complete picture.


Duy, T. A. and M. A. Thoma, 1998. Modeling and forecasting cointegrated variables: Some practical experience. Journal of Economics and Business, 50, 291–307.

Engle, R. F., and B-S. Yoo, 1987. Forecasting and testing in co-integrated systems. Journal of Econometrics, 35, 143–159. 

Fanchon, P., and J. Wendel, 1992. Estimating VAR models under non-stationarity and cointegration: Alternative approaches for forecasting cattle prices. Applied Economics, 24, 207–217.

Hall, A. D., H. M. Anderson, and C. W. J. Granger, 1992. A cointegration analysis of Treasury Bill yields. Review of Economics and Statistics, 74, 116–126

© 2016, David E. Giles

No comments:

Post a Comment