Econometrics Beat: Dave Giles' Blog: The Forecasting Performance of Models for Cointegrated Data

Tuesday, July 26, 2016

The Forecasting Performance of Models for Cointegrated Data

Here's an interesting practical question that arises when you're considering different forms of econometric models for forecasting time-series data:

"Which type of model will perform best when the data are non-stationary, and perhaps cointegrated?"

To answer this question we have to think about the alternative models that are available to us; and we also have to decide on what we mean by 'best'. In other words, we have to agree on some sort of loss function or performance criterion for measuring forecast quality.

Notice that the question I've posed above allows for the possibility that the data that we're using are integrated, and the various series we're working with may or may not be cointegrated. This scenario covers a wide range of commonly encountered situations in econometrics.

In an earlier post I discussed some of the basic "mechanics" of forecasting from an Error Correction Model. This type of model is used in the case where our data are non-stationary and cointegrated, and we want to focus on the short-run dynamics of the relationship that we're modelling. However, in that post I deliberately didn't take up the issue of whether or not such a model will out-perform other competing models when it comes to forecasting.

Let's look at that issue here.

We'll consider several types of econometric models that might be used with non-stationary time-series data, and we'll see what evidence there is in the literature regarding their relative merits with regard to forecasting performance.

(Keep in mind that forecasting is just one reason for estimating these models. A model type that performs best at forecasting need not necessarily be the optimal one for measuring marginal effects, for instance.)

In the following discussion I'll consider various regression models, with a dependent variable that we wish to forecast, and one or more regressors. We'll also look at single-variable time-series models in the ARIMA family, where only the past history of the data for the series to be predicted (say Y_t) is used.

With regard to the latter, I'll consider two broad situations:

Y_t is a stationary series.
Y_t is non-stationary, but is difference-stationary. That is, it is I(d), for some d > 0.

In the case of regression models, three situations will be considered:

All of the data are stationary.
Some (or all) of the series are non-stationary, but are not cointegrated.
The time-series are non-stationary, and are cointegrated.

In the case of the single-variable ARIMA-type models, the situation is straightforward. If Y_t is stationary, then we can proceed with a standard Box-Jenkins ARIMA analysis, using the level of the Y_t data. If Y_t is integrated (of order one) then we just first-difference the data, and apply the usual ARIMA modelling to ΔY_t. If Y_t is seasonally integrated, then we just seasonally difference the data and go from there, etc.

Getting a forecast of Y_t is trivial if the data are stationary. It's also straightforward if ΔY_t is stationary - if ΔY_t* is the forecast of ΔY_t, then the static forecast of Y_t itself is Y_t* = ΔY_t* + Y_t-1; and the dynamic forecast of Y_t is Y_t* = ΔY_t* + Y_t-1*.

(For a discussion of "static" and "dynamic" forecasts, see this earlier post.)

Now, what about forecasts that are based on a regression model? To simplify the exposition, and without any loss of generality, suppose that the model is one that "explains" Y_t in terms of a single regressor, X_t, and the relationship is linear in the parameters:

Y_t = α + β X_t + ε_t . (1)

(We'll also assume that the error term, ε_t, is fully "well-behaved".)

Under case 1. above, all of the data are stationary, we would just proceed to estimate (1) and predict future values of Y in the usual way. If values of X are not known in the forecast period(s) they can (for example) be predicted using an appropriate ARMA model.

In case 2, either or both of the Y and X variables will have to be differenced (an appropriate number of times) before the equation of interest is estimated. For instance, if both Y and X are I(1), but not cointegrated, the model that we'd estimate would be of the form

ΔY_t = α + β ΔX_t + ε'_t . (2)

This would allow us to forecast ΔY_t, and then either static or dynamic forecasts of Y_t itself can be "recovered" in exactly the way that was described above for ARIMA models.

Case 3 is the probably the really interesting case. Now Y and X are cointegrated, so there are two (or sometimes three) different single-equation models for Y that we can consider. These are:

(i) Long-term equilibrium model:

Y_t = α + β X_t + ε_t . (3)

(ii) Error-correction model (ECM):

ΔY_t = α + β ΔX_t + γ e_t-1 + u_t , (4)

where e_t = (Y_t - a - bX_t) is the OLS residual from equation (3), and its lagged value is the so-called "error-correction term". (In (4), and (5) below, lagged values of ΔY_t and ΔX_t may be included as additional regressors.)

(iii) Restricted ECM:

Sometimes an exact cointegrating relationship between X and Y is suggested (or dictated) by the underlying economic theory, the error-correction mechanism doesn't need to be estimated, and so the ECM in (4) is modified to become:

ΔY_t = α + β ΔX_t+ δ (θ'W_t-1) + v_t , (5)

where W_t' = (Y_t , X_t), θ is a known vector, and now (θ'W_t-1) is the error-correction term.

One example of this situation would be the permanent income hypothesis. Another would be the Fisher hypothesis.

(I'm limiting the discussion to the case where we are interested in modelling and forecasting only Y, so VAR models aren't considered here. Useful references for studies that do include VAR models for forecasting cointegrated data include Fanchon and Wendel, 1992; Hall et al., 1992; and Duy and Thoma, 1998.)

So, an interesting question that arises here is: "which of models (i) to (iii) will perform best when our objective is to predict future values of Y"?

Not surprisingly, there's a literature that addresses this question. What does that literature tell us?

Here's a really quick summary, based largely on the study by Duy and Thoma (1998), with forecast "quality" measured in terms of predictive mean squared error:

Models of type (iii) involve the imposition of exact restrictions on the parameters. This might be expected to be helpful if the restrictions are indeed correct, but risky (from a statistical perspective), with adverse implications for forecasting performance, if these restrictions are false.
Notwithstanding this potential trade-off, type (iii) models generally out-forecast type (ii) models, and both out-perform type (i) levels models.
These rankings become more pronounced as the forecast horizon increases.
These results can be sensitive to the particular time-series data that are being considered.

Of course, although these findings are based on studies that are carefully executed, they aren't definitive. In addition, they're limited to the use of a quadratic loss function when evaluating forecasting performance.

If this is a topic that interests you, then I urge you to take careful look at the Duy and Thoma paper (and the other papers that they cite) to get a more complete picture.

References

Duy, T. A. and M. A. Thoma, 1998. Modeling and forecasting cointegrated variables: Some practical experience. Journal of Economics and Business, 50, 291–307.

Engle, R. F., and B-S. Yoo, 1987. Forecasting and testing in co-integrated systems. Journal of Econometrics, 35, 143–159.

Fanchon, P., and J. Wendel, 1992. Estimating VAR models under non-stationarity and cointegration: Alternative approaches for forecasting cattle prices. Applied Economics, 24, 207–217.

Hall, A. D., H. M. Anderson, and C. W. J. Granger, 1992. A cointegration analysis of Treasury Bill yields. Review of Economics and Statistics, 74, 116–126

3 comments:

EcointelligencyMay 12, 2019 at 9:31 AM
Hello my Professor
Sorry Sir, in case of a mixture of variables I(2), I(1) and I(0), how to do the cointegration test?
Cordially.
ReplyDelete
Replies
EcointelligencyMay 14, 2019 at 8:54 PM
Hello my Professor
Thank you very much for your reply and your beautiful patience.
Cordially
ReplyDelete
Replies

Add comment

Note: Only a member of this blog may post a comment.

Pages

Tuesday, July 26, 2016

The Forecasting Performance of Models for Cointegrated Data

3 comments: