Econometrics Beat: Dave Giles' Blog: Forecasting From an Error Correction Model

Saturday, May 28, 2016

Forecasting From an Error Correction Model

Recently, a reader asked about generating forecasts from an estimated Error Correction Model (ECM). Really, the issues that arise are no different from those associated with any dynamic regression model. I talked about the latter in a previous post in 2013.

Anyway, let's take a look at the specifics.........

For simplicity, suppose that we have just two variables, Y and X, and a single-equation ECM, with Y as the variable that we want to model. The following discussion extends trivially if we have additional variables.

Recall that an ECM is used when all of the variables are I(1), and cointegrated. We'll assume that both of these features of the data have been established by previous testing. For instance, the non-stationarity of the series may have determined by applying augmented Dickey-Fuller tests; and the presence of cointegration may have been determined by using the Engle-Granger two-step procedure.

Because we have just two variables, we can't have more than one cointegrating relationship between them; and any cointegrating relationship is unique. (This situation will change if there are more than two I(1) variables.)

The purpose of an ECM is to enable us to model the short-run dynamics between X and Y. The cointegrating equation measures the long-run relationship.

It will be helpful to think of the construction of the ECM in the following way.

The second step in the Engle-Granger cointegration testing procedure involves estimating the following "cointegrating regression" relating Y and X, using OLS:

Y_t = a + bX_t + u_t (1)

The lagged residual from (1) is Z_t-1 = (Y_t-1 - a* - b*X_t-1), where a* and b* are the OLS estimates of a and b. Z_t-1 is the so-called "error correction" term.
The ECM is then formulated as

ΔY_t = α + βΔX_t + γZ_t-1 + ε_t (2)

or,

ΔY_t = α + βΔX_t + γ (Y_t-1 - a* - b*X_t-1) + ε_t (3)

(In fact, we may also have lags of either or both of ΔX_t and ΔY_t as additional regressors in (3). Only the latter lags will have any effect on the following discussion, and this will be taken up below.)

Suppose that we estimate the ECM, (3) by OLS, yielding parameter estimates α*, β*, and γ*.

Re-arranging the estimated equation (3), we have:

Y_t = (α* - a*γ*) + β*ΔX_t - γ*b*X_t-1 + (1 + γ*)Y_t-1 + residual (4)

This equation is a "dynamic" regression - it predicts Y_t, but Y_t-1 appears as a regressor on the RHS. (In addition, certain restrictions apply to the estimated coefficients as a result of the inclusion of the error correction term in the ECM. However, that's not the important point here.)

To use (4) to obtain a forecast, Y*_t, for Y_t, we would set the residual to zero and use the estimated coefficients and the data for ΔX_t, X_t-1, and Y_t-1. (The latter value is known at time t.) However, when it comes to forecasting Yt+1, we have to distinguish between "static" and "dynamic" forecasting. If these terms aren't familiar, this is the time to read my earlier post.

To forecast Y_t+1 we can use (4), with a shift of one time-period, in one of two ways.

We can use the actual value for Y_t on the RHS:

Y*_t+1 = (α* - a*γ*) + β*ΔX_t+1 - γ*b*X_t + (1 + γ*)Y_t (5)

or, we can use the previous forecast value, Y*_t on the RHS:

Y*_t+1 = (α* - a*γ*) + β*ΔX_t+1 - γ*b*X_t + (1 + γ*)Y*_t (6)

Equation (5) generates "static" forecasts; while equation (6) generates "dynamic" forecasts.

When we are doing genuine ex ante forecasting into the future, we have to use dynamic forecasting. My earlier post illustrated all of this, using EViews.

If our ECM includes lags of ΔY_t as regressors, as will often be the case, the story changes in a pretty obvious way. For instance, suppose that (2) is generalized to:

ΔY_t = α + βΔX_t + γZ_t-1 + δΔY_t-1 + ε_t (7)

Then the forecasting equation, (4), becomes:

Y_t = (α* - a*γ*) + β*ΔX_t - γ*b*X_t-1 + (1 + γ* + δ*)Y_t-1 - δ*Y_t-2 + residual (8)

Again, we have a (restricted) dynamic model - this time there are two lags of Y on the RHS. We can again distinguish between static and dynamic forecasts, as above.

That's all that there is to it.

[Postscript: Can you see where an example of a "pre-testing" problem arises in the discussion above?]

10 comments:

GustavoWoltmannJune 1, 2016 at 4:16 AM
Hi there, Great blog you have there, really. I learned so much from your posts already so please juse keep up the good work! :)
ReplyDelete
Replies
AnonymousJune 2, 2016 at 8:12 AM
I seem to have trouble reconciling the Johansen test for cointegration with the residuals of long-term relationships. For example, using FRED,USA payroll series, the residuals log_PAYEMS to Log-NPPTL have a unit root using data from 2010 to 2016,an indication of no cointegration, but if I use the Johansen cointegration test there appears to be a cointegration relationship under category 2.
ReplyDelete
Replies
AnonymousJuly 25, 2016 at 7:09 AM
Dear Dave,

Thanks for the insightful explanation! Can you elaborate some other ways of x variables in the forecasting process other than "guess"?

Lee
ReplyDelete
Replies
UnknownJune 23, 2017 at 4:33 AM
Error Correction Model (ECM) Panel Data EVIEWS 9
https://www.youtube.com/watch?v=ZgCwrb6kI7w
video Introduce the concept of an Error Correction Model (ECM) Panel Data EVIEWS 9.
WhatsApp : +6285227746673
PIN BB : D04EBECB
IG : @olahdatasemarang
ReplyDelete
Replies
UnknownDecember 7, 2018 at 8:59 AM
Dear Dave,

For the error correction equation, is it appropriate to include other predictiors which are stationary, but dont have an impact to dependent on levels basis i.e. no long term impact?

Thanks
ReplyDelete
Replies

Add comment

Note: Only a member of this blog may post a comment.

Pages

Saturday, May 28, 2016

Forecasting From an Error Correction Model

10 comments: