Wednesday, March 6, 2013

ARDL Models - Part I

I've been promising, for far too long, to provide a post on ARDL models and bounds testing. Well, I've finally got around to it!

"ARDL" stands for "Autoregressive-Distributed Lag". Regression models of this type have been in use for decades, but in more recent times they have been shown to provide a very valuable vehicle for testing for the presence of long-run relationships between economic time-series.

I'm going to break my discussion of ARDL models into two parts. Here, I'm going to describe, very briefly, what we mean by an ARDL model. This will then provide the background for a second post that will discuss and illustrate how such models can be used to test for cointegration, and estimate long-run and short-run dynamics, even when the variables in question may include a mixture of stationary and non-stationary time-series.

In its basic form, an ARDL regression model looks like this:

  yt = β0 + β1yt-1 + .......+ βkyt-p + α0xt + α1xt-1 + α2xt-2 + ......... + αqxt-q + εt

where εt is a random "disturbance" term.

The model is "autoregressive", in the sense that yt is "explained (in part) by lagged values of itself. It also has a "distributed lag" component, in the form of successive lags of the "x" explanatory variable. Sometimes, the current value of xt itself is excluded from the distributed lag part of the model's structure.

Let's describe the model above as being one that is ARDL(p,q), for obvious reasons.

Given the presence of lagged values of the dependent variable as regressors, OLS estimation of an ARDL model will yield biased coefficient estimates. If the disturbance term, εt, is autocorrelated, the OLS will also be an inconsistent estimator, and in this case Instrumental Variables estimation was generally used in applications of this model.

In the 1960's and 1970's we used distributed lag (DL(q), or ARDL(0,q)) models a lot. To avoid the adverse effects of the multicollinearity associated with including many lags of "x" as regressors, it was common to reduce the number of parameters by imposing restrictions on the pattern (or "distribution") of values that the α coefficients could take.

Perhaps the best known set of restrictions was that associated with the Koyck (1954) for the estimation of DL (∞) model. These restrictions imposed a polynomial rate of decay on the α coefficients. This enabled the model to be manipulated into a new one that was autoregressive, but with an error term that followed a moving average process. Today, we'd call this an ARMAX model. Again, Instrumental Variables estimation was often used to obtain consistent estimates of the model's parameters.

Frances and van Oest (2004) provide an interesting perspective of the Koyck model, and the associated "Koyck transformation", 50 years after its introduction into the literature.

Shirley Almon popularized another set of restrictions (Almon, 1965) for the coefficients in a DL(q) model. Her approach was based on Weierstrass's Approximation Theorem, which tells us that any continuous function can be approximated, arbitrarily closely, by a polynomial of some order. The only question is "what is the order", and this had to be chosen by the practitioner.

The Almon estimator could actually be re-written as a restricted least squares estimator. For example, see Schmidt and Waud (1973), and Giles (1975). Surprisingly, though, this isn't how this estimator was usually presented to students and practitioners.

Almon's approach allowed restrictions to be placed on the shape of the "decay path" of the gamma coefficients, as well as on the values and slopes of this decay path at the end-points, t=0 and t=q. Almon's estimator is still included in a number of econometrics packages, including EViews. A Bayesian analysis of the Almon estimator, with an application to New Zealand imports data, can be found in Giles (1977), and Shiller (1973) provides a Bayesian analysis of a different type of distributed lag model.

Dhrymes (1971) provides a thorough and very general discussion of DL models.

So, now we know what an ARDL model is, and where the term "Autoregressive-Distributed Lag" comes from. In the next post on this topic I'll discuss the modern application of such models in the context of non-stationary time-series data, with the emphasis on an illustrative application with real data.


References

Almon, S., 1965. The distributed lag between capital appropriations and net expenditures. Econometrica, 33, 178-196.

Dhrymes, P. J., 1971. Distributed Lags: Problems of Estimation and Formulation. Holden-Day, San Francisco.

Frances, P. H. & R. van Oest, 2004. On the econometrics of the Koyck model. Report 2004-07, Econometric Institute, Erasmus University, Rotterdam. 

Giles, D. E. A., 1975. A polynomal approximation for distributed lags. New Zealand Statistician, 10, 22-26.

Giles, D. E. A., 1977. Current payments for New Zealand’s imports: A Bayesian analysis. Applied Economics, 9, 185-201.

Johnston, J., 1984. Econometric Methods, 3rd ed.. McGraw-Hill, New York.


Koyck, L. M., 1954. Distributed Lags and Investment Analysis. North-Holland, Amsterdam.

Schmidt, P. & R. N. Waud, 1973. The Almon lag technique and the monetary versus fiscal policy debate. Journal of the American Statistical Association, 68, 1-19.

Shiller, R. J., 1973. A distributed lag estimator derived from smoothness priors. Econometrica, 41, 775-788.


© 2013, David E. Giles

10 comments:

  1. If the disturbance term is not autocorrelated, why would OLS produce biased estimates?

    ReplyDelete
    Replies
    1. Econometrics 101 - e.g., Y(t) = a + bY(t-1) + e(t), where e(t) is i.i.d. and homoskedastic. Clearly, E[e(t).Y(t)] is not zero; and so E[e(t).Y(t+s)] is not zero, for all non-negative s, and for all t. Hence OLS is biased in finite samples, but the bias vanishes asymptotically (as long as the errors are indeed serially independent). In the case of no drift (a = 0), the bias of the OLS estimator of 'b' is -sb/n, to O(n^-2), where 'n' is the sample size. I believe this was first established by J. S. White, in "Biometrika", 1961.

      Delete
  2. Looking forward to the second part of this and modern uses of ARDL models. I was under the impression that they were relatively old-school models that were put into the dustbin once ARIMA and ARIMAX models became easy to fit.

    ReplyDelete
  3. "Let's describe the model above as being one that is ARDL(p,q), for obvious reasons."

    Sorry to ask the obvious, but why aren't p and q the same? What would be an example of using a different number of lags for the y term and the x term.

    And, any model that includes y (even where p=1) And an x explanatory variable, but also any number of other explanatory variables (z, w, etc.) also with any number of lags would be considered an ARDL?

    And finally, the term autoregressive seems descriptive, but the term distributed lag to describe the other regressors which may only have a lag of 0 (?) isn't intuitively descriptive to me? Not central question to be sure, but it would be helpful to understand the term.

    Thanks!
    Dan

    ReplyDelete
    Replies
    1. Dan - it's the same as in a VAR model - we don't necessarily want the same lag length on all variables. We need to think about the economics of the problem too.

      Yes, we can have additional explanatory variables with their own maximum lag lengths.

      Usually, we'd have several lags of the x (and other explanatory) variables. IN this case the "shape" of the distribution of the weights (coefficients) as we go back through the lags is of interest.

      Delete
  4. Is it possible to study the log-run relationship between three macro-economic variables using ARDL model? Should Engle-Granger Johansen co-integration technique be preferred over ARDL model when we are dealing with three variables?

    ReplyDelete
    Replies
    1. Yes, you can certainly use the ARDL methodology with three or more varaibles, possibly integrated of different orders. The example that's coming in Part II of this post will do just that.

      Delete
  5. Sir
    Eagerly waiting for Part II
    Abhijit

    ReplyDelete
  6. Hi, so in essence, this method can be used instead of a VECM approach, where the variables show cointegration, but aren't all I(1)?

    Looking forward to part 2!

    ReplyDelete