## Wednesday, March 6, 2013

### ARDL Models - Part I

I've been promising, for far too long, to provide a post on ARDL models and bounds testing. Well, I've finally got around to it!

"ARDL" stands for "Autoregressive-Distributed Lag". Regression models of this type have been in use for decades, but in more recent times they have been shown to provide a very valuable vehicle for testing for the presence of long-run relationships between economic time-series.

I'm going to break my discussion of ARDL models into two parts. Here, I'm going to describe, very briefly, what we mean by an ARDL model. This will then provide the background for a second post that will discuss and illustrate how such models can be used to test for cointegration, and estimate long-run and short-run dynamics, even when the variables in question may include a mixture of stationary and non-stationary time-series.

In its basic form, an ARDL regression model looks like this:

yt = β0 + β1yt-1 + .......+ βpyt-p + α0xt + α1xt-1 + α2xt-2 + ......... + αqxt-q + εt

where εt is a random "disturbance" term.

The model is "autoregressive", in the sense that yt is "explained (in part) by lagged values of itself. It also has a "distributed lag" component, in the form of successive lags of the "x" explanatory variable. Sometimes, the current value of xt itself is excluded from the distributed lag part of the model's structure.

Let's describe the model above as being one that is ARDL(p,q), for obvious reasons.

Given the presence of lagged values of the dependent variable as regressors, OLS estimation of an ARDL model will yield biased coefficient estimates. If the disturbance term, εt, is autocorrelated, the OLS will also be an inconsistent estimator, and in this case Instrumental Variables estimation was generally used in applications of this model.

In the 1960's and 1970's we used distributed lag (DL(q), or ARDL(0,q)) models a lot. To avoid the adverse effects of the multicollinearity associated with including many lags of "x" as regressors, it was common to reduce the number of parameters by imposing restrictions on the pattern (or "distribution") of values that the α coefficients could take.

Perhaps the best known set of restrictions was that associated with the Koyck (1954) for the estimation of DL (∞) model. These restrictions imposed a polynomial rate of decay on the α coefficients. This enabled the model to be manipulated into a new one that was autoregressive, but with an error term that followed a moving average process. Today, we'd call this an ARMAX model. Again, Instrumental Variables estimation was often used to obtain consistent estimates of the model's parameters.

Frances and van Oest (2004) provide an interesting perspective of the Koyck model, and the associated "Koyck transformation", 50 years after its introduction into the literature.

Shirley Almon popularized another set of restrictions (Almon, 1965) for the coefficients in a DL(q) model. Her approach was based on Weierstrass's Approximation Theorem, which tells us that any continuous function can be approximated, arbitrarily closely, by a polynomial of some order. The only question is "what is the order", and this had to be chosen by the practitioner.

The Almon estimator could actually be re-written as a restricted least squares estimator. For example, see Schmidt and Waud (1973), and Giles (1975). Surprisingly, though, this isn't how this estimator was usually presented to students and practitioners.

Almon's approach allowed restrictions to be placed on the shape of the "decay path" of the gamma coefficients, as well as on the values and slopes of this decay path at the end-points, t=0 and t=q. Almon's estimator is still included in a number of econometrics packages, including EViews. A Bayesian analysis of the Almon estimator, with an application to New Zealand imports data, can be found in Giles (1977), and Shiller (1973) provides a Bayesian analysis of a different type of distributed lag model.

Dhrymes (1971) provides a thorough and very general discussion of DL models.

So, now we know what an ARDL model is, and where the term "Autoregressive-Distributed Lag" comes from. In the next post on this topic I'll discuss the modern application of such models in the context of non-stationary time-series data, with the emphasis on an illustrative application with real data.

[Note: For an important update to this post, relating to EViews 9, see my 2015 post, here.]

References

Almon, S., 1965. The distributed lag between capital appropriations and net expenditures. Econometrica, 33, 178-196.

Dhrymes, P. J., 1971. Distributed Lags: Problems of Estimation and Formulation. Holden-Day, San Francisco.

Frances, P. H. & R. van Oest, 2004. On the econometrics of the Koyck model. Report 2004-07, Econometric Institute, Erasmus University, Rotterdam.

Giles, D. E. A., 1975. A polynomal approximation for distributed lags. New Zealand Statistician, 10, 22-26.

Giles, D. E. A., 1977. Current payments for New Zealand’s imports: A Bayesian analysis. Applied Economics, 9, 185-201.

Johnston, J., 1984. Econometric Methods, 3rd ed.. McGraw-Hill, New York.

Koyck, L. M., 1954. Distributed Lags and Investment Analysis. North-Holland, Amsterdam.

Schmidt, P. & R. N. Waud, 1973. The Almon lag technique and the monetary versus fiscal policy debate. Journal of the American Statistical Association, 68, 1-19.

Shiller, R. J., 1973. A distributed lag estimator derived from smoothness priors. Econometrica, 41, 775-788.

1. If the disturbance term is not autocorrelated, why would OLS produce biased estimates?

1. Econometrics 101 - e.g., Y(t) = a + bY(t-1) + e(t), where e(t) is i.i.d. and homoskedastic. Clearly, E[e(t).Y(t)] is not zero; and so E[e(t).Y(t+s)] is not zero, for all non-negative s, and for all t. Hence OLS is biased in finite samples, but the bias vanishes asymptotically (as long as the errors are indeed serially independent). In the case of no drift (a = 0), the bias of the OLS estimator of 'b' is -sb/n, to O(n^-2), where 'n' is the sample size. I believe this was first established by J. S. White, in "Biometrika", 1961.

2. Looking forward to the second part of this and modern uses of ARDL models. I was under the impression that they were relatively old-school models that were put into the dustbin once ARIMA and ARIMAX models became easy to fit.

3. "Let's describe the model above as being one that is ARDL(p,q), for obvious reasons."

Sorry to ask the obvious, but why aren't p and q the same? What would be an example of using a different number of lags for the y term and the x term.

And, any model that includes y (even where p=1) And an x explanatory variable, but also any number of other explanatory variables (z, w, etc.) also with any number of lags would be considered an ARDL?

And finally, the term autoregressive seems descriptive, but the term distributed lag to describe the other regressors which may only have a lag of 0 (?) isn't intuitively descriptive to me? Not central question to be sure, but it would be helpful to understand the term.

Thanks!
Dan

1. Dan - it's the same as in a VAR model - we don't necessarily want the same lag length on all variables. We need to think about the economics of the problem too.

Yes, we can have additional explanatory variables with their own maximum lag lengths.

Usually, we'd have several lags of the x (and other explanatory) variables. IN this case the "shape" of the distribution of the weights (coefficients) as we go back through the lags is of interest.

4. Is it possible to study the log-run relationship between three macro-economic variables using ARDL model? Should Engle-Granger Johansen co-integration technique be preferred over ARDL model when we are dealing with three variables?

1. Yes, you can certainly use the ARDL methodology with three or more varaibles, possibly integrated of different orders. The example that's coming in Part II of this post will do just that.

5. Sir
Eagerly waiting for Part II
Abhijit

6. Hi, so in essence, this method can be used instead of a VECM approach, where the variables show cointegration, but aren't all I(1)?

Looking forward to part 2!

1. Yes. Part 2 really is coming!

7. Sania Ashraf- Doctoral Scholar Islamic FinanceJune 12, 2013 at 11:09 PM

Dear Sir,
what is the minimum required observations for ARDL estimation?

1. There's no simple answer - it depends on the frequency of your data (monthly, quarterly, annual) and on the number of lags you need to properly specify the model.
DG

2. Thank you. This has given me a better understanding to ARDL but is there a main difference between this and ARIMA .
Can't wait for the Part 2

3. ARIMA is for a single time-series. For Part II, see http://davegiles.blogspot.ca/2013/06/ardl-models-part-ii-bounds-tests.html

4. Sir,
For ARDL if the frequency of data is monthly what would be the minimum number of observation required?

5. I'd be wanting at least 10 years of data - n>120. See below - the tests are only asymptotically valid.

6. Profesor,
For ARDL if the frequency of data is yearly what would be the minimum number of observation required?

8. Can we apply ardl approach when number of observations are 10?

1. The tests have only asymptotic (large n) relevance, so "no". More generally, an time-series models using only n=10 is unlikely to be of much use.

9. sir u are very affable and full of knowledge.i have read many of ur posts. sir plz tell me that whether we have to find out Durdin h stat in ARDL to cheak out serial Auto correlation or not?if not then DW-stat value detect serial auto correlation correctly and LM test?plz comment sir.

1. The DW test is inappropriate, given the lagged dependent variables. The LM test can be used to test for various orders of AR() and MA() processes in the errors. Keep in mind that this is only an asymptotically valid test, as is the h-test for the AR(1) case.

10. Can Y(t) be I(0) ? or do have to run a unit root to be sure that Y(t) is not I(0)for ARDL?
Thanks
Marc

1. See an earlier query - it will usually be I(1), but it need not be.

I did notice though that some authors are quoting that the dependent variable has to be I(1), (Trade Liberalisation, Financial Development and Economic Growth: Evidence from Pakistan (1980-2009)
Rao Muhammad Atif , Abida Jadoon, Khalid Zaman , Aisha Ismail, Rabia Seemab,
Journal of International Academic Research (2010) Vol.10, No.2.).

I went back and read the referenced Pesaran et al. (2001), article but could not really find that the dependent variable has to be I(1).
(Pesaran, M.H., Y. Shin., and Smith R. (2001) Bounds testing approaches to the analysis of level relationships, Journal of Applied Econometrics, 16, 289-326.)

I am a meteorologist, so I might have missed something in the Pesaran article, but it looks like I have to concur with you too that Yt can be I(0) also.

Regards
Mark

3. This comment has been removed by the author.

4. I can see nothing in the Pesaran paper on the bounds testing that required the Y variable to be I(1). However, if you proceeded with a Y variable that appeared to be I(0) and then the bounds test gave a clear outcome of cointegration, then you have a conflict. You can't have cointegration unless the variables are non-stationary to begin with.

So, if you are really, really confident in your unit root testing, and you feel that Y is I(0), in that case it would make little sense to do the bounds testing in the first place!

11. can i use ARDL model for 21 observations ?

1. That's up to you, but in my view that's a very short sample, especially if you are testing for cointegration along the way.

12. Dr. Giles,

Have you encountered any sort of ARDL of the form?:

y(t) = a + b*y(t-1) + c(t)*x(t) + e(t)

with c(t) = k + p*c(t-1) + i(t) (c is an unobservable random coefficient)
x(t) is observable co-variate. The correlation between x(t) and i(t) may be non-zero, but no correlation between e(t) and either i(t) or x(t)

thanks!

1. Nope - can't say that I have. Sorry!

13. Tahir Muhammad NaveedJanuary 3, 2014 at 3:45 AM

Once we decided the lag length of distributed lag model (lag length of explanatory variable), and if the coefficients of a lagged explanatory variable are changing signs, some are positive and some negative. If we add them up, can we say that it is the "net effect" on that explanatory variable.
Thanks

1. No, we can't because they're measured at different points in time. That's the point of the dynamics.

14. Would it be possible to post on Nonlinear ARDLs. There is a lot of two-step Engle-Granger stuff on asymmetric adjustment in ECMs but I don't see much on asymmetric adjustment in ARDLS or one-step error correction models.

1. Good suggestion - I'll see what I can do. Just need more hours in the day..... :-)

2. In that spirit...this blog is truly great and many of these intricate posts must take a pretty long time to create. Thanks for sharing all your knowledge, I truly appreciate it.

3. Tom, thanks fr the comment and the sentiment. They do take time, but it's fun, and it's nice to "give back".

4. thank u dr, i have a question.
how can we solve the low degree of freedom risk in ARDL for short-time series?

5. You can't - except by getting more data!

15. Thank You Sir,,, Can I use ARDL Bounds Test even when my data fractionally integrated such as I(1) AND I(2) or more???

Regardsss

1. I've already made it clear in the post that you can't use it if any of the data are I(2). (BTW, I(1) and I(2) are different from "fractionally integrated".)

16. Dear Sir, I want to investigate this relation y = f(x,d) where y and x two series , y~ I(2) and x~ I(1) by ADF test, d = dummy veraible, from Jan 2003 to June 2011 =0 , and from July 2011 to Dec 2013 = 1. My observations are 132.
I want to examine the long equilibrium relationship between y and x, and the impact of the dummy veriable on y.
I got no cointegration between y and x, using cointegration method an error correction model.
Please how can i analyz this data using dynamic models such as ARDL "remember one series is ~I (2)" or any other model.
Thank You

1. Ahmed - you could second-difference the Y variable, first-difference the X variable, and use OLS.

2. Thank You Dr. Giles...

17. Dear Sir,
I am currently conducting a forecast on the demand on petroleum( and its allied products, petrol and diesel) in India. Can this model be used. Eagerly awaiting your inputs.
regards

1. Yes, as long as none of your series are I(2).

2. Thank you sir.

18. Dear Sir
am working on the determinants of electricity consumption, can i use ardl methodology and how?

1. See here: http://davegiles.blogspot.ca/2013/06/ardl-models-part-ii-bounds-tests.html

Make sure that none of your variables are I(2).

19. Dear sir,
am working on income inequality, can i use ARDL as i have only 27 annual observations. Also does ARDL itself takes care of problem of endogeinity. and what about, if there is multicollinearity among explanatory variables, can we still use ARDL. is any eviews code available to run ARDL.
thanku

21. Dear Dr.Dave,
I am working on time series data and I found one of my variables though it is I(1) in intercept and trend and intercept, it is I(2) in None. Is it possible to run ARDL?
Many thanks,

1. You can't use the ARDL bounds if any of the series are I(2).

2. Thank you so much.

22. Dear sir, i want to investigate the causal relationship bt/w two time series apart from my independent and dependent variables i have to use some control variables also but have no idea how to use them in granger causality test is there any other method through which i also include my control variables.

1. You just enter them as additional "exogenous variables" and they won't be included in the Granger causality test.

23. Sir, this is incredible, what you are doing for the world. Thanks again.
Just to find out if you have or whether you can run for us and example with the Nalove Distribution lag model.
I am Ateh Thomson Pepeah, from Cameroon-Africa.

Thanks!

24. You've made it clear that any of the series shouldn't be I(2) in order to run ARDL.
But, what'd be your suggestion if I need to test for cointegration and my Y~I(1) and X~I(2)?

Thanks in advance, I really appreciate what you've done with this blog!

1. In this case they can't be cointegrated. If you have 2 or more I(2) series, and one or more I(1) series, you can have cointegration. In this case it's possible for the I(2) to cointegrate to an I(1) series, and the latter can then cointegrate with the other I(1) data. See http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2145744

2. Thanks for your quick response.
Actually, the issue is more simple: my variables have different integration order:

GDP per capita: I(1); Y
Population: I(2); X
Period: 1960-2012

Would Haldrup (1998) still be valid?

3. For the sake of specification, my I(1) variable is GDP per capita and my I(2) variable is Population for Mexico from 1960 to 2012.

Logical sense dictates both variables could be cointegrated. I read that "hypothesis testing of the I(2) model" and/or running DOLS could be a solution for cointegration an I(1) variable with an I(2) variable.

Is this correct? Thanks in advance, sir.

4. No. If you have 2 variables, one I(1) and the other I(2), they can't be cointegrated, BY DEFINITION. Also, ARDL bounds testing can't be used with I(2) data, but that's a different point.

5. You need at least 2 I(2) variables for Haldrup's results.

25. Hi,
I would like to study the impact the advertising of a product on its sales (weekly data for 5 years).
The final aim is to forecast what would be the impact on sales of a change in the advertising spending. I was considering including also other variables (such as competitors prices, macroeconomics variables,…)
As I know that advertising has a long lasting effect and that, for example, at time t I would be only able to know prices at t-1 I need to include lags.
Would ARIMAX model be appropriate? ARDL?

Thank you.

26. Hi,
I would like to study the effect of inflation to exchange in America using ARDL approch but the problem that this series of inflation and exchange are not stationnary and need the second diffrences to be stationnary. I would like to ask you if i can use the estimation of ARDL in this situation
Thank you.

1. You can't use the bounds testing associated with ARDL models if any of your series are I(2).

27. Hello, I'm working in a paper on testing endogenous growth based on jones 1995. I have to estimate the following ARDL model:
(1) gt= A(L)gt-1+B(L)it+et,
where gt is the GDP growth rate in period t and it is investment rate in t, the ecuation could be written as:
(2) gt=gt= A(L)gt-1+B(1)it+C(L)(it-it-1)+et
Does anyone know how to compute (2) in eviews 8? i'd be very thankful if somebody can help me

28. Dear Prof. Giles
I am applying PMG model which uses ARDL procedure, I found that when I apply these two forms the results come indentical:
1) Y= L.Y + L.X1 + L.X2
2) Y= L.Y + X1 + X2
does that mean that ARDL has a "built-in" lag for the independent variables even when I use the variables as VAR without L.VAR?
thanks

1. I suggest you check with the folk at EViews - I don't work for them. Why not post your question on their forum at http://forums.eviews.com/

29. Hi Sir
I have four variables : GDP and FDI are started in 1980- 2013 but corruption and political stability are started in 1996-2013 . Sir can you used the data 1980-2013 for all varibles .

thanks Sir

1. No - that won't be possible.

30. Dear Prof. Giles

I have a very short data period: 1990-2006 and I would like to study the causality between five variables. the process is as followings: stationarity and causality. for cointegration: the viewing size is very short.

Sir my question is what this step (stationarity and stationnarity without cointegration) relevant?

thanks you verry much Sir.

1. You won't be able to do any sensible causality testing with 5 variables and that short sample period.

31. Dear Prf.Giles
thanks you verry much Sir for your time

32. Hi Dear professer Giles

Sir I have four variables and short sample periode 2000 -2014 , Sir can you used OLS ( Multiple regression analysis is used to find the relationship between the variables) et Thanks you professer

1. Yes, you can, but the results will not be very "reliable".

33. I'm working on a paper and handling time series data of 35 years. Out of 6 variables few variables are stationary at 1st difference, one at level and one at 2nd difference. I'm going to employ ARDL....Bond Test. I'm doing right Sir????

1. You cannot have any I(2) variables when doing the Bounds Testing - this is mentioned in the post.

34. Do we need to make series stationary or can we use ardl model with unstationary series?

35. Dear Prof. Giles , I have to anlyse impact of 5 macrovariables on stock index and out of 5, 4 are I(1) and 1 is I(0). So can I apply ARDL model here. many thanks. plz reply.

37. Sir I saw some Papers in which no of observations were 14 to 15 and they applied ARDL.me confused I have 14 years time series annual data.can I use this?check this https://www.researchgate.net/publication/281857078_MACROECONOMICS_INDICATORS_AND_BANK_STABILITY_A_CASE_OF_BANKING_IN_INDONESIA

1. There is no way I would attempt to do any serious ARDL modelling with so few observations.

38. Hello Sir

Although you have made clear regarding the integration order of the variables, but I am still confused that whether I can use ARDL approach if both of my variables are stationary at level?
I tried using VAR & Granger but both of them are giving different lags significant, so hot struck & thought of using ARDL.
Kindly suggest.

Thanks
Parul

1. Yes, you can legitimately estimate an ARDL model in this case, but it's not really the right basis for Granger causality testing.

2. After running the ARDL model, I found that it is heteroskedastic, what should be done now?

3. Just do what you would do with any other regression model.

39. Anticipating the second a portion of this and present day employments of ARDL models. I was under the feeling that they were moderately old-school models that were put into the dustbin once ARIMA and ARIMAX models turned out to be anything but difficult to fit.you can also see this link VCDB TOOL which you get cars validation, xml reports,validates ACES data,customize report and more...

1. ARDL models were "revived" because it turned out that they provide a very useful context for testing for long-run relationships when there is ambiguity about the stationarity/non-stationarity of the data.

40. sir, is normality test necessary for ARDL MODEL?

41. Dave Giles, i have 9 independent variables and 1 depended variables, (total 10) i would it be possible to use ARDL Model?

1. Only if you have lots of observations. It;s just a regression model you need positive degrees of freedom.

42. Please proof , how to check the degree of freedom in ARDL MODEL?

1. The same as for any regression model - it's the number of observations minus the number of regressors.

43. Dear Professor Giles,
I would like to study the impact of three variables “X1”, "X2" and “Z” on “Y”. The final aim is to forecast. These series are not stationary. They are all I(1). I would like to ask you if I can use the ARDL model in levels. Is it possible to estimate the following model in levels?
Y(t) = a + b1*Y(t-1) + b2*X1(t-1) + b3*X2(t-1) + b4*Z(t-2)
Best regards,

1. The last equation won't be legitimate unless the variables are cointegrated. You can estimate an ARDL model if you wish, but forecasting isn't its primary purpose. If the variables are NOT cointgrated you'll need to difference them before estimating the last model you mention. Otherwise use an ECM.

2. My pricipal focus is forcasting. you have written above that, if there is a co-integrating relationship, you can estimate an ARDL.
as i understand , the ardl that we estimate is nothing but an OLS with lags of the variables involved.

So, My question to you is can the models in levels with lagged variables be used for forecasting?

this is especially important since the variables are non-stationary and hence, running ols wont be valid according to my knowledge.

3. If all of the variables are I(1) AND they are cointgrated then you could use this "old fashioned" ARDL model for forecasting legitimately. But not if the AREN'T cointegrated. That's true for ANY OLS regression. Also - see the links at the end of this post to "modern" ARDL modelling.

44. So ARDL is basically a one-equation version of a VAR model?

45. Dear Professor Giles, the ARDL model I prepared on eview is not providing output when the forecast is run, and nor does the ols. I have changed the data range after I prepared both ardl and ols. but when the forecast is run the output is not provided. can you kindly help me to solve this issue please? thank you so much in advance.

1. Dhanusha - I don't work for EViews - I suggest you contact them through their User Forum at http://forums.eviews.com/

46. Thank you for your post. Is it possible to advise how can I solve (Singular matrix) error using Eviews to apply ARDL.

Thankyou

1. The same way that you would for any other regression model - you need either more observations or fewer lags.

47. Hello Professor Giles, I want to know, If OLS estimation of ARDL model gives Biased results then how can we rely on the Long-run or short run coefficients? given by the ARDL model? . When runing ARDL model, does Eviews use OLS to estimate it?

1. ARDL models are estimated by OLS (& not just in EViews). OLS will be biased (for small samples) in any model that has lagged values of the dependent variable as regressors, so that includes ARDL models. However, it is a consistent estimator (as long as the errors are independent), so the bias vanishes for large samples. You shouldn't use an ARDL model with a very small sample. And keep in mind that the long-run relationship is, again, just that - it needs plenty of observations in the sample to be meaningful. Finally, if there is a long-run cointegrating relationship, then OLS is a really good choice for estimating its parameters. The reason is this. Under cointegration, OLS is "super-consistent". The estimates converge to the true parameter values at the square of the usual rate as the sample size (n) grows. In standard models this rate of convergence is SQRT(n), but the rate is the same as "n" itself under cointegration.

48. Thank you very much Professor. And one more question, I have 60 observations (Quarterly) , and after running ARDL with 4 independent variables , using Schwarz criteria, ARDL(1,0,1,0,0,3) was chosen. Bound test shows co-integration and there is no serial correlation in residuals, so can I trust this results? , considering that some of my independetn variables are I(0). Thank you again.

1. On the face of it, you should be O.K.

49. Dear Professor,
I have two series of daily futures' returns data for about 10 years making it about 2500 data points. But both the series are stationary in nature. Is it okay to run an ARDL model to measure the short term and long term relationship between the returns ?