## Tuesday, October 25, 2011

### VAR or VECM When Testing for Granger Causality?

It never ceases to amaze me that my post titled "How Many Weeks are There in a Year?" is at the top of my all-time hits list! Interestingly, the second-placed post is the one I titled "Testing for Granger Causality". Let's call that one the number one serious post. As with many of my posts, I've received quite a lot of direct emails about that piece on Granger causality testing, in addition to the published comments.

One question that has come up a few times relates to the use of  a VAR model for the levels of the data as the basis for doing the non-causality testing, even when we believe that the series in question may be cointegrated. Why not use a VECM model as the basis for non-causality testing in this case?

On the face of it, this might seem like a good idea. It's been suggested that as the VECM incorporates the information abou the short-run dynamics, tests conducted within that framework may be more powerful than their counterparts within a VAR model. In fact, however, there's a very good reason for not using a VECM for this particular purpose.

First, let's recall the main message from my earlier post. A simple definition of Granger Causality, in the case of two time-series variables, X and Y is:
"X is said to Granger-cause Y if Y can be better predicted using the histories of both X and Y than it can by using the history of Y alone."
We can test for the absence of Granger causality by estimating the following VAR model:

Yt = a0 + a1Yt-1 + ..... + apYt-p + b1Xt-1 + ..... + bpXt-p + ut      (1)
Xt = c0 + c1Xt-1 + ..... + cpXt-p + d1Yt-1 + ..... + dpYt-p + vt      (2)

Then, testing H0: b1 = b2 = ..... = bp = 0, against HA: 'Not H0', is a test that X does not Granger-cause Y.

Similarly, testing H0: d1 = d2 = ..... = dp = 0, against HA: 'Not H0', is a test that Y does not Granger-cause X. In each case, a rejection of the null implies there is Granger causality.

Now, if any of the variables are non-stationary (whether or not they are cointegrated), the usual Wald test statistic for this testing will not have an asymptotic Chi-Square distribution. An easy way to deal with this is to use the following procedure proposed by Toda and Yamamoto (1995) - more details are provided in my previous post:
1. Test each of the time-series to determine their order of integration.
2. Let the maximum order of integration for the group of time-series be m.
3. Set up a VAR model in the levels (not the differences) of the data, regardless of the orders of integration of the various time-series.
4. Determine the appropriate maximum lag length for the variables in the VAR, say p, using the usual methods.
5. Make sure that the VAR is well-specified.
6. If two or more of the time-series have the same order of integration, at Step 1, then test to see if they are cointegrated.
7. No matter what you conclude about cointegration at Step 6, this is not going to affect what follows. It just provides a possible cross-check on the validity of your results at the very end of the analysis.
8. Take the preferred VAR model and add in m additional lags of each of the variables into each of the equations.
9. Test for Granger non-causality as follows. For expository purposes, suppose that the VAR has two equations, one for X and one for Y. Test the hypothesis that the coefficients of (only) the first p lagged values of X are zero in the Y equation, using a standard Wald test. Then do the same thing for the coefficients of the lagged values of Y in the X equation.
10. Make sure that you don't include the coefficients for the 'extra' m lags when you perform the Wald tests.
11. The Wald test statistics will be asymptotically chi-square distributed with p d.o.f., under the null.
12. Rejection of the null implies a rejection of Granger non-causality.
13. Finally, look back at what you concluded in Step 6 about cointegration:

"If two or more time-series are cointegrated, then there must be Granger causality between them - either one-way or in both directions. However, the converse is not true."
(This last piece of information may provide a cross-check on your overall conclusions.)
O.K. - now back to VARs vs. VECMs!

Suppose that at Step 6 we come to the conclusion that the time-series are cointegrated. In general, the presence of cointegration would suggest that we should model the data using a VECM model, rather than using a VAR model. That's modelling the data, though, not testing for Granger non-causality.

Here's the deal.

To get to the point where we are considering using a VECM model as the basis for the causality testing, we had to go through the prior step of testing for cointegration; and only if we rejected the hypothesis of "no cointegration" would we even consider estimating a VECM model. This is a classic example of "preliminary test testing". That is, the framework (model) chosen as the basis for the non-causality test is conditional on the outcome of a previous test - a test for non-cointegration. It's as if the choice between a VAR model and a VECM model (as the framework within which to test for non-causality) is made by flipping a biased coin. (Remember that there are always Type I and Type II errors associated with any classical hypothesis test.)

So, one important question that arises is the following one:

If we first test for non-cointegration, and then (conditional on the outcome of this test) we perform another test, what are the properties of this second test?
You see, the second test (the test for non-causality) will be of one form if we decide to use the VAR, and of a different form if we decide to use a VECM. When we pre-test, the second test is actually a random mixture of two tests. The "actual" test statistic is a weighted sum of the test statistic that would be obtained if we used a VAR model, and the test statistic that would be obtained if  we used a VECM model. And the weights are random, with values that depend on the properties of the prior (non-cointegration) test.

The upshot of all of this is as follows. When we test for no cointegration, then decide on a VAR model or a VECM model, and then apply a Granger non-causality test, the properties of this last test aren't at all what we think they are. They're really messy, and the best way to find out what's going on is to conduct a Monte Carlo experiment.

In particular, there will almost certainly be some distortion in the significance level (and hence the power) of the final test. We may think we're applying the non-causality test at (say) the 5% level, but the true significance level (the actual rate of rejection of  the null hypothesis when this hypothesis is false) may be quite different. And this might (should?) bother us.

[As an aside, if you think that the "size distortion" that can arise from pre-test testing may not be a big deal, then take a look at the results in Table 2 of King and Giles, 1984.]

So, has anyone investigated the issue of the effects of pre-test testing in the case we're interested in here - the case of testing for Granger non-causality, after first testing to to see if there is cointegration, so that we effectively randomize the choice of a VAR or VECM model?

Of course they have! You can take a look at the studies by Toda and Phillips (1994), Dolado and  Lütkepohl (1996), Zapata and Rambaldi (1997), and Clarke and Mirza (2006) for lots of interesting details. I particularly recommend the last of these papers, by my colleague Judith Clarke and her former student, Sadaf Mirza.

Zapata and Rambaldi (1997, p.294) find that the T-Y Wald test is clearly preferred to the likelihood ratio test used in the context of a VECM model, unless the sample size is extremely small. (Would we really want to be going through all of this with a very small sample, especially when cointegration is a long-run phenomenon?)

The big take-home message from this research is very simple:
"We find that the practice of pretesting for cointegration can result in severe overrejections of the noncausal null, whereas overfitting [that's the T-Y methodology; DG] results in better control of the Type I error probability with often little loss in power." (Clarke & Mirza, 2006, p.207.)

Note: The links to the following references will be helpful only if your computer's IP address gives you access to the electronic versions of the publications in question. That's why a written References section is provided.

References

Clarke, J. A. and S. Mirza (2006). A comparison of some common methods for detecting Granger noncausality. Journal of Statistical Computation and Simulation, 76, 207-231.

Dolado, J. J and H. Lütkepohl (1996). Making Wald tests work for cointegrated VAR systems. Econometric Reviews, 15, 369-386.

Toda, H. Y. and P. C. B. Phillips (1994). Vector autoregressions and causality: a theoretical overview and simulation study. Econometric Reviews, 13, 259-285.

Toda, H. Y. and T. Yamamoto (1995). Statistical inferences in vector autoregressions with possibly integrated processes. Journal of Econometrics, 66, 225-250.

Zapata, H. O. and A. N. Rambaldi (1997). Monte Carlo evidence on cointegration and causation. Oxford Bulletin of Economics and Statistics, 59, 285-298.

1. I am a bit ashamed to say this, but my coworkers and I have been kicking this post around and we are unclear on the 'why' of step 3 "Set up a VAR model in the levels (not the differences) of the data, regardless of the orders of integration of the various time-series."

We've flipped through our text books from back in grad school and couldn't find the answer. Could you help?

2. Anaonymous: Thanks for the comment. Not your fault!
There's a bit more information in my April post (linked in the post above). You won't find anything about this in any of the texts written prior to 1994. Indeed, even recent general grad. econometric texts don't cover it - you'd need to look at something like Helmut Lutkepohls' "Multiple Time Series" text. This is a great example of the books lagging behind the theory (and practice, actually). The point is this. If the data are non-stationary then the usual Wald test (or the LRT for that matter) for testing the restrictions involved in causality doesn't have its usual asymptotic (chi square) distribution. The distribution is non-standard and involves unknown "nuisance" parameters, so it can't be tabulated, and you don't have proper critical values to use - even with an infinite amount of data. Now, there are basically 2 equivalent ways to deal with this, the simpler of which to apply is the Toda-Yamamoto "trick". That's all it is - a trick to "fix up" the distribution of the Wald test statistic so it is asymptotically chi square. You fit the model in the levels (counter-intuitive, I know, if the data are non-stationary). It's the ADDITION of the extra lags (that are NOT included in the formlulation of the test) that gets you the result you want.

Two things to note: (1) This will still be OK even if the data are stationary, so you can use the T-Y approach as an insurance policy, if you even "suspect" that one or more of the series may by I(1) or I(2); (2) This model in the levels, with the extra lags, is ONLY for causality testing. It's not to be used for forecasting, impulse response function analysis, or anything else. For those purposes you would still use a VAR in the differences, if the data were I(1) but not cointegrated, or a VECM if the data are in fact cointegrated.

DO take a look at the T-Y paper: even just the abstract, intro. aned conclusions. It really will help. I hope that these comments do too.

1. Dear Prof. Giles
In your comment, you stated that "Now, there are basically 2 equivalent ways to deal with this".
My question is: What is the second way to handle this (and can it be used if all time series are I(1))?
Best wishes

2. The second (equivalent) way is discussed in Chapter 7 of Lütkepohl, H. (2006). New Introduction to Multiple Time Series Analysis. Springer, Berlin. It can be used with data that are all non-stationary (as can the T-Y procedure).

3. Thank you, I'll look it up!
Generally speaking, even if you have two I(1) time series, you can carry out the Y-T procedure. That is correct, right?

3. Dear Prof. Giles,

To begin with, thank you very much about your extraordinary clarifying blog. A blog like yours I think is an exellent example how science and teaching can work in the 21st millennium.

This and your entry about Granger Causality explains the procedure sufficently, while e.g. in Lütkephol (it is a exellent book nevertheless) these issues are less clear presented and I think difficult to understand for many students. Actually, I have already read some (recently) published papers where Granger Causality tests were implemented in a questionable way. The prefered prodecure (or any other mentioned in Lütkephol) does not seem to be known in applied works all the time. Is there some published material that explains testing Granger-causality with respect to VEC, VECM, Integrated and Cointegrated data, etc. in a concise and lucid way like your blog entry? If not, would a clarifying methodological published note not be worth it.

By the way, another methodological question. Testing for cointegration first and choosing the model for the causality test conditional on the first test is "preliminary test testing". However, why is testing for the order integration fist and including additional lags for the causality test conditional on the first test not basically "preliminary test testing" in similar way? (additional lags are supposably not the same such as a different model [VECM] as a whole)

I will keep in touch with your great blog!

Kind regards,
Georg

1. Georg: Thank you for your kind comments. It's good to know that the blog is being helpful.

Regarding your first question, I can't think of an easy-to-read piece of material that's published. It's no doubt something that people would find helpful, though.

Regarding your point about pre-testing: Yes! Absolutely - there are important pre-testing issue when you (i) test for unit roots, and then subsequently test for cointegration; (ii) test for units roots (and/or cointegration) & then test for Granger non-causality, etc.

I've published a number of papers on pre-testing in the past - see my c.v. at
web.uvic.ca/~dgiles/dgiles_cv.pdf .

I have drafts of a couple of posts on pre-testing in general that I plan to put on the blog before too long: one on pre-test estimation; and one on pre-test testing.

Hopefully, these will be of some interest.

4. Dear Prof Giles,
Aside from the reason you posted in your previous blog entry:"This might occur if your sample size is too small to satisfy the asymptotics that the cointegration and causality tests rely on."
Is there any other reason why there is no Granger causality between two cointegrated variables?
I am investigating oil price benchmarks in real effective exchange rates. In particular, with the Chinese yuan. I have performed the Granger causality test as you have outlined (very clear and helpful by the way) but there is none present. I'm using data from 1994 to the present, seasonal dummy variables are used (monthly and exogenous) and even when I omit the financial crises data from 2008 onwards, there is still no granger causality.

Kindest regards
Anonymous

1. Thanks for your question. Despite the data you have omitted, thre could be structural breaks that are affecitn either the cointegration testing or the causality testing. You say you have monthly data, so another possibility is that there are seasonal unit roots and/or seasonal cointegration.

5. Dear Prof. Giles,
When you talk about a "VECM model as the basis for non-causality testing" which testing procedure are you referring to? Is it the likelihood ratio test due to Mosconi/Giannini 1992?

Are these Granger causality-tests in a VECM context implemented in any standard econometrics software (I am using stata but I could not find any Granger causality-test in a VECM framework)?

Thanks to you I can see the problem of a pretest bias when conducting tests in a VECM. But - given that we have cointegrated variables - shouldn't these tests be more efficient as we impose correct and more specific restrictions? Is it perhaps that the negative pretest bias is stronger than the effect of imposing valid restrictions?

Thanks for this great blog!
Best regards,
Manuel

1. Manuel: Thanks for the comment.

Yes, I had in mind tests like the Mosconi-Giannini test.

I'm not aware of this test being incorporated in any of the standrd econometrics packages, but other readers of this blog may be able to help on this point.

You are right that there is a trade-off between the loss of power arising from the pre-test testing, and the gain in power when we impose correct restrictions. This is a comon problem, and at the end of the day the net effect will depend on the particular problem we're looking at.

6. Dear Prof. Giles,

I would like to ask regarding to the coefficient of ECT. Some researchers said the coefficient of ECT consider good if the range between 0-1. What do you think, Prof.? Please advice.
Thanks.

1. Thanks for the question. We want the coefficient of an ECT to be negative, and we'd like it to be statistically significant.

2. Dear Prof. Giles,

Wouldn't the desired sign of the coefficient estimate of the ECT be based on which line of the VECM system we're looking at? For example, if we have the second row of the most simple bivariate VECM:

Delta*x_t = alpha*(y_{t-1}-beta*x_{t-1}) + e_t

then we would want alpha to be positive such that when y_{t-1} gets "too big", the process x will increase over the next period to correct the disequilibrium?

7. Dear Prof Giles,

I am currently researching whether remittances granger cause gdp and health expenditure. I have tested and transformed for stationarity(time series are I(2) processes), found lags elections using AIC etc and following this I had originally planned to simply model my data using VAR and then implement the Granger test in Stata. After reading your extremely useful (thank you!) blog posts I feel I need to employ a test for cointegration for each set of variables (i.e. remittances and gdp, remittances and health expenditure) and then decide whether my data must be modelled using a VAR or a VECM, rather than go straight to a VAR. Am I correct in my thinking? Econometrics is not my strong point!

The evidence to hand suggests that iut is preferable to test for Granger causality using a levels VAR model (modified as per the Toda-Yamamoto procedure), rather than using a VECM model for causality testing.

If you are using STATA, note that the Granger test there does NOT make the required Toda-Yamamoto adjustment. You will need to include (but not test) 2 extra lags of each variable, as some of your data are I(2).

9. Dear Prof Giles,

Once again, I cannot thank you enough for what can only be described as a truly fantastic institution (your blog).

Having read the TY (1995) paper, and undertaken some tests, I was looking to go further and do some kind of robustness check within a VECM framework, but I am struggling to find any commercial software which tests the restriction ,which i believe originates in Mosconi-Giannini, and is what i believe is being tested in the EXCELLENT Clarke and Mirza paper - that is: the product of the two relevant elements of cointegrating vector and error correction mechanism, and the coefficients on the lagged differenced variables are jointly equal to zero.

If this is something which I want to pursue further, am I going to have to write up a matlab file or similar? Can this be coded into Eviews somehow? I can obviously estimate the VECM, then estimate this as a system equation by equation, and jointly test the \alpha=\differenced coefficients=0... but this is not quite what is what we're after, as it a test which is restricting the whole cointegrating relationship, not just one variable.

Do you have any suggestions? Presumably, Clarke and Mirza write up their own proprietary code, but this is something which I would obviously be keen to avoid, if at all possible!

Best wishes, thanks again for all of your hard work that goes into the blog!

I'll talk with Judith Clarke and see what can be done to get you some code, etc.

10. Dear Prof Giles,
thank you very much for this helpful blog.
I am at the first stages of learning econometric. I am sorry to ask this simple question, may I know that if times series data are I(0) and I(1), (mixed integrated order, can we employ Granger causality based on VECM?

1. The VECM model is only defined when the time-series are cointegrated. For this to be the case the series need to be integratd of the same order. So, the answer to your question is "no".

2. Dear Prof,

Does VECM show the direction of dependence in the long run? e.g. if we found that four stock market indices are co-integrated, Can VECM show which is the dependent market in the long run?

Thank you.

3. No - this is a matter for causality testing.

DG

4. Hello professor,
In the case of variables integrated at I(1) and I(2) what is the appropriate test for cointegration? Can I use johansen test?

5. No - not in its usual form. For more information, see my post here: http://davegiles.blogspot.ca/2012/01/cointegration-analysis-with-i2-i1-data.html

11. Dear Professor Giles,

Pesaran's bound test approach is a way to test cointegration when underlying series are not integrated to the same order (am I right on this point?). If this is the case, is there a way to test causality under this situation? Thanks and regards, Kamrul, Murdoch University, Perth, WA

12. Dear prof,

How to know which coefficient is significant in the VECM output?

1. The estimated coefficients will be asymptotically Normal, so if you have a big enough sample, treat the t-statistics as if they are z-statistics.

2. Thank you prof.

One more question, If I have four variables (stock indices) and the JJ cointegration shows two cointegrating equations, should I run VECM with two cointegrations, or run it with one cointegrating equation at a time? because I want to know in the long run which index is influenced by the other.

Thanks.

3. With 2 together.

4. Dear Prof Dave Giles,
I got your posts regarding the T-Y approach to Granger non-causality very helpful. Thank you! But, my question is that is it for short-run or long-run Granger causality?

13. It's short run - one period. See my response(with a reference) to the same question on the "Testing for Granger Causality" post a couple of days ago.

DG

14. Dear Prof Dave Giles,

If the equation contains only 2 variables (one dependent and only one independent variables) and dependent variable is I(0) while independent variable is I(1), can I test Engle and Granger cointegration based on this kind of data?

And if after the test of cointegration, can I continue to test VEC(if it is cointegrated) and VAR(if it is not cointegrated)?

Thank you very much in advance.

1. No - the whole concept of cointegration is based on variables that are integrated of the same order. So, if the variables are all I(1) it makes sense to test if a linear combination of them is I(0). If such a linear combination exists, we say the original variables are cointegrated. If you have just 2 variables, on I(1) and one I(0), cointegration isn't possible.

15. Dear Prof Dave Giles,

I would like to ask you another question regarding the critical value in unit root process. I use Stata program to run the test of ADF in step of unit root, my data contains 239 observations. In order to determine whether it is I(0), can I compare t value with critical value in table of ADF result directly, or I have to use MacKinnon's Critical Values for ADF integration.

Thank you very much for the help.

1. I'm not a STATA user - you'd need to check if the critical values are the asymptotic ones or exact ones from MacKinnon. IN EViews, the exact ones are used, together with p-values. If you have n=239, there won't be much difference between exact and asymptotic values, but if in doubt, use the MacKinnon values. And check the STAT manual or "help" - it's important to know what the package is giving you. :-)

16. Dear Prof. Giles,

I'm currently working on a VAR model with one I(0) variable and one I(1) variable. Is there any theoretical foundation on how to do this? Most papers write about VAR models based on differences or levels. Can I model with a differenced and a non-differenced variable? Thank you very much for your time and very clear explanations on this blog!

Kind regards,
Robin

1. Robin - if it's causality testing that you interested in, see this post: http://davegiles.blogspot.ca/2011/04/testing-for-granger-causality.html

Putting causality to one side, it you just want to fit the VAR and use it for forecasting or impulse response functions, you have 2 options:
1. Use the level of the I(0) variable & the first-difference of the I(1) variable.

2. Difference both variables. The differenced I(0) variable will still be stationary. There is risk of over-differencing the I(0) variable, but overall I'd prefer to choose this option.

I hope this helps.

17. Dear prof Giles
My research title is ( tourism -led growth hypothesis:case study of Liby and I would investigate the relasionship between tourism and economic growth. My data period is annual data from1995-2010. And my variables are GDP. International receipt, unemployment rate , also I would investigate the short run and long run relationship and the causality between these variables.
I would ask what are the steps should I follow to investigate the relationship and causality between the variables
My regards
Nagma

1. Nagma: I have spelled out the steps in detail in my post here:
http://davegiles.blogspot.ca/2011/04/testing-for-granger-causality.html

DG

18. Dear Prof. Giles,

Is it make sense to run cointegration test on 2 variables which both are I(0) in unit root test?

However, I have run it and the result show that it is cointegrated.

But what I found in prior papers, they use I(1) variables to test cointegration, not I(0).

I am not sure whether I can use I(0) to test for cointegration or not?

1. No - it makes no sense, due to the very definition of the concept of cointegration. Two series are cointegrated if (i) They are both integrated of the same order, but, (ii) there exists a linear combination of these variables that is integrated of a lower order. So, both I(), but a linear combination that is I(0), for example. In the case of just 2 variables, if such a linear combination exists, then it is unique. This need not be the case if there are more than 2 variables.

Your result that you mentioned could arise for several reasons, including a small sample; structural breaks in the data, etc. In any event, you can;t have 2 I(0) series that are cointegrated.

I am running the data to test the relationship between spot price and futures price. I intend to follow the step of testing unit root, cointegration, and ECM.

At first, I used the the prices to run unit root test, and found that both spot and futures prices are I(1).

However, I have been told that I need to use the returns which calculated by Ln(pt/pt-1)instead of prices. But by using the returns, the result change to I(0)for both variables.

So, is it means that I cannot go for further steps (cointegration and ECM)?

Thank you so much for your help.

3. Your returns data are stationary, so cointegration and an ECM are non-issues. You can just model your data using conventional methods.

19. Sorry for my poor knowledge in this field, what do you mean by "conventional methods."?

Thank you.

1. You can just use OLS regression.

2. Dear Prof.

thank you for the excellent service that you provide to the community (industry and academic) by running this blog. It is very valuable.

Follow up question on the above: what happens if you use, say, 6 times series of returns (stocks, forwards, options, etc) where all of them are stationary? How would you go about testing for granger causality among them?

Thank you

3. Estimate a 6-equation VAR model in the levels of the data and just use the usual Wald test. There is no need to use the modified test.

20. Hi,

Great blog!

I have found that the variables in my model are either level stationary or 1st difference stationary, however tests revealed no cointegration.

As a result I am trying to fit a VAR model. Can I fit the model in first differences or should I use a mixture of level and first difference variables?

1. Fred: Thanks for the comment.

It depends on why you are estimating the VAR. If it's to test for Granger causality, then you should fit in the levels, and follow the TY procedure outlined in the "Testing for Granger Causality" post linked at the beginning of this post.

If you're estimating the model to use it for forecasting or impulse response functions, then from the information you've supplied, I'd difference ALL of the variables. This will make the I(1) variables stationary. The differenced I(0) variables won't be I(0), bit they WILL BE stationary.

You may have done some over-differencing, but it doesn't sound as if you have a lot of choice?

Are there any structural breaks in your data that may have "contaminated" the unit root/cointegration tests?

I hope this helps.
DG

I was planning on doing both Grangers causality and impulse response function. Assuming I am dealing with I(1) variables, do I have to fit two VAR models one at levels and one at first difference in order to test for both?

1. Fred - one in the levels for TY testing for causality. Then one in the differences for the IFR's.
DG

2. Unfortunately I'm a Stata user. I don't think I am capable of implementing your Eviews steps of the Y-T test.

Could you tell me what the main drawbacks of running the conventional Granger Causality on first difference VAR are?

3. All you have to do is fit a levels VAR, include an extra lag of each variable, and then make sure you DON'T include the coefficients of the extra lags when you do the Wald test. This is easy to implement with any of the usual packages.

If you proceed with a first-difference VAR, the test statistic for Granger causality will not be valid - even asymptotically. It won't have an asymptotic chi-square distribution. Its asymptotic distribution will depend on unknown nuisance parameters. In short - you're sunk!

4. Hello,
What you are saying is that I cannot estimate an ARMAX model with the time series in first difference, in the case that they are cointegrated?

5. That's right.

22. Hello sir,
I have a few questions about the macroeconomic data I am trying to estimate using VAR. I have 5 data series, including interest rate, exchange rate, money supply etc. I find that 3 of the series are I(0) and 2 of them have unit root, with stationarity at first differences. First of all, can I estimate using first difference of the two series and leave the other three at level and still use VAR? or would I need to take the first difference of all series in the model? Secondly, when I run the vAR with first difference of the two non-stationary series and levels of the 3 stationary series, I get very low R-squared around 13%, log likelihood 230, and determinant residual covariance as 4.71E-19. Can you tell me if I can go ahead with impulse response testing? I believe the model is a very bad fit based on these, what would be your suggestion to improve since I cannot change my data?
Many thanks in advance and regards,
Has

23. I'd suggest you do an ARDL/Bounds test analysis to see if there is a long-run relationship. Then you can decide whether you should be using a VAR or a VECM. It also sounds as if you may have an essentially-singular system.

1. Hello sir, thank you for your response, but how can I solve the problem of having a singular system? Can I accept there is multicollinearity but continue with the model ? Could you provide some suggestions as to how to sort this issue please?
Greatly appreciated,
Regards

2. No, you can't continue - there is something about the way you have set up the equations that is causing the problem. You'll have to re-consider the specification of the equations.

24. Dear Prof. Giles,

I want to test the impact of external shocks (oil, US gdp) on domestic variables (gdp, cpi). Variables like Oil, GDP, CPI are I(1) while US gdp is I(0). When I test the cointegration of all variables at LEVELS, they ar cointegrated. However, in order to impose restrictions, I prefer use SVAR model to estimate the IRF & FRVD. Is SVAR ok in this case?

In SVAR, I need to take differences of Oil, GDP, CPI. I am confused of differencing US GDP (because it is stationay at level). Because the results of 2 models are quite different.

Could you help me on this issue? Thank you so much :).

1. If US GDP is I(0) it shouldn't be included in the cointegration testing. In addition, if you difference an I(0) variable it is still stationary (hough not I(0)). SO, you could difference it along with the other variables in the model.

25. HI Giles
my sample is 16 years ,because i couldn't find more than 16 observation due to shortage of the data also there is no any daily or weekly or even monthly data. so , i would investigate the relationship between tourism and economic growth if there is long or short run by the VECM MODEL also the causality if possible. my question iS , IS it okay to run VECM , because all the econometrician are saying that it is impossible to run time series less than 30 observation? if not , is it okay to run ARDL model because in your post you mentioned that is the best model for the small sample , please really i need your help.
Thanks
NAGMA

1. Nagma - I would be very concerned to see such a short time series being used for this sort of analysis.

2. so , what can i do? because i found paper is used bound test for cointegratiom and T-Y FOR CAUSALITY with the same sample size.please your help and expand explanation and justification

3. There are lots of bad/weak papers out there. You need more data. That's all I can say.

26. Hi Giles,

I have two I(0) and two I(1) series. When I run a cointegration test, I find that there is cointegration between series. I also applied Granger causality test with T-Y procedure. But I also want to see impulse-response function. What should I do? 1- Take differences of all the variables and run a VAR model. 2- Take differences of only I(1) series and run a VAR model. 3- Run a VECM.

Thank you very much in advance :)

1. I'd use a VECM for this.

27. Dear Prof. Giles,

I have estimated a VAR model in the levels to test for Granger causality. It consists of two I(1) and one I(0) time series and p=1, m=1 lags. Running a Wald test showed that the lags of time-series one together with those of time series 2 do not significantly differ from zero so that ts1 and ts2 do not Granger cause ts3. I also ran the Wald test for the other two equtions and there is no statistical significance so that there should be no Granger causality if I understood the procedure right.
Estimating a VAR in the differences (and hoping there will not be any over-differencing of the I(0) variable) shows some significant coefficients (even between the time-series). Shouldn't the coefficients be zero when there is no Granger causality? Or is one of the models false? I had to estimate the VAR model in the differences with more than p+m lags in order to make sure that this VAR modell is well-specified.

Thank you very much :)

1. Thanks for the comment. No, the coefficients need not be zero. By the way, as you have a mixture of I(1) & I(0) variables, did you allow for this properly when you tested for cointegration? For example, see my post at http://davegiles.blogspot.ca/2011/04/testing-for-granger-causality.html

2. So if there is no Granger causality found when testing the VAR model in the levels with the T-Y approach there doesn't exist any Granger causality no matter what the significances and estimated values of the VAR in the differences are?

3. Correct. I presume your sample size is large enough that you can safely appeal to the asymptotics needed for the TY -Wald test.

28. Hello Sir,

I am a bit weak in this field. I want to know what steps proceed after testing for co-integration. I want to test the long run relationship between the US stock market and other stock markets. Firstly, i will do a pairwise cointegration test between the US and the other stock markets. If I want to find the short term relationship, what should I do? What is the difference between Granger causality test and VECM? Why and when should we use them?

1. If you find that your data are cointegrated then you can use the levels of the data in an OLS regression to estimate the long-run relationship between them. You can also estimate an ECM if you are interested in the short-run dynamics.

29. Dear Dave Giles

I'm interested in estimating the long-run relationship between four variables that are I (1) and are also cointegrated (only one long-term relationship). I have the following questions
1. The Granger causality test should be done with the VAR or VECM.
2. What does not reject Ho of Granger causality test in VAR

1. 1. I'd use the VAR - see my explanation in this post.
2. If the variables are in fact cointergrated then there HAS to be G-causality in one direction or another. Failure to detect it may be due to the use of a very small sample, or the presence of structural breaks in the data.

30. Dear Dave
I would like to know-
1. When i use a VAR model how can i get impulse response function for a negative shock using Eviews?
2.In case of structural VAR, how can i create a confidence band for impulse response function?

1. 1. Click the "impulse" tab, then "impulse definition", then "user specified". Use HELP to see how to specify your impulse.

2. No idea off hand.

31. Sir,
1) My x & y are both I(1). In fact both are growing. (d(x) & d(y) are both I(0)).
2) d(x) Granger-causes d(y) - and vice versa
3) T-Y tests show Granger-causality from y to x - but NOT vice versa
4) According to Johansen's test y & x are cointegrated. Impuls response functions from VEC show y declining in response to shock to x and x rising in response to y.
My ECONOMIC conclusion is that "x does NOT drive y". (This is pretty heretical to most economists).
My silly question: Do I need to report (or engage in) steps 2 & 3 at all?
L (an econometrics' autodidact).
P.S. Regarding confidence bands for Impulse response functions from VEC. Eviews gives these bands for VARs. How about transforming VEC into VAR (with VAR in differences, with the exogenous variable defined as the cointegrating eq. specified in VEC?)

1. Go with your economic reasoning first and foremost.

32. Hi sir
I am working on panel data, All my variables are I(1) and cointegrated. I am interested in estimating long-run and short-run causality . How I can do this?
My second question is can we apply Dumitrescu and Hurlin (2012) causality test on multivariate model?

33. Dear Prof Giles,
First, thank you very much for this helpful blog.
I am writing my thesis and I want to be sure that I have undestand correctly.
First: Granger causality on VAr shoul be implemented with the Toda et al approach if there is cointegration
Second: if there is cointregration, there is for sure Granger causality
Thrid : Is it possible to implement it on stata or R?
Thank You very much
ale

1. Your understanding is correct in all 3 cases. You can do this in any package that allows you to estimate a VAR and to perform a Wald test - that inculdes Stata, R, etc.

34. Dear Prof.,

Thank you for all the good job! I've one seemingly simple question: how can I change negative values into logarithm in eviews? It automatically drops them

1. The logarithm of a negative number is not defined, mathematically. So, you're asking for the impossible!

2. Thank you Professor. I understand the logic but I have many variables with negative values in my regression which I have to transform to logs. What shall I do about them?

3. You can't. The fact that there are zero values is telling you that such a specification would be incorrect.

35. Prof.,

If there is co-integration among variables, is it a must that they have long term relationship?

Thank you!

1. Yes - that's what cointegration is.

2. Thank you Prof.

So, that means the error correction term in the vector error correction model must be negative and significant (?)

3. why should be also negative? is significant not enough?

36. Dear prof,
is it important to be the the coefficient of ECT less then one? and if we get the coefficient of ECT equal 2 or 3, is it wrong ?

1. The coefficient of an error-correction term should be negative.

2. Dear prof, what does it mean, if ECT less than -1, in other words, if it equals for example -1.1

3. Then the results make no sense - the error-correction term is "over-correcting" in trying to get back to equilibrium.

37. Prof.
I tested cointegration using Johnsen cointegration test, but found no cointegration. Can I use Granger causality or what as an option?
Thank you

1. Yes - of course. Cointegration implies there must be Granger causality, but the converse isnt true. YOu can have G-causality even if there is no cointegration.

38. Dear Prof,

Do you know why VECM model take the first difference as dependent variable while VAR take level as dependent variable? Thanks.

1. This is to take account of the non-stationarity of the data.

39. Dear Professor,

If I test the cointegration between 5 variables using Johansen Cointegration test and cointegrating vector was found. Do it mean any pair of two variables is having the Cointegration relationship?

1. No - this is NOT the correct interpretation. If you use the Johansen method, the precise nature of any cointegrating vectors will be revealed explicitly.

2. Dear Professor,

Thanks. Would you mind to elaborate further "the precise nature of any cointegrating vectors will be revealed explicitly" means?

So for my case, should I run the Johansen Cointegration test for each individual pair separately instead of run all 5 together, in order to find out which pair is exactly cointegrated? Thanks.

3. No - you test all 5 for cointegration using Johansen's method. Any standard package - e.g., EViews - will provide output that shows how many cointgrating vectors there are ans what variables appear (with what weights) in any such vectors. That;s the whole point of the Johansen methodology.

4. Thanks. Can you advise me how should I do if I want to test which pair of variables have long run relationship? Should I use Engle approach?

5. Anonymous (21 July) - use Johansen's cointegration testing methodology.

40. Dear Prof,

Have you heard about exclusion test, which to be done after we found there is a cointegrating vector using Johansen procedure, to see which variable(s) do not participated in the cointegrating space? If yes, do not know how to run it? Thanks.

1. Yes - this type of testing was discussed in Johansen's original 1988 paper, as well as in several subsequent papers by Johansen and Juselius (among others). You can see an application in their 1992 paper, in the Oxford Bulletin of Economics & Statistics) involving the demand for money. For a good overview of this type of testing, see: http://www.nuff.ox.ac.uk/economics/papers/2003/w10/BoswijkDoornik.pdf
I'll try to do a post on this at some stage.

41. Dear Professor,

Do you know why we cant conduct unit root test in Eviews for panel date with N=20, T=6? It shows insufficient number of observation. But our total observation is 20*6=120 correct? Thanks.

1. What matters is the value of T, and T=6 is insufficient. Questions such as this would be better addressed to the EViews forum, rather than this blog. See http://forums.eviews.com/

42. Dear Prof.,
Thanks for your great work. It's really helping me lot.
I'm Kaleswaran. R doing Ph.D in Pondicherry Central University in the area of "India's Foreign Trade and Its Contributions on Economic Prosperity", in India. Generally, we have to conduct for 'Cointegration' test if two variables are integrated at same order. In my case, two variables are I(1). So, now I have conduct 'cointegration' test. Here, among five equations which one I have to choose. My data are having linear trend. Equation 3 and 4 ('Intercept (no trend) in CE and test VAR' and 'Intercept and Trend in CE-no intercept in VAR') are showing no cointegration relationship exist among the variables. But, equation 2 'Intercept (no trend) in CE - no intercept in VAR' is showing 1 cointegration exist (I'm using Eviews 7). Is it correct to choose this equation 2 when our data have linear trend?

1. I haven't seen your data, but "yes", that sounds correct.

43. Dear Prof.
Great Blog

if we found in the VECM no co-integration ( vector rank zero ) then I should not use VECM. rather I should use regular regression of variables first difference?

and if I found more then rank 3 vector for 2 variable model in VECM, than I can use VECM normally ?

1. 1. Yes. You don;t use a VECM if there's no cointegration.
2. I don't understand your second question.

44. Hello sir,
i have read the post, but i still dont understand somethings, because im really new in this field. i did the engle- granger cointegration test and my variables are not cointegrated.The variables are all I(1)s at level and I(0) at first difference.can i proceed further to granger causality test( i keep seeing on the post, variables must be cointegrated to have a causality) .If i can, is VAR the appropriate model to use?do i need to use the variables at first difference or level? im really sorry, cus there was a similar post ,but i dont seem to get. thanks very many!

1. You have mis-read what I said, If the variables are cointegrated, then there must be Granger causality. However, the converse is NOT true: you can have causality without cointegration. If you are really sure that your variables are all I(1) and not cointegrated you could use a VAR in the first-differences for modelling and impulse response purposes. However, if you want to test for Granger causality in your case, I'd be using a levels VAR model and using the Toda-Yamamoto (MWALD) approach - see my other posts on this.

45. Hello Prof.
If I was using VAR model however there is autocorrelation
should I change the VAR model or is there some steps to fix it or should I just report my findings and keep it the way it is
?
thanks

1. Extending the lag length(s) will usually resolve this - if you have enough degrees of freedom to be able to do so.

46. Hi, I'm studying the effects of macroeconomic variables on stock returns and four of my variables are I(1) and the other is I(0), can i run a johansen test of cointegration?

1. You can only test for cointegration among variables that are non-stationary. However, take a look at my posts on ARDL models.

47. Dear Prof. Giles,
I really appreciate your clear and transparent explanation of the procedure of conducting a T-Y Granger causality test. I have several questions longing for your kind reply. 1. If the T-Y procedure fixed the asymptotic distribution of the Wald test within a VAR framework, would the distribution LR test statistic or LM test statistic be the standard one? 2. How large is the sample size should be considered large enough? 3. If the sample size is "extremely" small, if some procedure else can be done? For example, can I bootstrap the Wald chi2 statistic? Thank you very much!

1. Kang: Yes, the same applies to the LM test or LRT. As with any asymptotics, there's no "magic number". In the case of a small sample, bootstrapping the test statistic is best thing to do.

DG

48. Hi Prof.
I am conducting a research on the relationship between two macro economic variables . ADF test gave me the result that the data are non stationary at level but at first difference the became stationary. Then I employed Johansen co-integration test and found 'there is one co-integrating equation' in both trace and max eigenvalue. Now what can I do? or what should be my next step?

1. You just estimate a model using the first-differences of the data.

2. That is, a regular VAR in the differences.

49. hello sir. My series are I(0) and I(1). They are cointegrated according to ARDL bounds test. What kind of causality should I use; Standar Granger Causality, VECM Causality or Toda-Yamamoto??

1. Toda and Yamamoto.

50. dear sir,
in my modal, ect is negative but p-value is .25, what does it means? and what next step should be taken? thaks

51. Dear Prof.,
i would like to know if its possible for to do the ARDL and VECM tests for 10 variables for annual data 28 years?

I truly appreciate you help.

Best Regards
Ali Alshawaf

1. Ali - no, you won't have enough degrees of freedom to do anything meaningful.

52. Dear Prof and valued members
I employ a time series data that to measure the impact of Time deposit interest rate, M/BV, and a dummy variable on banks liquidity as measured by customer deposit. however, after checking for unit root (ADF); I've found that the variable of time interest rate is stationary at level I(0), M/BV at the second difference I(2), the dummy at level I(0), and the dependent variable "customer deposit" is stationary at the first difference I(1). could you please tell me the best model for achieving that?

53. Dear Professor,
To begin with, I would like thank you for your extraordinary blog.It especialy helps a lot to people like me (Read : Working Professionals ) who are new or not comfortable with statistics and have to complete task in set time.
I have been working on a model to find relationship between market price and future contract prices (3 months) of stock.e.g. Amazon's stock price(M) with Amazon May(M1),June(M2) and July(M3) future contract prices on particular data.
I have collected daily data for last 4 years.I checked for stationarity using ADF in R with default lag.M is non-stationary with p value '0.6' and M1,M2 and M3 are stationary.But when tested at first difference,all of them are seen as stationary (p=0.01).
I have then planned to use VECM or VAR to find relation between them in terms of equation.But as I am reading,I have been suggested to go for contegration testing ,then granger's causality test and then VECM.
I don't understand why can't I simply apply VECM or VAR to time series instead on these testings.Because as I understand,these testings don't alter my time series.Is my approach correct?Should I use VECM or VAR or some other model? Should i be carrying out testing of time series at initial level or first difference?
Please let me know.I am confused and don't know how to progress further.

You can't estimate a VECM model unless you have tested for, and found, cointegration. That's not going to be even possible unless ALL of your data are I(1). So, you should be using a VAR model. If you have a 10% significance level in mind, then ALL of your series are stationary according to the ADF test, so you could use the levels of all of the series in your VAR. If you want to be more cautious, you could cross-check the ADF results by also applying the KPSS test for confirmation. Alternatively, you could play it safe by estimating your VAR model using the differences of all of the series.

54. This comment has been removed by the author.

55. Dear Sir,
I am analysing time series data using cointegration and VECM. All the series were tested for a unit root allowing for structural breaks. The tests reveal that all the series are non-stationary, and also contain structural breaks. This suggests that I will need to account for the breaks in the VECM model. However, the structural breaks in all the series have different break dates. As a result, I am not sure how to incorporate the different breaks in the VECM model. I am asking if you can assit me with the right approach on how to deal with the different breaks in the VECM.
Many thanks.

1. I would imagine that you need several dummy variables - for each of the breaks.

56. Dear Prof. Giles,

Thank you very much for this blog. It has been of tremendous help for my Master's thesis. I would be grateful if you could confirm whether the TY procedure can be applied in a panel data (Panel VAR context).

Regards,
Ankesh

https://www.researchgate.net/publication/227347359_Testing_for_Granger_causality_in_heterogeneous_mixed_panels

57. Dear Dr. Giles,

I first want to express my sincere thanks for your blog filled with extraordinary knowledge. As far as I understand your comments correctly we cannot use VECM Granger causality unless all variables are I(1) and co-integrated. However, I am wondering what are the reasons why we cannot apply VECM Granger causality to a mix of I(1) and I(0) co-integrated variables.

1. I(0) variables can't be cointegrated, by definition.

58. Dear Prof. Giles,
I'm very thankful for all your interesting and fabulous in detailed posts. I find my self in a very uncomfortable situation.
I have a set of 7 variables (I know it is a lot for a VAR) montly data from 1997 to 2014.
I'm trying to identify how different supply and demand variables affect a commodity price.
Therefore I conducted my study using an SVAR approach which delivered interesting IRFs.
After leaving that project on a side for more than a year I found this post of yours and wonder if the SVAR analysis was the right approach to my study and if I shouldn't maybe consider a VECM. Here some of the information:

- From the 7 variables 5 appear to be I(1), the rest are stationary.
- I conducted a Johansen test for all the variables and it finds 3 cointegration relations.
- If I choose only the I(1) variables to test for cointegration relations I obtain only 1 cointegration relation.

I'm very confused on what to do. After investing a lot of time on this project and going very deep into SVARs I have the feeling my research is wrong. Could you tell me what u think about this?
Thanks a lot for your input,
Chris

1. Chris - first, your Johansen testing for cointegration should involve only the 5 variables that are I(1). Second, if you have found cointegration then you really need to estimate a VECM model. The other 2 stationary variables should be included in the model in their levels (not differenced).

59. Dear Professor
Can u tell me, how can we get P-Value for long run coefficients in VECM after JJ cointegration? As the long run has only standard errors and T statistics but it doesnot show the P Value? Kindly need your guidance in this regards.
Thanks alot Professor

1. The t-statistics will be asymptotically standard normal, so you can compute the p-value trivially from that.

60. Dear Prof Giles
I do not want to begin with any more adulatory comments about your blog because it is now a common reference point for all practitioners.
My question is about the sample size required for conducting Granger causality test using VAR or VECM model.
Suppose we have two variables with annual data for 30 years. Let p of VAR model is 3. So we have 14 parameters ( a,b, c and d, in your example) of the VAR model. Let order of integration be 1 for both the variables. So we will be adding another 2 parameters in each equation. Thus, the total number of parameters to be estimated are 18. How robust will be our estimated parameters and subsequent conclusion about the underlying causality based on these parameters? We also need to reckon with the fact that in macroeconomics chances of a structural break is very high in a span of 30 years.
This issue struck me while reviewing results of an econometric exercise. Your advice pl.
Regards
Ashok

1. Ashok - in short, fir a model with these specifications, 30 annual observations seems rather short.

61. Hello and congratulations for your great and helpful post. A question: Can I apply a VARM at level with stationary data but with no cointegration? thanks in advance.

1. Yes. (If the data are all stationary, they can't be cointegrated.)

62. hello Mr. Giles,may you answer me this question... if the Johansen test shows two cointegrating equations.. why do always in a vecm we use just one? thanks

1. We don't. If there are 2 cointegrating relationships, then 2 error correction terms will enter the VECM model.

63. Dear Sir, Suppose in the VAR (variables in Level), the optimum lag length is 2. For Co-integration test and (VECM), I should take lag 1 as optimal lag length. Am I right. Or should I take the differenced variables I(0) and check the lag length and then proceed to check co integration in Levels. Sometimes I find lag 0 as optimal lag. What should I do in such cases in estimating VAR.

1. Regarding your first question - yes, you are right. If the optimal lag is zero then you don;t have a VAR. I'd use one lag in this case.

64. Dear Sir,

I'm analyzing 3 time series, each one of them has 45 observations. They're I(1). Is it enough the number of observations in order to fit a VECM with them? Should I fit an ARDL model or another one instead of it in order to get the relationship between them? In advance, thank you very much for your help.

1. First thing - what's the objective of this research??? That's crucial. Second -they're all I(1), but are there any cointegrating relations?

65. Dear Sir,
I am working with 5 variables where only two of the independent variables are I(0) and all the rest are I(1). Can i run a VECM here(assuming cointegration exists)? For VECM to be operational, variables must be cointegrated so wheteher i should search for cointegration among all[I(0) and I(1)] the variables or only among the I(1) variables. Thank You in advance for your help.

1. Just among the I(1) variables.

66. Hello!

you might find this question boring! but i would higly appreciate your answer!

I used a ARDL bound test to find a cointegration relation between my variables (or should i use a johansen cointegration test since all my variables are I(1)?). the variables are cointegrated.
i think the traditional way to test for causality in presence of cointegration is by using a VECM to test for Granger non-causality. Yet, i would like to know if i can run a VARM (using TY procedure) to test for Granger non-causality.

Also, i would like to know if you would use TY procedure (in levels) in the next cases:
1. variables are a mixture of I(0)/I(1) but they are not cointegrated.
2. variables are a mixture of I(0)/I(1) and they are cointegrated.
3. variables are I(1) but they are not cointegrated.
4. variables are I(1) and they are cointegrated.
5. variables are I(1). they can`t be cointegrated.

1. If you are comfortable that they are all I(1), then Johansen's procedure would be appropriate.

Re T-Y:
1. to 5. Yes in all cases.

2. Thank you for answering, professor!

I'm currently working on my thesis and i want to test for cointegration with an ARDL bound test and then test for granger non-causality with a VARM using Toda-Yamamoto procedure.

I really want to use ARDL bound test, but it seems like all variables are I(1) (I tested for unit root in my series with ADF, PP, KPSS and ZA tests). is ARDL bound test viable in my case (all variables are I(1)).

If i didn't misundertood, according to your suggestion it is ok to use ARDL bound test for cointegration and then VARM (using T-Y procedure) when all variables are integrated of the same order (in my case, all variables are I(1)).

3. If you're really sure that all of your series are I(1), then just use Johansen's tests fro cointegration. Then use T-Y to test for Granger non-causality.

4. Thank you!

67. Dear prof. Giles,

Thank you for all your valuable contributions. This blog has teached me a lot.

Still, I have one question regarding TY causality. Does it refer to short run or long run causality?

My best guess is short run, the same as "regular" granger causality. As opposed to the causality through the error correction term (in VECM) that may be considered as long run causality.

1. That's correct.

68. Dear Prof. Dave Giles,

Thank you the wonderful blog. I have one questions.

When testing VAR, do we need to test all the auto-correlation, heteroscedasticity and Normality ? If we have a problem of Hetero, is increasing lag the only choice to deal with it ? Thank you.

1. Increasing the lag length should deal with autocorrelation, but it won't help with hetero. Use the het-consistent covariance matrix when constructing the Wald test for caausality. The non-normality is not really a big issue. Your causality test is only going to be asymptotically valid, after all.

69. Dear Professor Giles, I'm trying to chose between VECM and ARDL approaches for modeling inflation (CPI Index) in Sri Lanka. My sample size is only 39 annual data and have altogether nine variables. Is it true that ARDL works better with small samples and higher number of variables compared to VECM? Thank you very much in advance!

1. Dhanusha - with only 39 observations, and nine variables (and their lags), you're not going to be able to do very much with either type of model. Just think about the degrees of freedom.

70. Dear Prof. Giles
Your blog helps a lot in understanding econometrics. Thank you!

I've got one thing that I do not fully understand:
When identifying lead-lag relationships between two variables (both I(1)), I take the first difference and model them in a VAR-model. Now when testing for Granger causality, I have to set up a new VAR, with the variables in levels. My questions: Is that procedure all right? And when I want to find out whether there is Granger causality between let's say stock returns and bond spreads, do I still include them in levels (prices not returns then)?
Help is very much appreciated.

1. Yes, you use the levels of the data when testing for Granger causality, even though the data are non-stationary. However, you need to alter the model (e.g., as in the Toda-Yamamoto procedure) to ensure that the Wald test statistic has its usual asymptotic chi-square distribution.

2. I see; I just wondered because it seemed unusual to me to use levels (despite the fact that you mentioned it in this and one of your other texts). I've started studying time series analysis about two months ago, so I still need time to get along with it a bit better.
Thank you very much for your help. I wish you a good day!

71. Dear Prof. Giles
You stated that with the T-Y-procedure, checking for cointegration is not necessary (since one only uses it to check the results at the end). My question: Is Granger causality "enough" to quantify the relationship between two variables? If one wants to answer the question "What relationship does X and Y have?", is cointegration really needed when there is or isn't Granger causality?

1. My comment was purely to do with establishing Granger causality. Nothing more than that. You can have Granger causality (uni-directional, or bi-directional) with or without there being cointegration. However, if you have cointegration then there MUST exist Granger causality in one direction or both. The latter follows from Granger's "Representation Theorem". If you want to answer a very general question, such as "What relationship does X and Y have?", I'd suggest you separate it into 2 sub-questions: "Is there a short-run relationship?"; and "Is there a long-run relationship?"

2. Thank you for your explanation, it's really helping.
Short-run relationships can be tested by Granger causality and long-run relations by looking for cointegration. Is that reasoning correct?

3. And if you have cointgration, the relationship in the levels of the data is the long-run "equilibrating" relationship. while the corresponding "error-correction" model gives you the short-run dynamic relationship.

4. If I find Granger causality and the variables are cointegrated, it is the long-run relationship? Okay, I was mistaken there, I'm sorry. So basically: Granger causality without cointegration refers to the short-run eq, Granger causality with cointegration to the long-run eq (+ VECM coefficients show you the short-run eq) and cointegration without Granger causality to the long-run as well. That should be alright now, isn't it?
(I'm sorry for all these questions)

5. Correct! (That's OK :-) )

72. Thank you for your help! :)

73. Dear Prof. Giles
I wanted to quickly ask you something: Is the Y-T procedure still valid, even if you choose an arbitrary lag length? For example: AIC tells you to take 8 lags, but the errors are highly correlated. So you increase the number of lags to 12. Is the Y-T-procedure (applied on a VAR with 13 lags then) still valid and superior to Granger causality in differences?

1. Yes, that's fine. The main thing is to make sure that you don't UNDER-state the lag length.

74. When heavily overstating the lag length, will the Wald test have an asymptotic chi-squared distribution though?

1. It will still be O.K., but it's power may be reduced somewhat.

75. Hi, Prof. Giles
Im confused over the ECT. After I found cointegration in Johansen test, I proceeded to VECM, but my ECT was insignificant and it was negative. Does this mean that my result is spurious? As I know, the ECT is referring to long run equilibrium, so in my case, it contradicts with the Johansen test. Can I still proceed to VEC Granger causality and further tests?

1. The coefficient of the ECT should be negative. Also, this term relates to the short-run dynamics, NOT the long-run relationship.

76. Dear Prof. Giles
By critical values, do you mean the ones from the chi square table?

1. Yes, see step 11.

2. Ah right, thank you!

77. Dear Prof Giles,

May i ask whether it is okay to include two perfectly negatively correlated series (i.e. one series is the negative of another series) in a VAR framework to test for Granger causality?

1. It makes no sense, and if you try to actually do this you'll find that everything "falls apart". Try it! :-)

78. Hello Prof Giles,

Do you see any problems of using weekly data instead of daily data in VECM if my time period is just 2 years long. I'm examining price discovery process between CDS and Bond markets.

1. No, I don't see any problem with this.

79. Hello Prof Giles,
I am new to eviews..currently I am working on EKC hypothesis paper..where squared gdp is involved..all my data is transformed to log-log model..the problem is, when I try to include the squared gdp into the series in ‘estimate equation’, i get the message ‘near singular matrix’..which I later found out that it is not allowed in eviews to use data that is derived from another data in the same series..which in my case squared gdp is derived from gdp..could you help me in this area please as my lecturer also could not provide the appropriate answer

80. Professor Giles, I was sure I asked this question already but can't see my post? Maybe I'm looking at the wrong post. I'm working on a project which was last done 8 years ago and I have a couple of questions.

1) The last time it was done they used an Unrestricted Error Correction model (UECM) of the form:

d.X = d.Y + d.Z + L.X + L.Y + L.Z

This seems to be the similar to Pasaran and Shins ARDL approach? With a lagged dependent variable on the right hand side with the differenced and lagged independent variables. It's very hard to find documentation of this UECM elsewhere. Also in Pesaran and SHin they have not differenced the dependent variable. Can you tell me which is the most appropriate method? I am dealing with a small sample (24 yearly obs last time and no more than 30 this time)

Also I am only interested in the coefficient estimates to be used in another model. According to Eric Sims time-series notes he recommends just regressing cointegrated series in levels as the estimates will be "superconsistent"? This seems a lot simpler and he even says including lags of the dependent variable protect from potentially not cointegrated independent variables? Do you agree?

81. It's impossible to answer your first question without knowing what X, Y, and Z are, and what are their orders of integration.
Regarding your second point: If all of the series are I(1) AND they are cointegrated, then regular OLS using the LEVELS (not differences) of the data will be super-consistent. That is, the estimator converges to the true parameter values at the rate "n" (the sample size), and not the usual SQRT(n) rate. This is well known, and is unaffected by adding lags of the levels to the regression. Of course, such a regression will only provide information about the long-run equilibrating relationship among the variables. It will tell you nothing about the short-run dynamics - and the latter is precisely what any sort of ECM is all about.

82. Thank You, Professor Giles and apologies I should have been more clear. X is a measure of demand and Y,Z...are explanatory variables like income etc. They are all non-stationary and I(1)

Interesting, so an ECM is for short run dynamics I see. So if these coefficients were to be used for forecasting (yearly out to 2050) would you recommend an ECM?

1. See my earlier post at https://davegiles.blogspot.com/2016/05/forecasting-from-error-correction-model.html

That's a VERY long forecast horizon, no matter what sort of model you used. It doesn't sound like short-run dynamics to me.

83. Dear Professor,

My cointegration vector includes three variables (I used a VECM model for government expenditure, government revenues and GDP in logs). The coefficient of revenues in the coint. equation is not significant. Is there a problem in general with insignificant cointegration parameters undet the Johansen framework?

*Note that the error-correction term of the EC model of ΔRev is also not significant and positive.

1. The thing that is a concern is the positive sign for the coefficient of the ECT - that makes no sense.