Thursday, March 5, 2015

Granger Causality & Seasonal Adjustment

One decision that we often have to make when modelling with time-series data is whether to use "seasonally adjusted" data, or the original (unadjusted) data. In some cases the decision is effectively made for us - only the seasonally adjusted data are published. This arises, for example, with some U.S. macroeconomic data, and it can be a bit of a pain.

For some previous comments on this, see here.

However, suppose that we have a choice - original data, or data that have been seasonally adjusted by some filtering method (e.g., the Census X-11/12/13 filter) - and we're interested in testing for Granger causality. Is there any evidence in favour of using one version of the data or the other?

Well, yes, there is. Let's take a look at it.

Perhaps not surprisingly, Granger himself had something to say about this - not in his seminal causality paper (Granger, 1969), but in Granger (1979).  

A useful starting point is to note that the very notion of Granger causality involves the information set that's available. In a nutshell, X1 Granger-causes X2 if the prediction of X2 (based on only the past information about X2) is improved when the information set is augmented by past information about X1. Seasonally adjusting time-series data alters the information set. So, not surprisingly, this may have implications for Granger causality, and our ability to detect its presence.

One of the results that Granger highlights is the following.

Let:         X1t = Y1t + S1t    

               X2t = Y2t + S2t       ;     t = 1, 2, ........

where the two Y series have no seasonal component; and the two S series are stochastic and strongly seasonal. The X series are stationary.

Granger (1979, p.43) notes that:
"There are numerous possible interrelationships between the two X series, for example, Y1t may be causing Y2t, but S1t, S2t are interrelated in a feedback (two-way causal) manner.
The effects of seasonal adjustment on the analysis of such relationships have not been studied thoroughly, although both Sims [16] and Wallis [17] have recently considered in some detail the case where Y1t causes Y2t and S1t, S2t are possibly interrelated.
If S1t, S2t are important components, then it is clear that even if they are not strictly related, so that they do not have any causes in common, it is virtually impossible to properly analyze the relationship between X1t, X2t without using a seasonal adjustment procedure. This is because S1t and S2t will certainly appear to be correlated, with the maximum correlation between S1t and S2t-k where k is the average distance between the seasonal peaks of the two series. Such spurious relationships are disturbing, and thus an adjustment is required."
Granger goes on to say (p.43):
"One aspect not apparently previously emphasized is that spurious relations may be found if autoadjustment is used in the case where the Y series are unrelated, but the S series are related. Suppose that the economically important components are Y1t and Y2t but, in fact, these series are independent. The economic analyst would presumably want the adjusted series to be unrelated in any analysis that he performs. However, in theory, this will not occur if S1t, S2t are correlated, and an autoadjustment is used."
His conclusions include the following (p.44):
"By considering the causation of seasonal components, one reaches the conclusions that it is incorrect to believe that the seasonal component is deterministic, that a complete decomposition into seasonal and nonseasonal components is possible by analyzing only the past of the series and that autoadjustment. methods do remove the seasonal part, and this can lead to relationships being found between series that are in some sense spurious."
In short,
  1. Treating seasonality as being deterministic (which is what we do when we use seasonal dummy variables in a regression) can be dangerous.
  2. If we routinely seasonally adjust our data when there are strong seasonal effects, we may falsely "discover" causal relationships between the non-seasonal components.
  3. If there are strong seasonal effects and we don't adjust for these, we may "discover" causal relationships which are really just between the seasonal components of the series, and not between the non-seasonal components.
  4. On the other hand, failing to seasonally adjust the data can lead to the same situation.
If there is stochastic seasonality, this may take the form of unit roots at the seasonal frequencies. Testing for the presence of such roots, and for seasonal cointegration, and filtering and modelling the data accordingly, can be crucially important. (More on this in future posts.)

In the meantime let's consider an empirical example to illustrate how tests for Granger non-causality can be affected by seasonally adjusting the data. I've used EViews, and the workfile and the data are on the code and data pages, respectively for this blog.

The data are monthly price indices for retail trade in the U.K. - one for good in food stores (PF) and one for goods in non-food stores (PNF). The sample period is from 1986M1 to 2007M12. Here are the original data, and you can see that each series exhibits a strong seasonal pattern:
I've seasonally adjusted both series using the Census X-13 routine in EViews, and here are the results:

On balance, the ADF and KPSS tests that I've undertaken suggest that both the unadjusted and seasonally adjusted food and non-food price series may be I(1), For this reason, I've used the Toda-Yamamoto / "modified Wald" testing procedure to test for Granger non-casusality. (For more details, see my posts here, here, and here.)

The basic lag-lengths (k) for the VAR models were chosen using the AIC and SIC measures. The models were then augmented by one more lag of each variable, to allow for the non-stationarity of the data, but this extra lag was not included in the tests for non-causality.

The results that were obtained are as follows (with p-values in parentheses):


Looking at the p-values, and with a 5% or 10% significance level in mind, we see that in the case of the unadjusted data there is evidence of Granger causality from non-food prices to food prices. (Remember that the null hypothesis is "non-causality".) In contrast, when the seasonally adjusted data are used, there is no evidence of Granger causality in either direction.

We appear to have an example of "Case 3" above. 

Remember, this is just an illustration. However, hopefully it will lead you to "tread carefully".


References

Granger, C. W. J., 1969. Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37, 424–438.

Granger, C. W. J., 1979. Seasonality: Causation, interpretation, and implications. (With discussion.) In A. Zellner (ed.), Seasonal Analysis of Economic Time Series. NBER, Washington DC, 33-56.

Sims, C. A., 1974.  Seasonality in regression. Journal of the American Statistical Association, 69, 618—626.

Wallis, K. F., 1974. Seasonal adjustment and relations between variables. Journal of the American Statistical Association, 69, 18—31.


© 2015, David E. Giles

4 comments:

  1. Could you please specify what autoadjustment is? It seems to be an important element here. Also, did I get the main idea right: as long as seasonal adjustment is successful (the seasonal component is well captured during the adjustment procedure), Granger causality works wit no problem; but when seasonal adjustment fails (when adjusting, something not quite equal to the seasonal component is captured), Granger causality will likely be found due to the impact of the remainders of the seasonal components in the two series. (I hope my wording is comprehensible.)

    ReplyDelete
  2. Autoadjustment refers to the Census X /STAMP methods routinely used by virtually all statistical agencies in the world. Your interpretation is correct, and there's lots of evidence that the automatic methods are far from "perfect". Also, they assume a deterministic seasonal; component, and will not be adequate if the series has a stochastic seasonal component (i.e., unit roots at the seasonal frequencies).

    ReplyDelete
  3. may i ask. is it possible to determine the granger causality based on the ARDL model? Appreciate for your help. thanks

    ReplyDelete
    Replies
    1. Lerry - no, you need to set up a VAR model.

      Delete