Saturday, September 13, 2014

The Econometrics of Temporal Aggregation - IV - Cointegration

My previous post on aggregating time series data over time dealt with some of the consequences for unit roots. The next logical thing to consider is the effect of such aggregation on cointegration, and on testing for its presence.

As in the earlier discussion, we'll consider the situation where the aggregation is over "m" high-frequency periods. A lower case symbol will represent a high-frequency observation on a variable of interest; and an upper-case symbol will denote the aggregated series. So,

           Yt = yt + yt - 1 + ......+ yt - m + 1 .

If we're aggregating quarterly (flow) data to annual data, then m = 4. In the case of aggregation from monthly to quarterly data, m = 3, and so on.

We know, from my earlier post, that if yt is integrated of order one (i.e.,  I(1)), then so is Yt.

Suppose that we also have a second temporally aggregated series:

           Xt = xt + xt - 1 + ......+ xt - m + 1 .

Again, if xt is I(1) then Xt is also I(1). There is the possibility that xt and yt are cointegrated. If they are, is the same true for the aggregated series, Xt and Yt?

Granger (1990) showed that the answer to this question is "Yes". Let's see why.

If the two I(1) time-series, xt and yt, are cointegrated then there exists a unique α such that:

           zt = (yt - α xt) ~ I(0) .

The vector, (1 , -α) is the "cointegrating vector". It's unique in the case where we are dealing with just two variables.

Now, let's define a new (aggregated) variable at time τ (say):

         Wτ =  (Yτ - α Xτ ) = [Σyτ - j -α Σxτ - j] ,

where the summations are each for j = 0 to (m - 1).

         Wτ = (yτ - α xτ) + (yτ - 1 - α xτ - 1) + ....... + (yτ - m + 1 - α xτ - m+1)

               = (wτ + wτ - 1 + .... + wτ - m + 1) .

Because each of the "w" series are I(0), we see that "W" is the sum of "m" I(0) series, and so it is I(0) itself. So, this means that the aggregated series, X and Y, are cointegrated.

If the high-frequency variables, x and y, are cointegrated then there must exist an error correction model (ECM) relating these two variables. The same will be true for the aggregated variables, X and Y. However, the lag structure associated with these two ECMs will be different.

There are two other results that are worth noting.

First, in a previous post I noted that when it comes to testing for unit roots, what matters is the temporal span of the sample we are using - not the number of observations in the sample. The same is true when it comes to testing for cointegration using the Engle-Granger two-step procedure, as was shown by Pierce and Snell (1995).

Second, suppose that we have more than two series - say, k series - that are I(1). Then the number of cointegrating vectors can be anything from zero to (k - 1). In this case we'd usually be turning to Johansen's methodology to test for the presence of cointegration, and to detect the number and nature of any cointegrating vectors. For this situation, Marcellino (1996) has shown that number and composition of the cointegrating vectors are invariant to temporal aggregation. Moreover, the loadings of the aggregated and disaggregated error-correction terms are same.

The bottom line: temporally aggregating high-frequency time series data doesn't impact on the existence or nature of any cointegration.


Granger, C. W. J., 1990. Aggregation of time-series variables: A survey. In T. Barker and M. H. Pesaran (eds.), Disaggregation in Econometric Modelling. Routledge, London. (Discussion Paper version.)

Marcellino,M., 1999. Some consequences of temporal aggregation in empirical analysis. Journal of Business and Economic Statistics, 17, 129-136.

Pierse, R. G. and J. Snell, 1995. Temporal aggregation and the power of tests for a unit root. Journal of Econometrics, 65, 333-345.

© 2014, David E. Giles


  1. Hi Prof. Dave, my name is Isaac. I'm working with time series data covering 17 years . I intend to use the data for time series analysis (including unit root test, Johansen/Bounds test for cointegration, ECM, impulse response functions etc). However, Eviews8 keeps giving me error message "near singular matrix" whenever I try Johansen test for cointegration. All the cointegrating estimators (FMOLS and DOLS) are also giving me similar messages. Please, what do you think is going wrong? Is the sample size not large enough for such an analysis?

    1. You don't say how many variables (& how many lags) you are using in the VAR. Obviously too many for this number of observations.

  2. Thanks Dave. I'm using 8 variables. And given sample size, I'm fixing the lag length at 1 when specifying the VAR model.

    1. Can't be done - you have a singular covariance matrix. You need more data or fewer equations.