Saturday, July 6, 2019

Seasonal Unit Roots - Background Information

A recent email query about the language that we use in the context of non-stationary seasonal data, and how we should respond to the presence of "seasonal unit roots", suggested to me that a short background post about some of this might be in order.

To get the most from what follows, I suggest that you take a quick look at this earlier post of mine - especially to make sure that you understand the distinction between "deterministic" seasonality" and "stochastic seasonality" in time-series data.

There's an extensive econometrics literature on stochastic seasonality and testing for seasonal unit roots, and this dates back at least to 1990. This is hardly a new topic, but it's one that's often overlooked in the empirical applications.

Although several tests for seasonal unit roots are available, the most commonly used one is that proposed by Hylleberg et al. (1990) - hereafter "HEGY". Depending on what statistical/econometrics package you prefer to use, you'll have at least some access to the HEGY test(s), and perhaps some others. For instance there are routines that you can use with R, stata, and Gretl.

The EViews package includes a rather complete built-in suite of different seasonal unit root tests for time series data with various periodicities - 2, 4, 5, 6, 7, and 12. This enables us to deal with trading-day weekly data, and calendar weekly data, as well as the usual "seasonal" frequencies. 

I'm not going to be going over the tests themselves here.

Rather, the objectives of this post are, first, to provide a bit of background information about the language that's used when we're talking about seasonal unit roots. For instance, why do we refer to roots at the zero, π, frequencies, etc.? Second, in what way(s) do we need to filter a time series in order to remove the unit roots at the various frequencies?

Let's begin by considering a quarterly time series, Xt (t = 1, 2, ........). We'll use the symbol "L" to denote the lag operator. So. L(Xt) = Xt-1; L2(Xt) = L(L(Xt)) = L(Xt-1) = Xt-2etc. In general, Lk(Xt) = Xt-k.

Unit roots at different frequencies

If we consider the difference between the value of the series, X, now and its value four quarters (one year) ago, we can represent this by (Xt - Xt-4) = (Xt - L4Xt) = (1 - L4)Xt. Let's take a closer look at the polynomial equation, (1 - L4) = 0, in the lag operator, and ask what are its roots?"

We can factorize (1 - L4) as follows: 

           (1 - L4) = (1 - L2)(1 + L2)
                        = (1 - L)(1 + L)(1 + L2) = (1 - L)(1 + L)(1 + iL)(1 - iL) = 0,                  (1)

and then we see that the roots of (1) are L = 1; L = -1; L = i; and L = -i. Here, "i" is the imaginary number, whose square is -1.

In fact, each of these roots can be written as complex numbers, each of the form (x + iy). For instance, the root L = 1 corresponds to the case x = 1, y = 0. You may also recall that instead of expressing a complex number in terms of Cartesian coordinates, we can write it in terms of polar coordinates. That is, we can write it in the form r(cosθ + i sinθ), where "r" is what we call the "radial coordinate", and θ is the "angular coordinate". Without loss of generality, we'll normalize the former and set r =1 in what follows.

Alright, so where does this leave us? Well, we're going to have to recall some of that trigonometry that you learned in high school!

Let's look at the graphs for the sine and cosine functions, with the argument (x) measured in radians:

These two functions "repeat themselves" every 2π radians (i.e., 360 degrees). This corresponds, of course, to going exactly once around a circle.

So, the root of (1) corresponding to L = 1 can be written as (1 + i 0) = (cosθ + i sinθ), and from the sine and cosine graphs we can see that this implies that θ = 0 (or 0 +/- multiples of 2π, which are still 0). In other words, the series exhibits one cycle per year.

Similarly, the root of (1) corresponding to L = -1 can be written as (-1 + i0) = (cosθ + i sinθ), and from the sine and cosine graphs we can see that this implies that θ = π (or π +/- multiples of 2π). Now the series exhibits two cycles per year.

(We don't need to worry about the additional multiples of 2π, as this would take us "around the circle" more than once. So let's forget about this detail.)

Finally, the roots of (1) corresponding to L = +/-i can be written as (0 +/- i) = (cosθ + i sinθ), and from the sine and cosine graphs we can see that this implies that θ = π/2 or 3π/2. There are now four cycles per year in the data.

To summarize, we can have roots of equation (1) that correspond to one or more of the zero, π, or π/2 or 3π/2 frequencies. Moreover, these last two frequencies really need to be thought of as a pair - after all, they're associated with a complex conjugate pair in the Cartesian coordinate system, whereas the other two roots are "real".

So much for the language associated with seasonal unit roots.

Filtering the data

What filters are need to eliminate the various roots, so as to render the X series stationary? 

(i) If L = 1, then this corresponds to the transformation, or filter, (1 - L)Xt = 0. In other words, if there is a unit root at the zero frequency then we need to construct Yt = (1 - L)Xt = (Xt - Xt-1) to get a stationary series. The usual first-differencing of the data is appropriate.

(ii) If L = -1, then this corresponds to the filter, (1 + L)Xt = 0. That is, if there is a unit root at the π frequency then we need to construct Yt = (1 + L)Xt = (Xt + Xt-1) to get a stationary series. Notice that this particular filter doesn't involve "differencing" the data.

(iii) If either L = i, or L = -1, then this corresponds to the filter (1 + L2)Xt = 0. So, if there is a unit root at the π/2 or 3π/2 pair of frequencies, then we need to construct Yt = (1 + L2)Xt = (Xt + Xt-2) to get a stationary series. Again, this filter doesn't involve the usual "differencing".

Multiple roots

Of course, it's quite possible that our time series, Xt, has unit roots at more than one frequency. For example, it may have roots at both the zero and π frequencies. In that case, the filter that will make the series stationary is (1 - L)(1 + L) = (1 - L2). So, we construct Yt = (Xt - Xt-2). Similarly, if X has unit roots at all of the seasonal frequencies, but not at the zero frequency, then the appropriate filter is (1 + L)(1 + L2) = (1 + L + L2 + L3), and the series Yt = (Xt + Xt-1 + Xt-2 + Xt-3) will be stationary; and so on. If there are unit roots at all four frequencies, then X series is said "seasonally integrated", and the relevant filter is (1 - L)(1 + L)(1 + L2) = (1 - L4), and so we "fourth-difference" Xt and form Yt = (Xt - Xt-4).

Some Extensions  

The above discussion is cast in terms of quarterly time-series data. If we have data that are recorded twice-yearly, you should be able to see from the factorization (1 - L2) = (1 - L)(1 + L) that there can be unit roots only at either the zero or π frequencies. (See Feltham and Giles, 2003, for more on this.)

You might guess that case of monthly data gets pretty messy! (See Beaulieu and Miron, 1993.) In this case the unit roots correspond to L = ± 1, ± i, ± (1 ± √(3i))/2, ± (√(3) ± i)/2. The various frequencies are zero, π/6, π/3, π/2, 2π/3, 5π/6, and π.

The final matter that needs mentioning here is the possibility of cointegration when we have two or more seasonal time series. Suppose that X1t and X2t are quarterly series, and they each have unit roots at (say) the π frequency. Then it's possible that they may be cointegrated at this frequency. Similarly if X1t has unit roots at all frequencies, and X2t has unit roots at the π and zero frequencies, then then the two series may be cointegrated at the zero and/or π frequencies. And so on.

Engle et al. (1993) provide and illustrate a systematic testing framework for seasonal cointegration. It's essentially a generalization of the Engle-Granger two-step cointegration test, with the HEGY tests replacing the ADF test. For another application, see Reinhardt and Giles (2001). Lee (1992) extends the Johansen cointegration tests to the case of (quarterly) seasonal cointegration. A nice application of this procedure is given by Debenedictis (1997).


Beaulieu, J. J., & J. A. Miron, 1993. Seasonal unit roots in aggregate U.S. data. Journal of Econometrics, 55, 305-328.

Debenedictis, L. F., 1997. A vector autoregressive model of the British Columbia regional economy. Applied Economics, 29. 877-888. 

Engle, R. F., C. W. J. Granger, Hylleberg, S. & H. S. Lee, 1993. The Japanese consumption function. Journal of Econometrics, 55, 275-298.

Feltham, S. G. & D. E. A. Giles, 2003. Testing for unit roots in semi-annual data. in D. E. A. Giles (ed.), Computer-Aided Econometrics. Marcel Dekker, New York, 175-208. (Pre-print here.)

Ghysels, E., H. S. Lee, & J. Noh, 1994. Testing for unit roots in seasonal time series: Some theoretical extensions and a Monte Carlo investigation. Journal of Econometrics, 62, 415-442.

Hylleberg, S., R. F. Engle, C. W. J. Granger, & B. S. Yoo, 1990. Seasonal integration and cointegration. Journal of Econometrics, 44, 215-228.

Lee, H. S., 1992. Maximum likelihood inference on cointegration and seasonal cointegration. Journal of Econometrics, 54, 1-47.

Reinhardt, F. S. & D. E. A. Giles, 2001. Are cigarette bans really good economic policy?. Applied Economics, 33, 1365-1368. 

© 2019, David E. Giles


  1. Wonderful post. Although I have read several time series books, I have not seen the conversion to polar coordinates so clearly explained. Would you consider explaining the ARMA frequency test in EViews, relating it to comments? Thanks

  2. Can we make use of this test in (S)ARIMA modelling? Meaning instead of just looking at ACF and PACF of the data to determine whether there is any seasonal patterns and then directly 'seasonal' differencing the data 'the Box-Jenkins way', can we use this test and then filter the data according to its results?

    1. Hi - as I mentioned in the first paragraph, take a look at this previous post which covers the distinction between deterministic and stochastic seasonality. The BJ approach you mention deals only with deterministic seasonality, not stochastic. Indeed, BJ/ARIMA modelling presumes that the series is stationary - and that's what the HEGY testing is checking.

    2. Despite your very clear explanation I still don't think that I get the point.

      For example, Terence Mills in his 'Applied Time Series Analysis' states: "As in the modeling of stochastic trends, ARIMA processes have been found
      to do an excellent job in modeling stochastic seasonality..."
      I understand that deterministic seasonality can be accounted for by 'seasonal dummies'. I also understand that one type of models with stochastic seasonality is the seasonal autoregressive model, which can either be stationary or nonstationary due to the presence of a 'seasonal' unit root.
      According to the BJ approach we 'seasonally difference' and/or difference the data depending on whether the ACF/PACF exhibit specific patterns.

      My question is: will it be of any benefit to test for the presence of seasonal unit roots and to filter the data the way you explained above instead of directly seasonal differencing the data in ARIMA modelling?

    3. Hi - Thanks for the follow-up, and for clarifying. The "seasonal differencing in traditional (S)ARIMA modelling will be fine if the series is (fully) "Seasonally integrated". That is, if there is a unit root at the zero frequency and all of the seasonal frequencies. For instance, in the case of quarterly data, that's when we need to fourth-difference to get stationarity. On the other hand, if there are unit roots at only some of the frequencies, the usual differencing associated with SARIMA modelling won't be appropriate. It won't result in a stationary series for the subsequent ARIMA analysis. So, the answer to your question is "yes - definitely". I hope this helps.