## Tuesday, April 7, 2015

Recently, I received an email from Ozan, who wrote:
"I’ve a simple but not explicitly answered question within the text books on stationary series. I’m estimating a model with separate single equations (I don’t take into account the interactions among them ). I’ve only non-stationary series in some equations (type 1), only stationary in some (type 2), and a combination of the both in the others (type 3). For the first two cases I apply the usual procedures and for the last case the Pesaran (2011) test. I want to find the short term effects of some variables on the others. I’ve two questions:
1) If the Pesaran test turns out inconclusive or rejects cointegration, what’s the next step ? Differencing  all the series and applying an OLS? Or differencing only the non-stationary ones? Or another method?
2) As I mentioned I’m looking for the short-run effects. In the type 2 equations, I guess running an OLS in levels gives the long-run effects. Therefore I run an OLS in differences. Some claim that differencing an already stationary series causes problems. I’m confused. What do you think?"
Let's start out by making sure what Ozan means by "the usual procedures" for his "Type 1" and "Type 2" equations.

I'm presuming he means:

Type 1: All of the series are I(1). Then:

(i) If the variables are not cointegrated, estimate a model using the first-differences of the data (or, perhaps, the log-differences of the data), using OLS or IV.

(ii) If the variables are cointegrated:

(a) Estimate an error-correction model to determine the short-run effects.

(b) Estimate a model in the levels (or, perhaps, log-levels) of the variables to determine the long-run cointegrating relationship between them

Type 2: All of the series are I(0). Then you can:

(i) Model the variables in the levels of the data (or, perhaps, the log-levels) of the data, using OLS or IV estimation.

(ii) Estimate the model using the first-differences (or, perhaps, the log-differences) of the variables. The transformed variables won't be I(0), but they will still be stationary. There is nothing wrong with this. However, one possible down-side is that you may have "over-differenced" the data, and this may show up in the form of an error term that follows an MA process, rather than being serially independent. On this point, see the discussion below.

Now, what about the "Type 3" equations?

In this case, Ozan uses the ARDL/Bounds testing methodology, which I've discussed in some detail here, and in earlier posts. Now, in response to his two questions:

(1) In this case, you could apply either of the two approaches that you mention. However, I'd lean towards the option of differencing all of the variables. The reason for saying this is that if the tests that you've used to test for stationarity / non-stationarity have led you to a wrong conclusion, differencing everything is a conservative, but relatively safe way to proceed. You don't want to unwittingly fail to difference a variable that is I(1). The "costs" of doing so are substantial. On the other hand, unnecessarily differencing a variable that is actually I(0) incurs a relatively low "cost". (See the comments for Type 2 (ii), above.)

(2) See the discussion for Type 2 (ii) above. However, to get at the short-run effects (and avoid the over-differencing issue), I'd be more inclined to explore some simple dynamic models of the old-fashioned ARDL type - not the Pesaran type. (See here.) That is, consider models of the form:

yt = α + β0xt + β1xt-1 + β2xt-2 + ..... + βkxt-k + γ1yt-1 + γ2yt-2 + ..... + γpyt-p + εt  .

I'd start with a fairly general specification (with large values for k and p), and then simplify the model using AIC or SIC, to get a parsimonious dynamic model.

Then, for instance, if I were to end up with a model of the form:

yt = α + β0xt + γ1yt-1 + ut ,

the short-run marginal effect between x and y would be β0; while the long-run effect would be given by [β0 / (1 - γ1)], etc.