Econometrics Beat: Dave Giles' Blog: 10/01/2015

Friday, October 16, 2015

New Forecasting Blog

Allan Gregory, at Queen's University (Canada) has just started a new blog that concentrates on economic forecasting. You can find it here.

In introducing his new blog, Allan says:

"The goal is to discuss, compare and even evaluate alternative methods and tools for forecasting economic activity in Canada. I hope others involved in the business of forecasting will share their work, opinions and so on in this forum. Hopefully, we can understand the interaction of forecasting theory and practical forecasting."

This is a blog that you should follow. I'm looking forward to Allan's upcoming posts.

Tuesday, October 13, 2015

Angus Deaton, Consumer Demand, & the Nobel Prize

I was delighted by yesterday's announcement that Angus Deaton has been awarded the Nobel Prize in Economic Science this year. His contributions have have been many, fundamental, and varied, and I certainly won't attempt to summarize them here. Suffice to say that the official citation says that the award is "for his contributions to consumption, poverty, and welfare".

In this earlier post I made brief mention of Deaton's path-breaking work, with John Muellbauer, that gave us the so-called "Almost Ideal Demand System".

The AIDS model took empirical consumer demand analysis to a new level. It facilitated more sophisticated, and less restrictive, econometric analysis of consumer demand behaviour than had been possible with earlier models. The latter included the fundamentally important Linear Expenditure System (Stone, 1954), and the Rotterdam Model (Barten, 1964; Theil, 1965).

I thought that readers may be interested in an empirical exercise with the AIDS model. Let's take a look at it.

Lies, Damned Lies, & Cointegration

My thanks to a colleague for bringing to my attention a recent discussion paper with the provocative title, "Why Most Published Results on Unit Root and Cointegration are False".

As you can imagine, I couldn't resist it!

After a quick read (and a couple of deep breaths), my first reaction was to break one of my self-imposed blogging rules, and pull the paper apart at the seams.

The trouble is, the paper is so outrageous in so many ways, that I just wasn't up to it. Instead, I'm going to assign it to students in my Time-Series Econometrics course to critique. They have more patience than I do!

The authors make sweeping claims that certain theoretical results are undermined by one poorly implemented piece of (their own) empiricism.

They provide no serious evidence that I could find to support the bold claim made in the title of their paper.

We are left with a concluding section containing remarks such as:

"In summary, three analogies between cointegration analysis and a sandcastle may be appropriate. First, a sandcastle may be built on sand, so it falls down because the foundation is not solid. Second, a sandcastle may be badly built. Third, a sandcastle built on seashore with a bad design may stay up but will not withstand the ebb and flow of the tides. The cointegration analysis, like a sandcastle, collapses on all three counts. In several planned research publications, we will report the criticism of research outcomes (results) and the methods employed to obtain such results. Below we provide one example why a research finding using the methodology of cointegration analysis to be false." (pp.11-12)

and:

"In the name of science, cointegration analysis has become a tool to justify falsehood -- something that few people believe to be true but is false. We recommend that except for a pedagogical review of a policy failure of historical magnitude, the method of cointegration analysis not be used in any public policy analysis." (p.14)

The most positive thing I can say is: I can't wait for the promised follow-up papers!

© 2015, David E. Giles

Sunday, October 4, 2015

Cointegration & Granger Causality

Today, I had a query from a reader of this blog regarding cointegration and Granger causality.

Essentially, the email said:

"I tested two economic time-series and found them to be cointegrated. However, when I then tested for Granger causality, there wasn't any. Am I doing something wrong?"

First of all, the facts:

If two time series, X and Y, are cointegrated, there must exist Granger causality either from X to Y, or from Y to X, both in both directions.
The presence of Granger causality in either or both directions between X and Y does not necessarily imply that the series will be cointegrated.

Now, what about the question that was raised?

Truthfully, not enough information has been supplied for anyone to give a definitive answer.

What is the sample size? Even if applied properly, tests for Granger non-causality have only asymptotic validity (unless you bootstrap the test).
How confident are you that the series are both I(1), and that you should be testing for cointegration in the first place?
What is the frequency of the data, and have they been seasonally adjusted? This can affect the unit root tests, cointegration test, and Granger causality test.
How did you test for cointegration - the Engle-Granger 2-step approach, or via Johansen's methodology?
How did you test for Granger non-causality? Did you use a modified Wald test, as in the Toda-Yamamoto approach?
Are there any structural breaks in either of the time-series? These ail likely any or all of the tests that you have performed.
Are you sure that you correctly specified the VAR model used for the causality testing, and the VAR model on which Johansen's tests are based (if you used his methodology to test for cointegration)?

The answers to some or all of these questions will contain the key to why you obtained an apparently illogical result.

Theoretical results in econometrics rely on assumptions/conditions that have to be satisfied. If they're not, then don't be surprised by the empirical results that you obtain.

Friday, October 2, 2015

Illustrating Spurious Regressions

I've talked a bit about spurious regressions a bit in some earlier posts (here and here). I was updating an example for my time-series course the other day, and I thought that some readers might find it useful.

Let's begin by reviewing what is usually meant when we talk about a "spurious regression".

In short, it arises when we have several non-stationary time-series variables, which are not cointegrated, and we regress one of these variables on the others.

In general, the result that we get are nonsensical, and the problem is only worsened if we increase the sample size. This phenomenon was observed by Granger and Newbold (1974), and others, and Phillips (1986) developed the asymptotic theory that he then used to prove that in a spurious regression the Durbin-Watson statistic converges in probability to zero; the OLS parameter estimators and R² converge to non-standard limiting distributions; and the t-ratios and F-statistic diverge in distribution, as T ↑ ∞ .

Let's look at some of these results associated with spurious regressions. We'll do so by means of a simple simulation experiment.

What NOT To Do When Data Are Missing

Here's something that's very tempting, but it's not a good idea.

Suppose that we want to estimate a regression model by OLS. We have a full sample of size n for the regressors, but one of the values for our dependent variable, y, isn't available. Rather than estimate the model using just the (n - 1) available data-points, you might think that it would be preferable to use all of the available data, and impute the missing value for y.

Fair enough, but what imputation method are you going to use?

For simplicity, and without any loss of generality, suppose that the model has a single regressor,

y_i = β x_i + ε_i , (1)

and it's the n^th value of y that's missing. We have values for x₁, x₂, ...., x_n; and for y₁, y₂, ...., y_n-1.

Here's a great idea! OLS will give us the Best Linear Predictor of y, so why don't we just estimate (1) by OLS, using the available (n - 1) sample values for x and y; use this model (and x_n) to get a predicted value (y*_n) for y_n; and then re-estimate the model with all n data-points: x₁, x₂, ...., x_n; y₁, y₂, ...., y_n-1, y*_n.

Unfortunately, this is actually a waste of time. Let's see why.

Econometrics Beat: Dave Giles' Blog

Pages