Econometrics Beat: Dave Giles' Blog

Sunday, May 5, 2013

The Frequent Regressor Club

My friend, Ken White, developed the SHAZAM econometrics package in 1977. Ken's a funny guy - that's to say, he has a great sense of humour.

On one of his many visits to Christchurch, New Zealand (when I was living there, many years ago) he gave me a wooden die that he'd had an artisan carve at the local Arts Centre in Chistchurch. On each of the six faces he'd had the guy put the name of an econometrics/statistics package - TSP, LIMDEP, GAUSS, RATS, and SHAZAM. Yes, I know that's only 5 names. The thing was, SHAZAM appeared on two of the faces! The idea was to roll the die to decide which package to use in your lab. class. I still have the die - much to the occasional bemusement of students who see it on my desk.

Granger Causality Testing Done Properly

I enjoy following David Stern's Stochastic Trend blog. David is Research Director at the Crawford School of Public Policy at the Australian National University. He's an energy and environmental economist who does some really interesting work - not my field at all, but I always enjoy reading what he has to say.

In his latest blog post, David links to a recent paper that he's co-authored with Robert Kaufman, from Boston University. The paper is titled, "Robust Granger Causality Testing of the Effect of Natural and Anthropogenic Radiative Forcings on Global Temperature".

As I said, this isn't my field. However, if you want to see an example of Granger causality testing done well, you should take a look at this well-written paper.

Nice one!

Friday, May 3, 2013

When Will the Adjusted R-Squared Increase?

The coefficient of determination (R²) and t-statistics have been the subjects of two of my posts in recent days (here and here). There's another related result that a lot of students don't seem to get taught. This one is to do with the behaviour of the "adjusted" R² when variables are added to or deleted from an OLS regression model.

We all know, and it's trivial to prove, that the addition of any variable to such a regression model cannot decrease the R² value. In fact, R² will increase with such an addition to the model in general. Conversely, deleting any regressor from an OLS regression model cannot increase (and will generally reduce) the value of R².

Mark Thoma on "Replication"

Yesterday, in his Economist's View blog, Mark Thoma discussed the importance of replicating results in empirical economics. He's absolutely right, of course.

I'll leave you to read what had to say, but I especially liked his closing passage:

"One place where replication occurs regularly is assignments in graduate classes. I routinely ask students to replicate papers as part of their coursework. Even if they don't find explicit errors (and most of the time they don't), it almost always raises good questions about the research (why this choice, this model, what if you relax this assumption, there's a better way to do this,here's the next question to ask, etc., etc.). So replication does occur routinely in economics, and it is very valuable, but it is not a formal part of the profession the way it should be, and much of the replication is done by people (students) who generally assume that if they can't replicate something, it is probably their error. We have a lot of work to do on the replication front, and I want to encourage efforts like this."

At least one of my colleagues also assigns replication exercises in this way, and I really should do the same. Fortunately, more journals are either recommending or requiring that data-sets be made available as a condition of publication. The Journal of Applied Econometrics is one such journal, and we've recently been pushing in that direction with the Journal of International Trade & Economic Development.

This should become part of our culture.

When Can Regression Coefficients Change Sign?

Let's suppose that you've been running regressions happily all morning. It's sunny day, but what could be better than enjoying some honest-to-goodness econometrics? Suddenly, you notice that one of the estimated coefficients in your model has a sign that's the opposite to what you were expecting (from your vast knowledge of the underlying economics). Shock! Horror!

Well. it's really good that you're on the look-out for that sort of thing. Congratulations! However, something has to be done about this problem.

Being young, with good eyesight, you also happen to spot something else that's interesting. One of the other estimated coefficients has a very low t-statistic. You have a brilliant idea! If you delete the variable associated with the very small t-value, maybe the "wrong" sign on the first coefficient will be reversed. Is this possible?

All About Spherically Distributed Regression Errors

This post is based on a handout that I use for one of my courses, and it relates to the usual linear regression model,

y = Xβ + ε

In our list of standard assumptions about the error term in this linear multiple regression model, we include one that incorporates both homoskedasticity and the absence of autocorrelation. That is, the individual values of the errors are assumed to be generated by a random process whose variance (σ²) is constant, and all possible distinct pairs of these values are uncorrelated. This implies that the full error vector, ε, has a scalar covariance matrix, σ²I_n.

We refer to this overall situation as one in which the values of the error term follow a “Spherical Distribution”. Let's take a look at the origin of this terminology.

Good Old R-Squared!

My students are often horrified when I tell them, truthfully, that one of the last pieces of information that I look at when evaluating the results of an OLS regression, is the coefficient of determination (R²), or its "adjusted" counterpart. Fortunately, it doesn't take long to change their perspective!

After all, we all know that with time-series data, it's really easy to get a "high" R² value, because of the trend components in the data. With cross-section data, really low R² values are really common. For most of us, the signs, magnitudes, and significance of the estimated parameters are of primary interest. Then we worry about testing the assumptions underlying our analysis. R² is at the bottom of the list of priorities.

Finite Sample Properties of GMM

In a comment on a post earlier today, Stephen Gordon quite rightly questioned the use of GMM estimation with relatively small sample sizes. The GMM estimator is weakly consistent, the "t-test" statistics associated with the estimated parameters are asymptotically standard normal, and the J-test statistic is asymptotically chi-square distributed under the null. But what can be said in finite samples?

Of course, this question applies to almost all of the estimators that we use in practice - IV, MLE, GMM, etc. Indeed, lots of work has been done to explore the finite-sample properties of such estimators. For instance, consider my own work on bias corrections for MLEs (see here, here, and here). So, I'm more than sympathetic to the general point that Stephen made.

Estimating an Euler Equation Using GMM

In one of my grad. econometrics courses we cover Generalized Method of Moments (GMM) estimation. I thought that some readers might be interested in the material that I use for one of the associated lab. classes.

The lab. exercise involves estimating the Euler equation associated with the "Consumption-Based Asset-Pricing Model" (e.g., Campbell, 1993, 1996.) This is a great example for illustrating GMM estimation, because the Euler equation is a natural "moment equation".

The basic statement of the problem is given below, taken from the handout that accompanies the lab. class exercises:

Some Official Data Come With Standard Errors!

Without intending to, I seem to have been on a bit of a rant about data quality and reliability recently! For example, see here, here, and here.

This post is about a related topic that's bugged me for a long time. It's to do with the measures of uncertainty that some statistical agencies (e.g., Statistics Canada) correctly report with some of their survey-based statistics.

A good example of what I have in mind is the Labour Force Survey (LFS) from Statistics Canada.