"First, catch your hare"
(Mrs. Beeton- recipe for jugged hare)
(I understand that it's debatable whether or not Mrs. Beeton actually wrote those precise words, but they're generally attributed to her and it's still pretty good advice.)
Cookbooks certainly have their place, but sometimes they're misunderstood or misused. Indeed, sometimes they're mislaid, and then if you don't understand the rudiments of cooking you're either going to go hungry, or you may create something very unpalatable. The same is true when it comes to certain types of econometrics courses or textbooks.
I'll lay it on the table - I am definitely not a fan of "Cookbook Econometrics".
I'll lay it on the table - I am definitely not a fan of "Cookbook Econometrics".
Here's what I'm referring to.
It's pointless, and frankly dangerous, to simply tell students what to do, without telling them why. And I know that this applies to more than just econometrics. When I'm explaining to students why we go through the proofs of important results, I usually make the following points.
It's pointless, and frankly dangerous, to simply tell students what to do, without telling them why. And I know that this applies to more than just econometrics. When I'm explaining to students why we go through the proofs of important results, I usually make the following points.
First, depending on the nature and level of the course, I may or may not expect them to be able to reproduce the proof in a test or exam - usually, that's the least of my concerns. Second, the real benefit in being led through the proof of a standard result in econometrics is that enables you to see exactly where (and how) any underlying assumptions are actually used. It's one thing to be told that a certain result holds only if assumptions A, B, and C are satisfied. It's much more revealing to actually see the step in the proof where assumption A gets used. And third, it's also helpful to see what conditions or assumptions are not used when establishing a particular result. At the very least, this may provide a hint that the result is robust to the violation of those conditions. Or it may suggest the possibility that an even "stronger" result will emerge if additional assumptions are satisfied.
My contention is that if you've been taken through the proof, and seen the assumptions "in action", you're more likely to pay proper attention to those assumptions being satisfied when you use the result, day to day, in your empirical work.
To take a familiar example, let's think about the Gauss-Markhov Theorem (GMT). This result uses a set of assumptions that ensure that the OLS estimator is "Best Linear Unbiased" for the coefficient vector, β, in the linear regression model, y = Xβ + u. Just what are these assumptions?
- The model is correctly specified - that is, there are no omitted or extraneous regressors; and the functional form (with respect to the variables) is correct.
- The model is indeed linear in the parameters.
- The regressors (the columns of the X matrix) are either non-random, or if they are random they are uncorrelated with the random errors (the elements of u).
- The X matrix has full (column) rank.
- The random error term has a zero mean.
- The random error term has a scalar covariance matrix - that is, the errors are serially independent and homoskedastic.
Every one of these assumptions is used (often more than once) in the proof of the GMT. Some of these assumptions are used only implicitly, and it's up to the instructor to highlight the fact that they're really there - lurking behind the scenes, so to speak. For example, we can't even talk about the usual OLS estimator of β unless that estimator is well defined - so we need Assumption 4 from the start. It's also used in the proof of the GMT when we compare the covariance matrix of the OLS estimator, namely σ2(X'X)-1, with that of any other linear and unbiased estimator. The (X'X) matrix will be singular unless X has full rank.
Assumptions 3 and 5 are among those needed to ensure that the OLS estimator is unbiased, and to get us started on the proof of the GMT. They're also there in the background, along with Assumption 1, when we use that covariance matrix formula.
And then, of course, there's the issue of the form of the distribution that the random errors follow. The obvious point to make is that the proof of the GMT does not use, or require, an assumption that the errors are Normally distributed. If in fact the errors are Normal, then we get a stronger result - the OLS estimator is then "Best Unbiased". That's to say, we don't need to restrict our attention any more to the class of estimators that are linear functions of the random data. A stronger set of assumptions gives us a stronger result, in this case.
Let's think about this last point a bit further though. Implicitly, at least, the GMT does require that the distribution of the error term meets some conditions. Suppose I were to assume that the error vector is multivariate Student-t, with 2 degrees of freedom. Does the GMT still hold? Actually, no it doesn't, even though the form of the errors' distribution doesn't appear to be used anywhere in the proof. If the error vector has this particular distribution then the OLS estimator's covariance matrix is not defined! It makes little sense to then compare something that isn't defined with other covariance matrices!
The situation gets worse if we drop the degrees of freedom for the Student-t distribution to just 1. Then we have the Cauchy distribution. In this case none of the moments of the error term's distribution exist, and the same is true for the sampling distribution of the OLS estimator of β. It's difficult to talk about the OLS estimator being unbiased when it's mean isn't defined! And the same point as before applies to its covariance matrix. In this particular case we're actually violating Assumption 5, above. And remember how this particular assumption was hiding in the background?
So, adding an extra assumption doesn't necessarily lead to a stronger result. In this case it undermines the very foundations of the theorem, leaving us with no result at all!
Let's think about this last point a bit further though. Implicitly, at least, the GMT does require that the distribution of the error term meets some conditions. Suppose I were to assume that the error vector is multivariate Student-t, with 2 degrees of freedom. Does the GMT still hold? Actually, no it doesn't, even though the form of the errors' distribution doesn't appear to be used anywhere in the proof. If the error vector has this particular distribution then the OLS estimator's covariance matrix is not defined! It makes little sense to then compare something that isn't defined with other covariance matrices!
The situation gets worse if we drop the degrees of freedom for the Student-t distribution to just 1. Then we have the Cauchy distribution. In this case none of the moments of the error term's distribution exist, and the same is true for the sampling distribution of the OLS estimator of β. It's difficult to talk about the OLS estimator being unbiased when it's mean isn't defined! And the same point as before applies to its covariance matrix. In this particular case we're actually violating Assumption 5, above. And remember how this particular assumption was hiding in the background?
So, adding an extra assumption doesn't necessarily lead to a stronger result. In this case it undermines the very foundations of the theorem, leaving us with no result at all!
Seeing what assumptions are used, and how they're used is an important part of learning econometrics. Of course, it's true of many other fields of study too. I'm not claiming a uniqueness result here! For example, suppose you're learning about the concept and implications of Comparative Advantage in a principles course. There are some pretty strong assumptions that go along with the standard Ricardian model. You need to know about them and how they're used, so you'll then understand which ones can be relaxed without affecting the main result.
You miss out on a lot of interesting insights in those "Cookbook Econometrics" courses. And then there are some not-to-be-named econometrics texts that can be criticized on exactly the same grounds - they're just cookbooks. In my view, they contain less real econometrics than recent issues of Econometrica, and that's saying something these days!
You miss out on a lot of interesting insights in those "Cookbook Econometrics" courses. And then there are some not-to-be-named econometrics texts that can be criticized on exactly the same grounds - they're just cookbooks. In my view, they contain less real econometrics than recent issues of Econometrica, and that's saying something these days!
Hi David,
ReplyDeleteI think that you may have forgotten the "econ" component of econometrics and, as a results, are a bit harsh on "cookbook econometrics". Think at the margin for a moment. For which margin are cookbook econometrics texts written? They are for those individuals who would, under other circumstances, not take an econometrics course at all. In my opinion, missing many interesting insights is better than missing the subject altogether. Of course, I'm firmly in the "an imperfect number is better the no number" camp.
Cheers,
Brandon
Brandon: That's a very fair point, and it's well taken. Thanks!
ReplyDeleteDG
I took many a few advanced (for an undergrad) classes in econometrics my sophomore year (at a top US school). The courses were advanced because they were very proof oriented (as opposed to the slightly more cookbook regular econometrics courses). They really wanted us to understand econometrics at a more profound level than the cookbook you are displeased about.... But I hated the proofs; they just made no sense. I could follow the strict logic from step to step, but reproducing the proof, let alone piecing together the big picture, was impossible for me and most of my classmates.
ReplyDeleteThe following year I took real analysis (a very proof oriented math class) and gained an appreciation for the way proofs work and are constructed. Since then, I see proofs with a new appreciation, understanding, and even excitement.
In other words:
I wish I had been taught how to appreciate proofs in general before I was taught econometric proofs.
PS- I love the blog. Keep up the great work!
BTW,
ReplyDeleteIf you ever run out of ideas for topics. I've always been confused about moving average models in econometrics. I can follow all the proofs with the sequences of error terms and what not, but I never quite understood how the error terms are derived in the first place. It seemed weird that we just took the error terms as given in all the proofs I've seen, but perhaps Im just missing something obvious.
Anonymous 1: Thanks for the comment. I know what you mean - I was brought up in a math. dept. before moving to economics and econometrics at the grad. level.
ReplyDeleteDG
Anonymous 2: Thanks for the suggestion. I think I can do that. If you wanted to email me with a spcific example, please feel free to do so.
ReplyDeleteDG
Proofs are nice, but the epistemological foundations of econometrics (particularly questions of model selection and specification searching) are even more important, and usually neglected.
ReplyDeleteConsider the first assumption: "The model is correctly specified - that is, there are no omitted or extraneous regressors; and the functional form (with respect to the variables) is correct."
The problem here is that the very notion of model X having "missing" regressors or the "correct" functional form is meaningless unless there is a "true model" with the "right" regressors and the "right" functional form to which X can be compared.
What would "the true model" be? If the data of interest were generated by a model, then that's the "true model". The problem of course is that the data of interest never come from models, they come from the world.
Therefore, when dealing with the real world, there are no "correctly specified models" (and therefore, there aren't any 'true coefficients' either).
I found all econometric theory completely baffling until I realized that all theoretical discussions of the properties of estimators were in the context of known DGPs, and that in the real world there ain't no such animal. The specification search (methods for which were never once discussed in any econometrics class I ever took, at either the undergrad or grad level), is the crude and imperfect fitting of a complex world into an oversimplified model.
Darren: All good points. Have you read Ed. Leamer's book, "Specification Searches: Ad Hoc Inference With Nonesperimental Data" (Wiley, N.Y., 1978)?
ReplyDeleteDG
Hi Dave,
ReplyDeleteYes, I had heard of it, and in fact I'd downloaded it at one point, but I haven't as of yet read it. Thanks for the reminder! Leamer wrote a recent article called "Tantulus on the road to Asymptotia", and he seems to feel not much has changed.
He's expert enough to credibly say what I merely suspected: most econometric theory is only usable in the context of a known DGP, and there are no known DGPs, only specification searches to choose a close approximation. The implications of this insight has not penetrated econometric practice and instruction anywhere near as much as it should have, it seems to me.
Darren: I have a lot of sympathy with your position. Thanks for the pointer to Leamer's 2010 paper. When Ed's "Specification Searches" book came out in '78 I got 3 or 4 of my grad. students to do a "book reading" with me. We met once a week at lunchtime and worked our way through it. It was extremely worthwhile for everyone - myself included, of course.
ReplyDeleteDG