Monday, September 12, 2011

Econometrics and One-Way Streets

It's a nice sunny day out there, so you decide to get on your bike and pedal down the street a couple of blocks to your favourite ice cream parlour. Great idea! Except that it's a one-way street and you're pedalling against the traffic. Fortunately, on this particular day, most people have already headed for the beach, there are very few cars that have to avoid you, and somehow you make it to your destination in one piece. Whew!

Now, maybe your ears are suffering from some of the abuse you received along the way - maybe not. I guess it would depend on exactly who you encountered during your little trip. In any event, you savour your well-earned ice cream and feel pretty good about yourself, and life in general. You could have travelled the longer route, around the block, to avoid the one-way street, but gee, the end result was the same, so that's all that matters. Right?

Wrong, of course! You got away with it this time, but the next time you take this short cut, you probably won't be so lucky. You may not get to your destination at all. No ice cream. Certainly not the same outcome as if you took the time to pedal around the block.

And on top of that, what about the bad example you set to those young kids in that car that had to swerve to miss you as you? Maybe they now think it's OK to take a short cut - you get the same result, after all.

It's these short cuts in the use of econometrics that really get to me at times.

How many times have you heard someone defend their use of (say) OLS estimation, in a context where it is patently inappropriate, by saying something like: "... I also estimated a (logit / probit / whatever) model and got essentially the same results; and these OLS results are easier to interpret"?

O.K. - they got away with it this time, but that's no excuse. What about next time?

And then there are those papers that describe research using time-series or panel data, but there's not a single mention of the words "non-stationary", "unit roots", or "cointegration" anywhere in the paper? Trust me, I read such a paper just last Friday.

Getting the "right answer" for the "wrong reason" is not something to be proud of! So why do some purveyors of empirical economics keep pretending that it is?

I can excuse ignorance - we can just encourage the person concerned to learn some more econometrics. But when the person concerned really knows better, it amounts to laziness or arrogance - or some mixture of the two. Then, I have no sympathy, and no respect.

So, unless you want some flak when you ride down a one-way street, in the wrong direction, maybe you should take the proper route - around the block. That way, if you reach your destination it's more likely to be for the right reason.

A while back I posted a piece that I titled "Gripe of the Day". I began that piece by serving notice that there are lots of things about the way some people do their empirical economics that annoy me. Count this post as another gripe - there'll be more coming!

10 comments:

  1. OK, first, guilty as charged (and feeling guilty about it too!). :) But, for the sake of argument (and not to defend any past crimes of my own--mmm...icecream!), here are a couple rhetorical questions in response:

    1) Suppose I want to reach an audience that is less econometrically sophisticated than I am. Is it ever OK for me to, for example, estimate a linear probability model (instead of, say, a probit) because my readers stand a better chance of following the empirical argument?

    2) Page lengths represent an implicit constraint in publishing. The more sophisticated the econometrics used (often) the more length in setup is required (especially if your audience isn't as up to speed on the technique as you are). That takes away from space spent on the theory, the data description, and other valuable information that could be conveyed to the reader in the limited space for the article. At what rate should I trade off space spent on econometrics over space spent on, say, theory?

    3) Time constraints apply to applied empirical researchers. We can spend our next hour chasing down some interesting new data (or thinking up a new question or...), or we can learn some interesting new econometrics with that next hour. Obviously if we do too much of one, we end up doing too little of the other. What's the right balance?

    I guess a way of summarizing all of this in an overall question is to ask, "Is it ever appropriate to forgo estimating the 'perfect' econometric model, in order to gain something else on some other front?" In economic theory, for instance, the "best" models (i.e. those most used and most treasured by applied folks) are often the least correct, in the sense that parsimony is valued. So we throw out all kinds of real stuff in order to get something simple, tractable and easy to "play with." That doesn't mean we don't tailor its sophistication and applicability depending on the context, but we always keep some simplification for the sake of clarity. Does the same thinking apply in econometrics? Should it? Am I drawing a false dichotomy?

    Of course, logits and probits (instead of LPMs) are pretty straightforward for the sort of audience I write for. So, consider my questions as applying especially to higher level econometrics. Like the stuff econometricians know, but us applied people are shaky on (heteroskedasticity-consistent bivariate ordered probit, anyone?). ;)

    ReplyDelete
  2. Martin: Thanks for the thoughtful comments. Taking your points in reverse order, I have a lot of sympathy with (3). It's an economic problem - right? There are constraints involved, and we have to weigh marginal costs and benefits. Balance is indeed the trick - I agree with that. I don't think there's a single perfect answer to that. I just get upset when someone has obviously put in a huge amount of effort cleaning up a really large and interesting data-set, and then ignores some basic econometric issues. To me, that's a lack of balance.

    As for (2), I think the answer probably lies with sending the material to the appropriate journal; proper use of appendices; etc. We're not going to raise the game in the profession if we keep dumbing it down.

    Re. (1): to take your specific example literally, I would answer "no, it's not." Unless it's a high school audience perhaps! This is REALLY basic undergrad. stuff. How does it help a reader if you use a model that's clearly mis-specified? You might help one or two readers feel a bit more comfortable, but you'll leave just as many wondering if they should trust anything you say!

    Take my other example of not even MENTIONING unit roots etc. in an otherwise interesting paper based on time-series or panel data. I wouldn't have minded if there had been just a paragraph that said that the author had applied this, that and another panel unit root test; had found no evidence of non-stationary, and that detailed results were available on request (or on his/her web page).

    Again, thanks for the great comments - much appreciated.

    ReplyDelete
  3. So, do you have a "10 Most Wanted" list of econometric crimes (or criminals!)?

    ReplyDelete
  4. Martin: I like that idea!!!!! I just might steal it! If I do, I promise you a hat-tip, at the very least!

    ReplyDelete
  5. You can have the idea--so long as you don't put me on the list!

    ReplyDelete
  6. Dave: I agree entirely with your general point, but I am not so sure about the LPM/probit/logit example. Angrist & Pischke's "Mostly Harmless Econometrics" has a robust defence of the LPM on pp. 103-7. The argument is that (a) we usually care mostly about marginal effects in the probit/logit/tobit/etc. setting; (b) these marginal effects are usually pretty close to the LPM coefficients; (c) the nonlinear world is messier and less transparent w.r.t. things like weighting, IV, panel data, etc.

    I found the argument pretty convincing. I think it's legitimate enough to make the LPM a less-than-ideal example to make your point. The non-stationarity example is much more powerful, I think - ignoring non-stationarity is far more likely to generate spurious results than using an LPM instead of a probit.

    --Mark

    ReplyDelete
  7. Mark (& Martin): On reflection, I agree. Fair point. I could have used a much better example than the LPM. There are plenty of others - e.g., failing to use I.V., and using OLS instead.

    I think what really bugs me is when a writer who should know better, simply takes short cuts, and then says "well, it wouldn't really affect the results much". I wouldn't want my physician doing that!

    Thanks for the very constructive feedback!

    ReplyDelete
  8. I don't know what bothers me most, laziness, mendacity, or ignorance. Dealing properly with data cleaning and outliers is not quite what you had in mind, but the one-way street metaphor applies very nicely here, too. What's worse: people who don't do proper data cleaning and robustness-checking because they can't be bothered, or because they don't want their results to disappear, or because they don't realize they have a problem when, e.g., they have a t-stat of 1,000? I was once at seminar given by a PhD student from a very respectable US university who reported the latter and was oblivious to the possibility that bigger is not always better.

    On reflection: mendacity, ignorance, laziness.

    --Mark

    ReplyDelete
  9. David, o.k., here's my problem.

    I'm teaching a 4th year micro research essay course.

    4th year econometrics is a co-requisite. A good percentage of the students won't see logit or probit until close to the end of term, if at all. So do I:

    a. get them to estimate the linear probability model that they actually (I hope) understand?

    b. teach them some handwaving version of probit or logit? I.e. put a bunch of dots and a nice looking curve on the blackboard, tell the students, here's the formula, I can't really explain it to you, but you don't really need to understand all that math anyways, here's the Stata command. Oh, and remember to add that option at the end that generates marginal effects.

    c. don't let them look at any research problem that has categorical dependent variables, shutting down almost all projects involving data liberation initiative datasets? (even income data is categorical in these datasets).

    d. get them to write theory papers

    I'll probably opt for the second alternative, but it won't be rigorous, and I'm not sure that it's the right thing to do.

    ReplyDelete
  10. Frances: Thanks for the thoughtful comment. First, I agree - you have a problem! (The timing of the courses.) I would have hoped that by the time the students are taking a 4th year econometrics course they would at least have heard the words "maximum Likelihood", so you could sprinkle those magic words of explanation (without any math)while taking option (b).

    Options (c) and (d) won't fly, in my view.

    I think you know already why I really don't like (a). With respect to the latter I'd show them (it takes just 3 or 4 lines) why the model is inherently hetroskedastic; and point out that if y is binary, then the errors can't be normal, or even have any continuous distribution, so the LPM has obvious specification error issues. I'd say that this mis-specification may not be too severe in practice, but to do the job properly we need to re-think the model itself (not just the estimator). Once the model is fixed up then we don't have a standard regression framework any more, so a different estimator has to be used. A good choice is one that works well (by construction) with the sorts of large data-sets they're likely to meet in your course (hand wave), yadda, yadda, yadda......

    I remember being told as a grad. student that teaching an "applied econometrics" course is much harder than teaching a thoery course. I'm totally convinced that's true!

    ReplyDelete