## Friday, April 6, 2012

### Is it Me or is it Them??

I really do value these sessions we've been having together.

Occasionally I have some "gripe" that I just have to get off my chest. I try really hard, not to let these things "get to me" - honest, I really do!! I'm sure that you've noticed.

I try. But sometimes it all gets too much. I can't explain it in rational terms. Maybe the meds. just didn't kick in as anticipated?

For whatever reason, I sometimes find myself feeling frustrated, and confused, by what I see around me .......... that is, with respect to some of the so-called "applied econometrics" literature that gets rammed down my throat. I know that I don't have to read it. But just when I'm happily ignoring it, I end up in a seminar where it rears its ugly head. I know, I know, .... I should just shrug it off.

An example? Sure - that's easy. By the way,.......Has the clock started?

Well, doctor (....... Oh! Jane? Sure.... of course, Jane is fine with me........ As long as..........yes, yes, of course.........)

O.K. (Jane), here we go! (I should warn you, though - this might get a bit technical.)

Here's the thing, Jane (this still feels a bit uncomfortable to me, but..........O.K., I'm working on it). I keep encountering these empirical papers where everything hinges on a regression model in which the dependent variable has been transformed by taking logarithms, Typically, the regressors are measured in the levels of the data. That's when they're not just a bunch of dummy variables.

(No, that wasn't meant to be rude - it's a technical term that we use, and......... O.K.!)

That's to say, we have a semi-logarithmic model. (I warned you that this might get technical........ sorry, Jane!) Not that I have anything against semi-logs (if I might use a colloquial expression). I mean, I've estimated them myself from time to time. In fact, some of my best friends........... Yes - I know that we've discussed them before.

Anyway! When I ask someone why they've estimated a model of this type, they typically say something like,
"Well, it's very convenient when it's in that form. It allows me to easily 'read off' the implications of the estimated coefficients."
Which makes me hopping mad! (Sorry,..... I mean, I get a little irked.)

Convenient?!?!?!  Econometrics of convenience is about as acceptable as a marriage of convenience!

What about some specification testing? We actually have a whole toolkit that we can dip into. Is this just laziness, or what?

And then, if I push the matter a little further (very gently, of course), I tend to get some really defensive reaction, along the lines of,
"But this is just a reduced form equation. It's not supposed to have any structural content."
Hmmm! Now, if the response had something to do with transforming the dependent variable so that the assumption of Normal errors was better approximated, I might have had an idea of where they were coming from. Even then, I'd have to be persuaded that this was the correct transformation to use.

Their bottom line seems to be:
"I've used a semi-log because that's what we all do with our empirical work in my field, and any fool can see that it's highly convenient to do so, so what's your beef?"
(By the way, how are we going for time?)

Trying to get these people to take any specification testing seriously, just seems to be impossible. I don't know, Jane, am I losing it, or what? I mean, back in the day,.......................

Sorry, I know I shouldn't say that!

Anyway, there's something even more disturbing that tends to come up. After protesting that the equation is just a reduced form relationship, they usually contradict themselves by suddenly saying that they have reason to believe that one or more of the regressors is really endogenous, so they're going to have find an instrumental variable. They make it sound like an Easter egg hunt.

Oh! So it's really a structural equation, and not a reduced form equation, after all! Am I confusing you, Jane?

Is it me, or is it them?

I mean, I certainly know the difference between these two things. Do they?

You can't have it both ways. It's either a reduced form equation, or it's a structural equation. It can't change its spots, so to speak while we're looking at it. And it can't be both, at the same time, in the one context.

Do you see why I keep coming to see you each week?

So, now they're "instrumenting" like crazy - no, it's just silly expression - most of them can't even carry a tune, let alone play "Chopsticks" on their keyboards.

Sorry! Where was I? Oh yes -

They've now changed horses in mid-stream. Yes, I know that takes a certain amount of skill, but this isn't a rodeo, after all!

Suddenly we have a structural equation - with no "structure" to it at all, really. Still no use of the underlying economic theory to justify the functional form. Do you see what I mean?

And this "instrumenting"! You know what recent religious converts are like? Zealous to a fault. They seem to forget some of us were actually publishing the theory behind this stuff when they were still in kindergarten. (Sorry!) The J test, for example. If they bothered to read Denis Sargan's '58 paper, they'd see that this is just his test for over-identification. I could go on..........

Oh really? Wow, that went fast. But we've barely started on this...........

O.K., O.K., I understand. Same time next week? Yes, of course, thank you. That would be just fine.

I'm sure I'll feel better about this eventually - thanks to you, of course................

© 2012, David E. Giles

1. This was a good post. One related thing I've always wondered is why the generalized linear model (GLM) framework isn't more popular in economics. Instead of taking the log of y and using that as the dependent variable, why not just use GLM with a log link function? In the GLM framework predicting the value of y instead of log y is not a problem. Yes, in GLM you are modeling ln(E(y)) rather than E(ln(y)) as in OLS, but in principle most people are interested in E(y), not E(ln(y)) anyway.

1. Thanks! You're absolutely tight about GLM's. Regrettably, you'll find that 90% of econometricians haven't been exposed. Hence the "piecemeal", and ad hoc, approach in a lot of the applied "microeconometrics" literature. There's a lack of understanding that many of the models they use are just special cases of the GLM. Pity!

2. Do you have a good reference on GLM from an econometrics perspective that you like?

In my understanding, GLM makes a very different assumption on the error distribution. In the usual OLS log-log model, the error is assumed to be multiplicative, while in the GLM the error is assumed to be additive. I think in many economic applications the former makes more sense or has some theoretical interpretation.

You can also transform ln(y) back to the original scale using Duan's method:
Duan, N., 1983. Smearing estimate: A nonparametric re-transformation method. Journal of the American Statistical Association 78:605-610.

I will admit that GLM usually gives you much more reasonable estimates than re-transformed OLS, at least in my limited applied experience.

1. I'd be happy if more econometricians read McCullagh & Nelder's book. I can't think of a GLM that takes an econometric perspective, but I'd love to see one.

I'm glad you mentioned Duan's transformation -Something else that gets overlooked a lot.

3. Dmitry said "while in the GLM the error is assumed to be additive". I don't think this is remotely accurate, except at the normal. For example, consider a Gamma GLM with log-link - there's no sense in which the 'error' there is any more additive than it is for a lognormal model for which you take logs and fit a normal linear model. And if you go to say a logistic regression, the claim that the error is additive would be bizarre.