Tuesday, October 11, 2011

On the Importance of Knowing the Assumptions

I've been pretty vocal in the past about the importance of understanding what conditions need to be satisfied before you start using some fancy new econometric or statistical "tool". Specifically, in my post, "Cookbook Econometrics", I grizzled about so-called "econometrics" courses that simply teach you do "do this", do that", without getting you to understand when these actions may be appropriate.

My bottom line: you need to understand what assumptions lie behind such claims as "this estimator will yield consistent estimates of the parameters"; or "this test has good power properties" - preferably before you get too excited about using the estimator or test and you cause too much damage. In other words, it's all very well to understand what problems you face in your empirical work (simultaneity, missing observations, uncertain model specification, etc.), but then when you choose some tools to deal with these problems, you need to be confident that your choices will achieve your objectives.

A forthcoming paper in Economics Letters brings this point home to us all, rather nicely. Baldaus and Santos Silva (2011) re-visit Huber's (1964, 1973) M-estimator. They show that the version of this that is most commonly used by econometricians - the so-called MBW estimator - is inconsistent if the errors of the regression model are heteroskedastic and/or have a skewed distribution.

The MBW estimator gets its name from the fact that it is usually implemented as an iteratively re-weighted least squares estimator based on biweights (Beaton and Tukey, 1974). Why should econometricians care? Well:
  • This robust estimator is included in STATA - see the rreg command.
  • This package is widely used by researchers working in empirical micro.
  • These researchers often use cross-section data, which is often associated with models with heteroskedastic errors.
There's no comfort in having a huge sample size. That's the point about "inconsistency". Even if you have an infinitely large sample size (the whole population?) to work with, your estimates won't converge to the true parameter values. Of that you can be certain. What a shame!

Of course, the inconsistency of most standard non-linear estimators (such as Logit, Probit, Tobit) when the errors are heteroskedastic is well-known, but frequently over-looked by enthusiastic practitioners. And I'm not talking about inconsistent standard errors here - I'm talking about the estimates of regression coefficients themselves. For more on this, see my earlier post, "Gripe of the Day".

So if you're an enthusiastic STATA user, be careful if you think that M-estimation is the answer to the outliers in your data. (And it's not just the MBW estimator that suffers from this problem, as Baldauf and Santo Silva point out.)

Their advice?
"...in typical econometric problems where the errors can be heteroskedastic and skewed, the MBW-estimates are difficult to interpret and can be very misleading. Therefore, the use of the MBW-estimator in econometrics cannot be generally recommended, and it certainly should not be routinely used as an alternative to OLS."  [Baldauf and Santos Silva, 2011, p.7.]
If you don't understand the assumptions/conditions, you'll probably screw up.


Note: The links to the following references will be helpful only if your computer's IP address gives you access to the electronic versions of the publications in question. That's why a written References section is provided.

References

Baldauf, M. and J. M. C. Santos Silva (2011). "On the use of robust regression in econometrics". Economics Letters, doi:10.1016/j.econlet.2011.09.031, in press.

Beaton, A. E. and J. W. Tukey (1974). "The fitting of power series, meaning polynomials, illustrated on band-spectroscopic data". Technometrics, 16, 146-185.

Huber, P. J. (1964). "Robust estimation of a location parameter". Annals of Mathematical Statistics, 35, 73-101.

Huber, P. J. (1973). "Robust regression: Asymptotics, conjectures and Monte Carlo". Annals of Statistics, 1, 799-821.



© 2011, David E. Giles

No comments:

Post a Comment