Econometrics Beat: Dave Giles' Blog

Friday, July 13, 2012

More Comments on the Use of the LPM

Alfredo drew my attention to Steve Pische's reply to a question raised by Mark Schaffer in the Mostly Harmless Econometrics blog. The post was titled, Probit Better than LPM? The question related to my own posts (here, here, and here, in reverse order) on this blog concerning the choice between OLS (the Linear Probability Model - LPM) or the Logit/Probit models for binary data.

Thanks, Alfredo, as this isn't a blog I follow.

Alfredo asked: "Would you care to respond? I feel like this is truly an exchange from which a lot of people can learn".

Concentrating, or Profiling, the Likelihood Function

We call it "concentrating", they (the statisticians) call it "profiling" - the likelihood function, that is.

Different language - same thing.

So what's this all about, anyway?

Decline and Fall of the Power Curve

When we think of the power curve associated with some statistical test, we usually envisage a curve that looks something like (half or all of) an inverted Normal density. That is, the curve rises smoothly and monotonically from a height equal to the significance level of the test (say 1% or 5%), until eventually it reaches its maximum height of 100%.

The latter value reflects the fact that power is a probability.

But is this picture that invariably comes to mind - and that we see reproduced in all elementary econometrics and statistics texts - really the full story?

Actually - no!

"Data is", or "Data are"?

I guess I'm a ~~pedant~~ traditionalist when it comes to the word "data": one "datum", several pieces of "data", etc.

As with many matters relating to the use of language, though, this one isn't open and shut, by any means.

And so a few days ago we saw The Wall Street Journal, The Economist, and The Guardian grappling with this issue once again.

However, I'm going to stick with my guns, dust off my slide-rule, and also continue to use the "Oxford comma"!

Local vs. Global Approximations

Approximating unknown (continuously differentiable) functions by using a Taylor (MacLaurin) series expansion is common-place in econometrics. However, do you ever pause to recall that such approximations are only locally valid - that is, valid only in a neighbourhood of the (possibly vector) point about which the approximation is made?

Unlike some other types of approximations - such as Fourier approximations - they are not globally valid.

Does this matter? Is it something we should be concerning ourselves with?

Mark Thoma in "The Browser"

It was nice to see this interview with Mark Thoma in The Browser today.

Enjoy!

Friday, July 6, 2012

The Milliken-Graybill Theorem

Let's think about a standard result from regression analysis that we're totally familiar with. Suppose that we have a linear OLS regression model with non-random regressors, and normally distributed errors that are serially independent and homoskedastic. Then, the usual F-test statistic, for testing the validity of a set of linear restrictions on the model's parameters, is exactly F-distributed in finite samples, if the null hypothesis is true.

In fact, the F-test is Uniformly Most Powerful Invariant (UMPI) in this situation. That's why we use it! If the null hypothesis is false, then this test statistic follows a non-central F-distribution.

It's less well-known that all of these results still hold if the assumed normality of the errors is dropped in favour of an assumption that the errors follow any distribution in the so-called "elliptically symmetric" family of distributions. On this point, see my earlier post here.

What if I were now to say that some of the regressors are actually random, rather than non-random? Is the F-test statistic still exactly F-distributed (under the null hypothesis)?

The Role of Statistics in the Higgs Boson Discovery

With the scientific world abuzz today over the (possible) confirmation of the existence of the Higgs Boson, this post from David Smith on the SmartData Collective is a must-read for anyone with an interest in statistics.

Friday, June 29, 2012

SURE Models

In recent weeks I've had several people email to ask if I can recommend a book that goes into all of the details about the "Seemingly Unrelated Regression Equations" (SURE, or just SUR) model.

Any decent econometrics text discusses this model, of course. However, the treatment usually focuses on the asymptotic properties of the standard estimators - iterated feasible GLS, or MLE.

Attention, Stata Users

I've mentioned the Econometrics by Simulation blog before. Although it's still relatively new, it's had some great posts, and Francis Smart is doing a terrific job, as is reflected in the way that the page view numbers are building up.

Definitely worth a look, especially (but not only) if you're a Stata user.