## Monday, October 3, 2011

### Making a Name for Yourself!

So you want to make a name for yourself? One way for an up-and-coming young econometrician to do this would be to come up with a new estimator or test that everyone subsequently associates with your name. For example, the the Aitken estimator; the Durbin-Watson test; the Cochrane-Orcutt estimator; the Breusch-Pagan test; White's robust covariance matrix estimator, etc.

This can be a bit risky - your new inferential procedure might not "catch on" as well as you hope it will. Worse yet, someone else might come up with a similar idea around the same time, and steal your glory.  A much safer way to make a name for yourself is to be the first to prove a result that has hitherto had everyone baffled.

Admittedly, some of the best of these have gone already. Don't waste time trying to prove Fermat's "last theorem", for example! Within statistics and econometrics there are still some possibilities, though. In a recent post I talked about the Behrens-Fisher problem. In a nutshell, there is no exact, finite-sample, solution to the problem of testing the equality of two normal means when the population variances are unknown and unequal. In that post I related this to the corresponding regression problem of testing for a structural break in the coefficient vector when the errors are heteroskedastic.

Now, there's a somewhat related estimation problem that you could use to gain fame and fortune. This problem is an "open" one. It's different from the Behrens-Fisher problem, for which we know there is no exact solution. This other statistical problem is more of a conjecture that has yet to be proven, and it has direct econometric implications.

The problem in question relates to the so-called "Graybill-Deal estimator". O.K., so this estimator has a name already, but there's still some room for you to tag along!

Here's the deal. Pal et al. (2007) note that:

"One of the oldest and most interesting problems in statistical inference, which has dogged the researchers over the last five decades, is the estimation of a common mean of two normal populations with unknown and possibly unequal variances."
Sounds simple enough, doesn't it?

First, note that the variances are unknown and possibly different in the two populations. If the variances are known, then the solution is simple. Let μ denote the common mean, and let σ21 and σ22 be the two variances. Then the MVUE (and MLE) of  μ is:

μ* = [n1xbar1/σ21 + n2xbar2/σ22] / [n1/σ21 + n2/σ22] ,

where n1 and n2 are the two sample sizes; and xbar1 and xbar2 are the respective sample means. This estimator has an intuitive "weighted average" interpretation.

In the case of unknown and unequal variances, the most commonly used (and unbiased) estimator of  μ  is the so-called Graybill and Deal (1959) estimator:

μGD = [n1xbar1/s21 + n2xbar2/s22] / [n1/s21 + n2/s22] ,

where:
s21 = Σ(x1i - xbar1)2 / (n1 - 1) ; s22 = Σ(x2i - xbar2)2 / (n2 - 1) .

Now, there are four interesting things to note about μGD  :
• μGD  is not the MLE of the common mean. The MLE's of the three parameters can't be written in "closed-form" and (as is often the case), the likelihood equations need to be solved numerically.
• The MLE of the common mean is also an unbiased estimator. Surprisingly, this was established formally only recently - see Pal et al. (2007).
• The MLEs for σ21 and σ22 can be shown to be downwards-biased, with biases of -2σ21/n1 and  -2σ22/n2 respectively, to O(n-1). See Giles (2011).
• Finally (and here's your chance), we don't know whether or not μGD is admissible under quadratic loss.
Let's take a look at this last point. Under a quadratic loss function, "risk" is just the same as MSE. So, what's being said here is the following:

For this problem, we don't know if there exists any other estimator of μ that has a MSE at least as small as that of the Graybill-Deal estimator in finite samples, for all possible values of the unknown parameters; and strictly smaller MSE for at least some values of the parameters.
If even one such alternative (possibly biased) estimator exists, then the Graybill-Deal estimator will be inadmissible, and for a lot of statisticians this would be enough to dissuade them from using it.

There are some conjectures and some results under additional restrictive conditions (e.g., see Sinha and Mouqadem, 1982; and  Pal and Lim, 1997), but the admissibility or otherwise of the Graybill-Deal estimator is still unknown.

Let's translate some of this into econometrics terms.

Consider the case of a linear regression model of the form y = + ε. The (conditional) mean  of y is . Suppose that we want to estimate β, but we are worried that there may be a structural break in the (unknown) variance of the errors, and of y, at some known point. (This is the situation we discussed in the context of the Goldfeld-Quandt test for homoskedasticity in that previous post.)

With normal errors, we now have the Graybill-Deal estimation situation. The Graybill-Deal estimator of β will be:

βGD = [n1b1/s21 + n2b2/s22] / [n1/s21 + n2/s22] ,

where b1 and b2 are the OLS estimators of β based on the two sub-samples; and now s21 and s22 are the sums of squared OLS residuals, divided by the degrees of freedom, for each sub-sample regression.

Here's the question:

(This loss function is the one of most interest, given that we are using least squares estimation here.)

This is your big chance to make a name for yourself!

Note: The links to the following references will be helpful only if your computer's IP address gives you access to the electronic versions of the publications in question. That's why a written References section is provided.

References

Giles, D. E. (2011). Bias correction for the MLEs of the scale parameters in the common mean problem, with implications for the admissibility of the Graybill-Deal estimator. Mimeo., Department of Economics, University of Victoria.

Pal, N. and W-K. Lim (1997). A note on second-order admissibility of the Graybill-Deal estimator of a common mean in several normal populations. Journal of Statistical Planning and Inference, 63, 71-78.

Pal, N., J-L. Lin, C-H. Chang and S. Kumar (2007). A revisit to the common mean problem: Comparing the maximum likelihood estimator with the Graybill-Deal estimator. Computational Statistics and Data Analysis, 51, 5673-5681.

Sinha, B. K. and O. Mouqadem (1982). Estimation of the common mean of two univariate normal populations. Communications in Statistics - Theory and Methods, 11, 1603-1614.