We all know that the Maximum Likelihood Estimator (MLE) is justified primarily on the basis of its desirable (large sample) asymptotic properties. Specifically, under the usual regularity conditions, the MLE is generally weakly consistent, asymptotically efficient, and its limit distribution is Normal. There are some important exceptions to this, but by and large that's what you get.
When it comes to finite-sample properties, the MLE may be unbiased or biased; efficient or inefficient; depending on the context. It can be a "mix and match" situation, even in the context of one problem. For instance, for the standard linear multiple regression model with Normal errors and non-random regressors, the MLE for the coefficient vector is unbiased, while that for the variance of the error term is biased.
As we often use the MLE with relatively small samples, evaluating (and compensating for) any bias is of some interest.
One characteristic of a lot of interesting MLEs is that they can't be written down as a formula, in closed for. Instead, they are obtained by using an iterative numerical algorithm to solve the "likelihood equations" - the set of first-order conditions associated with maximizing the (log-) likelihood function.
Recalling that if θ* is an estimator of a parameter (vector), θ, then the bias of this estimator is defined as E[θ*] - θ, where E[.] denotes expectation with respect to the sampling distribution of the estimator. Think about the OLS estimator (the MLE if the errors are normally distributed) for the coefficient vector in the standard linear regression model: b = (X'X)-1X'y. To show that this estimator is unbiased (assuming errors with a zero mean), we replace "y" in this formula for b with Xβ + ε; then we take the expectation of b, and we find that E[b] = β. The estimator is unbiased.
What if we didn't have formula, b = (X'X)-1X'y written down in front of us? How could we determine if this estimator is unbiased?
Well, we could simulate its sampling distribution, of course. The same would be true for any general MLE problem where all that we were told is that the parameter estimators are the solution to a set of first-order conditions. Examples include the MLEs for Logit and Probit models; count data models (e.g., Poisson and Negative Binomial); the FIML estimator for a simultaneous equations model; models that are non-linear in the parameters to begin with; etc.
We could bootstrap the bias of the MLE and then "bias-correct" ("bias-adjust") the MLE to reduce its bias. In the case of a bootstrap correction, we'll generally be able to reduce the bias from O(n-1) to O(n-2).
However, there's another way to proceed - a way that yields an analytic expression for the bias (to O(n-1)) of an MLE, even when we can't write down the estimator's formula. We can then use an estimate of this bias to "bias-correct" the original MLE, and again reduce its by to O(n-2).
The basis for this is the expansion of the likelihood equations proposed by Cox and Snell (1968).
Over the past couple of years I've been working with Hui (Helen) Feng, Ryan Godwin, Jacob Schwartz and Ling (Linda) Xiao on some research that uses the Cox-Snell technique in a variety of contexts. This is not a go place to go into details, but you can find some of our results to date in the publications by Giles and Feng (2011), Giles (2012), Giles, Feng and Godwin (2012), Schwartz, Godwin and Giles (2012), Xiao and Giles (2012). (The links for each of these papers take you to the Working Paper versions.)
We also have other papers currently under revision, including Giles, Feng and Godwin (2011), and Schwartz and Giles (2011).
One interesting message that comes out of this work is that the Cox-Snell analytic bias corrections work extremely well for a wide range of interesting problems. Moreover, they are generally very easy to apply and they often dominate the computing-intensive bootstrap bias correction, once the MSEs of the bias-corrected estimators are taken into account.
Cox, D. R. and E. J. Snell, 1968. A general definition of residuals. Journal of the Royal Statistical Society, B, 30, 248-275.
Giles, D. E., 2012. Bias reduction for the maximum likelihood estimator of the parameters in the half-logistic distribution. Communications in Statistics – Theory & Methods, 41, 212-222
Giles, D. E. and H. Feng, 2011. Reducing the bias of the maximum likelihood estimator for the Poisson regression model. Economics Bulletin, 31 (4), 2933-2943.
Giles, D. E., H. Feng and R. T. Godwin, 2011. Bias-corrected maximum likelihood estimation of the parameters of the generalized Pareto distribution. Econometrics Working Paper EWP1105, Department of Economics, University of Victoria.
Giles, D. E., H. Feng and R. T. Godwin, 2012. On the bias of the maximum likelihood estimator for the two-parameter Lomax distribution. Forthcoming in Communications in Statistics – Theory and Methods.
Schwartz, J. and D. E. Giles, 2011. Biased-reduced maximum likelihood estimation for the zero-inflated Poisson distribution. Econometrics Working Paper EWP1102, Department of Economics, University of Victoria.
Schwartz, J, R. T. Godwin and D. E. Giles , 2012. Improved maximum likelihood estimation of the shape parameter in the Nakagami distribution. Forthcoming in Journal of Statistical Computation and Simulation.
Xiao, L. and D. E. Giles, 2012. Bias reduction for the maximum likelihood estimator of the generalized Rayleigh family of distributions. Forthcoming in Communications in Statistics – Theory and Methods.
© 2012, David E. Giles