Sunday, December 22, 2013

More on Student-t Regression Models

My recent post relating to maximum likelihood estimation of non-standard regression models in EViews included the case where the model's errors are independent Student-t distributed. In that example, the degrees of freedom for the Student-t distribution were assumed to be known. There was a good reason for making this assumption, as was spotted by Osman Dogan in his comment on that post.

If we relax this assumption and include the degrees of freedom parameter, v, of the t-distribution as another parameter that has to be estimated, then the likelihood function exhibits some unfortunate characteristics. Specifically, this function becomes unbounded at a boundary of the parameter space. Consequently, maximizing the likelihood function will generally result in us achieving only a local maximum, not a global maximum.

You might ask, "why would this matter?" Well, basically, if you want to be sure that your MLE achieves the good asymptotic properties that motivate us to use it in the first place, then you need to globally maximize the likelihood function.

I discussed this issue in some detail in an earlier post, here.

In the context of the multiple regression model with independent Student-t errors with an unknown degrees of freedom parameter, these issues have been discussed fully by Fernandez and Steel (1999), for example. In particular, those authors show how a Bayesian approach to this estimation problem can overcome the difficulties associated with MLE here.

The problem is very reminiscent of the "incidental parameters" problem that arises widely in statistics, as well as in certain econometric estimation problems. Good examples of this general type of problem in econometrics include "switching regression" models; as well as models of markets that are in disequilibrium; and stochastic frontier production functions.

It's well known that a Bayesian approach is productive in the case of the "incidental parameters" problem, so it shouldn't be too surprising that it's also helpful with the Student-t regression model.

So, if you want to estimate a regression model with independent Student-t errors, and the degrees of freedom parameter associated with that distribution is unknown, then don't use maximum likelihood estimation! The Bayesian estimator discussed by Fernandez and Steel (1999) is one alternative. Pianto (2010) suggests a bootstrap estimator; and another possibility  would be to consider method of moments estimation, which would result in estimates that are at least weakly consistent.


References

Fernandez, C, and M. F. J. Steel, 1999. Multivariate Student-t regression models: Pitfalls and inference. Biometrika, 86, 153-167. (Downloadable version here.)

Pianto, D. M., 2010. A bootstrap estimator for the Student-t regression model.


© 2013, David E. Giles

2 comments:

  1. I skimmed the Fernandez and Steel paper and noticed that they always consider the lefthand-side variable to be multivariate. Do these problems persist when the lefthand-side variable is univariate?

    As a practical matter, it is quite common to use Student's t distribution when fitting univariate ARCH models and to treat the degree-of-freedom parameter as a parameter to be estimated. Is that problematic?

    ReplyDelete
    Replies
    1. Thanks for the comment. Yes, the same problems arise when the LHS variable is univariate. This is discussed in the other paper referenced in the post. Regarding the ARCH model, yes, this very common practice. Again, there are issues with maximizing the likelihood function, and care should be taken. If you use this approach then you're in the hands of whoever programmed the maximization algorithm - they'd better know what they're doing! As always when using MLE with a non-linear problem, you need to test different starting values to see what local maximum you end up at.

      Delete