Thursday, January 16, 2014

Estimating the Generalized Pareto Distribution

The generalized Pareto distribution (GPD) arises in the modelling of "extremes", especially if the "peaks-over-threshold" approach is being used. Estimating the parameters of the GPD by the method of maximum likelihood is especially challenging. The challenges arise because the likelihood function doesn't satisfy the usual regularity conditions for all possible values of the parameters.

I've discussed some of these issue in earlier posts, here and here.

When my colleagues, Helen Feng and Ryan Godwin, and I started looking at analytic bias reduction techniques for maximum likelihood estimators that can't be expressed in closed form, we first tackled the case of the GPD. It was well motivated, because you usually start with a very large sample size, the number of extreme data-points that lie above a given threshold, and which form the sample for estimation purposes, is generally very small. So, small-sample bias is a real issue.

Well, we bit off a lot more than we realized at the time, and bias reduction when estimating the GPD's parameters turned into a bit of a nightmare! We published several papers (including ones with Jacob Schwartz) dealing with bias reduction for other distributions, but the GPD problem was always there in the background.

I'm pleased to be able to report that our paper on this problem is now accepted for publication in Communications in Statistics - Theory & Methods. You can access a pre-print here.

© 2014, David E. Giles