## Monday, April 29, 2019

### Recursions for the Moments of Some Continuous Distributions

This post follows on from my recent one, Recursions for the Moments of Some Discrete Distributions. I'm going to assume that you've read the previous post, so this one will be shorter.

What I'll be discussing here are some useful recursion formulae for computing the moments of a number of continuous distributions that are widely used in econometrics. The coverage won't be exhaustive, by any means. I provide some motivation for looking at formulae such as these in the previous post, so I won't repeat it here.

When we deal with the Normal distribution, below, we'll make explicit use of Stein's Lemma. Several of the other results are derived (behind the scenes) by using a very similar approach. So, let's begin by stating this Lemma.

Stein's Lemma (Stein, 1973):

"If  X ~ N[θ , σ2], and if g(.) is a differentiable function such that E|g'(X)| is finite, then

E[g(X)(X - θ)] = σ2 E[g'(X)]."

It's worth noting that although this lemma relates to a single Normal random variable, in the bivariate Normal case the lemma generalizes to:

"If  X and Y follow a bivariate Normal distribution, and if g(.) is a differentiable function such that E|g'(Y)| is finite, then

Cov.[g(Y )X] = Cov.(X , Y) E[g'(Y)]."

In this latter form, the lemma is useful in asset pricing models.

There are extensions of Stein's Lemma to a broader class univariate and multivariate distributions. For example, see Alghalith (undated), and Landsman et al. (2013), and the references in those papers. Generally, if a distribution belongs to an exponential family, then recursions for its moments can be obtained quite easily.

Now, let's get down to business............

Recall that the rth. "raw moment" (or "moment about zero") for the random variable X, with distribution function F, is defined as μr' = E[r= ∫ xrdF(x) , for r = 1, 2, .....; and the "central (centered) moments" of X are defined as μr = E[(X - μ1)r] , for r = 1, 2, 3, .......

Also, we can express one of these sets of moments in terms of the other set by using the two relationships:

μr = Σ{[r! / (i! (r - i)!)] μi' (- μ1r-i } ,                                              (1)

where the range of summation is from i = 0 to i = ; and

μr' = Σ{[r! / (i! (r - i)!)] μi (μ1r-i } ,                                                (2)

where, again, the range of summation is from i = 0 to i = r.

Normal distribution
If X ~ N[θ , σ2], then we know that μ1' = E[X] = θ, and μ2' = E[2] = σ2 + θ2.

Also, note that we can write:

μ3' = E[3] = E[2(X - θ + θ)] = E[2(X - θ)] + θ E[2] .
Applying Stein's Lemma, with g(X) = 2, we obtain:

μ3' = 2σE[X] + θ E[2] = θμ2' + 2σ2μ1' = θ3 + 3θσ2.

Similarly, if we apply the Lemma with g(X) = k ; k =3, 4, ......, we obtain:

μ4' = E[4] = E[3(X - θ)] + θ E[3]  = θμ3' + 3σ2μ2' = θ4 + 6θ2σ2 + 3σ4.

μ5' = E[5] = E[4(X - θ)] + θ E[4] = θμ4' + 4σ2μ3' = θ5 + 10θ3σ2 + 15θσ4;

and so on.

So, each moment is obtained recursively from all of the lower-order moments, through the repeated use of Stein's Lemma.

More particularly, you can see immediately that following recursion formula holds for the raw moments  of the Normal distribution (Bain, 1969; Willink, 2005):

μr+1' = θ μr' + μr-1'  ;    r = 1, 2, 3, .........

Chi-square distribution

We'll use the following standard result for the Chi-square distribution with "v" degrees of freedom:

"For any real-valued function, h,  E[h(χ2v)] = v E[h(χ2v+2) / χ2v+2], provided that the expectations exist."

(This result can be generalized to the case of a non-central Chi-square distribution. See Appendix B of Judge and Bock (1978).

Let h(x) = xk, for some integer, k. Then, applying the above result repeatedly, we get:
• If k = 1, then  μ1' = E[h(χ2v)] = v E[χ2v+2 / χ2v+2] = v .
• If k = 2, then μ2' = E[(χ2v)2] = v E[(χ2v+2)2χ2v+2] = vE[χ2v+2] = v(v + 2).
Immediately, we see that μ2 = Var.[χ2v] = μ2' - (μ1')2 = 2v.
• If k = 3, then μ3' = E[(χ2v)3] = v E[(χ2v+2)3χ2v+2] = vE[(χ2v+2)2] = v(v + 2)(v + 4) .
I'll bet that you can see right away what the expressions are for μ4', μ5', etc.!

In terms of a genuine recurrence relationship, we see from the above that we can write:

μr' = μr-1' (v + 2(r -1))   ;   r = 1, 2, 3, ........

Although we can then use equation (1) to obtain the central moments of X, there's also a separate recursion formula for these moments in the case of the Chi-square distribution This is discussed in the section on the Gamma distribution, below.

Student-t distribution

Suppose that X follows a Student-t distribution, with v degrees of freedom. Then the moments of X can be summarized as follows:

μr' = 0                                              ; if r is odd, and 0 < r < v

μr' = v(r/2) Π[(2i - 1) / (v - 2i)]       ; if r is even, and 0 < r < v                                       (3)

where the product is for i = 1 to i = (r / 2).

The inequality, v > r, ensures that the moment "exists" - that is, it is finite.  If v = 1, then X follows a Cauchy distribution, and none of its moments "exist".

Equation (3) gives us a nice, tidy, and direct formula for obtaining any of the even-order moments of X. For instance, we see that:

μ2' = v [(2- 1) / (v - 2)] = v / (v - 2)  ;          if v > 2.

Obviously, μ2 = Var.[X] = μ2', because E[X] = 0.

Also,

μ4' = v2 [1 / (v - 2)][3 / (v - 4)] = 3v2 / [(v - 2)(v - 4)]  ;          if v > 4.
μ6' = v3 [1 / (v - 2)][3 / (v - 4)][5 / (v - 6)] = 15v3 / [(v - 2)(v - 4)(v - 6)]  ;          if v > 6.

etc.

This set of results isn't in the form of a recurrence relationship, but we can obtain one very easily. Note that:

μ4' = [v / (v - 2)][3v / (v - 4)] = μ2' [3v / (v - 4)]                 ;     if v > 4.

μ6' = [v / (v - 2)][3v / (v - 4)][5v / (v - 6)] =  μ4[5v / (v - 6)]         ;     if v > 6,

and so on.

Clearly, in general, the following recurrence formula holds -

μr' = μr-2' [(r - 1) / (v - r)]      ;    if r is even, and 0 < r < v.

Trust me - this formula is much more appealing than integrating functions of the density, or working with the characteristic function, for this particular distribution.

Finally, note that because E[X] = μ1' = 0, μrμr', for all (even) r.

F distribution

Suppose that X follows an F distribution, with numerator and denominator degrees of freedom, v1 and v2 respectively.

The general formula for the raw moments of X is:

μr' = (v2 / v1r Γ[(v1 / 2) + r] Γ ([v2 / 2) - r] / (Γ [v1 / 2] Γ[v2 / 2]) .                    (4)

So,

μ1' = (v2 / v1)Γ[(v1 / 2) + 1] Γ ([(v2 / 2) - 1] / (Γ [v1 / 2] Γ[v2 / 2] .

Using the result that Γ[x + 1] = xΓ[x], and cancelling terms, this simplifies to

μ1' = v2 / (v2 - 2)  ;     if  v2  > 2

Similarly,

μ2' = (v2 / v1)2 Γ[(v1 / 2) + 2] Γ ([(v2 / 2) - 2] / (Γ [v1 / 2] Γ[v2 / 2].

Again, noting that Γ[(v1 / 2) + 2] = ((v1 / 2) + 1)Γ[(v1 / 2) + 1]; that Γ[(v2 / 2) - 2] = Γ[(v2 / 2) - 1] / ((v2 / 2) - 2); and then simplifying, we get:

μ2' = μ1' (v2 /v1)[(v1 + 2) / (v2 - 4)]   ;    if  v2  > 4

Proceeding in a similar manner, we find that:

μ3' = μ2' (vv1)[(v1 + 4) / (v2 - 6)]   ;   if  v2  > 6

μ4' = μ1' (v2 / v1)[(v1 + 6) / (v2 - 8)]   ;   if  v2  > 8

etc.

So, in general you can see that we have the following recursion formula for the moments of X:

μr+1' = μr' (v2 / v1)[(v1 + 2r) / (v2 - 2(r + 1))]   ;  r = 0, 1, 2, .............; if v2 > 2(r + 1).

This recurrence relationship avoids us having to deal with the gamma functions in (4), let alone having to perform any integration, or deal with this distribution's messy characteristic function. (As with the Student-t distribution, the moment generating function isn't defined here, because of the above conditions on the existence of the moments.)

The distributions that we've considered so far are ones that you use every day in your econometrics work. Now let's consider a couple more distributions that arise a bit less frequently.

Gamma distribution

One situation where the Gamma distribution comes up in econometrics is in the context of "count data". This might seem a bit odd, because count data are non-negative integers, and we're talking about continuous random variables here.

However, the Gamma distribution comes into play when we generalize the Poisson distribution to a particular form of the Negative Binomial distribution. Both of these distributions were discussed in my previous, related, post. Looking back at that post, you'll be able to see that the variance of a Poisson random variable equals its mean. We say that the distribution is "equi-dispersed". This is very restrictive, and isn't realistic with a lot of count data in practice. On the other hand, the variance of the Negative Binomial distribution exceeds its mean. We say that this distribution is "over-dispersed", and in practice this is often more reasonable.

To construct the Negative Binomial distribution in the form that we usually use it in econometrics, we take a Poisson random variable and then add in an unobserved random effect to its (conditional) mean. If this random effect follows a Gamma distribution, we end up with the Negative Binomial distribution for the count data. (See Greene, 2012, pp.806-807 for details.)

With this by way of motivation, consider a random variable, X that follows a Gamma distribution with a shape parameter 'a', and a scale parameter 'b'. (Be careful here - there are two forms of the Gamma distribution. The other one has the shape parameter 'a' and the rate parameter, θ = 1 / b.)

Willink (2003) shows that the central moments (moments about the mean) for X can be obtained from the following recursion -

μr = (r - 1)(r-1 + ab2μr-2)     ;  r = 2, 3, ..........

There are two special cases of the Gamma distribution that we might note. First, if a = (v / 2), and b = 2, then X follows a Chi-square distribution with v degrees of freedom. So, the central moments of the Chi-square distribution follow the recursion relationship:

μr = 2(r - 1)(μr-1 + r-2)        ;  r = 2, 3, ..........

Second, if a = b = 1, then X follows an Exponential distribution, and its central moments satisfy:

μr = (r - 1)(μr-1 + μr-2)      ;   r = 2, 3, .........

In addition, Withers (1992) shows that for the Exponential distribution we have the following simpler recursion:

μr = r-1 + (-1)r      ;  r = 1, 2, 3, .......

Beta distribution

One of the important features of the Beta distribution is that its density's support is the unit interval (although this can be generalized). It's one of the few distributions used by econometricians that has a finite support.

This suggests that it might be useful when modelling data that are continuous in nature, and are in the form of proportions. Indeed, this is the case. I discuss this in the context of consumer demand analysis in this old post. More generally, the Beta regression model is discussed in detail by Ferrari and Cribari-Neto (2004) and Cribari-Neto and Zeileia (2010).

The density function for X, that follows a Beta distribution is:

(x | α, β) = [Γ(α + β) / (Γ(α)Γ(β))] α-1 (1 - xβ-1  ; 0 < x < 1  ; α , β > 0

The general formula for the rth. raw moment of X is

μr' = E[r] = [ (α + r - 1)(α + r - 2) ...... α] / [(α + β + r - 1)(α + β + r - 2) .....(α + β)]  ; r = 1, 2, 3, ......

Immediately, we have:

μ1' = α / (α + β)
μ2' = [α (α + 1)] / [(α + β + 1)(α + β)] = [(α + 1) / (α + β + 1)] μ1'
μ3' = [α (α + 1)(α + 2)] / [(α + β + 2)(α + β + 1)(α + β)] = [(α + 2) / (α + β + 2)] μ2'
etc.

So, a general recursion formula for the raw moments of the Beta distribution is:

μr' = [(α + r - 1) / (α + β + r - 1)] μr-1'   ;   r = 1, 2, 3, .....

Numerical examples

Remember, the whole point abut these recursion formulae is they help us to rapidly compute all of the moments of a distribution, up to some pre-specified maximum order, using one general formula.

There is an R script file on the code page for this blog that illustrates this point, first for X ~ N[θ, σ2]; and second for X ~ F[v1,v2]. In the first case the first ten raw moments for X when θ = 1 and σ2 = 4 are:    1,  5,  13,  73,  281,  1741,  8485,  57233,  328753,  2389141,  ....

In the second case, the first ten raw moments for X when v1 = 7 and v2 = 24 are:  1.09,  1.68,  3.5265,  9.82,  36.09,  175.28,  1141.85,  10276.63,  135064.30,  2894235.00, ....

References

Alghalith, M., undated. A note on generalizing and extending Stein's lemma.

Bain, L. J., 1969. Moments of non-central t and on-central F distribution. American Statistician, 23, 33-34.

Cobb, L., P. Koppstein, & N. H. Chen, 1983. Estimation and moment recursion relations for multimodal distributions of the exponential family. Journal of the American Statistical Association, 78, 124-130.

Cribari-Neto F.  & A. Zeileis, 2010. Beta regression in R. Journal of Statistical Software, 1–24.

Ferrari, S. L. P. & F. Cribari-Neto, 2004. Beta regression for modelling rates and proportions. Journal of Applied Statistics, 31, 799–815.

Greene, W. E., 2012. Econometric Analysis, 7th. ed.. Prentice Hall, Upper Saddle River, NJ.

Judge, G. G. & M. E. Bock, 1978. The Statistical Implications of Pre-Test and Stein-Rule Estimators in Econometrics. North-Holland, New York.

Landsman, Z., S. Vanduffel, & J. Yao, 2013. A note on Stein's lemma for multivariate elliptical distributions. Journal of Statistical Planning and Inference, 11, 2016-2022.

Stein, C. M., 1973. Estimation of the mean of a multivariate normal distribution. Proceedings of the Prague Symposium on Asymptotic Statistics, 345-381.

Willink, R., 2003. Relationships between central moments and cumulants, with formulae for the central moments of gamma distributions. Communications in Statistics - Theory and Methods, 32, 701-704.

Willink, R., 2005. Normal moments and Hermite polynomials. Statistics and Probability Letters, 73, 271-275.

Withers, S. W., 1992. A recurrence relation for the moments of the exponential. New Zealand Statistician, 27, 13-14.

Woodland, A. D., 1979. Stochastic specification and the estimation of share equations. Journal of Econometrics, 10, 361-383.
© 2019, David E. Giles