Econometrics Beat: Dave Giles' Blog: Let's be Consistent

Thursday, October 18, 2012

Let's be Consistent

One of the standard, large-sample, properties that we hope our estimators will possess is "consistency". Indeed, most of us take the position that if an estimator isn't consistent, then we should probably throw it away and look for one that is!

When you're talking about the consistency of an estimator, it's a really good idea to be quite clear regarding the precise type of consistency you have in mind - especially if you're talking to a statistician! For example, there's "weak consistency", "strong consistency", "mean square consistency", and "Fisher consistency", at least some of which you'll undoubtedly encounter from time to time as an econometrician.

When we first meet the concept of a “consistent” estimator, as students, we usually learn about what is actually called “mean square consistency”. This notion is usually described in the following manner, where for simplicity θ* is an estimator of a scalar-valued parameter, θ, based on a sample of size n:

If (i) Bias(θ*) → 0 as → ∞ ;

and (ii) var.(θ*) → 0 as n →∞ ,

then θ* is a “mean square consistent” estimator of θ.

Note that the two conditions above imply that the mean squared error of the estimator converges to the value zero as n grows without limit – hence the terminology. If the parameter and its estimator are vectors, then we simply replace the “variance” with the “covariance matrix” in condition (ii).

Basically, what is happening under mean square consistency is that the density associated with the estimator's sampling distribution is collapsing to a degenerate "spike", located exactly at the true value of θ, when the sample size grows without limit.

This type of consistency is actually rather a strong property. Specifically, it requires that E(θ*) and E(θ*² ) are both defined, for all values of n. These expectations are only defined if the underlying integrals converge. (Recall that for any continuous random variable, X, E(X) = ∫xp(x)dx.) These integrals actually diverge for many distributions. For example, in the case of the Student-t distribution with v degrees of freedom, these two expectations are defined only if v > 2.

So, mean square consistency is a nice property, but it may not be possible to even talk about its existence in some cases.

However, a weaker form of consistency, based on the notion of the “probability limit”, or the “plim” can be used, even when we can’t consider mean square consistency. We say that our estimator "converges in probability" to the true parameter value, or is “weakly consistent”, if plim(θ*) = θ. That is, if lim_{(n→ ∞)}Pr.[|θ* - θ| < ε] = 1, where ε is some arbitrarily small positive number.

You may have been told that if an estimator is mean square consistent, then it must also be weakly consistent, but that the converse is not necessarily true. Let’s see what is involved in establishing this result. Essentially, we just exploit a well-known theorem from probability theory, known as Chebyshev’s Inequality.

(Sometime’s you’ll see this Russian mathematician’s name spelled differently. That’s because there is often more than one acceptable transliteration from the Cyrillic alphabet to ours.)

Anyway, here’s the inequality in question:

Let X be a random variable and let g(.) be a non-negative real-valued function.

Then, Pr.[g(x) ≥ k] ≤ E[g(x)] / k, for all k > 0.

Now, consider the probability that we use when defining weak consistency, namely Pr.[|θ* - θ| < ε].

Note that,

Pr.[|θ* - θ| < ε] = Pr.[(θ* - θ)² < ε² ] = 1 - Pr.[(θ* - θ)² ≥ ε²] . (1)

Also,

Pr.[(θ* -θ)² ≥ ε²] ≤ E(θ* -θ)² / ε² ,

by Chebyshev’s inequality.

(Here, g(θ*) is (θ* - θ)², which is clearly non-negative, as required.)

Now, if the estimator is mean square consistent, this means that E(θ* - θ)² → 0 as n → ∞.

So, if the estimator is mean square consistent, then from (1),

lim_(n→∞)Pr.[|θ* - θ| < ε] = (1 - 0) = 1.

That is, the estimator is also weakly consistent.

So, we have our result: mean square consistency implies weak consistency.

On the other hand, weak consistency does not imply mean square consistency. One counter-example will suffice to show this.

If we have a simple random sample of n observations, from a population with a finite mean and variance, then the sample average, x* = (Σ_ix_i) / n is both a mean square consistent, and a weakly consistent estimator of the population mean, μ. Consider using (1 / x*) as an estimator of (1 / μ).

This estimator is weakly consistent, by Slutsky's Theorem. However, if the population is Normal, then neither E[1 / x*] nor E[(1 / x*)²] exist. This implies that this estimator cannot be mean square consistent.

Finally, here's a simple regression example to illustrate some of the above points.

Suppose we have a simple regression model, where the only regressor is a time-trend variable, That is x = 1, 2, 3, ...., n. So, the model is:

y_t = β₁ + β₂x_t + ε_t ; ε_t ~ [0, σ²]

and the errors are serially independent. Given the zero mean for the errors, and the non-random regressor, the OLS estimators (b₁ and b₂) of β₁and β₂ are unbiased. You should be able to show that

var.(b₁) = 2σ²[(n + 1)(2n + 1)] / [n(n - 1)(n + 1)]
and
var.(b₂) = 12σ² / [n(n - 1)(n + 1)].

Clearly, both of these variances go to zero as n grows without limit, and both estimators are unbiased. Hence, they are both mean square consistent estimators. It follows that they are both also weakly consistent estimators.

11 comments:

SrkOctober 18, 2012 at 7:34 PM
Thanks for your grateful informations, this blogs will be really help for students blogs.
ReplyDelete
Replies
Mark SchafferOctober 21, 2012 at 12:08 PM
Great post, Dave. The textbooks usually (always?) start with plims and then build up to stronger forms of consistency, but you've convinced me that starting with convergence in mean square is the right way to teach it. I have to lecture on this in a couple of weeks, so I will give it a try straight away (and will cite you, of course!).

--Mark
ReplyDelete
Replies
Mark SchafferOctober 31, 2012 at 3:55 PM
Hi Dave. The lecture went pretty well, actually. I think your basic idea was absolutely right - instead of doing what most (all?) of the textbooks do and start with the weakest notion of consistency and then work up through stronger versions, start instead with a strong notion that's easier for students to get their heads around and then work down through weaker notions.

At least, I *think* the lecture went pretty well. I've directed my students to your blog and to this entry in particular, so maybe some of them will want to comment on how this method worked from their perspective.

--Mark
ReplyDelete
Replies
UnknownApril 3, 2014 at 4:02 PM
Many thanks for this exposition. Though I'm studying it lately, I believe it will help me in subsequent work and research. I'm also a student of Mark Schaffer.
ReplyDelete
Replies
AnonymousOctober 12, 2014 at 4:42 AM
Thanks for the post!
ReplyDelete
Replies

Add comment

Note: Only a member of this blog may post a comment.

Pages

Thursday, October 18, 2012

Let's be Consistent

11 comments: