Econometrics Beat: Dave Giles' Blog: A Quick Illustration of Pre-Testing Bias

The statistical and econometric literature on the properties of "preliminary-test" (or "pre-test") estimation strategies is large and well established. These strategies arise when we proceed in a sequential manner when drawing inferences about parameters.

A simple example would be where we fit a regression model; test if a regressor is significant or not; and then either retain the model, or else remove the (insignificant) regressor and re-estimate the (simplified) model.

The theoretical literature associated with pre-testing is pretty complex. However, some of the basic messages arising from that literature can be illustrated quite simply. Let's look at the effect of "pre-testing" on the bias of the OLS regression estimator.

Consider the following bivariate regression model:

y_i = β₁x_1i + β₂x_2i + u_i,

where x₁ and x₂ are non-random; and u_i ~ i.i.d. N[0 , σ²] for all i.

Suppose that we adopt the following strategy:

1. Test to see if x₂ is a statistically significant regressor.

2. If our test suggests that it is significant, then retain both x₁ and x₂ in the model.

3. If our test suggests that it is not, then drop x₂ from the regression, and re-estimate the model keeping just x₁as the sole regressor.

Let's focus on the estimator of β₁ that's actually associated with this strategy. If we stop at step 2, denote the (unrestricted) OLS estimator (MLE) as b_1U. If we stop at step 3, denote the (restricted) OLS estimator (MLE) as b_1R.

The null hypothesis that we'll be testing is H₀: β₂ = 0, and the alternative hypothesis is H_A: β₂ > 0 (although nothing of substance changes if we adopt a two-sided H_A).

Given our assumptions about the error term in the model, the obvious (and UMP) test will be a t-test. Denote the associated test statistic by "t₂", and let t_c(α) be the associated critical value when the chosen significance level is α.

So, the"pre-test" estimator of β₁ is of the following form:

β₁^* = b_1U ; if t₂ > t_c(α)

β₁^* = b_1R ; if t₂ ≤ t_c(α)

One final bit of notation will be particularly helpful in exposing the result that I want to illustrate.

Let I_[A](x) be an "indicator function" that takes the value "1" if the random variable, x, lies in the interval A, and is zero otherwise. Let A' be the "complement" to the interval A. So, if x does not lie in A, it lies in A', and vice versa.

Note that I_[A'](x) = 1 - I_[A](x); and I_[A](x)I_[A'](x) = 0 , for any A and x.

Also, I_[A](x) is a binary random variable, as it is a function of x.

Letting x = t₂, we can write our pre-test estimator of β₁ as:

β₁^* = I_{(t_c(α) , ∞)}(t₂) b_1U + I_{[0 , t_c(α)]}(t₂) b_1R

= {1 - I_{[0 , t_c(α)]}(t₂)} b_1U + I_{[0 , t_c(α)]}(t₂)b_1R

= b_1U + I_{[0 , t_c(α)]}(t₂)(b_1R - b_1U) .

Let's think about the biases of b_1U and b_1R, under the assumptions associated with our model.

b_1U is always unbiased, whether H₀ is true or false. (Over-fitting the model will not introduce a bias in the OLS estimator.)
b_1R is unbiased if H₀ is true, but it is biased if H₀ is false. (Omitting a relevant regressor biases the OLS estimator.)

With this in mind, consider the expected value of β₁^*:

E[β₁^*] = E[b_1U] + E{I_{[0 ,t_c(α)]}(t₂) (b_1R - b_1U)}

= β₁ + E{I_{[0 , t_c(α)]}(t₂) (b_1R - b_1U)}.

The second term on the right of the last equation is typically non-zero, and it's going to be messy. That's because t₂ (and hence the indicator function) is not independent of (b_1R - b_1U).

This second term is the bias in the (pre-test) estimator, β₁^*.

So, even without going to all of the trouble to evaluate the bias exactly - and it actually is quite a lot of trouble, even for this very simple model - we can see that pre-testing in the context of OLS estimation will typically introduce a bias.

Is there any circumstance in which this bias will be zero?

A sufficient condition for this will be met if b_1R = b_1U, and this in turn will be satisfied if the OLS estimator of β₂ in the original two-regressor model happens to yield a point estimate of exactly zero. Putting this extreme case to one side, the pre-test estimator of β₁ will be biased.

The basic result demonstrated here extends fully to the case where we have a multiple regression model, and instead of testing a single "zero" restriction we test the validity of a set of independent linear restrictions on the coefficient vector, using an F-test.

Further, although pre-testing generally biases or regression estimator, it also has an impact on the latter's efficiency. In other words, it impacts on the estimator's MSE. This suggests the following question: "Is pre-testing necessarily a 'bad' strategy?"

I'll take this up in more detail in a subsequent post, but in the meantime you might check out my earlier related posts (here and here), and the survey material on pre-testing in Giles and Giles (1993).

Reference

Giles, J. A. and D. E. A. Giles, 1993, “Pre-test estimation and testing in econometrics: Recent developments”, Journal of Economic Surveys, 7, 145-197.

2 comments:

Michael MargolisMay 23, 2016 at 2:58 AM
From a superficial look at the links, I gather you have found that pre-testing can be justified by certain loss functions (e.g., mean squared error) under certain circumstances.

If I understand correctly, this analysis treats the likelihoods as probability distributions over the parameters, which makes sense only if they are are seen as, in particular, the posterior distributions proceeding from an uninformative prior.

Question: is there any case for pre-testing now that doing full Bayesian computation is easy (as of course it was not when you wrote the survey paper)? It seems to me if you can specify a loss function, you might as well report your whole posterior together with expected loss as a function of the parameters. Then the loss-minimizing point-estimate is evident, together with much else; the purpose is clear; and alternative loss functions can be considered without muddling up the statistical starting point.

Note: Only a member of this blog may post a comment.

Econometrics Beat: Dave Giles' Blog

Pages

Sunday, May 22, 2016

A Quick Illustration of Pre-Testing Bias

2 comments: