Econometrics Beat: Dave Giles' Blog: Beyond MSE - "Optimal" Linear Regression Estimation

Thursday, October 10, 2013

Beyond MSE - "Optimal" Linear Regression Estimation

In a recent post I discussed the fact that there is no linear minimum MSE estimator for the coefficients of a linear regression model. Specifically, if you try to find one, you end up with an "estimator" that is non-operational, because it is itself a function of the unknown parameters of the model. It's note really an estimator at all, because it can't be computed.

However, by changing the objective of the exercise slightly, a computable "optimal estimator" can be obtained. Let's take a look at this.

We'll consider the same model as in the earlier post. Although it's the simple linear regression model without an intercept, the basic result generalizes to the usual multiple linear regression model. So, our model, with a non-random regressor, is:

y_i = βx_i + ε_i ; ε_i ~ i.i.d. [0 , σ²] ; i = 1, 2, ...., n.

Let β* be any linear estimator of β, so that we can write β* = Σa_iy_i, where the a_i's are non-random weights, and all summations (here and below) are taken over i (or, later j) = 1 to n.

So, E[β*] = βΣ(a_ix_i) , and

Bias[β*] = β[Σ(a_ix_i) - 1]. (1)

Similarly,

var.[β*] = Σ[a_i²var.(y_i)] = σ²Σ(a_i²). (2)

Now, instead of trying to find the a_i weights (and hence the β*) that minimizes MSE, let's try and find the a_i's that lead to a β* that minimizes the quantity

Q = α[ var.(β*) / σ²] + (1 - α)[Bias(β*) / β]² , (3)

where α is any number satisfying 0 < α < 1. The quantity we're going to minimize is a weighted sum of the relative variance and the squared relative bias of our linear estimator.

Notice that if we choose α = 0, then β* is just the OLS estimator; and we choose α = 1, then β* = 0 (which is not exactly an interesting "estimator").

From (3), we get:

(∂Q / ∂a_j) = 2αa_j + 2(1 - α)x_j[∑(a_ix_i) - 1] (4)

Setting all of the equations in (4) to zero, multiplying by y_j, and summing over j, we get:

αβ* + (1 - α)∑(x_jy_j)[∑(a_ix_i) - 1] = 0 . (5)

Similarly, setting all of the equations in (4) equal to zero, multiplying by x_j, and summing over all j, we get:

α∑(a_jx_j) + (1 - α)∑(x_j²)[∑(a_ix_i) - 1] = 0 . (6)

From (6),

∑(a_jx_j) = ∑(a_ix_i ) = (1 - α)∑(x_j²) / [α + (1 - α)∑(x_j²)] . (7)

Using (7) in (5), and solving for β*, we get:

β* = b [(1 - α)∑(x_j²)] / [α + (1 - α)∑(x_j²)] , (8)

where b is the OLS estimator of β.

The estimator in (8) can be computed for any chosen value of α; and for 0 < α < 1 β* is a shrinkage estimator - it shrinks the OLS estimator towards the origin.

6 comments:

mark leedsOctober 10, 2013 at 11:49 AM
that's interesting dave. if you had included an intercept, would it still act as a shrinkage estimator ?
ReplyDelete
Replies
Mark SchafferOctober 10, 2013 at 1:02 PM
Dave,

Thanks for a really interesting couple of posts - good stuff!

Can we say something concrete about how the two estimators are related? If I'm not mistaken, the feasible estimator #2 is identical to the infeasible estimator #1 if you set alpha = sigma^2 / (sigma^2 + beta^2).

Of course we don't know sigma^2 or beta^2, but then alpha has to come from somewhere. Can we use the OLS estimates, set alpha = s^2 / (s^2 + b^2), and say our estimator (a) minimizes the weighted sum of the relative variance and the squared relative bias, and (b) uses an estimate of the "optimal" weight (where "optimal" means we would like to minimize the MSE if we could). Kind of hand-wavey, I know.

You suggested this option at the end of your first post, and I wonder if there's anything more concrete that can be said about it. (I suspect not - otherwise you probably would have said it! - but hope springs eternal.)

--Mark
ReplyDelete
Replies

Add comment

Note: Only a member of this blog may post a comment.

Pages

Thursday, October 10, 2013

Beyond MSE - "Optimal" Linear Regression Estimation

6 comments: