Econometrics Beat: Dave Giles' Blog: In What Sense is the "Adjusted" R-Squared Unbiased?

In a post yesterday, I showed that the usual coefficient of determination (R²) is an upward -biased estimator of the "population R²", in the following sense. If there is really no linear relationship between y and the (non-constant) regressors in a linear multiple regression model, then E[R²] > 0. However, both E[R²] and Var.[R²] → 0 as n → ∞. So, R² is a consistent estimator of the (zero-valued) population R².

At the end of that post I posed the following questions:

"You might ask yourself, what emerges if we go through a similar analysis using the "adjusted" coefficient of determination? Is the "adjusted R²" more or less biased than R² itself, when there is actually no linear relationship between y and the columns of X?"

Here's the answer.......

We have the following linear multiple regression model:

y = Xβ + ε ; ε ~ N[0 , σ²I_n] (1)

where X is non-random and of full rank, k, and includes an intercept variable as its first column.

Consider the null hypothesis, H₀: "β₂ = β₃ = .... = β_k = 0" vs. H_A: "Not H₀", and let F be the F-statistic for testing H₀. In yesterday's post, we noted that we can write

R² = [(k - 1)F] / [(n - k) + (k - 1)F] ,

where R² is the usual coefficient of determination.

Now, recall that the "adjusted" R² can be written as:

R*² = R² - (1 - R²)[(k - 1) / (n - k)] = R²(n -1) / (n - k) - [(k - 1) / (n - k)]. (2)

From the previous post, if H₀ is true, then:

E[R²] = [(k - 1) / (n - 1)] and Var.[R²] = [(k - 1)(n - k)] / [n (n - 1)²] .

Immediately, it follows from (2) that E[R*²] = 0, and Var.[R*²] = (k - 1) / [n(n - k)].

So, if there is no linear relationship between y and X (and the "population R²" is zero), the adjusted R² is both an unbiased and consistent estimator of that population measure.

Adjusting the usual R² for the degrees of freedom results in an interesting property for this sample statistic.

Econometrics Beat: Dave Giles' Blog

Pages

Wednesday, October 2, 2013

In What Sense is the "Adjusted" R-Squared Unbiased?

3 comments: