Wednesday, July 3, 2013

The Adjusted R-Squared, Again

In an earlier post about the adjusted coefficient of determination, RA2, I mentioned the following results that a lot of students don't seem to be aware of, in the context of a linear regression model estimated by OLS:

  1. Adding a regressor will increase (decrease) RA2 depending on whether the absolute value of the t-statistic associated with that regressor is greater (less) than one in value. RA2 is unchanged if that absolute t-statistic is exactly equal to one. If you drop a regressor from the model, the converse of the above result applies.
  2. Adding a group of regressors to the model will increase (decrease) RA2 depending on whether the F-statistic for testing that their coefficients are all zero is greater (less) than one in value. RA2 is unchanged if that  F-statistic is exactly equal to one. If you drop a group of regressors from the model, the converse of the above result applies.
The first of these results is (effectively) stated as Therorem 3.1 in Greene (2012), but the proof is left as an exercise.

In a comment on my previous  post, I was asked if I could supply simple proofs of these results.


In fact, a simple proof of result 1 is given by Haitovsky (1969). This was generalized to a proof of the second result by Edwards (1969).

Notice that as tv2 = F1,v, for any degrees of freedom, v, a proof of the second result effectively gives a proof of the first result.

In case you can't access the Edwards paper cited below, here's my version of his proof, and hence of both of the results stated above. Edwards' proof uses the usual ANOVA table, but I'll phrase things a little differently.

More importantly, my proof will be more general than Edwards'. I'll deal with the case of imposing J exact linear restrictions on the coefficient vector. This includes deleting J regressors from the model (by restricting their coefficients to be zero) as a special case.

There are several references that are relevant at this point, including Leamer (1975), McAleer et al. (1986), Oksanen (1987), and Visco (1978, 1988)

Here we go.............

Suppose that we have a k-regressor linear regression model, which includes an intercept,

                 y = Xβ + ε ,

and we estimate the model using OLS, and a sample of n observations. We'll denote the OLS estimator of beta as b = (X'X)-1X'y. The associated residual vector will be e = (y - Xb).

The "adjusted" (for degrees of freedom) R2 is:

               RA2 = 1 - [e'e / (n - k)] / [sy2] ,

where sy2 = [Σ(yi - ybar)2] / (n - 1) is the usual sample variance of the y data, and "ybar" is the sample mean, [Σyi] / n.

Now, let's consider a set of J independent, linear restrictions on β, of the form Rβ = r, where R and r are non-random,r is (J x 1), and R is (J x k) and of rank J. Let the restricted least squares (RLS) estimator of β be b*. Then, the adjusted R2 associated with the RLS results is RA*2 = 1 - [e*'e* / (n - k + J)] / [sy2] , and the test statistic for testing that the J restrictions are valid can be written as:

                          F = [(e*'e* - e'e) / J] / [e'e / (n-k)] .

We can write:

          [RA2 / RA*2 ] = [sy- e'e / (n - k)] / [sy- e*'e* / (n - k + J)] .

Then, we can see that

          RA2  > RA*2   if  e*'e* > e'e + e'e (J / (n - k) ;

or, if
          (e*'e* - e'e) / J > e'e / (n - k)  ;

or if
          [(e*'e* - e'e) / J] / [e'e / (n - k)] = F > 1  .

There's the result: The adjusted R2 will increase when we add the J regressors if F > 1. Similarly, it will decrease (stay the same) if F < 1 (F = 1).

By the way, notice that the result(s) are purely algebraic ones. They're not statistical ones. They hold as long as the OLS estimator is defined - that is, no matter how much multicollinearity there is, as long as it's not "perfect" (i.e., as long as X has full column rank, k). They hold whether the regression errors are normal or not; serially independent or not; homoskedastic or not; and the hold even if the regressors are random and correlated with the errors.

Of course, the t-statistics and F-statistics wouldn't always follow their standard distributions in many of these cases, and I'm not saying we should still be using OLS! I'm just pointing out that these results are purely algebraic (or geometric) in nature, and hold in the same way that we cannot decrease the usual R2 by adding any regressor to the model.

These results extend in a straightforward way to the case where Instrumental Variables estimation, rather than OLS estimation, is used. See Giles (1989).


References

Edwards, J. B., 1969. The relation between the F-test and R-bar2. The American Statistician, 23(5), 28.

Giles, D. E. A., 1989. Coefficient sign changes when restricting regression models under instrumental variables estimation. Oxford Bulletin of Economics and Statistics, 51, 465-467.

Haitovsky, Y., 1969. A note on the maximization of R-bar2. The American Statistician, 23(1), 20-21.

Leamer, E. E., 1975. A result on the sign of restricted least-squares estimates. Journal of Econometrics, 3, 387-390.

McAleer, M., A. R. Pagan, A. and I. Visco, 1986. A further result on the sign of restricted least-squares estimates. Journal of Econometrics, 32, 287-290.

Oksanen, E. H., 1987. On sign changes upon deletion of a variable in linear regression analysis. Oxford Bulletin of Economics and Statistics, 49, 227-229.

Visco, I., 1978. On obtaining the right sign of a coefficient estimate by omitting a variable from the regression. Journal of Econometrics, 7, 115-117.

Visco, I., 1988. Again on sign changes upon deletion of a linear regression. Oxford Bulletin of Economics and Statistics, 50, 225-227.


© 2013; David E. Giles

2 comments:

  1. Dave -
    I was aware of this property but never made an effort to go through the math - until I've read this post. The way you showed it was very instructive. Thanks!

    btw, I think your second equation ought to read R2 = 1 - (e'e/n) / [sy2]

    Best,

    Boris

    ReplyDelete
    Replies
    1. Boris - thanks for spotting that - I've amended the text.

      DG

      Delete