Saturday, November 8, 2014

A Reverse Regression Inequality

Suppose that we fit the following simple regression model, using OLS:

            yi = βxi + εi   .                                                              (1)

To simplify matters, suppose that all of the data are calculated as deviations from their respective sample means. That's why I haven't explicitly included an intercept in (1). This doesn't affect any of the following results.

The OLS estimator of β is, of course,

            b = Σ(xiyi) / Σ(xi2) ,

where the summations are for i = 1 to n (the sample size).

Now consider the "reverse regression":

           xi = αyi + ui   .                                                             (2)

The OLS estimator of α is

           a = Σ(xiyi) / Σ(yi2).

Clearly, a ≠ (1 / b), in general. However, can you tell if a ≥ (1 / b), or if a ≤ (1 / b)?

The answer is, "yes", and here's how you do it.

The trick is to recall the Cauchy-Schwarz Inequality. One variant of this inequality tells us that

          [Σ(xiyi)]2 ≤ [Σ(xi2) Σ(yi2)]  .                                          (3)

Now,  (ab) =  [Σ(xiyi)]2 / [Σ(xi2) Σ(yi2)] .

So, immediately, from (3),

          (ab) ≤  [Σ(xi2) Σ(yi2)] / [Σ(xi2) Σ(yi2)] = 1 ,

          a ≤ (1 / b) ;   if b > 0

          a ≥ (1 / b) ;    if b < 0 ,                                            (4)

regardless of the sample values for the data.

Now, here are some questions for you to think about:
  1. Under what circumstances will (4) hold as an equality?
  2. What can you say about the relationship between the two R2 values that we get when we estimate (1) and (2) by OLS?
  3. What can you say about the relationship between the t-ratios for testing H0: β = 0 in (1); and for testing H0': α = 0 in (2)?

© 2014, David E. Giles


  1. When the x and y have the same variance the bivariate plot is circular?

  2. Dave: speaking of reverse regression, and recognising that my econometrics is very weak, I would be grateful if you would tell me if I am totally out to lunch on this post:

  3. The answer is "It depends".

    The equation (4) holds only if b > 0.

  4. For the third question you poise at the end of this post (the question about the t-ratios), should the t-statistic be identical? If so, what happens if x is measured with error? Will the rejection rates be different?

    1. See the follow-up post:

      If x is measured with error, this will affect the rejection rate(s).

  5. another question:

    Suppose you estimate

    y_i = b_0 + b_1x_1_i + b_2x_2_i (1)


    e_i=z_0 + z_1_i (2)

    where e is the residual of a regression of y on a constant and x_2.

    Show |z_1| is smaller/equal |b_1| and
    how to change (2) such that z_1=b_1


Note: Only a member of this blog may post a comment.