Monday, September 30, 2013

Solution to the Regression Trick

In a post earlier this month, I posed the following problem:

A researcher wishes to estimate the regression of y on X by OLS, but does not wish to include an intercept term in the model. Unfortunately, the only econometrics package available is one that "automatically" includes the intercept term. A colleague suggests that the following approach may be used to ‘trick’ the computer package into giving the desired result – namely a regression fitted through the origin: 
Enter each data point twice, once with the desired signs for the data, and then with the opposite signs. That is, the sample would involve ‘2n’ observations – the first ‘n’ of them would be of the form (yi, xi') and the next ‘n’ of them would be of the form (-yi , -xi'). Then fit the model (with the intercept) using all ‘2n’ observations, and the estimated slope coefficients will be the same as if the model had been fitted with just the first ‘n’ observations but no intercept.” 
Is your colleague's suggestion going to work?

The answer is.....
Yes!


The package forces the inclusion of an intercept in the model, so the regression "line" (actually, hyper-plane) will pass through the sample means of the data. However, the sample mean of of every variable we're supplying is zero. So, the regression will pass through the origin, as desired.

If you really want to do the math., this is how it goes:


© 2013, David E. Giles

2 comments:

  1. So, I knew this worked because I've done it before (in terms of forcing the intercept to the origin), but won't your standard errors be underestimated?

    ReplyDelete
    Replies
    1. Yes, they certainly will. In fact, for an original sample size of 'n', and 'k' regressors (excluding the intercept), the reported standard errors will be 'c' times the correct standard errors, where c=SQRT[(n-k)/(2n-k-1)] . Clearly, c <1.

      Delete