Sunday, October 7, 2012

Dancing With the Econometricians

Let's talk about the two-step. Not the tango or the polka. The two-step!

More specifically let's talk about a particular two-step estimator that we use all of the time in econometrics. I want to clear up some misconceptions that I seem to encounter all too frequently when I read empirical "applied" papers.

Why is it that some people insist on using the term "Two Stage Least Squares" inappropriately?

Let me explain what I mean.
Suppose that we have a system of M simultaneous equations, where the (ith) structural equation that we want to estimate can be written as:
yi = Yiγi + Xiβi + εi .

We'll assume that this equation is identified. Yes, let's be careful about this. If the "rank condition" for identification of the equation isn't satisfied, then the 2SLS estimator won't be weakly consistent. Even worse, if the "order condition" isn't satisfied then this estimator isn't even defined (because a matrix that has to be inverted will be singular).

The (mi) columns of the Yi matrix are endogenous regressors.  The (ki) columns of the Xi matrix are "predetermined" regressors. That is, they are strictly exogenous regressors (often including an "intercept" variable); or else they are lagged endogenous or exogenous variables (if we have time series data). The elements of the error term, εi, have a zero mean, and are homoskedastic and serially independent.

Remember that this equation is just one of M in the entire simultaneous system. Let's use the symbol, X, to denote the matrix of observations on all of the predetermined variables that appear anywhere in the system.
Suppose that X has K columns.
The 2SLS estimator of the coefficients in equation (1) can be described as follows:
1. Using OLS, regress Yi on all (yes, ALL) of the columns of X, and get the matrix of predictions, Yi* = X(X'X)-1X'Yi.
2. Replace Yi in (1) with Yi*, and then use OLS to estimate this modified equation, yielding consistent estimates of the elements of γi and βi.

Notice the words that I've emphasized in the first stage.

Now, a bit of history - just to provide some light relief in the middle of the math! The 2SLS estimator emerged as a computationally convenient way of obtaining consistent estimates of the coefficients in an equation of the form (1). At the time that 2SLS was proposed by Theil (1953a, 1953b, 1954), Basmann (1957), and Sargan (1958), computational convenience was a big deal. In fact, it remained a big deal for along time - e.g., see here. Anderson (2005) has recently drawn our attention to the fact that the 2SLS estimator was used (at least "obliquely") in the even earlier papers by Anderson and Rubin (1949, 1950) that introduced the LIML estimator and derived that estimator's asymptotic distribution.

So, what is it that I get upset about? Well, it's become very common to see the following two-step estimator described as "2SLS":
1. Using OLS, regress Yi on a set of suitable instruments, Z, and get the matrix of predictions, Yi** = Z(Z'Z)-1Z'Yi.
2. Replace Yi in (1) with Yi**, and then use OLS to estimate this modified equation, yielding estimates of the elements of γi and βi.
This is often done in the context of a single structural equation, without a complete simultaneous equations model in sight. There's nothing wrong with this (provided that Xi is included in Z - see Angelo Melino's comment below) - but please don't refer to this as 2SLS! Unless Z = X, it's just an instrumental variables estimator, constructed in two steps.

In fact, although this second approach yields appropriate estimates of the coefficients, it won't give you the correct standard errors, or the correct values of anything else that is constructed from the second-step residuals. That's because the latter residual vector is of the form ei** = yi - (Yi**gi + Xibi), instead of the correct form, ei = yi - (Yigi - Xibi). Here, gi and bi are the estimated coefficient vectors. Of course, this is easily fixed, just as it is in the case of the genuine 2SLS estimator, but it's simpler to just use the one-step instrumental variables estimator rather than the (otherwise) equivalent two-step estimator.

Confession: One situation where the two-step approach is helpful is if you are worried that Z may include some "weak" instruments. In that case, the F-statistic for the significance of the estimated coefficients in of the first step provides one basis for testing against such weakness. But that's another story!

I have to admit that as much as I enjoy using the EViews econometrics package, I find it intensely irritating that when it comes to instrumental variables estimation, the command is titled "Two Stage Least Squares"!

In another post, coming up this week, I'll explore the various connections between the (real) 2SLS estimator and instrumental variables estimation. There are more of these connections than you might think.

Meantime, call me pedantic if you will, but if you want to dance with the econometricians then you have stay in time with the music, and you have to know that a samba is not the same as a rumba!

References
Anderson, T. W., 2005. Origins of the limited information maximum likelihood and two-stage least squares estimators. Journal of Econometrics, 127, 1-16.

Anderson, T.W. and H. Rubin, 1949. Estimation of the parameters of a single equation in a complete system of stochastic equations. Annals of Mathematical Statistics, 20, 46–63.

Anderson, T.W. and H. Rubin, 1950. The asymptotic properties of estimates of the parameters of a single equation in a complete system of stochastic equations. Annals of Mathematical Statistics, 21, 570–582.

Basmann, R.L., 1957. A generalized classical method of linear estimation of coefficients in a structural equation. Econometrica 25, 77–83.

Sargan, J.D., 1958. Estimation of economic relationships using instrumental variables. Econometrica, 67, 557–586.

Theil, H., 1953a. Repeated least-squares applied to complete equation systems. Centraal Planbureau Memorandum.

Theil, H., 1953b. Estimation and simultaneous correlation in complete equation systems. Centraal Planbureau Memorandum. (Reprinted In: Raj, B., Koerts, J. (Eds.), 1992, Henri Theil’s Contributions to Economics and Econometrics, Vol. 1, Kluwer, Dordrecht.)

Theil, H., 1954. Estimation of parameters in econometric models. Bulletin of the International Statistical Institute, 34, 122–129.

1. Pedantic, but informative!

2. I enjoyed this post on TSLS since I plan to use it as one
of my methods for my Master Thesis. I look forward to future post on TSLS.

Thanks!

3. Dave,

Can I take a contrary view? Or maybe you can correct me if I've misunderstood your post.

I've checked some of my more recent favourite rigorous econometrics texts, and the presentation and usage of the term "2SLS" seems to be the same as the one you are complaining about: estimate a single equation using IV, but do it in two stages.

For example, Davidson and MacKinnon, Econometric Theory and Methods (2004), say on pp. 323-4, "The IV estimator (8.29) is commonly known as the two-stage least-squares, or 2SLS, estimator, because, before the days of good econometrics software packages, it was often calculated in two stages using OLS regressions. ... Two-stage least squares was invented by Theil (1953) and Basmann (1957) at a time when computers were very primitive. Consequently, despite the classic papers of Durbin (1954) and Sargan (1958) on instrumental variables estimation, the term 'two-stage least squares' came to be very widely used in econometrics, even when the estimator is not actually computed in two stages. We prefer to think of two-stage least squares as simply a particular way to compute the generalized IV estimator...."

This is how I've always thought about 2SLS. It also seems to be the way the term is used in the other textbooks I've checked (Greene, Hayashi, Stock & Watson are the ones at hand). But you seem to be saying that 2SLS is a term that should be applied only to estimation of a system, and not to single-equation estimation. Is that right?

Personally ... I think the term "two-stage least squares" is terrible and should be avoided. It's very confusing for students, because they can easily be misled into thinking that the "two stages" are somehow integral to the definition of the estimator, when they're not - they're just a way of calculating the thing. These days we never calculate it in two stages, but it's the same estimator.

This is very different from "two-step estimators" which have to be done in two steps by their very nature. I have in mind Feasible GLS (1st step - estimate the variance components, 2nd step - get the coeffs); 2-step Efficient GMM (1st step - estimate the var-cov matrix of orthogonality conditions, 2nd step - get the coeffs); SUR; 3SLS; etc.

Personally, if I could I'd ban the term "two-stage least squares" entirely, and use "two-step" consistently for estimators that are necessarily done in two steps. But I suspect I am in the minority on this one.

--Mark

4. Mark - thanks for this. No, you haven't misunderstood my post. The term 2SLS has a specific historical meaning. We now know that it's just a particuolar IV estimator - that's how we usually prove its consistency.

I agree that it would be good if the "two-stage"/"two-step" terminology was now dropped. That's why I'd much prefer to see an "IV" command in EViews instead of ther current "2SLS" command.

5. I've rummaged around some more, and the oldest econometrics books I have at hand (Johnston and Kmenta, though not the first editions) do introduce "2SLS" in the context of system estimation. But I don't think "2SLS" is used in that way any more, at least in common parlance (the "econometric vulgate", perhaps).

I see you have another post upcoming on IV et al., which I quite look forward to reading. I agree it would be great to drop the "two stage" terminology for this estimator, but I have a feeling that might complicate your drafting task unduly!

Cheers,
Mark

6. Thanks Mark!

7. Prof. Giles,

So you mean to say that when I use EViews and its TSLS
command in my research, I should now label my methodology
as "IV estimation method"? Thanks.

8. Yes - absolutely. And you should state what instruments you've used.

9. I agree with the message of your post but not with one of the details.

You write
"2. Replace Yi in (1) with Yi**, and then use OLS to estimate this modified equation, yielding estimates of the elements of γi and βi.
This is often done in the context of a single structural equation, without a complete simultaneous equations model in sight. There's nothing wrong with this - but please don't refer to this as 2SLS!"

In fact, there is a problem. In general, replacing Yi with Yi** leads to an inconsistent estimator because we can't guarantee that Xi will be orthogonal to Yi-Yi** unless Xi is included in the set of instruments Z.

1. Angelo - absolutely correct! In practice most people would do this, but it definitely should have been emphasised. I've amended the post accordingly.
Best, DG

10. Dear Prof.Dave
Can we solved recursive model with 2sls method?
Exmple:
Y1=f(X1,X2,X3)
Y2=f(Y1,X1,X2,X3)
Y3=f(Y1,Y2,X1,X2,X3)

1. Hi - a recursive system is one that has the type of structure that you mention, AND the error covariance matrix is DIAGONAL. This implies that the errors in the different equations are independent of each other. In this case, OLS estimation can be applied.

However, it is most unlikely that the errors will be contemporaneously uncorrelated. In this case, you can just use 2SLS estimation, as you would for any other SEM.

11. post amazing