Wednesday, August 10, 2011

Flip a Coin - FIML or 3SLS ?

When it comes to choosing an estimator or a test in our econometric modelling, sometimes there are pros and cons that have to weighted against each other. Occasionally we're left with the impression that the final decision may as well be based on computational convenience, or even the flip of a coin.

In fact, there's usually some sound basis for selecting one potential estimator or test over an alternative one. Let's take the case where we're estimating a structural simultaneous equations model (SEM). In this case there's a wide range of consistent estimators available to us.

There are the various "single equation" estimators, such as 2SLS or Limited Information Maximum Likelihood (LIML). These have the disadvantage of being asymptotically inefficient, in general, relative the "full system" estimators. However, they have the advantage of usually being more robust to model mis-specification. Mis-specifying one equation in the model may result in inconsistent estimation of that equation's coefficients, but this generally won't affect the estimation of the other equations.

The two commonly used "full system" estimators are 3SLS and Full Information Maximum Likelihood (FIML). Under standard conditions, these two estimators are asymptotically equivalent when it comes to estimating the structural form of an SEM with normal errors. More specifically, they each have the same asymptotic distribution, so they are both asymptotically efficient.

On the face of it, then, it might seem that there's really nothing to choose between these two estimators on statistical grounds, at least in large enough samples. Using this reasoning, it's sometimes suggested that we may as well use 3SLS rather than FIML, because the former estimator is much simpler to compute than the latter, and we don't run into the problem of making sure we've located a global maximum of the likelihood function when computing the FIML estimates. (I've posted on this computational issue associated with MLEs here.)

The trouble with this line of reasoning, however, is that it ignores a couple of important issues. First, this equivalence between 3SLS and FIML holds only asymptotically - that is, if the sample is infinitely large. What about the situation where we have just a modest sized sample?

First, in an unpublished paper, Sargan (1970) proved that in finite samples the sampling distribution of the FIML estimator has Cauchy-like tails. This implies that the mean and higher-order moments are not defined. It doesn't even make sense to ask, "is this estimator unbiased", because the mean of its sampling distribution doesn't exist. Equally, it makes no sense to try and compare the MSE of this estimator with that of another estimator, in finite samples. This MSE isn't well-defined either.

So, some people would see this as a strike against the FIML estimator, and in favour of the 3SLS estimator, as the latter doesn't suffer from this particular problem. (See Sargan (1978).)

Second, notice that this asymptotic equivalence between 3SLS and FIML relates to the estimation of the structural form (SF) parameters. These parameters can be of great interest. If the model is linear in the variables and parameters, these parameters are the marginal effects between the exogenous variables and the endogenous variables, just like in a linear OLS regression model. These parameters can be used to compute other economically interesting quantities, such as elasticities.

However, there's more to an SEM than just its structural form. If we want to use the model  for computing multipliers, or for forecasting (as is very often the case), then we need to use the reduced form of the model. To retain the information in the identifying restrictions associated with the structural form, we solve the latter for the restricted reduced form (RRF) of the model. The estimates of the structural form parameters are manipulated to give us the estimates of the parameters in the corresponding restricted reduced form of the model.

Now, if we focus on the restricted reduced form, what can we say by way of comparing the pros and cons of the (derived) 3SLS and FIML estimates? At this point, some other interesting differences arise, and these can also help us in choosing between these two estimators.

Let me explain.

McCarthy (1972) proved that the estimator for the RRF coefficients derived from the 2SLS estimates of the SF possesses no moments. That is, its mean, variance, etc. are all infinite, because the integrals associated with their calculation don't converge.  In the early 1970's, Denis Sargan  proved the same result for the 3SLS estimator of the RRF. See Sargan (1988).

Now, that's a bit of a worry in practice. When the 3SLS estimates of the RRF parameters are combined with the exogenous data to generate forecasts, for instance, these forecasts can be all over the map!
On the other hand, this problem doesn't arise when we estimate the SF by FIML, and then obtain the corresponding estimates of the parameters of the RRF. Also in the early 1970's, Sargan proved that the finite sample moments of the FIML/RRF estimator exist up to the order (TM - G), where T is the sample size; M is the number of endogenous variables in the system; and G is the number of predetermined variables in the model.

So, to me this makes a pretty strong case in favour of using FIML, rather than 3SLS, if you have any interest at all in the RRF of the model that you're estimating, and if your sample size is modest. On the other hand, if it's the structural coefficients that you're interested in, and your sample is relatively small, the 3SLS is a pretty compelling choice, on both statistical and computational grounds.

These pros and cons are something to keep in mind, that's for sure. You don't have to flip a coin!

Note: The links to the following references will be helpful only if your computer's IP address gives you access to the electronic versions of the publications in question. That's why a written References section is provided.


McCarthy, M. D., 1972. A note on the forecasting properties of two stage least squares restricted reduced forms - the finite sample case. International Economic Review, 13, 757-761.

Sargan, J. D., 1970. The finite sample distribution of FIML estimators. Econometric Society World Congress, Cambridge.

Sargan, J. D., 1978. On the existence of the moments of 3SLS estimators. Econometrica, 46, 1329-1350.

Sargan, J. D., 1988. The existence of the moments of estimated reduced form coefficients. Chapter 6 in J. D. Sargan, Contributions to Econometrics, (ed., E. Maasoumi), Cambridge University Press, Cambridge, 133-157.

© 2011, David E. Giles


  1. Dear Sir,
    Thank you very much for contributing post. I need some clarification regarding FIML, What is under-indentification problem? How can we solve the problem using exclusion restriction? explanation with an example would be very much contributing.

    Kind Regards
    K. Sharmin

    1. Under-identification arises when the parameters can't be determined uniquely. It is logically an issue prior to the matter of estimation. If any parameter is under-identified, so is the equation in question. In this case, there does not exist ANY consistent estimator of that equation. If any equation is under-identified, then so is the whole SEM.

      We usually check for identification using the so-called "rank" and "order" conditions. The former is both necessary and sufficient, and the latter is necessary. The order condition is easy to check just by looking at each equation. If K is the total number of predetermined variables in the entire system, then the order condition for the i'th equation is that (K-Ki) >= (Mi-1). Here, Ki is the number of predetermined variables included in the i'th equation, and Mi is the number of endogenous variables (including the dependent variable) that are included in the i'th equation.

      Note that (K-Ki) is the number of predetermined variables EXCLUDED from the i'th equation. So, an under-identified equation can be made identifiable if we exclude additional predetermined variables from that equation. Of course, these exclusion restrictions would need to make economic sense.

      If an equation is under-identified, the 2SLS estimator is not even defined, as one of the matrices that has to be inverted in the construction of this estimator will be singular. The same problem arises if you try to apply 3SLS or FIML to a system in which any equation is under-identified.