Econometrics Beat: Dave Giles' Blog

Tuesday, February 17, 2015

Non-Existent Instruments

Consider the following abstract for an econometrics paper:

"The method of instrumental variables (IV) and the generalized method of moments (GMM), and their applications to the estimation of errors-in-variables and simultaneous equations models in econometrics, require data on a sufficient number of instrumental variables that are both exogenous and relevant. We argue that, in general, such instruments (weak or strong) cannot exist."

This is, in fact, the abstract for a recent paper by Hall et al. (2014), and when I first read it I was definitely intrigued!

Recall that when we look for instruments we need to find variables that are, on the one hand, (asymptotically) uncorrelated with the errors of our regression model; but are, on the other hand, highly correlated (asymptotically) with the random regressors. The abstract, and the paper itself (of course) suggests that usually this objective is not achievable.

Why is this?

The difficulty arises if we view the error term in our regression equation as arising from various mis-specifications in the model. The authors argue that this interpretation is generally appropriate in econometric applications. Building on earlier work by Pratt and Schlaifer (1988), they show that in this case it's generally the situation that the error is a function of the very regressors that we're trying to "instrument". That being the case, legitimate instruments will be unattainable.

Food for thought!

References

Hall, S. G., P. A. V. B. Swamy, and G. S. Tavlas, 2014. On the interpretation of instrumental variables in the presence of specification errors. Working Paper 14/19, Department of Economics, University of Leicester.

Pratt, J. W. and R. Schlaifer, 1988. On the Interpretation and observation of laws. Journal of Econometrics, 39, 23-52.

© 2015, David E. Giles

Monday, February 16, 2015

The Econometric Game, 2015

If you're a grad. student with an interest in econometrics, you've probably heard about The Econometric Game. It's been covered before on this blog (e.g., here, last year).

The 2015 Econometric Game is the sixteenth in the series, and it will take place at the University of Amsterdam, between 31 March and 2 April this year. You can find the list of participating universities here.

Last year's winner was the University of Copenhagen. Who's your pick for 2015?

Sunday, February 15, 2015

Testing for Multivariate Normality

The assumption that multivariate data are (multivariate) normally distributed is central to many statistical techniques. The need to test the validity of this assumption is of paramount importance, and a number of tests are available.

A recently released R package, MVN, by Korkmaz et al. (2014) brings together several of these procedures in a friendly and accessible way. Included are the tests proposed by Mardia, Henze-Zirkler, and Royston, as well as a number of useful graphical procedures.

If for some inexplicable reason you're not a user of R, the authors have thoughtfully created a web-based application just for you!

Reference

Korkmaz, S., D. Goksuluk, and G. Zarasiz, 2014. An R package for assessing multivariate normality. The R Journal, 6/2, 151-162.

© 2015, David E. Giles

New Canadian Provincial Data

I was delighted to see the release, last week, of a new Statistics Canada research paper, "Provincial Convergence and Divergence in Canada, 1926 to 2011". Co-authored by W. Mark Brown and (UVic grad.) Ryan McDonald.

It's a very interesting paper that makes use of some equally interesting new data that Statistics Canada has released recently. For once, Statistics Canada has gone to considerable efforts to assemble a really useful long-run data-set (see Cansim Table 384-5000 This is a far cry from the myriad of "Series Discontinued" flags that we're used to seeing in the Cansim database, and it's great to see that Ryan has been instrumental in its development (McDonald, 2015).

As he notes, "More advanced statistical methods, and models with greater breadth and depth, are difficult to apply to existing fragmented Canadian data. The longitudinal nature of the new provincial dataset remedies this shortcoming."

Data is an econometrician's life-blood. We need to see more of this from Statistics Canada.

Reference

McDonald, R., 2015. Constructing Provincial Time Series: A Discussion of Data Sources and Methods. Income and Expenditure Accounts Technical Series, no. 77 Statistics Canada Catalogue no. 13-604-M. Ottawa: Statistics Canada.

Wednesday, February 4, 2015

Four Different Types of Regression Residuals

When we estimate a regression model, the differences between the actual and "predicted" values for the dependent variable (over the sample) are termed the "residuals". Specifically, if the model is of the form:

y = Xβ + ε , (1)

and the OLS estimator of β is b, then the vector of residuals is

e = y - Xb . (2)

Any econometrics student will be totally familiar with this.

The elements of e (the n residuals) are extremely important statistics. We use them, of course, to construct other statistics - e.g., test statistics to be used for testing the validity of the underlying assumptions associated with our regression model. For instance, we want to check, are the errors (the elements of the ε vector) serially independent; are the errors homoskedastic; are they normally distributed; etc.?

What a lot of students don't learn is that these residuals - let's call them "Ordinary Residuals" - are just one type of residuals that are used when analysing the regression model. Lets take a look at this.

Some Suggested Reading

Bahoc, F., H. Leeb, and B. M. Potscher, 2014. Valid confidence intervals for post-model-selection predictors. Working Paper, Department of Statistics, University of Vienna.
Baumeister, C. and J. D. Hamilton, 2014. Sign restrictions, structural vector autoregressions, and useful prior information. NBER Working Paper No. 20741.
Bjerkholt, O., 2015. Fellowship elections in the Econometric Society 1933-1948. Working Paper, Department of Economics, University of Oslo.
Deuchert, E. and M. Huber, 2014. A cautionary tale about control variables in IV estimation. Discussion Paper No. 2014-39, School of Economics and Political Science, University of St. Gallen.
Doornik, J. A. and D. F. Hendry, 2014. Statistical model selection with 'Big Data'. Discussion Paper 735, Department of Economics, University of Oxford.
Duendack, M., R. W. Palmer-Jones, and W. R. Reed, 2014. Replications in economics: A progress report. Working Paper No. 26/2014, Department of Economics and Finance, University of Canterbury.

Saturday, January 24, 2015

Extreme Value Modelling in Stata

David Roodman wrote to me today, saying:

"I don’t know if you use Stata, but I’ve just released a Stata package for extreme value theory. It is strongly influenced by Coles’s book on EVT and the associated ismev package for R. Using maximum likelihood, it fits the generalized Pareto distribution and the generalized extreme value distribution, the latter including the extension to multiple order statistics. It also offers various diagnostic plots. There are already many sophisticated R packages for EVT. I suppose mine offers accessibility…and small-sample bias corrections. It can do the Cox-Snell correction for all the models, including with covariates (citing you for GPD, and promising a write-up for the rest). It also offers bias correction based on a parametric bootstrap. I’ve confirmed the efficacy of both bias corrections through simulations, for the GPD and GEV. I’m still tweaking the simulations, and they take time, but I hope to soon post some graphs based on them. The GPD results closely match yours.

Comments welcome. Please circulate to others who might be interested. To install the package in Stata, type “ssc install extreme”. The help file contains clickable examples that reproduce most results in the Coles book. The web page is https://ideas.repec.org/c/boc/bocode/s457953.html."

The work on the Cox-Snell bias correction for the Generalized Pareto Distribution that David is referring to is Giles et al. (2015). You can find an earlier post about this work here, and you can download the paper here.

Update, 28/1/2015: See more at David's blog.

Reference

Giles, D. E., H. Feng, and R. T. Godwin, 2015. Bias-corrected maximum likelihood estimation of the parameters of the generalized Pareto distribution. Communications in Statistics - Theory and Methods, in press.

Sunday, January 11, 2015

Econometrics vs. Ad Hoc Empiricism

In a post in 2013, titled "Let's Put the "ECON" Back Into Microeconometrics", I complained about some of the nonsense that is passed off as "applied econometrics". Specifically, I was upset about the disconnect between the economic model (if there is one) and the empirical relationships that are actually estimated, in many "applied" papers.

I urge you to look back at the post before reading further.

Here's a passage from that post:

"In particular, how often have you been presented with an empirical application that's based on just a reduced-form model that essentially ignores the nuances of the theoretical model?

I'm not picking on applied microeconomic papers - really, I'm not! The same thing happens with some applied macroeconomics papers too. It's just that in the micro. case, there's often a much more detailed and rich theoretical model that just lends itself to some nice structural modelling. And then all we see is a regression of the logarithm of some variable on a couple of interesting covariates, and a bunch of controls - the details of which are frequently not even reported."

Well, things certainly haven't improved since I wrote that. In fact, it seems that I'm encountering more and more of this nonsense. This isn't "econometrics", and the purveyors of this rubbish aren't "econometricians".

My real concern is that students who are exposed to these papers and seminars may not recognize it for what it is - just ad hoc empiricism.

Friday, January 9, 2015

ARDL Modelling in EViews 9

My previous posts relating to ARDL models (here and here) have drawn a lot of hits. So, it's great to see that EViews 9 (now in Beta release - see the details here) incorporates an ARDL modelling option, together with the associated "bounds testing".

This is a great feature, and I just know that it's going to be a "winner" for EViews.

It certainly deserves a post, so here goes!

First, it's important to note that although there was previously an EViews "add-in" for ARDL models (see here and here), this was quite limited in its capabilities. What's now available is a full-blown ARDL estimation option, together with bounds testing and an analysis of the long-run relationship between the variables being modelled.

Here, I'll take you through another example of ARDL modelling - this one involves the relationship between the retail price of gasoline, and the price of crude oil. More specifically, the crude oil price is for Canadian Par at Edmonton; and the gasoline price is that for the Canadian city of Vancouver. Although crude oil prices are recorded daily, the gasoline prices are available only weekly. So, the price data that we'll use are weekly (end-of-week), for the 4 January 2000 to 16 July 2013, inclusive.

The oil prices are measured in Candian dollars per cubic meter. The gasoline prices are in Canadian cents per litre, and they exclude taxes. Here's a plot of the raw data:

New Year Reading List

Happy New Year - and happy reading!

Arlot, S. and A. Celisse, 2010. A survey of cross-validation procedures for model selection. Statistics Surveys, 4, 40-79. (HT Rob)
Marsilli, C., 2014. Variable selection in predictive MIDAS models.Working Paper No. 520, Bank of France.
Kulaksizoglu, T., 2014. Lag order and critical values of the augmented Dickey-Fuller test: A replication. Journal of Applied Econometrics, forthcoming.
Mooij, J. M., J. Peters, D. Janzing, J. Zscheischler, and B. Scholkopf, 2014. Distinguishing cause from effect using observational data: Methods and benchmarks. Working Paper. (HT Roger)
Polak, J., M. L. King, and X. Zhang, 2014. A model validation procedure. Working Paper 21/14, Department of Econometrics and Business Statistics, Monash University.
Reed, W. R., 2014. Unit root tests, size distortions and cointegrated data. Working Paper 28/2014, Department of Economics and Finance, University of Canterbury. (HT Bob)