Saturday, February 28, 2015

March Reading List

Good grief! It's March already. You might enjoy:

Bajari, P., D. Nekipelov, S. P. Ryan, and M. Yang, 2015. Demand estimation with machine learning and model combination. NBER Working Paper No, 20955.

Baur, D. G. and D. T. Tran, 2014. The long-run relationship of gold and silver and the influence of bubbles and financial crises. Empirical Economics, 47, 1525-1541.

Efron, B., 2014. Estimation and accuracy after model selection. Journal of the American Statistical Association, 109, 991-1007.

Kennedy, P. E., 1995. Randomization tests in econometrics. Journal of Business and Economic Statistics, 13, 85-94.

Magnus, J. R., W. Wang, and X. Zhang, 2015. Weighted-average least squares prediction. Econometric Reviews, in press.

Osman, A. F. and M. L. King, 2015. A new approach to forecasting based on exponential smoothing with independent regressors. Working Paper 02/15, Department of Econometrics and Business Statistics, Monash University.

Perron, P. and Y. Yamamoto, 2015. Using OLS to estimate and test for structural change in models with endogenous regressors. Journal of Applied Econometrics, 30, 119-144.

© 2015, David E. Giles

Population Countdown

I was downloading data from the Statistics New Zealand website the other evening, and was alerted to the fact that an interesting event was about to occur. Here's my screen-capture of the N.Z. "Population Clock" about an hour later:




© 2015, David E. Giles

Thursday, February 19, 2015

Applied Nonparametric Econometrics

Recently, I received a copy of a new econometrics book, Applied Nonparametric Econometrics, by Daniel Henderson and Christopher Parmeter.

The title is pretty self-explanatory and, as you'd expect with any book published by CUP, this is a high-quality item.

The book's Introduction begins as follows:
"The goal of this book is to help bridge the gap between applied economists and theoretical econometricians/statisticians. The majority of empirical research in economics ignores the potential benefits of nonparametric methods and many theoretical nonparametric advances ignore the problems faced by practitioners. We do not believe that applied economists dismiss these methods because they do not like them.  We believe that they do not employ them because they do not understand how to use them or lack formal training on kernel smoothing."
The authors provide a very readable, but careful, treatment of the main topics in nonparamteric econometrics, and a feature of this book is the set of empirical examples. The book's website provides the data that are used (for replication purposes), as well as a number of routines in R. The latter provide useful additions to those that are available in the np package for R (Hayfield and Racine, 2008).


Reference

Hayfield T. and J. S. Racine, 2008. Nonparametric econometrics: The np package. Journal of Statistical Software, 27 (5), 1-32.


© 2015, David E. Giles

Wednesday, February 18, 2015

Tuesday, February 17, 2015

Non-Existent Instruments

Consider the following abstract for an econometrics paper:
"The method of instrumental variables (IV) and the generalized method of moments (GMM), and their applications to the estimation of errors-in-variables and simultaneous equations models in econometrics, require data on a sufficient number of instrumental variables that are both exogenous and relevant. We argue that, in general, such instruments (weak or strong) cannot exist." 
This is, in fact, the abstract for a recent paper by Hall et al. (2014), and when I first read it I was definitely intrigued!

Recall that when we look for instruments we need to find variables that are, on the one hand, (asymptotically) uncorrelated with the errors of our regression model; but are, on the other hand, highly correlated (asymptotically) with the random regressors. The abstract, and the paper itself (of course) suggests that usually this objective is not achievable.

Why is this?

The difficulty arises if we view the error term in our regression equation as arising from various mis-specifications in the model. The authors argue that this interpretation is generally appropriate in econometric applications. Building on earlier work by Pratt and Schlaifer (1988), they show that in this case it's generally the situation that the error is a function of the very regressors that we're trying to "instrument". That being the case, legitimate instruments will be unattainable.

Food for thought!   


References

Hall, S. G., P. A. V. B. Swamy, and G. S. Tavlas, 2014. On the interpretation of instrumental variables in the presence of specification errors. Working Paper 14/19, Department of Economics, University of Leicester.

Pratt, J. W. and R. Schlaifer, 1988. On the Interpretation and observation of laws. Journal of Econometrics, 39, 23-52.


© 2015, David E. Giles

Monday, February 16, 2015

The Econometric Game, 2015

If you're a grad. student with an interest in econometrics, you've probably heard about The Econometric Game. It's been covered before on this blog (e.g., here, last year).

The 2015 Econometric Game is the sixteenth in the series, and it will take place at the University of Amsterdam, between 31 March and 2 April this year. You can find the list of participating universities here.

Last year's winner was the University of Copenhagen. Who's your pick for 2015?


© 2015, David E. Giles

Sunday, February 15, 2015

Testing for Multivariate Normality

The assumption that multivariate data are (multivariate) normally distributed is central to many statistical techniques. The need to test the validity of this assumption is of paramount importance, and a number of tests are available.

A recently released R package, MVN, by Korkmaz et al. (2014) brings together several of these procedures in a friendly and accessible way. Included are the tests proposed by Mardia, Henze-Zirkler, and Royston, as well as a number of useful graphical procedures.

If for some inexplicable reason you're not a user of R, the authors have thoughtfully created a web-based application just for you!


Reference

Korkmaz, S., D. Goksuluk, and G. Zarasiz
, 2014. An R package for assessing multivariate normality. The R Journal, 6/2, 151-162.


© 2015, David E. Giles

New Canadian Provincial Data

I was delighted to see the release, last week, of a new Statistics Canada research paper, "Provincial Convergence and Divergence in Canada, 1926 to 2011". Co-authored by W. Mark Brown and (UVic grad.) Ryan McDonald.

It's a very interesting paper that makes use of some equally interesting new data that Statistics Canada has released recently. For once, Statistics Canada has gone to considerable efforts to assemble a really useful long-run data-set (see Cansim Table 384-5000 This is a far cry from the myriad of "Series Discontinued" flags that we're used to seeing in the Cansim database, and it's great to see that Ryan has been instrumental in its development (McDonald, 2015).

As he notes, "More advanced statistical methods, and models with greater breadth and depth, are difficult to apply to existing fragmented Canadian data. The longitudinal nature of the new provincial dataset remedies this shortcoming."

Data is an econometrician's life-blood. We need to see more of this from Statistics Canada.


Reference

McDonald, R., 2015. Constructing Provincial Time Series: A Discussion of Data Sources and Methods. Income and Expenditure Accounts Technical Series, no. 77 Statistics Canada Catalogue no. 13-604-M. Ottawa: Statistics Canada.


© 2015, David E. Giles

Wednesday, February 4, 2015

Four Different Types of Regression Residuals

When we estimate a regression model, the differences between the actual and "predicted" values for the dependent variable (over the sample) are termed the "residuals". Specifically, if the model is of the form:

                     y = Xβ + ε ,                                                         (1)

and the OLS estimator of β is b, then the vector of residuals is

                    e = y - Xb .                                                           (2)

Any econometrics student will be totally familiar with this.

The elements of e (the n residuals) are extremely important statistics. We use them, of course, to construct other statistics - e.g., test statistics to be used for testing the validity of the underlying assumptions associated with our regression model. For instance, we want to check, are the errors (the elements of the ε vector) serially independent; are the errors homoskedastic; are they normally distributed; etc.?

What a lot of students don't learn is that these residuals - let's call them "Ordinary Residuals" - are just one type of residuals that are used when analysing the regression model. Lets take a look at this.