Econometrics Beat: Dave Giles' Blog

Thursday, October 31, 2019

It's Time to Go

When I released my first post on the blog on 20th. Febuary 2011 I really wasn't sure what to expect! After all, I was aiming to reach a somewhat niche audience.

Well, 949 posts and 7.4 million page-hits later, this blog has greatly exceeded my wildest expectations.

However, I'm now retired and I turned 70 three months ago. I've decided to call it quits, and this is my final post.

I'd rather make a definite decision about this than have the blog just fizzle into nothingness.

For now, the Econometrics Beat blog will remain visible, but it will be closed for further comments and questions.

I've had a lot fun and learned a great deal through this blog. I owe a debt of gratitude to all of you who've followed my posts, made suggestions, asked questions, made helpful comments, and drawn errors to my attention.

I just hope that it's been as positive an experience for you as it has been for me.

Thank you - and enjoy your Econometrics!

Wednesday, October 30, 2019

Everything's Significant When You Have Lots of Data

Well........, not really!

It might seem that way on the face of it, but that's because you're probably using a totally inappropriate measure of what's (statistically) significant, and what's not.

I talked a bit about this issue in a previous post, where I said:

"Granger (1998, 2003) has reminded us that if the sample size is sufficiently large, then it's virtually impossible not to reject almost any hypothesis. So, if the sample is very large and the p-values associated with the estimated coefficients in a regression model are of the order of, say, 0.10 or even 0.05, then this really bad news. Much, much, smaller p-values are needed before we get all excited about 'statistically significant' results when the sample size is in the thousands, or even bigger."

This general point, namely that our chosen significance level should be decreased as the sample size grows, is pretty well understood by most statisticians and econometricians. (For example, see Good, 1982.) However, it's usually ignored by the authors of empirical economics studies based on samples of thousands (or more) observations. Moreover, a lot of practitioners seem to be unsure of just how much they should revise their significance levels (or re-interpret their p-values) in such circumstances.

There's really no excuse for this, because there are some well-established guidelines to help us. In fact, as we'll see, some of them have been around since at least the 1970's.

Let's take a quick look at this, because it's something that all students need to be made aware of as we work more and more with "big data". Students certainly won't gain this awareness by looking at the interpretation of the results in the vast majority of empirical economics papers that use even sort-of-large samples!

Reporting an R-Squared Measure for Count Data Models

This post was prompted by an email query that I received some time ago from a reader of this blog. I thought that a more "expansive" response might be of interest to other readers............

In spite of its many limitations, it's standard practice to include the value of the coefficient of determination (R²) - or its "adjusted" counterpart - when reporting the results of a least squares regression. Personally, I think that R² is one of the least important statistics to include in our results, but we all do it. (See this previous post.)

If the regression model in question is linear (in the parameters) and includes an intercept, and if the parameters are estimated by Ordinary Least Squares (OLS), then R² has a number of well-known properties. These include:

0 ≤ R² ≤ 1.
The value of R² cannot decrease if we add regressors to the model.
The value of R² is the same, whether we define this measure as the ratio of the "explained sum of squares" to the "total sum of squares" (R_E²); or as one minus the ratio of the "residual sum of squares" to the "total sum of squares" (R_R²).
There is a correspondence between R² and a significance test on all slope parameters; and there is a correspondence between changes in (the adjusted) R² as regressors are added, and significance tests on the added regressors' coefficients. (See here and here.)
R² has an interpretation in terms of information content of the data.
R² is the square of the (Pearson) correlation (R_C²) between actual and "fitted" values of the model's dependent variable.

However, as soon as we're dealing with a model that excludes an intercept or is non-linear in the parameters, or we use an estimator other than OLS, none of the above properties are guaranteed.

October Reading

Here's my latest, and final, list of suggested reading:

Bellego, C. and L-D. Pape, 2019. Dealing with the log of zero in regression models. CREST Working Paper No. 2019-13.
Castle, J. L., J. A. Doornik, and D. F. Hendry, 2018. Selecting a model for forecasting. Department of Economics, University of Oxford, Discussion Paper 861.
Gorajek, A., 2019. The well-meaning economist. Reserve Bank of Australia, Research Discussion Paper RDP 2019-08.
Güriş, B., 2019. A new nonlinear unit root test with Fourier function. Communications in Statistics - Simulation and Computation, 48, 3056-3062.
Maudlin, T., 2019. The why of the world. Review of The Book of Why: The New Science of Cause and Effect, by J. Pearl and D. Mackenzie. Boston Review.
Qian, W., C. A. Rolling, G. Cheng, and Y. Yang, 2019. On the forecast combination puzzle. Econometrics, 7, 39.

Sunday, September 1, 2019

Back to School Reading

Here we are - it's Labo(u)r Day weekend already in North America, and we all know what that means! It's back to school time.

You'll need a reading list, so here are some suggestions:

Frances, Ph. H. B. F., 2019. Professional forecasters and January. Econometric Institute Research Papers EI2019-25, Erasmus University Rotterdam.
Harvey, A. & R. Ito, 2019. Modeling time series when some observations are zero. Journal of Econometrics, in press.
Leamer, E. E., 1978. Specification Searches: Ad Hoc Inference With Nonexperimental Data. Wiley, New York. (This is a legitimate free download.)
MacKinnon, J. G., 2019. How cluster-robust inference is changing applied econometrics. Working Paper 1413, Economics Department, Queen's University.
Steel, M. F. J., 2019. Model averaging and its use in economics. Mimeo., Department of Statistics, University of Warwick.
Stigler, S. M., 1981. Gauss and the invention of least squares. Annals of Statistics, 9, 465-474.

Tuesday, August 20, 2019

Book Series on "Statistical Reasoning in Science & Society"

Back in early 2016, the American Statistical Association (ASA) made an announcement in its newsletter, Amstat News, about the introduction of an important new series of books. In part, that announcement said:

"The American Statistical Association recently partnered with Chapman & Hall/CRC Press to launch a book series called the ASA-CRC Series on Statistical Reasoning in Science and Society.

'The ASA is very enthusiastic about this new series,' said 2015 ASA President David Morganstein, under whose leadership the arrangement was made. 'Our strategic plan includes increasing the visibility of our profession. One way to do that is with books that are readable, exciting, and serve a broad audience having a minimal background in mathematics or statistics.'

The Chapman & Hall/CRC press release states the book series will do the following:

Highlight the important role of statistical and probabilistic reasoning in many areas

Require minimal background in mathematics and statistics

Serve a broad audience, including professionals across many fields, the general public, and students in high schools and colleges

Cover statistics in wide-ranging aspects of professional and everyday life, including the media, science, health, society, politics, law, education, sports, finance, climate, and national security

Feature short, inexpensive books of 100–150 pages that can be written and read in a reasonable amount of time."

Seven titles have now been published in this series -

Measuring Society, by Chaitra H. Nagaraja (2019)
Measuring Crime: Behind the Statistics, by Sharon L. Lohr (2019)
Statistics and Health Care Fraud: How to Save Billions, by Tahir Ekin (2019)
Improving Your NCAA® Bracket with Statistics, by Tom Adams (2018)
Data Visualization: Charts, Maps, and Interactive Graphics, by Robert Grant (2018)
Visualizing Baseball, by Jim Albert (2017)
Errors, Blunders, and Lies: How to Tell the Difference, by David S. Salsburg (2017)

Readers of this blog should be especially interested in Chaitra Nagaraja's recently published addition to this series. Chaitra devotes chapters in her book to the topics of Jobs, Inequality, Housing, Prices, Poverty, and Deprivation. I particularly like the historical perspective that Chaitra provides in this very readable contribution, and I recommend her book to you (and your non-economist friends).

Wednesday, August 14, 2019

Check out What Happened at the 2019 Joint Statistical Meetings

Each year, the Joint Statistical Meetings (JSM) bring together thousands (6,500 this year) of statisticians at what's the largest gathering of its type in the world. The JSM represent eleven international statistics organisations, including the four founding organisations - The American Statistical Association (ASA), The International Biometric Society, The Institute of Mathematical Statistical, and The Statistical Society of Canada.

As a member of the ASA since 1973 I've attended a few of these meetings over the years, but unfortunately I didn't make it to the JSM in Denver at the end of last month. As always, the program was amazing.

Yesterday, the ASA released a searchable version of the 2019 program that contains downloadable files of the slides used by many of the speakers. You can find that version of the program here. When you go through the program, look for presentations that have blue (rectangular) "Presentation" button. Papers in sessions sponsored by the Business and Economic Statistics section of the ASA may be of special interest to you - but there's lots to choose from!

Tuesday, August 6, 2019

Including More History in Your Econometrics Teaching

If you follow this blog (or if you look at the "History of Econometrics" label in the word cloud in the right side-bar), you'll know that I have more than a passing interest in the history of our discipline. There's so much to be learned from this history. Among other things, we can gain insights into why certain methods became popular, and we can reduce the risk of repeating earlier mistakes!

When I was teaching I liked to inject a few historical facts/anecdotes/curiosities into my classes. I think that this brought the subject matter to life a little. The names behind the various theorems, tests, and estimators are those of real people, after all.

There are some excellent books on the history of econometrics, including those by Epstein (1987), Morgan (1990), and De Marchi and Gilbert (1991). (Also, see the short piece by Stephen Pollock, 2014.)

However, I think that we could do more in terms of making material about this history accessible to our students.

The Statistics community has gone much further in this direction, and we might take note of this.

The other day, Amanda Golbeck posted some very helpful links on the American Statistical Association's "History of Statistics Interest Group" community noticeboard.

Here's her posting in its entirety - and don't miss the first of her links:

"Why not include more history in your teaching? The History of Statistics Interest Group library has a collection of Activities for Classes: community.amstat.org/historyofstats/ourlibrary/...

We are pleased to let you know that Bob Rosenfeld has created 13 history of probability and statistics teaching modules, and he has kindly made them available for you to use in your classes! We hope you will find them to be useful.

Reading and Exercises on the History of Probability from the Vermont Mathematics Initiative, Bob Rosenfeld

Pre-history to 1600 (PDF)
17th Century France (PDF)
Jacob Bernoulli - Law of Large Numbers (PDF)
Inverse Probability - Thomas Bayes (PDF)
Laplace (PDF)

Reading and Exercises on the History of Statistics from the Vermont Mathematics Initiative, Bob Rosenfeld

John Graunt and the Bills of Mortality (PDF)
Origin of the Normal Curve (PDF)
Origins of Graphs in Statistics (PDF)
Fitting models to data - the Path to Least Squares (PDF)
Statistics Moves from Physical to Social Sciences (PDF)
Correlation - Francis Galton (PDF)
t-Distribution and Gosset (PDF)
Fisher and Design of Experiments (PDF)"

(Bob Rosenfeld was former Co-Director for Statistics and School-Based Research at the Vermont Mathenatics initiative, and the author of a number of books on the teaching of statistics to K-8 students. D.G.)

Most of Bob Rosenfeld's pieces are directly relevant to econometrics students. It would be nice to see more material about the history of our discipline that could be incorporated into introductory econometrics courses.

References

De Marchi, N. & C. Gilbert, 1990. History and Methodology of Econometrics. Oxford University Press, Oxford.

Epstein, R. J. 1987. A History of Econometrics. North-Holland, Amsterdam.

Morgan, M. S., 1991. The History of Econometric Ideas. Cambridge University Press, Cambridge.

Pollock, D. S. G., 2014. Econometrics - An historical guide for the uninitiated. Working Paper No. 14/05, Department of economics, University of Leicester.

Friday, August 2, 2019

Sunday, July 28, 2019

AAEA Meeting, 2019

The Agricultural and Applied Economics Association (AAEA) recently held its annual meeting in Atlanta, GA. You can find the extensive program here.

This year, I was fortunate enough to be able to attend and participate.

This was thanks to the kind invitation of Marc Bellemare, a member of the Executive Board of the AAEA, and (of course) a blogger whom many of you no doubt follow. (If you don't, then you should!)

Marc arranged a session in which he and I talked about the pros and cons of The Cookbook Approach to Teaching Econometrics. The session was well attended, and the bulk of the time was devoted to a very helpful discussion-question-answer period with the audience.

As you'll know from some of my previous posts (e.g., here and here), I'm not a big fan of The Cookbook Approach - at least, not if it's the primary/sole way of teaching econometrics. Marc made the point that there's a place for this approach if it's adopted after more formal courses in econometrics. I'm in agreement with that.

I put together a few background talking-point slides for my short presentation. For what they're worth, you'll find then here.

I really enjoyed my time at the AAEA meeting, and I learned a lot. Thanks, Marc, and thank you to the participants!