Econometrics Beat: Dave Giles' Blog: 08/01/2011

Wednesday, August 31, 2011

Beware of Econometricians Bearing Spreadsheets

"Let's not kid ourselves: the most widely used piece of

software for statistics is Excel"

(B. D. Ripley, RSS Conference, 2002)

What a sad state of affairs! Sad, but true when you think of all of the number crunching going on in those corporate towers.

With the billions of dollars that are at stake when some of those spreadsheets are being used by the uninitiated, you'd think (and hope) that the calculations are squeaky clean in terms of reliability. Unfortunately, you'd be wrong!

A huge number of reputable studies over the years - ranging from McCullough (1998, 1999), to the special section in Computational Statistics & Data Analysis in 2008 - have pointed out some of the numerical inaccuracies in various releases of some widely used spreadsheets. With reputations at stake, and the potential for litigation, you'd again think (and hope) that by now the purveyors of such software would be on the ball. Not so, it seems!

Tuesday, August 30, 2011

An Overly Confident (Future) Nobel Laureate

For some reason, students often have trouble interpreting confidence intervals correctly. Suppose they're presented with an OLS estimate of 1.1 for a regression coefficient, and an associated 95% confidence interval of [0.9,1.3]. Unfortunately, you sometimes see interpretations along the following lines:

There's a 95% probability that the true value of the regression coefficient lies in the interval [0.9,1.3].
This interval includes the true value of the regression coefficient 95% of the time.

So, what's wrong with these statements?

Monday, August 29, 2011

Missing Keys and Econometrics

There couldn't possibly be any connection between conducting econometric analysis and looking for your lost keys, could there? Or, maybe there could!

Jeff Racine (McMaster U.) put a nice little piece up on his web page at the start of this month. It's titled Find Your Keys Yet?, and has the sub-title "Some Thoughts on Parametric Model Misspecification". Jeff rightly points out some of the difficulties associated with the concept of "the true model" in econometrics, and the importance of specification testing in the games we play.

BTW, this ties in with "Darren's" comments on my earlier post, Cookbook Econometrics.

Students of econometrics - please read Jeff's piece. Teachers of econometrics - ditto!

Saturday, August 27, 2011

Levelling the Learning Paying Field

Whenever it comes time to assign a textbook for a course, I get the jitters. It's the price tag that always gets to me! And if it gets to me, then surely it must result in gasps of disbelief from the students (and parents) who are affected by my choices.

Often, I can (and do) make sure that the one text I assign can be used for two back-to-back courses. Hopefully, that helps a bit.

However, the cost of textbooks can still be a sizeable burden. Then, when students go to re-sell their texts the following year, they discover that those pesky publishing houses have churned out new editions! Guess what that does to the re-sale value of last year's purchase?

Playing fields (or paying fields in this case) would be level if the world were flat. Right? Right! Ideally, flat and at a height of zero. Zero dollars! That's exactly what Flat World Knowledge is all about.

Thursday, August 25, 2011

Reproducible Econometric Research

I doubt if anyone would deny the importance of being able to reproduce one's econometric results. More importantly, other researchers should be able to reproduce our results to verify (a) that we've done what we said we did; (b) to investigate the sensitivity of our results to the various choices we made (e.g., functional form of our model, choice of sample period, etc.); and (c) to satisfy themselves that they understand our analysis.

However, if you've ever tried to literally reproduce someone else's econometric results, you'll know that it's not always that easy to so - even if they supply you with their data-set. You really need to have their code (R, EViews, STATA, Gauss) as well. That's why I include both Data and Code pages with this blog.

Wednesday, August 24, 2011

MoneyScience

MoneyScience - which describes itself as "the community resource for Quantitative Finance, Risk Management and Technology Practitioners, Vendors and Academics" - has recently released version 3 of its site.

There's a great deal of interesting work going on in "Financial Econometrics", and this is one site that provides really good content and excellent networking facilities that will help keep econometricians up to speed with what is going on in the finance community at large.

I'm pleased to be feeding this blog to MoneyScience (here), and you'll notice a new icon near the bottom of the right side-bar:

Innovations in Editing

I've posted in the past (here and here) about some of my experiences as an academic journal Editor. It has its ups and downs, for sure, but ultimately it's a rewarding job.

Preston McAfee (Yahoo! & Caltech) is currently the Editor of Economic Inquiry, where he's introduced some important innovations into the editorial process. He was formerly Co-Editor of American Economic Review. As well as being a highly respected economist, Preston has a wonderful way with words.

His thoughts on journal editing make excellent reading, for seasoned academics and newcomers alike. I particularly recommend Preston's piece in the American Economist last year - here.

Reference

McAfee, R. P. (2010). Edifying editing. American Economist, 55, 1-8.

Thursday, August 18, 2011

Visualizing Random p-Values

Here's a follow-up to yesterday's post on Wolfram's CDF file format. In an earlier post (here) I discussed the fact the p-values are random variables, with their own sampling distribution.

A great way of visualizing a number of the points that I made in that post is to use the CDF file for the Mathematica app. written by Ian McLeod (University of Western Ontario). You can run it/download it from here, using Wolfram's free CDF Player (here).

You'll recall that a p-value is uniformly distributed on [0 , 1] if the null hypothesis being tested is true. Using Ian's app., here is an example of what you see for the case where you're testing the null of a zero mean in a normal population, against a 2-sided alternative:

The null hypothesis is TRUE. The sample size is n = 50, and the simulation experiment involves 10,000 replications. You can see that the empirical distribution is approaching the Uniform true sampling distribution.

When the true mean of the distribution is 2.615 (so the null hypothesis is FALSE), the sampling distribution of the (two-sided) p-value looks like this:

If you're not sure why the pictures look like this, you might want to take a look at my earlier post, ("May I Show You My Collection of p-Values?")

Wednesday, August 17, 2011

Interactive Statistics - Wolfram's CDF format

Many of you will be familiar with Wolfram Research, the company that delivers Mathematica, among other things. Last month, they launched their new Computable Document Format (CDF) - it's something I'm going to be using a lot in my undergraduate Economic Statistics course.

Here are a few words taken from their press release of July 21:

Monday, August 15, 2011

Themes in Econometrics

There are several things that I recall about the first course in econometrics that I took. I'd already completed a degree in pure math. and mathematical statistics, and along with a number of other students I did a one-year transition program before embarking on a Masters degree in economics. The transition program comprised all of the final-year undergrad. courses offered in economics.

As you'd guess, the learning curve was pretty steep in macro. and micro, but I had a comparative advantage when it came to linear programming and econometrics. So things balanced out - somewhat!

Thursday, August 11, 2011

That Darned Debt!

What with the recent S&P downgrading of the U.S., and the turmoil in the markets, it's difficult to watch the T.V. or read a newspaper without being bombarded with the dreaded "D-word". I was even moved to pitch in myself recently by posting a piece about the issue of units of measurement when comparing debt (a stock) with GDP (a flow).

Yesterday, Lisa Evans had nice post titled 20 Outlandish and Informative Ways to Illustrate the U.S. National Debt. I think you'd enjoy it!

Lisa runs a site called Masters in Economics. It's devoted to assisting students who are thinking of undertaking a Masters degree in Economics as a stepping stone into a career in Economics. Programs at this level are more widely available than you might have thought.

I'm hoping that Lisa will be able to expand her database to include the many programs at this level in Canadian schools.

Meantime, keep an eye on her site and watch for her future posts.

Wednesday, August 10, 2011

Flip a Coin - FIML or 3SLS ?

When it comes to choosing an estimator or a test in our econometric modelling, sometimes there are pros and cons that have to weighted against each other. Occasionally we're left with the impression that the final decision may as well be based on computational convenience, or even the flip of a coin.

In fact, there's usually some sound basis for selecting one potential estimator or test over an alternative one. Let's take the case where we're estimating a structural simultaneous equations model (SEM). In this case there's a wide range of consistent estimators available to us.

There are the various "single equation" estimators, such as 2SLS or Limited Information Maximum Likelihood (LIML). These have the disadvantage of being asymptotically inefficient, in general, relative the "full system" estimators. However, they have the advantage of usually being more robust to model mis-specification. Mis-specifying one equation in the model may result in inconsistent estimation of that equation's coefficients, but this generally won't affect the estimation of the other equations.

The two commonly used "full system" estimators are 3SLS and Full Information Maximum Likelihood (FIML). Under standard conditions, these two estimators are asymptotically equivalent when it comes to estimating the structural form of an SEM with normal errors. More specifically, they each have the same asymptotic distribution, so they are both asymptotically efficient.

Tuesday, August 9, 2011

Being Normal is Optional!

One of the cardinal rules of teaching is that you should never provide information that you know you're going to have to renege on in a later course. When you're teaching econometrics, I know that you can't possibly cover all of the details and nuances associated with key results when you present them at an introductory level. One of the tricks, though, is to try and present results in a way that doesn't leave the student with something that subsequently has to be "unlearned", because it's actually wrong.

If you're scratching your head, and wondering who on earth would be so silly as to teach something that has to be "unlearned", let me give you a really good example. You'll have encountered it a dozen times or more, I'm sure. You just have to pick up almost any econometrics textbook, at any level, and you'll come away with a big dose of mis-information regarding one of the standard assumptions that we make about the error term in a regression model. If this comes as news to you, then I'll have made my point!

Wednesday, August 3, 2011

The Article of the Future

The standard format for academic journal articles is pretty much "tried and true": Abstract; Introduction; Methodology; Data; Results; Conclusions. There are variations on this. of course, depending on the discipline in question. When it comes to journals that publish research in econometrics, it's difficult to think of innovations that have taken advantage of developments in technology in the past few years.

O.K., so you can follow your favourite journal on Twitter or Facebook - but when you get to the articles themselves, do they look that much different from, say, ten years ago? Not really.

You'd think that econometrics journals could be a bit more exciting. The content is always a blast, of course, but what about the way it's presented? Apart from moving from paper to pdf files, we haven't really come that far. Yes, in many cases you get hyperlinks to the articles listed in the References section, which is fine and dandy. But is that enough?

When we undertake the research that leads to the articles we use all sorts of data analysis and graphical tools, whichever econometric or statistical software we're wedded to. Yet, these powerful tools are left pretty much on the sideline when it comes to disseminating the information through the traditional peer-reviewed outlets.

Are there any glimmers of hope on the horizon?

In a recent electronic issue of their Editors' Update newsletter, the publishing company, Elsevier, discussed their so-called "Article of the Future" Project. It's worth looking at. In particular, it could get you thinking about how we can raise the bar a little when it comes to publishing new research results in econometrics. For example, see the interactive graphics in the "Fun With F1" article.

A lot of us have some pretty strong views about the pricing policies of academic journals, and about the extent to which the flow of scientific information should be commercialized. I have no affiliation with this particular publisher, but it's refreshing to see what they're up to in this regard. Hopefully there's more of this going on elsewhere.

Pages