Friday, September 9, 2016

Spreadsheet Errors

Five years ago I wrote a post titled, "Beware of Econometricians Bearing Spreadsheets". 

The take-away message from that post was simple: there's considerable, well-documented, evidence that spreadsheets are very, very, dangerous when it comes to statistical calculations. That is, if you care about getting the right answers!

Read that post, and the associated references, and you'll see what I mean.

(You might also ask yourself, why would I pay big bucks for commercial software that is of questionable quality when I can use high-quality statistical software such as R, for free?)

This week, a piece in The Economist looks at the shocking record of publications in genomics that fall prey to spreadsheet errors. It's a sorry tale, to be sure. I strongly recommend that you take a look.

Yes, any software can be mis-used. Anyone can make a mistake. We all know that. However, it's not a good situation when a careful and well-informed researcher ends up making blunders just because the software they trust simply isn't up to snuff!  


© 2016, David E. Giles

Friday, September 2, 2016

Dummies with Standardized Data

Recently, I received the following interesting email request:
"I would like to have your assistance regarding a few questions related to regression with standardized variables and a set of dummy variables. First of all, if the variables are standardized (xi-x_bar)/sigma, can I still run the regression with a constant? And, if my dummy variables have 4 categories, do I include all of them without the constant? Or just three and keep the constant in the regression? And, how do we interpret the coefficients of the dummy variables in such as case? I mean, idoes the conventional interpretation in a single OLS regression still apply?"

Here's my (brief) email response:
"If all of the variables (including the dependent variable) have been standardized then in general there is no need to include an intercept - in fact the OLS estimate of its coefficient will be zero (as it should be).
However, if you have (say) 4 categories in the data that you want to allow for with dummy variables, then the usual results apply:
1. You can include all 4 dummies (but no intercept). The estimated coefficients on the dummies will sum to zero with standardized data. Each separate coefficient gives you the deviation from zero for the intercept in each category.
OR (equivalently)
2. You can include an intercept and any 3 of the dummies. Again, the estimated coefficients of the dummies and the intercept will sum to zero. Suppose that you include the intercept and the dummies D2, D3, and D4. The estimated coefficient of the intercept gives you the intercept effect for category 1. The estimated coefficient for D2 gives you the deviation of the intercept for category 2, from that for category 1, etc."
You can easily verify this by fitting a few OLS regressions, and there's a lot more about regression analysis with standardized data in this earlier post of mine.


© 2016, David E. Giles

Wednesday, August 31, 2016

September Reading

Here are a few suggestions for some interesting reading this month:
© 2016, David E. Giles

Tuesday, July 26, 2016

The Forecasting Performance of Models for Cointegrated Data

Here's an interesting practical question that arises when you're considering different forms of econometric models for forecasting time-series data:
"Which type of model will perform best when the data are non-stationary, and perhaps cointegrated?"
To answer this question we have to think about the alternative models that are available to us; and we also have to decide on what we mean by 'best'. In other words, we have to agree on some sort of loss function or performance criterion for measuring forecast quality.

Notice that the question I've posed above allows for the possibility that the data that we're using are integrated, and the various series we're working with may or may not be cointegrated. This scenario covers a wide range of commonly encountered situations in econometrics.

In an earlier post I discussed some of the basic "mechanics" of forecasting from an Error Correction Model. This type of model is used in the case where our data are non-stationary and cointegrated, and we want to focus on the short-run dynamics of the relationship that we're modelling. However, in that post I deliberately didn't take up the issue of whether or not such a model will out-perform other competing models when it comes to forecasting.

Let's look at that issue here.

Tuesday, July 5, 2016

Recommended Reading for July

Now that the Canada Day and Independence Day celebrations are behind (some of) us, it's time for some serious reading at the cottage. Here are some suggestions for you:


© 2016, David E. Giles

Saturday, June 25, 2016

Choosing Between the Logit and Probit Models

I've had quite a bit say about Logit and Probit models, and the Linear Probability Model (LPM), in various posts in recent years. (For instance, see here.) I'm not going to bore you by going over old ground again.

However, an important question came up recently in the comments section of one of those posts. Essentially, the question was, "How can I choose between the Logit and Probit models in practice?"

I responded to that question by referring to a study by Chen and Tsurumi (2010), and I think it's worth elaborating on that response here, rather than leaving the answer buried in the comments of an old post.

So, let's take a look.

Tuesday, June 7, 2016

The ANU Tapes of the British (Econometrics) Invasion

As far as I know, the Beatles never performed at the Australian National University (the ANU). But the "fab. three" certainly did, and we're incredibly lucky to have the visual recordings to prove it!

Stan Hurn (Chair of the Board of the National Centre for Econometric Research, based in the Business School at the Queensland University of Technology) contacted me recently about a fantastic archive that has been made available.

The Historical Archive at the NCER now includes the digitized versions of the movies that were made in the 1970's and 1980's when various econometricians from the London School of Economics visited and lectured at the ANU. Specifically, eight lectures by Grayham Mizon, five by Ken Wallis, and a further eight lectures by Denis Sargan can be viewed here.

I was on faculty at Monash University at the time of these visits (and that of David Hendry - so I guess the fab. four of of the LSE did actually make it). I recall them well because the visitors also gave seminars in our department while they were in Australia. 

Before you view the lectures - and I really urge you to do so - it's essential that you read the background piece, "The ANU Tapes: A Slice of History", written by Chris Skeels. (Be sure to follow the "Read more" link, and read the whole piece.) As it happens, Chris was a grad. student in our group at Monash back in the day, and his backgrounder outlines a remarkable story of how the tapes were saved.

Kudos to Stan and his colleagues for putting this archive together. And double kudos to Chris Skeels for having the foresight, energy, and determination to ensure that we're all able to share these remarkable lectures.

Thank you both!

© 2016, David E. Giles

Thursday, June 2, 2016

Econometrics Reading List for June

Here's some suggested reading for the coming month:


© 2016, David E. Giles

Saturday, May 28, 2016

Forecasting From an Error Correction Model

Recently, a reader asked about generating forecasts from an estimated Error Correction Model (ECM). Really, the issues that arise are no different from those associated with any dynamic regression model. I talked about the latter in a previous post in 2013.

Anyway, let's take a look at the specifics.........