Yesterday, in a post on the Worthwhile Canadian Initiative, Frances Woolley rightly drew attention to some rather disturbing issues associated with the upcoming release of the 2011 National Household Survey (NHS), by Statistics Canada. In a nutshell, she asks the question, "How can we be sure that the NHS information about the religious beliefs of Canadians is accuarate?"
Recently, I made the comment: Data - the econometrician's lifeblood! Can't function without it." I wish I'd been more specific, and said "reliable data."
Data quality, and the timeliness of its release, is something that affects us all. And the impact isn't always a positive one. Neither is the problem limited to survey data, of the type that Frances was discussing. It applies equally to time-series data.
Most practising economists are broadly aware of some of the pitfalls associated with working with data, the quality of which may be "mixed" or questionable. One matter for concern, though, is the lack of such awareness among some of our students who routinely use "official" data without questioning its quality, or its applicability to the questions that they are trying to address.
Among the matters that we should be telling our students more about are:
- Data get revised! All of the time! Don't assume that those GDP figures are going to stay that way.
- Data "disappear"! Don't assume that your favourite series published by your country's official statistical agency is always going to be available in that form. These agencies have a nasty habit of "discontinuing" time-series data - often without much warning!
- Data definitions change! Read the footnoted and the fine-print associated with published data. It's important to know if there are "breaks" in a time-series resulting from changes in the way it has been defined. Sometimes these breaks are unavoidable, but on other occasions they are just plain irritating! Either way, they affect your analysis.
- Data are based on estimates! O.K., not all of them - but a lot more than you may think. It's a common fallacy among students that core macroeconomic data are somehow "exact". They're not!
The bottom line(s):
- The quality of your data is at least as important as the amount of data you have.
- Be as concerned about understanding your data, and its limitations, as you are about understanding the statistical/econometric tools that you intend to use.
© 2013, David E. Giles