Econometrics Beat: Dave Giles' Blog: 12/01/2013

Tuesday, December 31, 2013

My Top 5 For 2013

Everyone seems to be doing it at this time of the year. So, here are the five most popular new posts on this blog in 2013:

Thanks for reading, and for your comments.

Happy New Year!

© 2013, David E. Giles

Monday, December 30, 2013

A Cautionary Bedtime Story

Once upon a time, when all the world and you and I were young ~~and beautiful,~~ there lived in the ancient town of Metrika a young boy by the name of Joe.

Happy Birthday, Econometric Society

The Econometric Society was founded 83 years ago today, as a result of a meeting held at the Stalton Hotel in Cleveland, Ohio.

One of my earliest posts was devoted to this aspect of the history of our discipline. If you haven't read it, this would certainly be an appropriate day to do so!

And if you want to look ahead, as well as back, keep in mind that the Econometric Society holds a World Congress every five years. The 11^th Congress is scheduled for 15 to 21 August 2015, in Montreal, Canada.

See you there!

Saturday, December 28, 2013

Statistical Significance - Again

With all of this emphasis on "Big Data", I was pleased to see this post on the Big Data Econometrics blog, today.

When you have a sample that runs to the thousands (billions?), the conventional significance levels of 10%, 5%, 1% are completely inappropriate. You need to be thinking in terms of tiny significance levels.

I discussed this in some detail back in April of 2011, in a post titled, "Drawing Inferences From Very Large Data-Sets". If you're of those (many) applied researchers who uses large cross-sections of data, and then sprinkles the results tables with asterisks to signal "significance" at the 5%, 10% levels, etc., then I urge you read that earlier post.

It's sad to encounter so many papers and seminar presentations in which the results, in reality, are totally insignificant!

Friday, December 27, 2013

Unbiased Estimation of a Standard Deviation

Frequently, we're interested in using sample data to obtain an unbiased estimator of a population variance. We do this by using the sample variance, with the appropriate correction for the degrees of freedom. Similarly, in the context of a linear regression model, we use the sum of the squared OLS residuals, divided by the degrees of freedom, to get an unbiased estimator of the variance of the model's error term.

But what if we want an unbiased estimator of the population standard deviation, rather than the variance?

Solution to Regression Problem

O.K. - you've had long enough to think about that little regression problem I posed the other day. It's time to put you out of your misery!

Here's the problem again, with a solution.

Thought for the Day

As a number of writers have noted previously, sales of Christmas cards Granger-cause Christmas, but they certainly don't cause Christmas!

Best wishes for the holiday season.

Monday, December 23, 2013

A Simple Regression Problem

Here's a regression problem for student readers of this blog.

Suppose that we estimate the following regression model by OLS:

y_i = α + β x_i + ε_i .

The model has a single regressor, x, and the point estimate of β turns out to be 10.0.

Now consider the "reverse regression", based on exactly the same data:

x_i = a + b y_i + u_i .

What can we say about the value of the OLS point estimate of b?

It will be 0.1.
It will be less than or equal to 0.1.
It will be greater than or equal to 0.1.
It's impossible to tell from the information supplied.

Thomas Bayes - 250 Years On

Two hundred and fifty years ago today a paper titled, "An Essay Towards Solving a Problem in the Doctrine of Chances", was presented to a meeting of the Royal Statistical Society in London. (Although, see here.)

The presenter - Richard Price. The author - (the late) Reverend Thomas Bayes.

Thus, we received "Bayes' Theorem".

A few days ago, the International Society for Bayesian Analysis held a celebratory conference to honour this momentous occasion in the history of statistical and scientific thinking.

Bayesian thinking has had a significant impact on the field of econometrics. My own Ph.D. dissertation (1975) was in Bayesian econometrics, and I was fortunate enough to have had Arnold Zellner as an external examiner.

I just wish I'd had access to the computational technology that's so freely available today!

Sunday, December 22, 2013

More on Student-t Regression Models

My recent post relating to maximum likelihood estimation of non-standard regression models in EViews included the case where the model's errors are independent Student-t distributed. In that example, the degrees of freedom for the Student-t distribution were assumed to be known. There was a good reason for making this assumption, as was spotted by Osman Dogan in his comment on that post.

If we relax this assumption and include the degrees of freedom parameter, v, of the t-distribution as another parameter that has to be estimated, then the likelihood function exhibits some unfortunate characteristics. Specifically, this function becomes unbounded at a boundary of the parameter space. Consequently, maximizing the likelihood function will generally result in us achieving only a local maximum, not a global maximum.

You might ask, "why would this matter?" Well, basically, if you want to be sure that your MLE achieves the good asymptotic properties that motivate us to use it in the first place, then you need to globally maximize the likelihood function.

I discussed this issue in some detail in an earlier post, here.

In the context of the multiple regression model with independent Student-t errors with an unknown degrees of freedom parameter, these issues have been discussed fully by Fernandez and Steel (1999), for example. In particular, those authors show how a Bayesian approach to this estimation problem can overcome the difficulties associated with MLE here.

The problem is very reminiscent of the "incidental parameters" problem that arises widely in statistics, as well as in certain econometric estimation problems. Good examples of this general type of problem in econometrics include "switching regression" models; as well as models of markets that are in disequilibrium; and stochastic frontier production functions.

It's well known that a Bayesian approach is productive in the case of the "incidental parameters" problem, so it shouldn't be too surprising that it's also helpful with the Student-t regression model.

So, if you want to estimate a regression model with independent Student-t errors, and the degrees of freedom parameter associated with that distribution is unknown, then don't use maximum likelihood estimation! The Bayesian estimator discussed by Fernandez and Steel (1999) is one alternative. Pianto (2010) suggests a bootstrap estimator; and another possibility would be to consider method of moments estimation, which would result in estimates that are at least weakly consistent.

References

Fernandez, C, and M. F. J. Steel, 1999. Multivariate Student-t regression models: Pitfalls and inference. Biometrika, 86, 153-167. (Downloadable version here.)

Pianto, D. M., 2010. A bootstrap estimator for the Student-t regression model.

Saturday, December 21, 2013

What is an Econometric Model? Objectivity vs. Reflexivity

In response to my recent post, titled, "The History of Econometrics - An Alternative View", Judea Pearl sent me a thoughtful and intriguing comment. The comment is posted already, but I think that it deserves more than just being tucked away at the bottom of another post.

So, I am giving Judea's comment additional attention here. I hope that you'll find it interesting, and that it will provoke some much-needed discussion.

Here's Judea's comment in its entirety:

Maximum Likelihood Estimation in EViews

This post is all about estimating regression models by the method of Maximum Likelihood, using EViews. It's based on a lab. class from one of my grad. econometrics courses.

We don't go through all of the material below in class - PART 3 is left as an exercise for the students to pursue in their own time.

The data and the EViews workfile can be found on the data page and the code page for this blog.

The purpose of this lab. exercise is to help the students to learn how to use EViews to estimate the parameters of a regression model by Maximum Likelihood, when the model is of some non-standard type. Specifically, find lout how to estimate models of types that are not “built in” as a standard option in EViews. This involves setting up the log-likelihood function for the model, based on the assumption of independent observations; and then maximizing this function numerically with respect to the unknown parameters.

First, to introduce the concepts and commands that are involved, we consider the standard linear multiple regression model with normal errors, for which we know that the MLE of the coefficient vector is just the same as the OLS estimator. This will give us a “bench-mark” against which to check our understanding of what is going on. Then we can move on to some more general models.

The History of Econometrics - An Alternative View

There are different ways of looking at history.Professor Annie Cot reminds of this, in the context of econometrics, in one of her dissertations that has been made available here.

"Econometrics has become such an obvious, objective - almost natural - tool that economists often forget that it has a history of its own, a complex and sometimes problematic history. Two works - Morgan (1990) and Qin (1993) - constitute the Received View of the history of econometrics. Basing our analysis on Leo Corry's methodological (and historiographical) framework of image and body of knowledge, the main purpose of this dissertation is to provide a critical account of the Received View.

Our main criticism is that historians of econometrics have a particular image of knowledge that stems from within econometrics itself, generating a problem of reflexivity. This means that historians of econometrics would evaluate econometrics and its history from an econometrician point of view, determining very specific criteria of what should be considered as "true", what should be studied or what should be the questions that the scientific community should ask.

This reflexive vision has conducted the Received View to write an internalist and funnel-shaped version of the History of Econometrics, presenting it as a lineal process progressing towards the best possible solution: Structural Econometrics and Haavelmo's Probability Approach in Econometrics (1944).

The present work suggests that a new history of econometrics is needed. A new history that would overcome the reflexivity problem yielding a certainly messier and convoluted but also richer vision of econometrics' evolution, rather than the lineal path towards progress presented by the Received View".

If you have a serious interest in the history of our discipline, this is for you.

Monday, December 16, 2013

Dennis Lindley Passes Away

The loss of Dennis Lindley, yesterday, will be received with sadness by Bayesians - econometricians included.

Dennis was a major driving force in the formalization and dissemination of Bayesian thought.

Two posts that comment on his many contributions can be found here and here.

I recall presenting a Bayesian paper at a statistics conference in New Zealand in the 1970's, with Dennis in the front row. It was an unnerving experience!

Sunday, December 15, 2013

Proxy Variables and Biased Estimation

Here's a problem from the exam. that one of my econometrics classes sat recently. It's to do with some of the consequences of mis-specifying a regression model, and then applying OLS estimation.

Specifically, let's suppose that data-generating process (the correct model specification) is actually of the form:

y = Xβ + ε ; ε ~ [0 , σ²I_n] . (1)

However, we can't observe the k variables in the X matrix, and instead we replace them with k "proxy variables" (substitutes) that we can observe. So, the model that we actually estimate is:

y = X^*β + v . (2)

The students were asked to show that the usual (unbiased) estimator of σ² is actually biased in this case; and they were asked if they could determine the "direction" of the bias.

On Staying Awake in Class

Pedagogy Unbound is "A place for college teachers to share practical strategies for today's classrooms."

Their blog today contains a lovely piece by David Gooblar, titled " 'Trucker Tricks' for Keeping Students Awake". Not that any of you would have such problems on either side of the rostrum in econometrics classes, I'm sure!

David writes:

"Of all the tips that have been posted at Pedagogy Unbound since the site’s launch in August, the one that has been read the most—by far—is titled “Help your students stay awake in class.” This, it seems, is what professors are most concerned about. Not student writing. Not the plague of plagiarism. Not even students who don’t participate in discussions. No, the most pressing problem facing college teachers today is merely getting their students to stay conscious for an hour and 15 minutes.

.......The tip is one (from an associate professor who) worked as a truck driver when he was a college student and now, as a teacher, he passes on his “trucker tricks” for staying awake to his students.

......Crucially, (he) follows up these tips by inviting the students, at any time during class throughout the term, to stand up if they feel like they are getting sleepy. They can take their notebooks and stand at the back or side of the classroom. He tells them he’d much prefer a class full of standing students to one full of sleeping, or even just drowsy, students.

.......So ask your students to stand up if they feel they are in danger of falling asleep. If it turns out that a number of them take you up on the offer, you can tell yourself that you’re getting a standing ovation."

I'm going to try it out!

Thursday, December 12, 2013

Time for Some More Reading!

With the weekend upon us once again, it's time to settle down with the papers - the econometrics research papers, that is. Here are my latest picks:

Cook, S., D. Watson, and L. Parker, 2014. New evidence on the importance of gender and asymmetry in the crime-unemployment relationship. Applied Economics, 46, 119-126.
Fan, J., F. Han, and H. Liu, 2013. Challenges of big data analysis. Mimeo.
Hashmi, A. R., 2014. Competition and innovation: The inverted-U relationship revisited. Review of Economics and Statistics, in press.
Juselius, K., N. F. Moller, and F. Tarp, 2104. The long-run impact of foreign aid in 36 African countries: Insights from multivariate time series analysis. Oxford Bulletin of Economics and Statistics, in press.
Li, R., D. K. J. Lin, and B. Li, 2013. Statistical inference in massive data sets. Applied Stochastic Models in Business and Industry, 29, 399-409.
Sanderson, E. and F. Windmeijer, 2013. A weak instrument F-test in linear IV models with multiple endogenous variables. CEMMAP Working Paper CWP58/13, The Institute for Fiscal Studies.

Data Do Not Imply Science

As a follow-up to my recent post on Big Data, I recommend today's post by Jeff Leek on the Simply Statistics blog. It's titled. 'The key word in "Data Science is not Data, it is Science'.

Jeff says:

"Most people hyping data science have focused on the first word: data. They care about volume and velocity and whatever other buzzwords describe data that is too big for you to analyze in Excel. .........

But the key word in data science is not "data"; it is "science". Data science is only useful when the data are used to answer a question. That is the science part of the equation. The problem with this view of data science is that it is much harder than the view that focuses on data size or tools. It is much, much easier to calculate the size of a data set and say "My data are bigger than yours"......"

Right on, Jeff!

When Everything Old is New Again

We see it with clothing styles. Not just hemline lengths, but also the widths of jacket lapels and guy's ties. How wide should the trouser legs be? Cuffs or no cuffs? Leave your clothes in the closet long enough, and there's a good chance they'll be back in style some day!

And so it is with econometrics. Here are just a few examples:

Random Variable?

A big HT to Ryan MacDonald for drawing this quote to my attention:

"While writing my book (Stochastic Processes, 1953) I had an argument with Feller. He asserted that everyone said "random variable" and I asserted that everyone said "chance variable." We obviously had to use the same name in our books, so we decided the issue by a stochastic procedure. That is, we tossed for it and he won."

Joe Doob, in Statistical Science 12 (1997), No. 4, page 307.

Added - thanks to Arthur Charpentier for this link..

Friday, December 6, 2013

The Washing Machine Repairman

Here's a fun quote.

"As I remember, Bill X fixed my washing machine. My husband, Harry X, brought him home to talk economics after a Cambridge dinner in hall and they walked in on my frustration with the washer. I met a slight-statured, quiet man who modestly asked if he could help. He tried something with a screw-driver which may have worked - or perhaps it didn't work - and went back to talking economics'"

Who were "Harry" and Bill?

Thursday, December 5, 2013

Econometrics and "Big Data"

In this age of "big data" there's a whole new language that econometricians need to learn. Its origins are somewhat diverse - the fields of statistics, data-mining, machine learning, and that nebulous area called "data science".

What do you know about such things as:

Decision trees
Support vector machines
Neural nets
Deep learning
Classification and regression trees
Random forests
Penalized regression (e.g., the lasso, lars, and elastic nets)
Boosting
Bagging
Spike and slab regression?

Probably not enough!

If you want some motivation to rectify things, a recent paper by Hal Varian will do the trick. It's titled, "Big Data: New Tricks for Econometrics", and you can download it from here. Hal provides an extremely readable introduction to several of these topics.

He also offers a valuable piece of advice:

"I believe that these methods have a lot to offer and should be more widely known and used by economists. In fact, my standard advice to graduate students these days is 'go to the computer science department and take a class in machine learning'."

Interestingly, my son (a computer science grad.) "audited" my classes on Bayesian econometrics when he was taking machine learning courses. He assured me that this was worthwhile - and I think he meant it! Apparently there's the potential for synergies in both directions.

Wednesday, December 4, 2013

The International Association for Applied Econometrics

Here's an organisation that deserves promoting - The International Association for Applied Econometrics. What more can I say?

Well, I had better add something!

First:

"The aim of the Association is to advance the education of the public in the subject of econometrics and its applications to a variety of fields in economics, in particular, but not exclusively, by advancing and supporting research in that field, and disseminating the results of such useful research to the public."

Second:

There next Annual Conference will be held in London, U.K., in June 2014, and the line-up of keynote speakers is impressive. Submissions of papers are due by 1 February 2014, and there is a nice prize for the best paper presented by a graduate student.

Pages

Tuesday, December 31, 2013

Monday, December 30, 2013

Sunday, December 29, 2013

Saturday, December 28, 2013

Friday, December 27, 2013

Thursday, December 26, 2013

Tuesday, December 24, 2013