Econometrics Beat: Dave Giles' Blog: 2012

Monday, December 31, 2012

On Random Numbers

These days, we take it for granted that it's very simple, and relatively costless, to generate random numbers. Really, these are "pseudo-random numbers", and we should pay careful attention to the quality of the algorithms that are used to generate them, as I've noted previously (and also here) on this blog.

Putting these qualification to one side, however, we've certainly come a long way from having to rely on that famous tome, A Million Random Digits With 100,000 Normal Deviates. Published by the Rand Corporation in 1955, this classic provided (lots of!) random digits that could be used as an aid to simple random sampling, as well random deviates generated from the standard normal distribution. The latter could be used in Monte Carlo simulations, for example.

The following is an extract from a handout that I use in my introductory statistical inference course:

Sad News

I was sad to learn that John Nankervis passed away on Christmas morning. Well known for his many important contributions to econometrics, John was Professor of Finance and Director of Research in the University of Essex Business School.

John will be greatly missed.

Saturday, December 29, 2012

The Greenfield Challenge

The European Network for Business and Industrial Statistics has the objective of "...connecting individuals and organisations, interested in theoretical developments and practical applications in the field of business and industrial statistics." They achieve this through their electronic network, and an annual Spring Conference.

The co-called "Greenfield Challenge", which is now an annual event, arose from a challenge laid down by Tony Greenfield in his George Box Medal acceptance speech in 2009:

"My challenge to you is that you will tell some audience about work you have done, and completed successfully because you used a statistical method. But that audience must be of people who are not statisticians.

And you will have spoken to those people through publications that are for the wider public, through magazines or newspapers, or from a public platform. You might even write a short story or a play.… That is my challenge: Tell the world, outside your circle, of work you have done, and done successfully because you used statistics."

Fighting words!

You can read more about the responses to this challenge from members of the ENBIS community here.

I really like this challenge. It seems to me that it would be equally applicable to the econometrics community.

What do you think?

Friday, December 28, 2012

National Econometricians' Day

The Netherlands has been a power-house of econometrics since day one. Think of Jan Tinbergen's contributions to the birth of the discipline, for instance.

So, perhaps it will come as no surprise that 5 February 2013 will be National Econometricians' Day, or "LED" (Landelijke Econometristendag), for students of econometrics. Billed as ".... the largest annual career event for econometricians in the Netherlands", the one-day event will be held in Utrecht in 2013. Their slogan - "LED it be your day!"

I've blogged previously (here and here) about another Netherlands-based initiative for students of econometrics - the Econometric Game. This event has gone from strength to strength, and attracts teams of students from all over the world.

It's great to see the extent of organization and involvement among the Dutch students. They have six associations for econometrics students, and these form the Landelijke Orgaan der Econometrische Studieverenigingen (LOES). The individual associations are: Asset | Econometrics (Tilburg University), Econometrisch Dispuut (Erasmus University), Kraket (VU University), SCOPE | Vectum (Maastricht University), VESTING (University of Groningen) and VSAE (University of Amsterdam).

Thursday, December 27, 2012

Saturday, December 22, 2012

Eggnog With an Econometrics Flavour

With Christmas almost upon us I was giving some thought to a seasonally-themed post. The truth is, it's not as easy to produce one with an econometric flavour as you might think.

I suppose we could look at forecasting retail sales for the holiday season; or construct a SUR model for the market shares held by the major courier companies as they deal with all of us last-minute shoppers and givers. But I wanted something that clearly boasts "Christmas" in the title.

Interestingly (or not) there are not that many published applied econometrics papers that fit the bill. A few years ago, however, I struggled to fill this void with a paper I titled, "Testing for a Santa Claus Effect in Growth Cycles". (You can download the working paper version from here.)

The abstract tells the story:

"We examine the seasonal distribution of turning points in the post-war growth cycles of sixteen economies. Using nonparametric tests for distributions on the circle, we cannot reject a uniform distribution for the turning points for most of the countries. In the case of troughs, uniformity is supported for eleven countries, notwithstanding the unusually large number of December or January troughs. This provides evidence against a ‘Santa Claus effect’ in the growth cycle."

This was one of the first papers that I wrote that involved goodness-of-fit testing with "circular" data. You'll find other posts on this general topic here and here, if you're interested. There are a few things I'd do differently if I were writing the Santa Claus paper now - but isn't that always the case?

Reference

Giles, D. E., 2005. Testing for a Santa Claus effect in growth cycles. Economics Letters, 87, 421-426.

Wednesday, December 19, 2012

Time Series Econometrics Conference

If you're working in Time Series Econometrics, then here's a conference that may interest you.

The next Workshop in Time Series Econometrics is being held in Zaragoza, Spain, in April 2013.

I like the aims of the workshop:

"Time Series Econometrics has been one of the most productive areas in quantitative economics in recent years. Along with the progress in theory and computation, great possibilities for applications have opened up in several economic fields, both for academics and professional practitioners. For these reasons, a group of us who are very devoted to this subject consider that it deserves a more prominent place in both national and international meetings. The main objective of this TSEW is to fill this gap.

TSEW wishes to bring together academics and non-academic professional practitioners working in Time Series Econometrics in Spain, both in the theoretical and applied dimensions.

We hope this I_t meeting does not suffer any structural break with respect to previous events and that we can carry on enjoying a long time series, without sample size problems, with a deterministic trend towards quality, with long memory and with a common factor based on getting together and enjoying our work in such a stimulating area as Time Series. Even so, we should leave room for some stochastic or random elements."

Monday, December 17, 2012

Judea Pearl Wins 2012 Turing Award

My November issue of Amstat News arrived this morning, and I was most interested to see that ASA member, Judea Pearl, has been awarded the 1012 Turing Award. This is described as "...the most prestigious award in computer science".

There is an interesting interview with Judea here. In a related piece of news, the ASA has announced a new "Causality in Statistics Education" award:

"The ASA announces a new prize, Causality in Statistics Education, aimed at encouraging the teaching of basic causal inference in introductory statistics courses. The prize carries an award of $5,000 per year. Donated by Judea Pearl, the prize is motivated by the growing importance of introducing core elements of causal inference into undergraduate and lower-division graduate classes in statistics."

Judea's contributions have been the subject of two recent posts (here and here) on this blog.

Sunday, December 16, 2012

More on Regression & Causality

Now back in town, I enjoyed seeing this thoughtful blog post from William M. Briggs. It relates to own recent post, Regression & Causation.

This is a big topic, with lots to be said, from various perspectives.

Tuesday, December 4, 2012

Regression & Causation

Recently I read, with interest, a thought-provoking paper by Bryan Chen & Judea Pearl. The paper is titled, "Regression and causation: A critical examination of econometrics textbooks".

Here's the abstract:

"This report surveys six influential econometric textbooks in terms of their mathematical treatment of causal concepts. It highlights conceptual and notational differences among the authors and points to areas where they deviate significantly from modern standards of causal analysis. We find that econometric textbooks vary from complete denial to partial acceptance of the causal content of econometric equations and, uniformly, fail to provide coherent mathematical notation that distinguishes causal from statistical concepts. This survey also provides a panoramic view of the state of causal thinking in econometric education which, to the best of our knowledge, has not been surveyed before."

Some Recent Papers on Granger Causality

My various posts on testing for Granger non-causality seem to have been quite popular with readers of this blog.

For example, see here, here, here, here, and also see the Word-Count block in the right side-bar of this page.

The literature involving applications of non-causality testing continues to grow, even though it is now 43 years since Granger's seminal paper on the subject. Regrettably, some of these applications are lacking in various respects, but there are many that are really excellent examples of applied econometric analysis.

Contributions to the various theoretical and methodological issues surrounding the Granger causality literature also continue to emerge. Here are just a few such papers that have emerged in recent months.

Matthieu Droumaguet & Tomasz Wozniak, 2012, "Bayesian testing of Granger causality in Markov-switchng VARs
Christophe Hurlin & Elena Dumitrescu, 2012, "Testing for Granger non-causality in heterogeneous panels"
Yushi Li, 2012, "Estimating long memory causality relationships by a wavelet method
Joachim Wilde, 2012, "Effects of simultaneity on testing Granger-causality - A cautionary note about statistical problems and economic misinterpretations"
Tomasz Wozniak, 2012, Granger-causal analysis of VARMA-GARCH models"

These papers cover a lot of important ground, and they're well worth taking a look at if you have an interest in testing for Granger non-causality.

Saturday, December 1, 2012

Assistant Prof. Position at UVic

Not my usual sort of post, I know, but I just wanted to get it out there that my department (Economics, at the University of Victoria, on the West coast of Canada) is looking to hire a tenure-track Assistant Professor. The details of the position are available here.

We've only just got permission to hire, so we're really scrambling to catch up with this year's job market. Anything that you can do to get the word out to likely applicants, placement officers, etc. would be a great help to us.

Any enquiries should be addressed directly to econfacultysearch@uvic.ca.

Sunday, November 25, 2012

Econometric Modelling With Time Series

That sounds like a snappy title, but it's been taken already!

There's a new econometrics book that's about to be released that looks really interesting. It's titled, Econometric Modelling With Time Series: Specification, Estimation and Testing. To be published by Cambridge University Press next month, this volume caught my eye, not only because of its title, but also because one its co-authors is a former Monash U. colleague of mine, Vance Martin (now at the University of Melbourne). Vance is joined by co-authors Stan Hurn and David Harris.

Is the Cochrane-Orcutt Estimator Unique?

One of the work-horses of econometric modelling is the Cochrane-Orcutt (1949) estimator, or some variant of it such as the Beach-MacKinnon (1978) full ML estimator. The C-O estimator was proposed by Cochrane and Orcutt as a modification to OLS estimation when the errors are autocorrelated. Those authors had in mind errors that follow an AR(1) process, but it is easily adapted for any AR process.

I've blogged elsewhere about the the historical setting for the work by Cochrane and Orcutt.

Given the limited computing power available at the time, the C-O estimator was a pragmatic solution to the problem of obtaining the GLS estimator of the regression coefficients, and approximating the full ML estimator. Students of econometrics will be familiar with the iterative process associated with the C-O estimator, as outlined below.

The use of this estimator leads to some interesting questions. Is this iterative scheme guaranteed to converge in a finite number of iterations? Is there a unique solution to this convergence problem, or can multiple local solutions (minima) occur?

Assessing Heckman's Two-Step Estimator

Good survey papers are worth their weight in gold. Reading and digesting a thoughtful, constructive, and well-researched survey can save you a lot of work. It can also save you from making poor choices in your own research, or even from "re-inventing the wheel".

For these reasons, The Journal of Economic Surveys is a great resource. Over the years it has published some really fine peer-reviewed survey articles, many of which I've benefited from personally.

Another piece of good news is that Wiley (the journal's publisher) makes a number of the most highly-cited articles available for free.

Failing the "Sniff Test"

If it looks like garbage, and smells like garbage, it probably is garbage! Insert any four-letter word of your choice, as long as it begins with "S" or "C", in place of "garbage".

HT to my former colleague, Peter Cribbett, for drawing my attention to this little gem:

A Very Personal Thank You

Just two words, but from the heart. It being Remembrance Day, I'm led to reflect on the contributions and sacrifices that my father, Albert Thomas Giles made for me. An infantryman in the British army (three times wounded) in World War II he also sacrificed a great deal for the education of his children.

Inadvertently, Bert was also influential in my becoming an econometrician.

Granger Causality Testing in R

Today just gets better and better!

I had an email this morning from Christoph Pfeiffer, who follows this blog. Christoph has put together some nice R code that implements the Toda-Yamamoto method for testing for Granger causality in the context of non-stationary time-series data.

Given the ongoing interest in the various posts I have had (here, here, here & here) on testing for Granger causality, I'm sure that Christoph's code will be of great interest to a lot of readers.

Thanks for sharing this with us, Christoph.

Former Students

It's always great to see our former grad. students making great progress with their chosen careers. Singling out individuals for special mention may be a little risky. But what the heck!!!

Bayesian Exercises

In the Advanced Topics in Econometrics course that I'm teaching this semester, one of the topics we're covering is Bayesian Econometrics. I've blogged a little on this topic before - e.g., here, here, here, and here.

If you want some practice exercises on Bayesian inference, you may be interested in this set of problems, as well as the assignment that my class is working on currently.

There's not much "econometric" content to the questions - they're more broadly statistical in nature. However, they cover some of the key ideas associated with this topic. Solutions will be posted later.

We're also looking at computational issues, such as MCMC. More on the latter in a different post, perhaps.

Wednesday, October 31, 2012

Listening to your Data

The latest issue of Significance Magazine (a joint publication of the Royal Statistical Society, and the American Statistical Association), includes an interesting article by Ethan Brown and Nick Bearman. It's titled, "Listening to Uncertainty: Information That Sings".

The article is about "sonification" - listening to your data!

Some Properties of Non-linear Least Squares

You probably know that when we have a regression model that is non-linear in the parameters, the Non-Linear Least Squares (NLLS) estimator is generally biased, but it's weakly consistent. This is the case even if the model has non-random regressors and an additive error term that satisfies all of the usual assumptions.

In addition, even if the model’s errors are normally distributed, the NLLS estimator will have a sampling distribution that is non-normal in finite samples, and the usual t-statistics will not be Student-t distributed in finite samples.

In this post I'll illustrate these, and some other results, by using a simple Monte Carlo experiment.

Central Limit Theorems

When we first encounter asymptotic (large sample) theory in econometrics, one of the most important results that we learn about is the Central Limit Theorem. Loosely speaking we learn that if we aggregate together enough values that are sampled randomly from the same distribution, with a finite mean and variance, then this aggregate starts to behave as if it is normally distributed.

However, too few courses make it clear that this "classical" central limit theorem is just one of several such results. The one that assumes independently and identically distributed values is actually the Lindeberg-Lévy Central Limit Theorem. There are other, related, results that deal with less restrictive situations.

Viren Srivastava

Recently, a reader of this blog asked I could provide some information about the late V.K. Srivastava, and the substantial contributions that he made to econometrics and to statistics generally.

I'm more than happy to oblige, as Virendra (Viren) was a good friend of mine, a treasured co-author, and a very caring and humble individual.

Jobs for Econometricians

My impression is that there is a strong international market for economists who have strong skills in econometrics. I'm not talking just jobs in the academic community, but also about positions in the private, public, and non-profit sectors too.

This blog has a page that list a very small selection of such jobs. This list has never been meant to be exhaustive. That's not what this blog is about. Rather, the jobs listed on that page are meant to be illustrative of some of the various jobs that are available to econometricians.

If you're looking seriously for an academic position, especially at the entry level, then the obvious place to start is Job Openings for Economists (JOE). This is sponsored by the American Economics Association, but handles jobs internationally. Although the focus is on academic positions, jobs in other sectors appear in JOE too.

Another website that may interest you is econometricsjobs.com. This is a commercial site that lists positions specific to econometrics. It also has international coverage, and covers all sectors of the workforce. Just browsing some of the jobs that are advertised there may broaden your perception of the opportunities that are available to econometricians.

There are other sites too, of course. Perhaps some of these will get mentioned in comments to this post.

Saturday, October 20, 2012

Mathgen

H/T to my colleague, Martin Farnham, for drawing my attention to Mathgen.

Thanks to Nate Eldridge, a mathematician at Cornell University, who blogs at That's Mathematics!, you can randomly generate your own mathematics research paper!

In fact, a Mathgen-generated was recently accepted for publication at one of those pseudo-journals that seem to have sprouted with a vengeance of late. If you weren't convinced already that these publishing outlets should be avoided like the plague, this ought to do it for you!

Just for funzies, I decided to solicit Mathgen's assistance in writing my own paper. It took just a few seconds, and you can read it here. Constructive comments are welcomed, of course. Just don't ask me what the title means.

I have a feeling that this is going to be a particularly productive weekend!

(As Martin suggested to me, this is every journal editor's new nightmare!)

Thursday, October 18, 2012

Let's be Consistent

One of the standard, large-sample, properties that we hope our estimators will possess is "consistency". Indeed, most of us take the position that if an estimator isn't consistent, then we should probably throw it away and look for one that is!

When you're talking about the consistency of an estimator, it's a really good idea to be quite clear regarding the precise type of consistency you have in mind - especially if you're talking to a statistician! For example, there's "weak consistency", "strong consistency", "mean square consistency", and "Fisher consistency", at least some of which you'll undoubtedly encounter from time to time as an econometrician.

Some Historical Links

You've probably noticed that some of my posts are essentially pieces that focus on some aspect of the history of econometrics, and/or the history of statistics. I certainly have a bit of an interest in these topics, and I also find that it's helpful to inject a bit of historical content when I'm teaching.

It doesn't necessarily have to be very much - just something interesting to make the name of the econometrician in question, or the origin of a concept a bit more memorable. Or perhaps some historical context that's intended to clarify why the literature took a certain turn at a certain time.

It's both interesting and enlightening to know something about where your discipline came from, how it evolved over time, and who the players were. Some of them were really interesting people!

What I Learned Last Week

Somewhat to my surprise, last month I got a great response to my post, "My Must-Read List" (HT's to Mark Thoma & Tyler Cowen). This past week I learned a lot by reading some terrific new papers on a variety of econometrics topics. Here they are, with some commentary, and in no particular order:

Degrees of Freedom in Regression

Yesterday, one of the students from my introductory grad. econometrics class was asking me for more explanation about the connection between the "degrees of freedom" associated with the OLS regression residuals, and the rank of a certain matrix. I decided to out together a quick handout to do justice to her question, and it occurred to me that this handout might also be of interest to a wider group of student readers.

So, here's what I wrote.

How Good is Your Random Number Generator?

Simulation methods, including Monte Carlo simulation and various forms of the bootstrap, are widely used by econometricians. We use these tools to learn about the sampling distributions of our estimators and tests, especially in situations where a purely analytic approach is technically difficult.

For example, sometimes we're able to appeal to standard asymptotic (large sample) results - such as the central limit theorems, and the laws of large numbers - to figure out how good our inferences will be if the sample size is very large. However, when it comes to the question of how good they are when the sample size is quite small, the answer may not be so easily established.

In addition, when we come up with a new theoretical result in econometrics, most of us take the precaution of also simulating the result - as check on its accuracy.

Monte Carlo and bootstrap methods rely critically on our ability to generate "pseudo"-random numbers that have the characteristics that we ascribe to them. How often have you actually checked if the random number generators in your favourite econometrics package produce values that are "random", and follow the distribution that you've asked for? Probably not often enough!

I follow John Cook's blog, The Endeavour. A couple of years ago he had a nice post titled, "How to test a random number generator". In that post, he links to a chapter of the same title that he wrote for the book, Beautiful Testing (edited by Tim Riley and Adam Goucher).

John's chapter is a short, but very valuable read, and I recommend it strongly.

Top 100 Economics Blogs

I was happy to learn this morning that the Economics Degree website has just released a list of Top Economics Sites for Enlightened Economists. There are 100 blogs in total, and the preamble notes:

"Listed in no specific order, these sites are a must-see for anyone who wants to be considered a quality, “enlightened” economist. Sites were selected based on a variety of factors including readership size, update frequency, information quality, and other awards received. "

I was even more pleased to see that this blog made the list (see number 71 & remember they're in particular order!), with the following, very kind, description:

"This is a high-quality blog with a strong econometrics focus. The posts are jam-packed with information and ideas, and are clearly intended for readers with a background in statistics or econometrics."

(Blush. Blush.)

There are some great sites for students and professionals alike on this Top 100 list.

Nice job!

Tuesday, October 9, 2012

Mathematics, Economics, & the Nobel Prize

With the announcement of this year's Nobel Prize in Economic Science less than a week away, here's a recent working paper that you'll surely enjoy: "The use of mathematics in economics and its effect on a scholar's academic career", by Miguel Espinosa, Carlos Rondon, and Mauricio Romero. (Be sure that you download the latest version - dated September 2012.)

Here's the abstract:

"There has been so much debate on the increasing use of formal mathematical methods in Economics. Although there are some studies tackling these issues, those use either a little amount of papers, a small amount of scholars or cover a short period of time. We try to overcome these challenges constructing a database characterizing the main socio-demographic and academic output of a survey of 438 scholars divided into three groups: Economics Nobel Prize winners; scholars awarded with at least one of six prestigious recognitions in Economics; and academic faculty randomly selected from the top twenty Economics departments worldwide. Our results provide concrete measures of mathematization in Economics by giving statistical evidence on the increasing trend of number of equations and econometric outputs per article. We also show that for each of these variables there have been four structural breaks and three of them have been increasing ones. Furthermore, we found that the training and use of mathematics has a positive correlation with the probability of winning a Nobel Prize in certain cases. It also appears that being an empirical researcher as measured by the average number of econometrics outputs per paper has a negative correlation with someone's academic career success." (Emphasis added; DG)

The first of the highlighted conclusions doesn't surprise me. I'm not sure that I like the second one, though!

Monday, October 8, 2012

Seminar Attendance: A Pep-Talk for Grad. Students

Grad. students are busy, busy, people. That's true, no matter what discipline or what institution we're talking about. They're busy with their course-work, comprehensive exams, research, ~~drinking beer~~, working as teaching assistants and research assistants, maintaining relationships with partners and children..............Grad. students even get to sleep every now and then!

So, something has to give. One way to grab an extra two or three hours each week is to avoid attending, and participating in, the research seminars put on by your department. Is that a smart choice, though?

Dancing With the Econometricians

Let's talk about the two-step. Not the tango or the polka. The two-step!

More specifically let's talk about a particular two-step estimator that we use all of the time in econometrics. I want to clear up some misconceptions that I seem to encounter all too frequently when I read empirical "applied" papers.

Why is it that some people insist on using the term "Two Stage Least Squares" inappropriately?

Let me explain what I mean.

Congratulations!

Congratulations to Ryan Godwin for successfully defending his Ph.D. dissertation yesterday. Ryan's dissertation was titled, "Econometric Analysis of Non-Standard Count Data", and you can find the abstract on the notice for his defense here.

Ryan is now on faculty in the Department of Economics at U. Manitoba.

I'm looking forward to working with Ryan in the future.

Wednesday, September 26, 2012

My "Must Read" List

I have to confess that the number of items on my list of papers that I really must read (very soon) is rather large. My excuse is the same as everyone else's - too many papers, too little time. However, here's a small selection of of some of the papers that I've added to that list recently:

The Journal of Econometric Methods

The first issue of the Journal of Econometric Methods is available online, and you can register for a FREE trial access if your library doesn't already subscribe to a package that includes this journal.

Edited by Jason Abrevaya, Bo Honoré, Atsushi Inoue, Jack Porter, and Jeff Wooldridge, the Journal of Econometric Methods promises to be a "must read" publication. For now, issues will be published once a year.

Here's an extract from the "Editorial" of the first issue:

Different Views on Significance Testing

Econ Journal Watch is an online resource that provides "scholarly comments on academic economics". If you don't follow it, or at least browse it from time to time, then I urge you to do so. Yes, that includes econometricians!

In the September 2012 issue you'll find a piece by Thomas Mayer. It's titled, "Ziliak & McCloskey's Criticisms of Significance Tests: An Assessment", and you can download a pdf version of the full article for free.

Here's the abstract of his article:

Confidence Regions for Regression Coefficients

Let’s consider the usual linear regression model, with the full set of assumptions:

y = Xβ + ε ; ε ~ N[0 , σ²I_n] , (1)

where X is a non-random (n × k) matrix with full column rank.

Recall that, under our usual set of assumptions for the linear regression model, the OLS coefficient estimator, b = (X'X)^-1X'y, has the following sampling distribution:

b ~ N[β ,σ²(X'X)^-1] . (2)

From the form of the covariance matrix for the b vector, we see that, in general:

(i) The leading diagonal elements will not all be the same, so each element of b will usually have a different variance.

(ii) There is no reason for the off-diagonal elements of the covariance matrix to be zero in value, so the elements of the b vector will be pair-wise correlated with each other.

You'll also remember that when we develop a confidence interval for one of the elements of β, say β_i, we start off with the following probability statement:

Pr.[-t_c < (b_i - β_i) / s.e.(b_i) < t_c] = (1 - α) , (3)

where t_c is chosen to ensure that the desired probability of (1 - α) is achieved. Equation (3) is then re-written (equivalently) as:

Pr.[β_i - t_cs.e.(b_i) < b_i < β_i +t_cs.e.(b_i)] = (1 - α), (4)

and we then manipulate the event whose probability of occurrence we were interested in, until we ended up with the following random interval which, if constructed many, many times, would cover the true (but unobserved) β_i, 100(1 - α)% of the time:

[b_i - t_cs.e.(b_i) , b_i + t_cs.e.(b_i)] . (5)

Notice that this interval is centered at b_i. Making the interval symmetric about this point ensures that we get the shortest (and hence most informative) interval for any fixed values of n, the sample size, and α. (See here and here for more details.)

Now, suppose that we want to generalize the concept of a confidence interval (that applies to a single element of b) to that of a confidence region, that can be associated with two elements of b at once.

Spherically Distributed Errors in Regression Models

Let's think about the standard linear regression model that we encounter in our introductory econometrics courses:

y = Xβ + ε . (1)

By writing the model in this form, we've already made two assumptions about the stochastic relationship between the dependent variable, y, and the regressors (the columns of the X matrix). First, the relationship is a parametric one - hence the presence of the coefficient vector, β; and second, the relationship is a linear one. That's to say, the model is linear in these parameters. If it wasn't, we wouldn't be able to write the model in the form given in equation (1).

However, the model isn't fully specified until we lay out any assumptions that are being made about the regressors and the random error term, ε. Now, let's consider the full set of (rather stringent) assumptions that we usually begin with:

Dummy Variables - Again!

In a previous post (here) I had a few things to say about the dummy variables that we often use in regression analysis. I'm currently making changes to a related paper of mine that's at the "revise and re-submit" stage with a journal. So, to get further feedback, I presented the material in my department's Brown Bag seminar series earlier this week.

If you're interested, you can download the slides for that presentation from here.

Thursday, September 13, 2012

Granger Causality Testing With Panel Data

Some of my previous posts on testing for Granger causality (for example, here, here, and here) have drawn quite a lot of interest. That being the case, I'm sure that readers of this blog will enjoy reading a new paper by two of my colleagues, and a former graduate student of theirs.

The paper, by Weichun Chen, Judith Clark, and Nilanjana Roy is titled "Health and Wealth: Short Panel Granger Causality Tests for Developing Countries". Here's the abstract of their paper:

Guy Medal for David Firth

At the recent annual conference of the Royal Statistical Society, the Guy Medal, in Silver, was awarded to Professor David Firth, Head of the Department of Statistics at of the University of Warwick.

Alert readers of this blog will recall David's name appearing in a recent post about bias correction. David, a Fellow of the British Academy, was previously awarded the Guy Medal, in Bronze, in 1998.

You can find a full list of all winners of the Guy medals, in Gold, Silver, and Bronze, here. Econometricians will see lost of very familiar names on the lists!

What's Your Favourite Data Analysis Cartoon?

This question was asked on the Stack Exchange Cross Validated blog. Your choice!

Enjoy!

Sunday, September 9, 2012

Using Integrated Likelihoods to Deal With Nuisance Parameters

There are more possibilities open to you when using maximum likelihood estimation than you might think.

When we're conducting inference, it's often the case that our primary interest lies with a sub-set of the parameters. and the other parameters are essentially what we call "nuisance parameters". They're part of the data-generating process, but we're not that interested in learning about them.

We can't just ignore these other parameters - that would amount to mis-specifying the model we're working with. However, in the context of maximum likelihood estimation, there are several things that we can do to make life a little easier.

NBER Summer Institute 2012

Recently, I checked out the site for the NBER Summer Institute 2012 - Econometric Methods for Demand Estimation.

There, you'll find eight videos of some of the lectures presented by Ariel Pakes (Harvard) and Aviv Nevo (Northwestern). The slides that accompany the lectures are also available for downloading.

The topics covered in the vieo lectures are:

Pakes -

The primitives of static demand models
Confronting the precision problem, the information in prices, implications for use of hedonics
Incorporating micro data
Moment inequalities in demand analysis

Nevo -

Estimation of static discrete choice models using market level data
Applications and choice of IV's
Measurement of consumer welfare
Dynamic demand

I learned a lot from these lectures, and I hope that you find them interesting too!

Friday, September 7, 2012

So, What is Econometrics?

Over the years there have been many attempts to define what we mean by the term "Econometrics". I guess we all have our favourites. Mine comes from one of the most influential econometricians of our time - David Hendry:

"Unfortunately, I must now try to explain what "econometrics" comprises. Do not confuse the word with "econo-mystics" or with "economic-tricks", nor yet with "icon-ometrics". While we may indulge in all of these activities, they are not central to the discipline. Nor are econometricians primarily engaged in measuring the heights of economists."

On Crime and Punishment

Quite regularly, I take a look at the "Graphic Detail" blog that's published each business day on The Economist's website. Many of the graphs, maps and infographics that they produce are rather interesting.

Today's one is taken from a recent study, "Divergent Effects of Beliefs in Heaven an Hell on National Crime Rates", published by Azim Shariff and Mijke Rhemtulla in the open-access journal, Plos One.

Here's the abstract from that paper:

The Cauchy Estimator & Unit Root Tests

As we all know, there's more than one way to estimate a regression equation. Some of the estimators that we frequently use include OLS, GLS, IV, GMM, LAD, and ML. Some of these estimators are special cases of some of the others, depending on the circumstances.

But have you ever used the Cauchy estimator? Probably not, even though it's been around (at least) since 1836.

Visualization Methods

Data visualization is an important part of any statistical analysis, including econometric modelling. This is a point I've made before, I know (here, for example). However, it's worth repeating the message from time to time. The Visual-Literacy website has a fun item that summarizes some of the different ways of conveying information in a visual manner: Periodic Table of Visualization Methods.

To be sure, not all of them are relevant to econometricians, but I thought it was kinda fun!

(H-T to John Cook and his @StatFact)

Tuesday, August 28, 2012

Topp-Leone Distribution

In May, I posted about bias-correcting maximum likelihood estimators (MLEs), and I referred to a series of related papers that I've been authoring/co-authoring in recent times.

I've just completed another such paper - this one relates to the estimation of the scale parameter in the so-called Topp-Leone (1955) distribution. You can access the paper here; and the abstract explains why this distribution is especially interesting, and summarizes the main results:

"The Topp-Leone distribution is attractive for reliability studies as it has finite support and a bathtub-shaped hazard function. We compare some properties of the method of moments, maximum likelihood, and bias-adjusted maximum likelihood estimators of its shape parameter. The last of these estimators is very simple to apply and dominates the method of moments estimator in terms of relative bias and mean squared error."

Economic Forecasting

I really enjoyed this post, titled Economic Forecasting: Is Google Trends the Future?, by Livio Di Matteo, on the Worthwhile Canadian Initiative blog.

The title speaks for itself.

"The Rise of Econometrics"

Readers of this blog will know that I have an (untrained) interest in the history of econometrics.

Even so, I'm afraid I don't see myself spending $1,300 to buy the set of volumes going by the name of this post's title, when Routledge publishes it in December!

Edited by Duo Qin, you'll get 1,777 pages, and:

"The set provides an authoritative one-stop resource to enable users to understand what has shaped econometrics into its current form. With a full index and comprehensive introductions to each volume, newly written by the editor, the collection also provides a synoptic view of many current key debates and issues."

Hopefully I can persuade the UVic Library to get on board!

In any case, the Table of Contents for this 4-volume set provides a fine reading guide for serious students of econometrics. Take a look, and then do some bedtime reading!

Friday, August 24, 2012

On Becoming a Sportsmetrician

Wouldn't you know it!? No sooner had I posted about Analysing Olympic Medal Data than the latest issue of AmstatNews hit my (snail) mailbox. AmstatNews is the monthly magazine for members of the American Statistical Association.

In the STATtr@ck section, there's a nice article by Jim Albert, titled "Preparing for a Career as a Sports Statistician: Two Interviews With People in the Field".

Take a look if you have any inclination of becoming a Sportsmetrician.

Analysing Olympic Medal Data

So, the London Olympics are over - with the Paralympics still to come, of course. Sports, and events such as the Olympic Games, generate lots of lovely data. It's also usually "hard" data. So, there's a cottage industry out there comprised of statisticians of all shapes and forms who love to work sports data.

The American Statistical Association has a Section for Statistics in Sport, publishes the Journal of Quantitative Analysis in Sports, and provides access to some interesting sports data-sets.

Interview With George Judge

The journal, Econometric Theory, has a long-standing tradition of publishing excellent interviews with econometricians (and some statisticians) who have made seminal contributions to our discipline over the years.

This has always struck me as a particularly worthwhile service to the econometrics community, and I often encourage my grad. students to read these interviews.

In an ET Interview that will be appearing shortly, Anil Bera talks to George Judge. Anil has made a copy of the interview available here, and I think that it will be of considerable interest to many readers.

Whose F Distribution Was It?

We use the F-distribution all of the time in our econometric work. But why is it called the "F" distribution?

A lot of students guess that the name is related to the great statistician, Sir Ronald A. Fisher. They're partly right. However, the occasionally encountered term, "Fisher's F Distribution" is somewhat misleading. The alternative terms, "Snedecor's F distribution" or the "Fisher-Snedecor Distribution" offer more accurate information.

Personally, I prefer to call it "Snedecor's F" - after George W. Snedecor. Here's why.

Egon Pearson

During the last few days, the Error Statistics blog has included posts about Egon Sharpe Pearson. Egon (son of Sir Karl Pearson), made numerous fundamental contributions to mathematical statistics, many of which bear directly on the development and practice of econometrics.

I had a post about the Neyman-Pearson Lemma earlier this year.

Goodness-of-Fit Testing With Discrete "Circular" Data

Tests for goodness-of-fit based on the empirical distribution function are pretty standard fare. Their applicability relies on the Glivenko-Cantelli Theorem.

However, things get a little tricky when the data are discrete (rather than continuous), or when they are "circular" in nature. When the data exhibit both of these characteristics, some really interesting testing issues have to be handled.

A paper that I wrote a short while back on this topic has just been accepted for publication in the Chilean Journal of Statistics. The paper is titled, "Exact Asymptotic Goodness-of-Fit Testing For Discrete Circular Data, With Applications", and it'll be appearing in the 2013 volume of the journal.

You can download a copy of the paper here.

Thursday, August 16, 2012

The Likelihood Principle

The so-called "Likelihood Principle" forms the foundation of both classical (frequentist) statistics, as well as Bayesian statistics. So, as an econometrician, whether you rely on Maximum Likelihood estimation and the associated asymptotic tests, or if you prefer to adopt a Bayesian approach to inference, this principle is of fundamental importance to you.

What is this principle? Suppose that x is the value of a (possibly vector-valued) random variable, X, whose density depends on a vector of parameters, θ. Then, the Likelihood Principle states that:

"All the information about θ obtainable from an experiment is contained in the likelihood function for θ given x. Two likelihood functions for θ (from the same or different experiments) contain the same information about θ if they are proportional to one another." (Berger and Wolpert, 1988, p.19).

Promoting Econometrics

A post today on the "Simply Statistics" blog is titled, "Statistics/Statisticians Need Better Marketing".

I liked it a lot, and much of the content could be applied to the econometrics community.

However, one of the suggestions worried me a bit - namely:

"Whenever someone does something with data, we should claim them as a statistician."

I'm not sure I'd like to claim as an econometrician, anyone who does some empirical analysis involving economic data. Goodness knows there's an awful lot of garbage out there! And it's produced by people I wouldn't call econometricians, even if that's how they describe tehmselves!

But maybe I'm just getting old and grumpy.

Monday, August 13, 2012

Videos on Using R

In this post on his blog some months ago, Ethan Fosse drew attention to Anthony Damico's collection of over 90 videos on using the R software environment.

Definitely worth looking at!

International Year of Statistics

2013 will be The International Year of Statistics. The associated website can be found here.

Quoting from the site:

"The International Year of Statistics ("Statistics2013") is a worldwide celebration and recognition of the contributions of statistical science. Through the combined energies of organizations worldwide, Statistics2013 will promote the importance of Statistics to the broader scientific community, business and government data users, the media, policy makers, employers, students, and the general public.

The goals of Statistics2013 include:

increasing public awareness of the power and impact of Statistics on all aspects of society;
nurturing Statistics as a profession, especially among young people; and
promoting creativity and development in the sciences of Probability and Statistics"

Various upcoming activities that acknowledge the International Year of Statistics can be found here.

Monday, August 6, 2012

James Durbin

James Durbin has passed away at the age of 89. Jim's numerous contributions to statistics included many that also made him a "household name" in econometrics circles.

There is a short obituary on p.7. of the latest issue of RSS News. A full obituary will follow in a future issue of The Journal of the Royal Statistical Society, Series A.

For some earlier historical material relating to James Durbin in this blog, see the earlier post here.

Thursday, July 26, 2012

Beware of Tests for Nonlinear Granger Causality

Standard tests for Granger causality (or, more correctly, Granger non-causality) are conducted under the assumption that we live in a linear world.

I've discussed some of the issues associated with applying such tests in the presence of possibly integrated/cointegrated time-series data previously, here and here.

But can we justify limiting our attention to a linear environment?

Summertime!

It's been a quiet week at the lake - specifically, no internet access! And hence no posts.

I haven't been able to respond to comments and requests either, so please accept my apology for that.

Yes, I know I should be better organized!

Hodrick-Prescott Filter Paper

A while back I posted (here , here, and here) about constructing confidence bands to go with the Hodrick-Prescott filter. Subsequently, I wrote up the material more formally, and that paper is to appear in Applied Economics Letters.

You can find the final version of the paper here.

Hat-tip to my colleague, Graham Voss, for encouraging me to write up the material properly.

Reference

Giles, D. E., 2012. Constructing confidence bands for the Hodrick-Prescott filter. Forthcoming in Applied Economics Letters.

Sunday, July 15, 2012

Cleaning up Your Data Files

A recent post on The Data Monkey blog describes a really neat (and free) text editor, called Hex Editor Neo.

If you have large, messy, data files that need cleaning, this looks like the editor for you!

Saturday, July 14, 2012

Where Have All the Data Gone?

Perhaps the title of this post shoud be "Why are all the Data Going?". This time it's Statistics Canada's SLID that's slip, sliding away. Or more correctly, effectively it slid away last month!

More Comments on the Use of the LPM

Alfredo drew my attention to Steve Pische's reply to a question raised by Mark Schaffer in the Mostly Harmless Econometrics blog. The post was titled, Probit Better than LPM? The question related to my own posts (here, here, and here, in reverse order) on this blog concerning the choice between OLS (the Linear Probability Model - LPM) or the Logit/Probit models for binary data.

Thanks, Alfredo, as this isn't a blog I follow.

Alfredo asked: "Would you care to respond? I feel like this is truly an exchange from which a lot of people can learn".

Concentrating, or Profiling, the Likelihood Function

We call it "concentrating", they (the statisticians) call it "profiling" - the likelihood function, that is.

Different language - same thing.

So what's this all about, anyway?

Decline and Fall of the Power Curve

When we think of the power curve associated with some statistical test, we usually envisage a curve that looks something like (half or all of) an inverted Normal density. That is, the curve rises smoothly and monotonically from a height equal to the significance level of the test (say 1% or 5%), until eventually it reaches its maximum height of 100%.

The latter value reflects the fact that power is a probability.

But is this picture that invariably comes to mind - and that we see reproduced in all elementary econometrics and statistics texts - really the full story?

Actually - no!

"Data is", or "Data are"?

I guess I'm a ~~pedant~~ traditionalist when it comes to the word "data": one "datum", several pieces of "data", etc.

As with many matters relating to the use of language, though, this one isn't open and shut, by any means.

And so a few days ago we saw The Wall Street Journal, The Economist, and The Guardian grappling with this issue once again.

However, I'm going to stick with my guns, dust off my slide-rule, and also continue to use the "Oxford comma"!

Local vs. Global Approximations

Approximating unknown (continuously differentiable) functions by using a Taylor (MacLaurin) series expansion is common-place in econometrics. However, do you ever pause to recall that such approximations are only locally valid - that is, valid only in a neighbourhood of the (possibly vector) point about which the approximation is made?

Unlike some other types of approximations - such as Fourier approximations - they are not globally valid.

Does this matter? Is it something we should be concerning ourselves with?

Mark Thoma in "The Browser"

It was nice to see this interview with Mark Thoma in The Browser today.

Enjoy!

Friday, July 6, 2012

The Milliken-Graybill Theorem

Let's think about a standard result from regression analysis that we're totally familiar with. Suppose that we have a linear OLS regression model with non-random regressors, and normally distributed errors that are serially independent and homoskedastic. Then, the usual F-test statistic, for testing the validity of a set of linear restrictions on the model's parameters, is exactly F-distributed in finite samples, if the null hypothesis is true.

In fact, the F-test is Uniformly Most Powerful Invariant (UMPI) in this situation. That's why we use it! If the null hypothesis is false, then this test statistic follows a non-central F-distribution.

It's less well-known that all of these results still hold if the assumed normality of the errors is dropped in favour of an assumption that the errors follow any distribution in the so-called "elliptically symmetric" family of distributions. On this point, see my earlier post here.

What if I were now to say that some of the regressors are actually random, rather than non-random? Is the F-test statistic still exactly F-distributed (under the null hypothesis)?

The Role of Statistics in the Higgs Boson Discovery

With the scientific world abuzz today over the (possible) confirmation of the existence of the Higgs Boson, this post from David Smith on the SmartData Collective is a must-read for anyone with an interest in statistics.

Friday, June 29, 2012

SURE Models

In recent weeks I've had several people email to ask if I can recommend a book that goes into all of the details about the "Seemingly Unrelated Regression Equations" (SURE, or just SUR) model.

Any decent econometrics text discusses this model, of course. However, the treatment usually focuses on the asymptotic properties of the standard estimators - iterated feasible GLS, or MLE.

Attention, Stata Users

I've mentioned the Econometrics by Simulation blog before. Although it's still relatively new, it's had some great posts, and Francis Smart is doing a terrific job, as is reflected in the way that the page view numbers are building up.

Definitely worth a look, especially (but not only) if you're a Stata user.

Friday, June 15, 2012

F-tests Based on the HC or HAC Covariance Matrix Estimators

We all do it - we compute "robust" standard errors when estimating a regression model in any context where we suspect that the model's errors may be heteroskedastic and/or autocorrelated.

More correctly, we select the option in our favourite econometrics package so that the (asymptotic) covariance matrix for our estimated coefficients is estimated, using either White's heteroskedasticity-consistent (HC) estimator, or the Newey-West heteroskedasticity & autocorrelation-consistent (HAC) estimator.

The square roots of the diagonal elements of the estimated covariance matrix then provide us with the robust standard errors that we want. These standard errors are consistent estimates of the true standard deviations of the estimated coefficients, even if the errors are heteroskedastic (in White's case) or heteroskedastic and/or autocorrelated (in the Newey-West case).

That's fine, as long as we keep in mind that this is just an asymptotic result.

Then, we use the robust standard error to construct a "t-test"; or the estimated covariance matrix to construct an "F-test", or a Wald test.

And that's when the trouble starts!

Pages

Monday, December 31, 2012

Sunday, December 30, 2012

Saturday, December 29, 2012

Friday, December 28, 2012

Thursday, December 27, 2012

Saturday, December 22, 2012

Wednesday, December 19, 2012

Monday, December 17, 2012

Sunday, December 16, 2012

Tuesday, December 4, 2012

Sunday, December 2, 2012

Saturday, December 1, 2012

Sunday, November 25, 2012

Sunday, November 18, 2012

Wednesday, November 14, 2012

Sunday, November 11, 2012

Wednesday, November 7, 2012

Monday, November 5, 2012

Wednesday, October 31, 2012

Tuesday, October 30, 2012

Monday, October 29, 2012

Friday, October 26, 2012

Tuesday, October 23, 2012

Saturday, October 20, 2012

Thursday, October 18, 2012

Monday, October 15, 2012

Friday, October 12, 2012

Wednesday, October 10, 2012

Tuesday, October 9, 2012

Monday, October 8, 2012

Sunday, October 7, 2012

Tuesday, October 2, 2012

Wednesday, September 26, 2012

Monday, September 24, 2012

Sunday, September 23, 2012

Sunday, September 16, 2012

Saturday, September 15, 2012

Friday, September 14, 2012

Thursday, September 13, 2012

Monday, September 10, 2012

Sunday, September 9, 2012

Saturday, September 8, 2012

Friday, September 7, 2012

Monday, September 3, 2012

Thursday, August 30, 2012

Wednesday, August 29, 2012

Tuesday, August 28, 2012

Sunday, August 26, 2012

Friday, August 24, 2012

Tuesday, August 21, 2012

Monday, August 20, 2012

Sunday, August 19, 2012

Thursday, August 16, 2012

Tuesday, August 14, 2012

Monday, August 13, 2012

Monday, August 6, 2012

Thursday, July 26, 2012

Monday, July 16, 2012

Sunday, July 15, 2012

Saturday, July 14, 2012

Friday, July 13, 2012

Tuesday, July 10, 2012

Monday, July 9, 2012

Sunday, July 8, 2012

Saturday, July 7, 2012

Friday, July 6, 2012

Wednesday, July 4, 2012

Friday, June 29, 2012