Tuesday, February 2, 2016

February Reading List

Here's a suggested reading list for February:
  • Casey, G. and M. Klemp, 2016. Instrumental variables in the long run. MPRA Paper No. 68696.
  • Coglianese, J., L. W. Davis, L. Kilian, and J. H. Stock, 2016. Anticipation, tax avoidance, and the price elasticity of gasoline demand. Journal of Applied Econometrics, in press.
  • Falorsi, S., A. Naccarato, and A. Pierini, 2015. Using Google trend data to predict the Italian unemployment rate. Working Paper No. 203, Dipartimento di Economia, Università degli studi Roma Tre.
  • Harris, D., S. J. Leybourne, and A. M. Robert, 2016. Test of the co-integration rank in VAR models in the presence of a possible break in trend at an unknown point. Working Paper No. 5 01-2016, Essex Finance Centre, Essex Business School, University of Essex.
  • Inoue, A. and G. Solon, 2010. Two-sample instrumental variables estimators. Review of Economics and Statistics, 93, 557-561.
  • Kim, N., 2016. A robustified Jarque-Bera test for multivariate normality. Economics Letters, in press.

© 2016, David E. Giles

Sunday, January 24, 2016

(Legally) Free Books!

(An earlier version of this post inadvertently included links to "pirated" material. This has now been rectified, and the post has been completely re-written.)

There are several Econometrics books, and comprehensive sets of lecture notes, that can be accessed for free. These include a number of excellent books by world-class econometricians.

Here a few that will get you started:

Thanks to Donsker Class for supplying several of these links.

If you know of others I'd love to hear about them.

© 2016, David E. Giles

Friday, January 22, 2016

Modelling With the Generalized Hermite Distribution

"Count" data occur frequently in economics. These are simply data where the observations are integer-valued - usually 0, 1, 2, ....... . However, the range of values may be truncated (e.g., 1, 2, 3, ....).

To model data of this form we typically resort to distributions such as the Poisson, negative binomial, or variations of these. These variations may account for truncation or censoring of the data, or the over-representation of certain count values (e.g., the "zero-inflated" Poisson distribution).

Covariates (explanatory variables) can be included into the model by making the mean of the distribution a function of these variables. After all, that's exactly what we do in a linear regression model.

If the "count" data form a time-series, then there are other issues that have to be taken into account.

However, the discrete distributions that we typically use have a number of limitations. The fact that the Poisson distribution is, of necessity, "equi-dispersed" (its variance equals its mean) is a big limitation. This leads us to consider distributions such as the negative binomial, in which he variance exceeds the mean. This enables us to model "over-dispersed" data, which are encountered frequently in practice.

The standard distributions are also limited in terms of what they can model in terms of distributional shapes. In particular, there are limitations on modal values in the data.

For instance, in the case of the Poisson distribution, these limitations are the following. If the parameter (λ) of the Poisson distribution is an integer, then there are two adjacent modes with equal modal height, at x = λ and x = λ-1. If lambda is non-integer, then there is a single mode at int(λ), the integer part of λ.

In the case of the negative binomial distribution, there is a single mode.

This suggests that standard discrete distributions of the type that we typically use to mode l"count" data will not be very satisfactory if our data exhibit multi-modality.

We need to look to alternative distributions.

Here's an example of what I mean.

In an earlier post, I discussed some of my work involving the use of the so-called Hermite distribution, introduced by Kemp and Kemp (1965). As an example, I showed the distribution of data relating to the number of financial crises in various countries, as reproduced here:

You can see that, apart from being multi-modal, this empirical distribution is over-dispersed (its variance is approximately twice its mean).

In Giles (2010) I used the Hermite distribution, and various covariates, to model these data using maximum likelihood estimation.

The Hermite distribution can be generalized in various ways. Recently, Moriña et al. (2015) have released a terrific R package, called hermite, that makes it really easy to model "count data" using the Generalized Hermite distribution. We now have a convenient way of dealing with data that exhibit both over-dispersion and multi-modality.

I strongly recommend this new addition to R.


Giles, D. E., 2010. Hermite regression analysis of multi-modal count data. Economics Bulletin, 30(4), 2936–2945.

Kemp, C. D. and A. W. Kemp, 1965. Some properties of the ‘Hermite’ distribution. Biometrika, 52, 381-394.

Moriña, D,, M. Higueras, P. Puig, and M. Oliveira, 2015. Generalized Hermite distribution modelling with the R package hermite. The R Journal, 7(2), 263-274.  

© 2016, David E. Giles

Saturday, January 16, 2016

Why Does "Pi" Appear in the Normal Density

Every now and then a student will ask me why the formula for the density of a Normal random variable includes the constant, π, or more correctly (2π).

The answer is that this term ensures that the density function is "proper" - that is, the integral of the function over the full real line takes the value "1". The area under the density, or "total probability", is "1".

Some students are happy with this (partial) answer, but others want to see a proof. Fair enough!

However, there's a trick to proving that this integral (area) is "1" in value. Let's take a look at it.

Saturday, January 9, 2016

Difference-in-Differences With Missing Data

This brief post is a "shout out" for  Irene Botusaru (Economics, Simon Fraser University) who gave a great seminar in our department yesterday.

The paper that she presented (co-authored with Federico Guitierrez), is titled "Difference-in- Differences When the Treatment Status is Observed in Only One Period". So, the title of this post is a bit of an abbreviation of what the paper is really about.

When we conduct DID analysis, we need to be able to classify information about the behaviour/characteristics of survey respondents into a 4-way matrix. Specifically we need to be able to observe the respondents before and after a "treatment"; and in each case we need to know which respondents were treated, and which ones were not.

Usually, a true panel of data, observed at two or more time-periods, facilitates this.

However, what if we simply have repeated cross-sections of data, taken at different time-periods? In this case we aren't necessarily observing exactly the same respondents when we look at the cross-sections for two different time-periods. Typically, in the cross-section after the treatment we'll know which respondents were treated and which ones weren't. However, there will be no way of partitioning the respondents in the pre-treatment cross-section  into "subsequently treated" and "not treated" groups.

Two of the four cells in the matrix of information that we need will be missing, so conventional DID can't be performed.

This is the problem that Irene and Federico consider.

A natural response is introduce some sort of proxy variable(s) to deal with the missing data, and of course this will introduce an estimation bias, even asymptotically. This paper basically takes this approach. The result is a GMM estimation strategy, together with a test that the underlying assumptions are satisfied.

This is a really nice paper - well motivated, technically solid, and with a nice empirical example and application. I urge you to take a look at it if DID is in your econometrics tool-kit (and even if it's not!)

I'm sure that Irene and Federico would appreciate hearing about situations where you've encountered this missing data problem, and how you've responded to it.

© 2016, David E. Giles

Wednesday, December 30, 2015

The Econometric Game, 2016

I like to think of The Econometric Game as the World Championship of Econometrics.

There have been 16 annual Econometric Games to date, and some of these have been featured previously in this blog. For instance in 2015 there were several posts, such as this one. You'll find links in that post to earlier posts for other years.

I also discussed the cases that formed the basis for the 2015 competition here.

In 2016, the 17th Econometric Game will be held at the University of Amsterdam between 6 and 8 April.

The competing teams will be representing the following universities:

Requests I Ignore

About six months ago I wrote a post titled, "Readers' Forum Page".

Part of my explanation for the creation of the page was as follows:

Tuesday, December 29, 2015

Job Market for Economics Ph.D.'s

In a post in today's Inside Higher Ed, Scott Jaschik discusses the latest annual jobs report from the American Economic Association.

Ne notes:
"A new report by the American Economic Association found that its listings for jobs for economics Ph.D.s increased by 8.5 percent in 2015, to 3,309. Academic jobs increased to 2,458, from 2,290. Non-academic jobs increased to 846 from 761." 
(That's an 11.1% increase for non-academic jobs, and a 7.3% increase for academic positions.)

The bounce-back in demand for graduates since 2008 is impressive:
"Economics, like most disciplines, took a hit after 2008. Between then and 2010, the number of listings fell to 2,285 from 2,914. But this year's 3,309 is greater not only than the 2008 level, but of every year from 2001 on. The number of open positions also far exceeds the number of new Ph.D.s awarded in economics."
And here's the really good news for readers of this blog:
"As has been the case in recent years, the top specialization in job listings is mathematical and quantitative methods."

© 2015, David E. Giles

Monday, December 28, 2015

Correlation Isn't Necessarily Transitive

If X is correlated with Y, and Y is correlated with Z, does it follow that X and Z are correlated?

No, not necessarily. That is, the relationship of correlation isn't necessarily transitive.

In a blog post from last year the Fields Medallist, Terrence Tao, discusses the question: "When is Correlation Transitive?", and provides a thorough mathematical answer.

He also provides this simple example of correlation intransitivity: 

This is something for students of econometrics to keep in mind!

© 2015, David E. Giles

Sunday, December 27, 2015

Bounds for the Pearson Correlation Coefficient

The correlation measure that students typically first encounter is actually Pearson's product-moment correlation coefficient. This coefficient is simply a standardized version of the covariance between two random variables (say, X and Y):

           ρXY = cov.(X,Y) / [s.d.(X) s.d.(Y)] ,                                                  (1)

where "s.d." denotes "standard deviation".

In the case of sample data, this formula will be:

          ρXY = Σ[(Xi - X*)(Yi - Y*)] / {[Σ(Xi - X*)2][Σ(Yi - Y*)2]}1/2 ,                 (2)

where the summations run from 1 to n (the sample size); and X* and Y* are the sample averages of the X and Y variables.

Scaling the covariance in this way to create the correlation coefficient ensures that (i) the latter is unitless; and (ii) it takes values in the interval [-1, +1]. The first of these two properties facilitates meaningful comparisons of correlations involving data measured in different units. The second property provides a metric that enables us to think about the "degree" of correlation in a meaningful way. (In contrast, a covariance can take any real value - there are no upper or lower bounds.)

Result (i) above is obvious. Result (ii) can be established in a variety of ways.

(a)  If you're familiar with the Cauchy-Schwarz inequality, the result that -1 ≤ ρ ≤ 1 is immediate.

(b)  If you like working with vectors, then it's easy to show that ρ is the cosine of the angle between two vectors in the X-Y plane. As cos(θ) is bounded below by -1 and above by +1 for any θ, we have our result for the range of ρ right away. See this post by Pat Ballew for access to the proof.

(c)  However, what about a proof that requires even less background knowledge? Suppose that you're a student who knows how to solve for the roots of a quadratic equation, and who knows a couple of basic results relating to variances. Then, proving that  -1 ≤ ρ ≤ 1 is still straightforward:

Let Z = X + tY, for any scalar, t. Note that var.(Z) = t2var.(Y) +2tcov.(X,Y) + var.(X) ≥ 0.

Or, using obvious notation, at2 + bt + c ≥ 0

This implies that the quadratic must have either one real root or no real roots, and this in turn implies that b2 - 4ac ≤ 0.

Recalling that a = var.(Y); b = 2cov.(X,Y); and c = var.(X), some simple re-arrangement of the last inequality yields the result that  -1 ≤ ρ ≤ 1.

A complete version of this proof is provided by David Darmonhere.

© 2015, David E. Giles