Econometrics Beat: Dave Giles' Blog: 11/01/2014

Friday, November 28, 2014

The A. R. Bergstrom Prize, 2015

Tuesday, November 25, 2014

Thanks for Downloading!

In an earlier post I mentioned a paper that I co-authored with Xiao Ling. The paper is "Bias reduction for the maximum likelihood estimator of the parameters of the generalized Rayleigh family of distributions. Communications in Statistics - Theory and Methods, 2014, 43, 1778-1792.

Over the period January to July 2014, this paper was downloaded 144 times from the journal's website. That made it the 6th most downloaded paper for that period - out of all papers downloaded from all volumes/issues of Communications in Statistics - Theory and Methods.

My guess is that some of you were responsible for this. Thanks!

Wednesday, November 19, 2014

The Rise of Bayesian Econometrics

A recent discussion paper by Basturk et al. (2014) provides us with (at least) two interesting pieces of material. First, they give a very nice overview of the origins of Bayesian inference in econometrics. This is a topic dear to my heart, given that my own Ph.D. dissertation was in Bayesian Econometrics; and I began that work in early 1973 - just two years after the appearance of Arnold Zellners' path-breaking book (Zellner, 1971).

Second, they provide an analysis of how the associated contributions have been clustered, in terms of the journals in which they have been published. The authors find, among other things, that:

"Results indicate a cluster of journals with theoretical and applied papers, mainly consisting of Journal of Econometrics, Journal of Business and Economic Statistics, and Journal of Applied Econometrics which contains the large majority of high quality Bayesian econometrics papers."

A couple of the paper coming out of my dissertation certainly fitted into that group - Giles (1975) and Giles and Rayner (1979).

The authors round out their paper as follows:

"...with a list of subjects that are important challenges for twenty-first century Bayesian conometrics: Sampling methods suitable for use with big data and fast, parallelized and GPU, calculations, complex models which account for nonlinearities, analysis of implied model features such as risk and instability, incorporating model incompleteness, and a natural combination of economic modeling, forecasting and policy interventions."

So, there's lots more to be done!

References

Basturk, N., C. Cacmakli, S. P. Ceyhan, and H. K. van Dijk, 2014. On the rise of Bayesian econometrics after Cowles Foundation monographs 10 and 14. Tinbergen Institute Discussion Paper TI 2014-085/III.

Giles, D.E.A., 1975. Discriminating between autoregressive forms: A Monte Carlo comparison of Bayesian and ad hoc methods”, Journal of Econometrics, 3, 229-248.

Giles, D.E.A.and A.C. Rayner, 1979. The mean squared errors of the maximum likelihood and natural-conjugate Bayes regression estimators”, Journal of Econometrics, 11, 319-334.

Zellner, A., 1971. An Introduction to Bayesian Inference in Econometrics. Wiley, New York.

Sunday, November 16, 2014

Orthogonal Regression: First Steps

When I'm introducing students in my introductory economic statistics course to the simple linear regression model, I like to point out to them that fitting the regression line so as to minimize the sum of squared residuals, in the vertical direction, is just one possibility.

They see, easily enough, that squaring the residuals deals with the positive and negative signs, and that this prevents obtaining a "visually silly" fit through the data. Mentioning that one could achieve this by working with the absolute values of the residuals provides the opportunity to mention robustness to outliers, and to link the discussion back to something they know already - the difference between the behaviours of the sample mean and the sample median, in this respect.

We also discuss the fact that measuring the residuals in the vertical ("y") direction is intuitively sensible, because the model is purporting to "explain" the y variable. Any explanatory failure should presumably be measured in this direction. However, I also note that there are other options - such as measuring the residuals in the horizontal ("x") direction.

Perhaps more importantly, I also mention "orthogonal residuals". I mention them. I don't go into any details. Frankly, there isn't time; and in any case this is usually the students' first exposure to regression analysis and they have enough to be dealing with. However, I've thought that we really should provide students with an introduction to orthogonal regression - just in the simple regression situation - once they've got basic least squares under their belts.

The reason is that orthogonal regression comes up later on in econometrics in more complex forms, at least for some of these students; but typically they haven't seen the basics. Indeed, orthogonal regression is widely used (and misused - Carroll and Ruppert, 1966) to deal with certain errors-in-variables problems. For example, see Madansky (1959).

That got me thinking. Maybe what follows is a step towards filling this gap.

Cointegration - The Definitive Overview

Recently released, this discussion paper from Søren Johansen, will give you the definitive overview of cointegration that you've been waiting for.

Titiled simply, "Time Series: Cointegration", Johansen's paper has been prepared for inclusion in the 2nd. edition of The International Encyclopedia of the Social and Behavioural Sciences, 2014. In the space of just sixteen pages, you'll find pretty much everything you need or want to know about cointegration.

To get you started, here's the abstract:

"An overview of results for the cointegrated VAR model for nonstationary I(1) variables is given. The emphasis is on the analysis of the model and the tools for asymptotic inference. These include: formulation of criteria on the parameters, for the process to be nonstationary and I(1), formulation of hypotheses of interest on the rank, the cointegrating relations and the adjustment coefficients. A discussion of the asymptotic distribution results that are used for inference. The results are illustrated by a few examples. A number of extensions of the theory are pointed out."

Enjoy!

Tuesday, November 11, 2014

Normality Testing & Non-Stationary Data

Bob Jensen emailed me about my recent post about the way in which the Jarque-Bera test can be impacted when temporally aggregated data are used. Apparently he publicized my post on the listserv for Accounting Educators in the U.S.. He also drew my attention to a paper from Two former presidents of the AAA: "Some Methodological Deficiencies in Empirical Research Articles in Accounting", by Thomas R. Dyckman and Stephen A. Zeff, Accounting Horizons, September 2014, 28 (3), 695-712. (Here.)

Bob commented that an even more important issue might be that our data may be non-stationary. Indeed, this is always something that should concern us, and regular readers of this blog will know that non-stationary data, cointegration, and the like have been the subject of a lot of my posts.

In fact, the impact of unit roots on the Jarque-Bera test was mentioned in this old post about "spurious regressions". There, I mentioned a paper of mine (Giles, 2007) in which I proved that:

Read Before You Cite!

Note to self - file this post in the "Look Before You Leap" category!

Looking at The New Zealand Herald newspaper this morning, this headline caught my eye:

"How Did Sir Owen Glenn's Domestic Violence Inquiry Get $7 Billion Figure Wrong?"

$7 Billion? Even though that's (only) New Zealand dollars, it still sounds like a reasonable question to ask, I thought. And (seriously) this is a really important issue, so, I read on.

Here's part of what I found (I've added the red highlighting):

Reverse Regression Follow-up

At the end of my recent post on Reverse Regression, I posed three simple questions - homework for the students among you, if you will.

Here they are again, with brief "solutions":

A Source of Irritation

I very much liked one of ECONJEFF's posts last week, titled "Epistemological Irritation of the Day".

The bulk of it reads:

" "A direct test of the hypothesis is looking for significance in the relationship between [one variable] and {another variable]."

No, no, no, no, no. Theory makes predictions about signs of coefficients, not about significance levels, which also depend on minor details such as the sample size and the amount of variation in the independent variable of interest present in the data."

He was right to be upset - and see his post for the punchline!

Saturday, November 8, 2014

Econometric Society World Congress

Every five years, the Econometric Society holds a World Congress. In those years, the usual annual European, North American, Latin American, and Australasian meetings are held over.

The first World Congress was held in Rome, in 1960. I've been to a few of these gatherings over the years, and they're always great events.

The next World Congress is going to be held in Montréal, Canada, in August of 2015. You can find all of the details right here.

Something to look forward to!

A Reverse Regression Inequality

Suppose that we fit the following simple regression model, using OLS:

y_i = βx_i + ε_i . (1)

To simplify matters, suppose that all of the data are calculated as deviations from their respective sample means. That's why I haven't explicitly included an intercept in (1). This doesn't affect any of the following results.

The OLS estimator of β is, of course,

b = Σ(x_iy_i) / Σ(x_i²) ,

where the summations are for i = 1 to n (the sample size).

Now consider the "reverse regression":

x_i = αy_i + u_i . (2)

The OLS estimator of α is

a = Σ(x_iy_i) / Σ(y_i²).

Clearly, a ≠ (1 / b), in general. However, can you tell if a ≥ (1 / b), or if a ≤ (1 / b)?

The answer is, "yes", and here's how you do it.

The Econometrics of Temporal Aggregation V - Testing for Normality

This post is one of a sequence of posts, the earlier members of which can be found here, here, here, and here. These posts are based on Giles (2014).

Some of the standard tests that we perform in econometrics can be affected by the level of aggregation of the data. Here, I'm concerned only with time-series data, and with temporal aggregation. I'm going to show you some preliminary results from work that I have in progress with Ryan Godwin. Although these results relate to just one test, our work covers a range of testing problems.

I'm not supplying the EViews program code that was used to obtain the results below - at least, not for now. That's because what I'm reporting is based on work in progress. Sorry!

As in the earlier posts, let's suppose that the aggregation is over "m" high-frequency periods. A lower case symbol will represent a high-frequency observation on a variable of interest; and an upper-case symbol will denote the aggregated series.

So,

Y_t = y_t + y_{t - 1} + ......+ y_{t - m + 1} .

If we're aggregating monthly (flow) data to quarterly data, then m = 3. In the case of aggregation from quarterly to annual data, m = 4, etc.

Now, let's investigate how such aggregation affects the performance of the well-known Jarque-Bera (1987) (J-B) test for the normality of the errors in a regression model. I've discussed some of the limitations of this test in an earlier post, and you might find it helpful to look at that post (and this one) at this point. However, the J-B test is very widely used by econometricians, and it warrants some further consideration.

Consider the following small Monte Carlo experiment.

The Village Idiot Hypothesis

Yesterday, I received an email from Michael Belongia (Economics, U. Mississippi). With it, he kindly sent a copy of the Presidential Address to the American Agricultural Economics Association in 1979. The talk, given by Richard A. King, was titled "Choices and Consequences". It makes interesting reading, and many of the points that King makes are just as valid today as they were in 1979.

He has a lot to say about empirical consumer demand studies, especially as they relate to agricultural economics. In particular, he's rightly critical of the very restrictive characteristics of the Linear Expenditure System (Stone, 1954), and the Rotterdam Model (Theil, 1975). However, many of the objections that King raised were overcome just a year later with the "Almost Ideal Demand System" introduced by Deaton and Muellbauer (1980).

However, it was my recent post on hypothesis testing that prompted Michael to email me, and King makes some telling observations on this topic in his address.

I liked this remark about the need to be explicit about the hypotheses that we have in mind when undertaking empirical work:

King also talks about "The Village Idiot Hypothesis", in relation to the preoccupation with testing hypotheses such as β = 0.

As Michael said to me in his email, "When, as in one example, decades of research have indicated that some elasticity is -0.2, why do new papers test whether β = 0 rather than β = -0.2?"

If you have access to the American Journal of Agricultural Economics, I recommend that you take a look at Richard King's address, as he makes several other important points that practitioners should take to heart.

References

King, R. A., 1979. Choices and consequences. American Journal of Agricultural Economics, 61, 839-848.

Deaton, A. and J. Muellbauer, 1980. An almost ideal demand system. American Economic Review, 70, 312-326.

Stone, R., 1954. Linear expenditure systems and demand analysis: An application to the pattern of British demand".Economic Journal, 64, 511-527.

Theil, H., 1975. Theory and Measurement of Consumer Demand, Vol. 1. North-Holland, Amsterdam.

Update to ARDL Add-In for EViews

In a post back in January, I drew attention to an Add-In for EViews that allows you to estimate ARDL models. The Add-In was written by Yashar Tarverdi. At that time, one limitation was that the Add-In handles only two variables, X and Y.

Judging by the questions and feedback I get about ARDL models, I know you'll be delighted to know that this limitation has been eased considerably. News out of @IHSEViews on Twitter this morning announces that the Add-In will now handle up to ten variables.

Good job! And thanks!

Wednesday, November 5, 2014

Computing Power Curves

In a recent post I discussed some aspects of the distributions of some common test statistics when the null hypothesis that's being tested is actually false. One of the things that we saw there was that in many cases these distributions are "non-central", with a non-centrality parameter that increases as we move further and further away from the null hypothesis being true.

In such cases, it's the value of the non-centrality parameter that determines the power of tests. For a particular sample size and choice of significance level, this parameter usually depends on the all of the other features of the testing problem in question.

To illustrate this in more detail, let's consider a linear multiple regression model:

Central and Non-Central Distributions

Let's imagine that you're teaching an econometrics class that features hypothesis testing. It may be an elementary introduction to the topic itself; or it may be a more detailed discussion of a particular testing problem. We're not talking here about a course on Bayesian econometrics, so in all likelihood you'll be following the "classical" Neyman-Pearson paradigm.

You set up the null and alternative hypotheses. You introduce the idea of a test statistic, and hopefully, you explain why we try to find one that's "pivotal". You talk about Type I and Type II errors; and the trade-off between the probabilities of these errors occurring.

You might talk about the idea of assigning a significance level for the test in advance of implementing it; or you might talk about p-values. In either case, you have to emphasize to the classt that in order to apply the test itself, you have to know the sampling distribution of your test statistic for the situation where the null hypothesis is true.

Why is this?

Confusing Charts

Today's on-line edition of The New Zealand Herald includes an article titled "Junior rugby putting little kids in harm's way". The article included two charts, presented one after the other, and explicitly intended to be viewed as a a pair. Here they are:

Econometrics Beat: Dave Giles' Blog

Pages