Econometrics Beat: Dave Giles' Blog: Can Your Results be Replicated?

Wednesday, September 11, 2013

Can Your Results be Replicated?

It's a "given" that your empirical results should be able to be replicated by others. That's why more and more journals are encouraging or requiring that authors of such papers "deposit" their data and code with the journal as a condition of acceptance for publication.

That's all well and good. However, replicating someone's results using their data and code may not mean very much!

This point was highlighted in a guest post today on the Political Science Replication blog site. The piece, by Mark Bell and Nicholas Miller, is titled "How to Persuade Journals to Accept Your Replication Paper". You should all read it!

~~Especially if you favour the STATA Stata package!~~

Here are a few excerpts, to whet your appetites:

"We were easily able to replicate Rauchhaus’ key findings in Stata, but couldn’t get it to work in R. It took us a long while to work out why, but the reason turned out to be an error in Stata: Stata was finding a solution when it shouldn’t have (because of separation in the data). This solution, as we show in the paper, was wrong – and led Rauchhaus’ paper to overestimate the effect of nuclear weapons on conflict by a factor of several million.

It’s very easy when you’re working through someone else’s code to be satisfied when you’ve successfully got your computer to produce the same numbers you see in the paper. But replicating a published result in one software package does not mean that you necessarily understand what that software package is doing, that the software is doing it correctly, or that doing it at all is the appropriate thing to do – in our case none of those were true and working out why was key to writing our paper."

"Learning new methods as you conduct the replication ensures that even if you don’t end up publishing your replication paper, you’ll have learnt new skills that will be valuable down the road."

(The emphasis in red is mine.)

By the way - Brad and Nicholas are both grad. students. More power to them!

13 comments:

AngeloSeptember 12, 2013 at 10:06 AM
I took a look at the paper by the two grad students. I think this is what happened.

With logit and probit models, ML requires that predicted probabilities for different "cells" (i.e. subsamples identified by dummy variables) match empirical frequencies. So if all observations in the same cell make the same choice (I think this is what they mean by "separation in the data"), then coefficients aren't identified (because linear combinations of them have to shoot off to positive or negative infinity). Apparently, STATA has some way of resolving the lack of identification so that some answer comes out of the package. It's kind of like having a perfect multicollinearity problem in a linear regression, but the package picks a particular solution out for you from the space of solutions to the LS problem--it would be ad hoc and different identification schemes could generate very different parameter estimates, although the predicted value of the dependent variable would be the same in all cases. (FWIW, my quick reading didn't make it clear how the graduate students resolved the lack of identification.)

My bottom line is that the problem wasn't in the software (I don't know if it allows you to check for this problem and the authors didn't or if it should have printed a warning regardless). The problem was the failure of the original authors to warn the reader that the parameter estimates weren't identified in the data, only the predicted probabilities.
ReplyDelete
Replies
AnonymousSeptember 12, 2013 at 12:53 PM
"More generally, packages that produce any results in this case (or the exact multicollinearity example you cited) still worry me. I'd rather they stopped, and produced an intelligible message that highlights the "problem"."

The glm function in R for example will not stop and
instead gives an answer under complete separation.
Here's an example from an lme4 github issue
(https://github.com/lme4/lme4/issues/124):

> set.seed(101)
> d <- data.frame(y=rbinom(1000,size=1,p=0.5),
+ x=runif(1000),
+ f=factor(rep(1:20,each=50)),
+ x2=rep(0:1,c(999,1)))
> glm(y ~ x+x2, data=d, family=binomial)

Call: glm(formula = y ~ x + x2, family = binomial, data = d)

Coefficients:
(Intercept) x x2
-0.10037 0.03549 -12.50117

Degrees of Freedom: 999 Total (i.e. Null); 997 Residual
Null Deviance: 1385
Residual Deviance: 1383 AIC: 1389

That amazing coefficient of -12.50117 is just a symptom of
complete separation.
ReplyDelete
Replies
Joerg LuedickeSeptember 13, 2013 at 11:38 AM
I agree, it is a good idea to replicate other's and one's own work with different packages (or do some other validations, e.g., simulate data that closely resembles some key features of the data at hand and see if the used command/package recovers the parameters correctly, etc.)

I got the insinuation from the following sentence in your original post:

"Especially if you favour the Stata package!"

This sentence in combination with your posted excerpts seem to present the problem as being caused by faulty software when in fact the problem seem to have arisen by researches choosing an inappropriate model.

Best,
Joerg
ReplyDelete
Replies
DimitriySeptember 13, 2013 at 2:30 PM
Here's a nice summary and commentary of this from Jeff Pitblado at http://www.stata.com/statalist/archive/2013-09/msg00618.html.
ReplyDelete
Replies
Andrew UsherSeptember 27, 2013 at 10:23 AM
I once had to convince a co-worker that Stata would do this. "But it generated coefficents they must be right." No these and another (infinite) set of coefficients.

When Stata tells you that the F-Stat is . there is likely something really wrong (also happens when you use a vce with not clusters)

I don't blame stata. Seems to be more a problem with canned procedures in general.
ReplyDelete
Replies

Add comment

Note: Only a member of this blog may post a comment.

Pages

Wednesday, September 11, 2013

Can Your Results be Replicated?

13 comments: