Scientific (and Economic ?) Research on Trial?

At some point, I will probably have something to say on yet another week of tumultuous events and interesting comments; in particular I think that Krugman and Wolf may soon, very soon, be getting to the crux of things.

Meanwhile, I want to pass on one of Felix Salmon's latest posts in which he treats the issue of replicability in academic research. Clearly, this is an important issue and not only one of integrity but also one of honesty and simple code-of-conduct. Felix begins ...

Falsifiability and replicability are key cornerstones of any academic research. If you're running an empirical study, and your results aren't replicable, your study is largely worthless.

I would like to see any researcher disagree with this although I suspect that many economists may silently, or not, retort that sometimes the theory is enough in itself. This is of course fair enough even if I think that endless tinkering with micro foundations and more or less sophisticated representative agents are often times quite arcane. More so, in this context it is also besides the point; here we are talking about empirical studies and thus I think that Felix' initial shot across the bow is a universal ideal.

Apparently, the reality is far from the ideal. Felix consequently directs our attention to a recent piece of research by the Fraser Institute in which a bleak picture is painted of the ability to replicate data and thus results in academic research. The blurb by the Fraser Institute who publishes the piece can be found from the link above whereas the actual report written by B.D. McCullogh and Ross McKitrick can be found here. The executive summary reads as follows;

Empirical research in academic journals is often cited as the basis for public policy decisions, in part because people think that the journals have checked the accuracy of the research. Yet such work is rarely subjected to independent checks for accuracy during the peer review process, and the data and computational methods are so seldom disclosed that post-publication verification is equally rare. This study argues that researchers and journals have allowed habits of secrecy to persist that severely inhibit independent replication. Non-disclosure of essential research materials may have deleterious
scientific consequences, but our concern herein is something different: the possible negative effects on public policy formation. When a piece of academic research takes on a public role, such as becoming the basis for public policy decisions, practices that obstruct independent replication, such as refusal to disclose data, or the concealment of details about computational methods, prevent the proper functioning of the scientific process and can lead to poor public decision making. This study shows
that such practices are surprisingly common, and that researchers, users of research, and the public need to consider ways to address the situation. We offer suggestions that journals, funding agencies, and policy makers can implement to improve the transparency of the publication process and enhance the replicability of the research that is published.

To my chagrin, the dismal science seems to be one of culprits when it comes to producing published results which are not easily, and sometimes impossibly, replicable. However and by all means, the examples cited in the report are fun to read as a whole and economics is not the only discipline on trial.

There is for example also the well known story about the historian Michael A. Bellesiles who wrote a book entitled "the Arming America: The Origins of a National Gun Culture" about how only very few Americans (or more aptly, colonials) owned guns prior to the civil war which suggested that gun ownership was(is) not a fundamental trait of the American people. Bellesiles was however soon called on his bluff and one thing which strikes is that during the entire debacle which followed and as more and more people, both supposed amateurs and professional historians, is that it all came down to the obvious fact that the data on which the initial conclusions were based just did not exist.

As for economics we have the rather telling example of a famous Boston Fed Study that paved the way for often, in economic terms, reckless lending to minorities as well as there is the paper on file sharing by Felix Oberholzer-Gee and Koleman Strumpf who argue that the effect on record sales from file sharing and p2p downloading has been negligible. Again this is an issue of replicability and the inability of other authors to find the same results let alone get their hands on the original data.

Now, before we have a fit there are clearly two sides of this coin. One is the root of the problem in the sense that many scientific studies are simply not replicable either because the data used in the study cannot be obtained or because it is not clear what kind of data was used in a given study. On the other hand there is of course the actual impetus for researchers to do replicative work. Clearly, scientific work in the context of e.g. economics solely with the purpose of checking the validity and robustness of others' results do not seem to be a route to glory in any given sense of the word. In this way, the bias which makes replicative work not only difficult but also scarce may be, in the authors' word, both a demand and supply side phenomenon.

My own view here is really closely tied to Felix' points. First of all, any empirical study should readily lend itself to close scrutiny and robustness checks in the context of the original data. I really do not want to mince my words. Any economic empirical study in which the author(s) don't want to give away their data and estimation methods should not be published. This has to stand as a minimum criteria for integrity. In fact, any confident scholar should promote the testing and scrutiny of his results by others. Far be it from me to forward myself as an example to follow, but in the context of my own, admittedly most timid, contribution to the annals of financial research the excel sheets are still tugged away on my hard drive and anyone can get them at any time if he/she wants to check my results. Clearly, any empirically rooted financial or economic study should abide to the same standard as a minimum, no?

However, the world is never so simple and even though I may appear quite pure and righteous (although, of course, I am not) I am not sure what to do about this (Felix again);

I'm not sure how it's possible to get economists in particular, or academics generally, to spend more time trying to replicate prior results. But it's clearly a major weakness in the academy, and it would be great to address it somehow.

Especially in the context of economics and finance there seems to be a bias towards trespassing too much onto other scholars' domain and specifically towards trespassing with the intention to stir up the garden even if such actions might be merited. For example, one very pervasive tendency in economics is that authors replicate their own studies with new datasets as well as most replicative studies are rarely done on exactly the same data and period but rather in the context of a given result. This way of expanding the scientific knowledge may not be as bad as it sounds. Clearly, if many scholars find the same results across a wide range of data and estimation techniques this in itself is a strong case for robustness of a given theory.

So and to summarize on the points, I think that a fundamental openness towards concrete sharing of data and estimation techniques is an obvious minimum requirement. In this light, I would submit my support for the idea that academic journals also spend time on checking the validity and replicability of empirical results even if of course I am stating this as a point without knowing exactly how it would change the refereeing process. As for promoting more precise replicative work I am not optimistic since the incentive structure of particularly economic research seems to be very far away from promoting such studies.

claus vistesenFebruary 20, 2009