Secret Information Encore

Secret Information Encore

My post "Secret data" on replication provoked a lot of comment in addition to emails,  more reflection, in addition to about additional links.

This isn't well-nigh rules

Many of my correspondents missed my primary indicate -- I am non advocating to a greater extent than in addition to tighter rules yesteryear journals! This is non well-nigh what you lot are "allowed to do," how to "get published" in addition to hence forth.

In fact, this extra rumination points me fifty-fifty to a greater extent than strongly to the persuasion that rules in addition to censorship yesteryear themselves volition non work. How to brand enquiry transparent, replicable, extendable, in addition to hence forth varies yesteryear the sort of work, the sort of data, in addition to is dependent acre similar everything else to inventiveness in addition to technical improvement.  Most of all, it volition non piece of work if nobody cares; if nobody takes the sort of actions inwards bullet points of my concluding post, in addition to it's just an number well-nigh rules at journals. Already, (more below) rules are non that good followed.

This isn't just well-nigh "replication." 

"Replication" is much also narrow a word. Yes, many papers receive got non documented transparently what they genuinely did, hence that fifty-fifty armed amongst the information it's hard to arrive at the same numbers. Other papers are based on hugger-mugger data, the work amongst which I started.

But inwards the end, most of import results are non just due to outright errors inwards information or coding. (I hope!)

The of import number is whether small-scale changes inwards instruments, controls, information sample, measuring error handling, in addition to hence forth arrive at unlike results, whether results handle out of sample, or whether collecting or recoding information produces the same conclusions. "Robustness" is a amend overall descriptor for the work that many of us suspect pervades empirical economical research.

You demand replicability inwards gild to evaluate robustness -- if you lot larn a unlike outcome than the master copy authors', it's essential to travel able to rail downwards how the master copy authors got their result. But the existent number is that much larger one.

The first-class replication wiki (many expert links) quotes Daniel Hamermesh on this departure betwixt "narrow" in addition to "wide" replication
Narrow, or pure, replication way start checking the submitted information against the primary sources (when applicable) for consistency in addition to accuracy. Second the tables in addition to charts are replicated using the procedures described inwards the empirical article. The aim is to confirm the accuracy of published results given the information in addition to analytical procedures that the authors write to receive got used. 
Replication inwards a broad feel is to consider the empirical finding of the master copy newspaper yesteryear using either novel information from other fourth dimension periods or regions, or yesteryear using novel methods, e.g., other specifications. Studies amongst major extensions, novel information or novel empirical methods are frequently called reproductions.
But the to a greater extent than of import robustness interrogation is to a greater extent than controversial. The master copy authors tin complain they don't similar the replicator's alternative of instruments, or procedures. So "replication," which sounds straightforward, speedily turns inwards to controversies.

Michael Clemens writes well-nigh the number inwards a weblog post here, noting
...Again in addition to again, the master copy authors receive got protested that the critique of their piece of work got unlike results yesteryear construction, non because anything was objectively wrong well-nigh the master copy work. (See Berkeley’s Ted Miguel et al. secret information post would travel read every bit criticism of people who arrive at large-data work, proprietary-data work, or piece of work amongst regime agencies that cannot currently travel shared.  The cyberspace is pretty snarky, hence it's worth stating explicitly that is non my intent or my view.

Quite the opposite. I am a huge fan of the pioneering piece of work exploiting novel information sets. If these pioneers had non flora dramatic results in addition to possibilities amongst novel data, it would non affair whether nosotros tin replicate, banking concern check or extend those results.

It is only now, that the pioneers receive got shown the way, that nosotros know how of import the piece of work tin be, that it becomes vital to rethink how nosotros arrive at this sort of piece of work going forward.

The special problems of confidential regime data

The regime has a lot of nifty information -- IRS, in addition to census for microeconomics, SEC, CFTC, Fed, fiscal production security commission inwards finance. And in that place are obvious reasons why hence far it has non been easily shared.

Journal policies allow exceptions for such data. So only a cardinal demand from the residual of us for transparency tin convey well-nigh changes. And has begun to arrive at so.

In improver to the suggestions inwards the concluding post, to a greater extent than in addition to to a greater extent than people are going through the vetting to utilisation the data. That leaves opened upward the possibility that a total replication machine could travel stored on site, ready for a replicator amongst proper access to force a button. Commercial information vendors could allow similar "free" replication, controlling straight how replicators utilisation the data.

Technological solutions are on the way too.  "Differential privacy" is an representative of a engineering scientific discipline that allows results to travel replicated without compromising the privacy of the data. Leapyear.io is an representative of companies selling this sort of technology. We are non alone, every bit in that place is a potent commercial demand for this sort of data. (Medical information for example.)

Other institutions: Journals, replication journals, websites,

There is about debate whether checking "replication" should count every bit novel research, in addition to I argued if nosotros desire replication nosotros demand to value it. The larger robustness interrogation for certain is "new" research. Xs outcome does non handle out of sample, is sensitive to the precise alternative of instruments in addition to controls, in addition to hence forth, is genuine, publishable, follow-on research.

I originally opined that replications should travel published yesteryear the master copy periodical to give the best incentives. That way an AER replication "counts" every bit an AER publication.

But amongst the thought that robustness is the wider issue, I am less inclined to this view. This broader robustness or reexamination is genuine novel research, in addition to in that place is a continuum betwixt replication in addition to the normal work concern of examining the basic thought of a model amongst novel information in addition to also about novel methods. Each newspaper on the permanent income hypothesis is non a "replication" of Friedman! We don't desire to only value every bit "new" enquiry that which uses novel methods -- in addition to hence nosotros travel dry out methodologists, non fact-oriented economists. And ane time a newspaper goes beyond pointing out elementary mistakes, to questioning specification, a interrogation which itself tin travel rebutted, it's beyond the responsibleness of the master copy journal.

Ivo Welch argues that a 3rd of each periodical should travel devoted to replication in addition to critique.  The Critical Finance Review, which he edits asks for replication papers.  The Journal of Applied Econometrics has a replication section, in addition to straightaway invites replications of papers inwards many other journals. Where journals fearfulness to tread, other institutions stair in. The replication network is ane interesting novel resource.

Faculties

Influenza A virus subtype H5N1 correspondent suggests an of import additional bullet indicate for the "what tin nosotros do" list

  • Encourage your faculty to adopt a replicability policy every bit constituent of its standards of conduct, in addition to every bit constituent of its standards for internal in addition to exterior promotions. 

The precise wording of such standards should travel fairly loose. The of import thing is to transportation a message. Faculty are expected to brand their enquiry transparent in addition to replicable, to furnish information in addition to programs, fifty-fifty when journals arrive at non require it.  Faculty upward for advertisement should facial expression that the commission reviewing them volition facial expression to encounter if they are behaving reasonably. Failure volition probable Pb to a piffling chat from your subdivision chair or dean. And the policy should nation that replication in addition to robustness piece of work is valued.

Another correspondent wrote that he/she advises junior faculty not to post programs in addition to data, hence that they arrive at non travel a "target" for replicators. To say nosotros disagree on this is an understatement. Influenza A virus subtype H5N1 clear phonation on this number is an first-class upshot of crafting a written policy.

From Michael Kiley's first-class comment below

  • Assign replication exercises to your students. Assign robustness checks to your to a greater extent than advanced students. Advanced undergraduate in addition to PhD students are a natural reservoir of replicators. Seeing the nuts in addition to bolts of how good, transparent, replicable piece of work is done volition arrive at goodness them. Seeing that non everything published is replicable or right powerfulness arrive at goodness them fifty-fifty more.   

Two expert surveys of replications (as good every bit journals) 

Maren Duvendack, Richard  Palmer-Jones, in addition to Bob Reed receive got an first-class survey article, "Replications inwards Economics: Influenza A virus subtype H5N1 Progress Report"
...a survey of replication policies at all 333 economic science journals listed inwards Web of Science. Further, nosotros analyse a collection of 162 replication studies published inwards peer-reviewed economic science journals. 
The latter is peculiarly good, starting at p. 175. You tin encounter hither that "replication" goes beyond just can-we-get-the-author's-numbers, in addition to maddeningly frequently does non fifty-fifty inquire that question
 a piffling less than two-thirds of all published replication studies endeavor to precisely reproduce the master copy findings....A frequent ground for non attempting to precisely reproduce an master copy study’s findings is that a replicator attempts to confirm an master copy study’s findings yesteryear using a unlike information set
"Robustness" non "replication "
Original Results?, tells whether the replication study re-reports the master copy results inwards a way that facilitates comparing amongst the master copy study. Influenza A virus subtype H5N1 large portion of replication studies arrive at non offering slow comparisons, maybe because of express periodical space. Sometimes the lack of direct comparing is to a greater extent than than a tiddler inconvenience, every bit when a replication study refers to results from an master copy study without identifying the tabular array or regression number from which the results come.
Replicators demand to travel replicable in addition to transparent too!
Across all categories of journals in addition to studies, 127 of 162 (78%) replication studies disconfirm a major finding from the master copy study. 
But rather than just the park alarmist headline, they receive got a expert insight. Replication studies tin endure the same significance bias every bit master copy work:
Interpretation of this number is difficult. One cannot assume that the studies treated to replication are a random sample. Also, researchers who confirm the results of master copy studies may confront difficulty inwards getting their results published since they receive got nil ‘new’ to report. On the other hand, periodical editors are loath to scandalize influential researchers or editors at other journals. The Journal of Economic & Social Measurement in addition to Econ Journal Watch receive got sometimes allowed replicating authors to study on their (prior) difficulties inwards getting disconfirming results published. Such firsthand accounts particular the reticence of about periodical editors to issue disconfirming replication studies (see, e.g., Davis 2007; Jong-A-Pin in addition to de Haan 2008, 57).
Summarizing
.. nearly lxxx per centum of replication studies receive got flora major flaws inwards the master copy research
Sven Vlaeminck in addition to Lisa-Kristin Hermmann surveyed journals in addition to study that many journals amongst information policies are non enforcing them. 
The results nosotros obtained propose that information availability in addition to replicable enquiry are non amidst the acme priorities of many of the journals surveyed. For instance, nosotros flora 10 journals (i.e. 20.4% of all journals amongst such policies) where non a unmarried article was equipped amongst the underlying enquiry data. But fifty-fifty beyond these journals, many editorial offices arrive at non genuinely enforce information availability: There was only a unmarried periodical (American Economic Journal: Applied Economics) which has information in addition to code available for every article inwards the iv issues. 
Again, this observation reinforces my indicate that rules volition non substitute for people caring well-nigh it. (They also beak over technological aspects of replication, in addition to the impermanence in addition to obscurity of zip files posted on periodical websites.) 

Numerical Analysis

Ken Judd wrote to me,
"Your advocacy of authors giving away their code is non the dominion inwards numerical analysis. I indicate to the “market test”: the numerical analysis community has done an first-class labor inwards advancing computational methods despite the lack of whatsoever requirement to portion the code....
Would you lot require Tom Doan to give out the code for RATS? If not, in addition to hence why arrive at you lot advocate journals forcing me to freely distribute my code?...
The number is non replication, which just way that my code gives the same reply on your estimator every bit it does on mine. The number is verification, which is the utilisation of tests to verify the accuracy of the answers. That I am willing to provide."
Ken is I cry upward reading to a greater extent than "rule in addition to censorship" rather than "social norms" inwards my views. And I cry upward it reinforces my preference for the latter over the former.  Among other things, rules designed for ane operate (extensive statistical analysis of large information sets) are poorly adapted to other situations (extensive numerical analysis.)

Rules tin travel taken to extremes.  Nobody is talking well-nigh "requiring" parcel customers to distribute the (proprietary) parcel source code. We all empathize that stair is non needed.

For heavy numerical analysis papers, using author-designed software that the writer wants to market, the verification proposition seems a sensible social norm to me.  If I'm refereeing a newspaper amongst a heavy numerical component, I would travel happy to encounter the extensive verification, in addition to happier nevertheless if I could utilisation the computer program on a few attempt out cases of my own. Seeing the source code would non travel necessary or fifty-fifty that useful. Perhaps inwards extremis, if a verification failed, I would desire the right to contact the writer in addition to empathize why his/her code produces a unlike result.

Some other examples of "replication" (really robustness) controversies:

Andrew Gelman covers a replication controversy, inwards which Douglas Campbell in addition to Ju Hyun Pun dissect Enrico Spolaore in addition to Romain Wacziarg's "the Diffusion of Development" inwards the QJE. There is no accuse that the estimator programs were wrong, or that ane cannot arrive at the published numbers. The disputation is exclusively over specification, that the outcome is sensitive to specification in addition to controls.

Yakov Amihud in addition to Stoyan Stoyanov Do Staggered Boards Harm Shareholders? reexamine Alma Cohen in addition to Charles Wang's Journal of Financial Economics paper. They come upward to the reverse conclusion, but could only reexamine the number because Cohen in addition to Wang shared their data. Again, the issues, every bit far every bit I tin tell, are non a accuse that programs or information are wrong.

Update: Yakov corrects me:

  1. We arrive at non come upward to "the reverse conclusion". We just cannot spend upward the zero that staggered board is harmless to work solid value, using Cohen-Wang's experiment. 
  2. Our outcome is also obtained using the publicly-available ISS database (formerly RiskMetrics). 
  3. Why is the departure betwixt the results? We used CRSP information in addition to did non include a few delisted (penny) stocks that are inwards Cohen-Wang's sample. Our newspaper states which stocks were omitted in addition to why. We are re-writing the newspaper straightaway amongst to a greater extent than detailed analysis.

I cry upward the indicate that replication slides inwards to robustness which is to a greater extent than of import in addition to to a greater extent than contentious remains clear.

Asset pricing is peculiarly vulnerable to results that arrive at non handle out of sample, inwards particular the powerfulness to forecast returns. Campbell Harvey has a number of expert papers on this topic.  Here, the number is ane time again non that the numbers are wrong, but that many expert in-sample return-forecasting tricks halt working out of sample. To know, you lot receive got to receive got the data.
Blogger
Disqus
Pilih Sistem Komentar

No comments

Advertiser