Reproducibility projects yield headline-grabbing numbers, not practical steps for minimizing the investment in and publication of irreproducible research. If used inappropriately, these numbers may have unintended consequences.
Last month, citing budgetary constraints, the Reproducibility Project: Cancer Biology, a joint initiative of Science Exchange and the Center for Open Science, announced that it would scale back its ambitions and attempt to reproduce 37 instead of 50 studies. The announcement received little media attention, perhaps because with 37 of the studies still on track (including two published in Nature Medicine), the initiative will nevertheless be able to generate a percentage of original studies that report experiments that do not replicate. Last summer, hundreds of news articles covered an announcement that a similar project replicated only around 40% of 100 psychology experiments. If that is any indication, the results of even a scaled-back cancer reproducibility project will receive plenty of headlines.
This attention will be in addition to the great deal of focus given already—and rightly—to the general issue of the irreproducibility of some published scientific research. The problem affects everyone in the scientific community: academic labs that are attempting to build on and extend others' work, biopharmaceutical companies that are searching the scientific literature for new drug targets to pursue and journal editors who are deciding which papers to publish. Work that is not reproducible saps time, money and energy.
These reproducibility projects thus have the meritorious goal of attempting to estimate the magnitude of the problem. Whether they will achieve that goal, however, is less clear. Although financial considerations naturally constrain the number of studies that can be subjected to replication attempts, 37—or even 100— studies are too few to draw conclusions about the full breadth of a scientific discipline. Moreover, the cancer reproducibility project will perform only selected experiments from each original study. Finally, failure to replicate is in essence a draw: it could be due to a flaw in the original study or a flaw in the replication study, but without further work, it is difficult to know.
But even a mega-reproducibility project, in which hundreds of studies might each be subjected to ten independently conducted replication attempts, would state—rather than solve—the problem. For this reason, practical steps aimed at reducing the investment in and the publication of irreproducible research are at least equally as important as post-publication attempts to replicate published work.
Laudably, efforts are already under way. Initiatives by journals—including the Nature Research Journals—to ensure that the methods used in academic research are reported in as much detail as possible in the resulting publications are obvious and worthwhile. One is simply more likely to reproduce something successfully if provided with a detailed protocol than if handed a vague blueprint.
However, these efforts act at a single point—the time of publication—in the life cycle of a research study. Pre-publication efforts to stamp out the root causes of irreproducibility are essential, and they will require meaningful, time-consuming and often challenging changes in behavior by researchers, institutions, funders and editors.
As a start, graduate schools that do not already offer them must add to the required curriculum for first-year students courses that focus on the core tenets of the scientific method and on statistics as they apply to experimental design. This way, tomorrow's researchers will incorporate these factors into their study designs, rather than applying them at the end of the study when prompted by a journal. The US National Institutes of Health recently produced a series of Rigor and Reproducibility Training Modules, which the agency may incorporate into its intramural training, but PhD-granting institutions must also step up (http://1.usa.gov/1OmBIWZ).
Researchers are not the only ones who need a solid grasp of the statistical issues relevant to experimental design; so, too, do journal editors. Another sensible initiative is the formation of relationships between editors and statisticians who are willing to act as referees for particularly complex files. Editors must also do a better job of recognizing the logical 'end' of a research project, in part by not capitulating to referees' demands to endlessly extend a study in several directions. This is because experiments performed in a setting of dwindling resources, by students and postdoctoral fellows who are under pressure to graduate or to start their own labs, may be repeated fewer times than those that form the primary focus of the study. Such studies hence risk being less reproducible. Nor are funders off the hook in the task of reducing irreproducibility. Increasing sample sizes and repeating experiments cost money, and funders and reviewers must consider these factors when allocating grant resources.
Although these pre-publication efforts should be prioritized, care should also be taken to prevent the justifiable outcry over irreproducibility from drowning out attempts to highlight other ills that afflict today's translational-research enterprise. To stick with the topic of cancer biology, if one uses a breast cancer cell line that has been misidentified as a melanoma cell line to screen for melanoma drug targets, it may not matter how reproducible the results are, because they may have little relevance to melanoma. The same goes for compounds shown to suppress tumor growth if administered to mice before tumor cells are implanted. Even if 100% of attempts to replicate this finding meet with success, it might have no bearing on efforts to develop drugs to treat established malignancies. In other words, no measure of reproducibility will ever ensure that a result in biomedical research will translate to humans, achieve a cure or prevent disease.
And disproportionate attention to the irreproducibility of select experiments may have unintended consequences. If high enough and sufficiently publicized, the percentages of studies that do not replicate in reproducibility projects risk shocking taxpayers and their elected officials into lobbying for reduced funding for all academic research.
No one denies that irreproducibility of scientific research is a serious problem. But we must not let the glare emanating from news about reproducibility project results blind us to the hard work that is needed to reduce irreproducibility—or distract us from other challenges facing translational research.
About this article
Cite this article
Take the long view. Nat Med 22, 1 (2016). https://doi.org/10.1038/nm.4033