Nature | Editorial

Reality check on reproducibility

A survey of Nature readers revealed a high level of concern about the problem of irreproducible results. Researchers, funders and journals need to work together to make research more reliable.

Article tools

Is there a reproducibility crisis in science? Yes, according to the readers of Nature. Two-thirds of researchers who responded to a survey by this journal said that current levels of reproducibility are a major problem.

The ability to reproduce experiments is at the heart of science, yet failure to do so is a routine part of research. Some amount of irreproducibility is inevitable: profound insights can start as fragile signals, and sources of variability are infinite. But, the survey suggests, there is a bigger issue — and something that needs to be fixed. One-third of the survey respondents said that they think about the reproducibility of their own research daily, and more than two-thirds discuss it with colleagues at least monthly. The survey, of course, probably attracted researchers most interested in these issues. But it would be foolish to pretend that there is not serious concern.

What does ‘reproducibility’ mean? Those who study the science of science joke that the definition of reproducibility itself is not reproducible. Reproducibility can occur across different realms: empirical, computational and statistical. Replication can be analytical, direct, systematic or conceptual. Different people use reproducibility to mean repeatability, robustness, reliability and generalizability.

Economists and social scientists often use the term to mean that computer code and data are available so that someone would be able, if so inclined, to redo the same analysis using the same data. For bench scientists, who made up most of our respondents, it usually means that another scientist using the same methods gets similar results and can draw the same conclusions. We asked respondents to use this definition.

Even with a fixed definition, the criteria for reproducibility can vary dramatically between scientists. Senior scientists will not expect each tumour sample they examine under a microscope to look exactly like the images presented in a scientific publication; less experienced scientists might worry that such a result shows lack of reproducibility.

Scientists will need more rigorous use of terminology to get to grips with the problem. For now, broad-brush discussions and solutions are useful. Researchers lament that experiments that cannot be repeated do not give a solid foundation to build on.

Pressure to publish, selective reporting, poor use of statistics and finicky protocols can all contribute to wobbly work. Researchers can also be hampered from building on basically solid work by difficult techniques, poorly described methods and incompletely reported data. Funding agencies and publishers are helping to reduce these problems. Funders have changed their grant requirements and awarded grants for the design of courses to improve statistical literacy; journals are supporting technologies and policies that help to address inadequate documentation. For example, Nature’s Protocol Exchange website is available to host a protocol for any experiment, pre- or post-publication.

“The criteria for reproducibility can vary dramatically between scientists.”

One-third of survey respondents report that they have taken the initiative to improve reproducibility. The simple presence of another person ready to question whether a data point or a sample should really be excluded from analysis can help to cut down on cherry-picking, conscious or not. A couple of senior scientists have set up workflows that avoid having a single researcher in charge of preparing images or collecting results. Dozens of respondents reported steps to make better use of statistics, randomization or blinding. One described an institution-level initiative to teach scientists computer tools so they could share and analyse data collaboratively. Key to success was making sure that their data-management system also saved time. Another respondent spent three months working on a set of tools that enables different researchers to apply the same equations across different software and computing environments and found that it led to praise, publications and collaborations.

Nature’s survey was launched before the US National Institutes of Health revised its grant requirements to improve reproducibility, and no survey questions asked explicitly about how research institutions might contribute, or how much time and money respondents would be willing to allocate to dedicated efforts to enhance reliability or replicate work. Our respondents seemed in principle receptive to such initiatives, which is encouraging for those — including Nature — who have already introduced steps to improve reproducibility. More steps are needed — starting with a discussion in the research community on how to properly credit, and talk to each other about, attempted replications.

Journal name:
Nature
Volume:
533,
Pages:
437
Date published:
()
DOI:
doi:10.1038/533437a

For the best commenting experience, please login or register as a user and agree to our Community Guidelines. You will be re-directed back to this page where you will see comments updating in real-time and have the ability to recommend comments to other users.

Comments

Commenting is currently unavailable.

sign up to Nature briefing

What matters in science — and why — free in your inbox every weekday.

Sign up

Listen

new-pod-red

Nature Podcast

Our award-winning show features highlights from the week's edition of Nature, interviews with the people behind the science, and in-depth commentary and analysis from journalists around the world.