Nature | News Feature

1,500 scientists lift the lid on reproducibility

Survey sheds light on the ‘crisis’ rocking research.

Corrected:

Article tools

More than 70% of researchers have tried and failed to reproduce another scientist's experiments, and more than half have failed to reproduce their own experiments. Those are some of the telling figures that emerged from Nature's survey of 1,576 researchers who took a brief online questionnaire on reproducibility in research.

The data reveal sometimes-contradictory attitudes towards reproducibility. Although 52% of those surveyed agree that there is a significant 'crisis' of reproducibility, less than 31% think that failure to reproduce published results means that the result is probably wrong, and most say that they still trust the published literature.

Data on how much of the scientific literature is reproducible are rare and generally bleak. The best-known analyses, from psychology1 and cancer biology2, found rates of around 40% and 10%, respectively. Our survey respondents were more optimistic: 73% said that they think that at least half of the papers in their field can be trusted, with physicists and chemists generally showing the most confidence.

The results capture a confusing snapshot of attitudes around these issues, says Arturo Casadevall, a microbiologist at the Johns Hopkins Bloomberg School of Public Health in Baltimore, Maryland. “At the current time there is no consensus on what reproducibility is or should be.” But just recognizing that is a step forward, he says. “The next step may be identifying what is the problem and to get a consensus.”

Failing to reproduce results is a rite of passage, says Marcus Munafo, a biological psychologist at the University of Bristol, UK, who has a long-standing interest in scientific reproducibility. When he was a student, he says, “I tried to replicate what looked simple from the literature, and wasn't able to. Then I had a crisis of confidence, and then I learned that my experience wasn't uncommon.”

The challenge is not to eliminate problems with reproducibility in published work. Being at the cutting edge of science means that sometimes results will not be robust, says Munafo. “We want to be discovering new things but not generating too many false leads.”

The scale of reproducibility

But sorting discoveries from false leads can be discomfiting. Although the vast majority of researchers in our survey had failed to reproduce an experiment, less than 20% of respondents said that they had ever been contacted by another researcher unable to reproduce their work. Our results are strikingly similar to another online survey of nearly 900 members of the American Society for Cell Biology (see go.nature.com/kbzs2b). That may be because such conversations are difficult. If experimenters reach out to the original researchers for help, they risk appearing incompetent or accusatory, or revealing too much about their own projects.

A minority of respondents reported ever having tried to publish a replication study. When work does not reproduce, researchers often assume there is a perfectly valid (and probably boring) reason. What's more, incentives to publish positive replications are low and journals can be reluctant to publish negative findings. In fact, several respondents who had published a failed replication said that editors and reviewers demanded that they play down comparisons with the original study.

Nevertheless, 24% said that they had been able to publish a successful replication and 13% had published a failed replication. Acceptance was more common than persistent rejection: only 12% reported being unable to publish successful attempts to reproduce others' work; 10% reported being unable to publish unsuccessful attempts.

Survey respondent Abraham Al-Ahmad at the Texas Tech University Health Sciences Center in Amarillo expected a “cold and dry rejection” when he submitted a manuscript explaining why a stem-cell technique had stopped working in his hands. He was pleasantly surprised when the paper was accepted3. The reason, he thinks, is because it offered a workaround for the problem.

Others place the ability to publish replication attempts down to a combination of luck, persistence and editors' inclinations. Survey respondent Michael Adams, a drug-development consultant, says that work showing severe flaws in an animal model of diabetes has been rejected six times, in part because it does not reveal a new drug target. By contrast, he says, work refuting the efficacy of a compound to treat Chagas disease was quickly accepted4.

The corrective measures

One-third of respondents said that their labs had taken concrete steps to improve reproducibility within the past five years. Rates ranged from a high of 41% in medicine to a low of 24% in physics and engineering. Free-text responses suggested that redoing the work or asking someone else within a lab to repeat the work is the most common practice. Also common are efforts to beef up the documentation and standardization of experimental methods.

Any of these can be a major undertaking. A biochemistry graduate student in the United Kingdom, who asked not to be named, says that efforts to reproduce work for her lab's projects doubles the time and materials used — in addition to the time taken to troubleshoot when some things invariably don't work. Although replication does boost confidence in results, she says, the costs mean that she performs checks only for innovative projects or unexpected results.

Consolidating methods is a project unto itself, says Laura Shankman, a postdoc studying smooth muscle cells at the University of Virginia, Charlottesville. After several postdocs and graduate students left her lab within a short time, remaining members had trouble getting consistent results in their experiments. The lab decided to take some time off from new questions to repeat published work, and this revealed that lab protocols had gradually diverged. She thinks that the lab saved money overall by getting synchronized instead of troubleshooting failed experiments piecemeal, but that it was a long-term investment.

Irakli Loladze, a mathematical biologist at Bryan College of Health Sciences in Lincoln, Nebraska, estimates that efforts to ensure reproducibility can increase the time spent on a project by 30%, even for his theoretical work. He checks that all steps from raw data to the final figure can be retraced. But those tasks quickly become just part of the job. “Reproducibility is like brushing your teeth,” he says. “It is good for you, but it takes time and effort. Once you learn it, it becomes a habit.”

One of the best-publicized approaches to boosting reproducibility is pre-registration, where scientists submit hypotheses and plans for data analysis to a third party before performing experiments, to prevent cherry-picking statistically significant results later. Fewer than a dozen people mentioned this strategy. One who did was Hanne Watkins, a graduate student studying moral decision-making at the University of Melbourne in Australia. Going back to her original questions after collecting data, she says, kept her from going down a rabbit hole. And the process, although time consuming, was no more arduous than getting ethical approval or formatting survey questions. “If it's built in right from the start,” she says, “it's just part of the routine of doing a study.”

The cause

The survey asked scientists what led to problems in reproducibility. More than 60% of respondents said that each of two factors — pressure to publish and selective reporting — always or often contributed. More than half pointed to insufficient replication in the lab, poor oversight or low statistical power. A smaller proportion pointed to obstacles such as variability in reagents or the use of specialized techniques that are difficult to repeat.

But all these factors are exacerbated by common forces, says Judith Kimble, a developmental biologist at the University of Wisconsin–Madison: competition for grants and positions, and a growing burden of bureaucracy that takes away from time spent doing and designing research. “Everyone is stretched thinner these days,” she says. And the cost extends beyond any particular research project. If graduate students train in labs where senior members have little time for their juniors, they may go on to establish their own labs without having a model of how training and mentoring should work. “They will go off and make it worse,” Kimble says.

What can be done?

Respondents were asked to rate 11 different approaches to improving reproducibility in science, and all got ringing endorsements. Nearly 90% — more than 1,000 people — ticked “More robust experimental design” “better statistics” and “better mentorship”. Those ranked higher than the option of providing incentives (such as funding or credit towards tenure) for reproducibility-enhancing practices. But even the lowest-ranked item — journal checklists — won a whopping 69% endorsement.

The survey — which was e-mailed to Nature readers and advertised on affiliated websites and social-media outlets as being 'about reproducibility' — probably selected for respondents who are more receptive to and aware of concerns about reproducibility. Nevertheless, the results suggest that journals, funders and research institutions that advance policies to address the issue would probably find cooperation, says John Ioannidis, who studies scientific robustness at Stanford University in California. “People would probably welcome such initiatives.” About 80% of respondents thought that funders and publishers should do more to improve reproducibility.

“It's healthy that people are aware of the issues and open to a range of straightforward ways to improve them,” says Munafo. And given that these ideas are being widely discussed, even in mainstream media, tackling the initiative now may be crucial. “If we don't act on this, then the moment will pass, and people will get tired of being told that they need to do something.”

Download the full questionnaire used in the survey and the raw data in a spreadsheet (the data are also available as a tab-delimited file at Figshare).

Journal name:
Nature
Volume:
533,
Pages:
452–454
Date published:
()
DOI:
doi:10.1038/533452a

Corrections

Corrected:

An earlier version of the graphic ‘Is there a reproducibility crisis’ inadvertently switched the labels for ‘Don’t know’ and ‘No, there is no crisis’. The labels are now with the correct percentages.

References

  1. Open Science Collaboration Science http://dx.doi.org/10.1126/science.aac4716 (2015).

  2. Begley, C. G. & Ellis, L. M. Nature 483, 531533 (2012).

  3. Patel, R. & Alahmad, A. J. Fluids Barriers CNS 13, 6 (2016).

  4. da Silva, C. F. et al. Antimicrob. Agents Chemother. 57, 53075314 (2013).

Author information

Affiliations

  1. Monya Baker writes and edits for Nature from San Francisco.

Author details

For the best commenting experience, please login or register as a user and agree to our Community Guidelines. You will be re-directed back to this page where you will see comments updating in real-time and have the ability to recommend comments to other users.

Comments for this thread are now closed.

Comments

2 comments Subscribe to comments

  1. Avatar for Matt Herchen
    Matt Herchen
    On 26 June 2015, Science magazine published a similar article in its section "Policy Forum" entitled "Promoting an Open Research Culture" (B. A. Nosek et al. Science, Vol. 348, pp. 1422-1425, DOI: 10.1126/science.aab2374). The article and two related pieces ("Self-correction in science at work", and "Solving reproducibility") seem motivated by the perception that there may be a reproducibility crisis in science. This Nature article echoes this view. Whether the crisis is real or perceived, one thing is certain: the crisis is being mismanaged by scientists and they have only themselves to blame. They are allowing journalists, defamation rings and nobodies seeking attention to tell them what to do. They are letting social media handle the crisis. Social media only fuels the current hysteria over fake science. This problem has been covered already by Science Transparency: https://scienceretractions.wordpress.com/2015/06/28/on-promoting-an-open-research-culture-policy-forum-science-magazine/
  2. Avatar for Pentcho Valev
    Pentcho Valev
    A complementary evil is biased interpretation - it can kill science even when the experiment is perfectly reproducible. For instance, ninety-nine percent of today's Einsteinians ("later writers" in John Norton's text below) fraudulently use the Michelson-Morley experiment as support for Einstein's 1905 constant-speed-of-light postulate. Actually, in 1887, the Michelson-Morley experiment unequivocally confirmed the variable speed of light predicted by Newton's emission theory of light, and refuted the constant (independent of the speed of the light source) speed of light predicted by the immobile ether theory and later adopted by Einstein as his special relativity's second postulate: http://philsci-archive.pitt.edu/1743/2/Norton.pdf John Norton: "In addition to his work as editor of the Einstein papers in finding source material, Stachel assembled the many small clues that reveal Einstein's serious consideration of an emission theory of light; and he gave us the crucial insight that Einstein regarded the Michelson-Morley experiment as evidence for the principle of relativity, whereas later writers almost universally use it as support for the light postulate of special relativity. Even today, this point needs emphasis. The Michelson-Morley experiment is fully compatible with an emission theory of light that contradicts the light postulate." http://books.google.com/books?id=JokgnS1JtmMC Banesh Hoffmann, Relativity and Its Roots, p.92: "There are various remarks to be made about this second principle. For instance, if it is so obvious, how could it turn out to be part of a revolution - especially when the first principle is also a natural one? Moreover, if light consists of particles, as Einstein had suggested in his paper submitted just thirteen weeks before this one, the second principle seems absurd: A stone thrown from a speeding train can do far more damage than one thrown from a train at rest; the speed of the particle is not independent of the motion of the object emitting it. And if we take light to consist of particles and assume that these particles obey Newton's laws, they will conform to Newtonian relativity and thus automatically account for the null result of the Michelson-Morley experiment without recourse to contracting lengths, local time, or Lorentz transformations. Yet, as we have seen, Einstein resisted the temptation to account for the null result in terms of particles of light and simple, familiar Newtonian ideas, and introduced as his second postulate something that was more or less obvious when thought of in terms of waves in an ether. If it was so obvious, though, why did he need to state it as a principle? Because, having taken from the idea of light waves in the ether the one aspect that he needed, he declared early in his paper, to quote his own words, that "the introduction of a 'luminiferous ether' will prove to be superfluous." Pentcho Valev

Germany focus

merkel

The secret to Germany’s scientific excellence

With a national election this month, Germany proves that foresight and stability can power research.

The best science news from Nature and beyond, direct to your inbox every day.

Genome evolution

genetics-evolution

Massive genetic study shows how humans are evolving

Analysis of 215,000 people's DNA suggests variants that shorten life are being selected against.

Predatory journals

comment-predatory

Stop this waste of people, animals and money

Predatory journals have shoddy reporting and include papers from wealthy nations, find David Moher, Larissa Shamseer, Kelly Cobey and colleagues.

Protein perk up

choanoflagellate

Bacterial 'aphrodisiac' sends single-celled organism into mating frenzy

Researchers surprised to observe bacterial protein triggering a switch from asexual to sexual behaviour.

Bat nav

bats

Bats slam into buildings because they can't 'see' them

Smooth, vertical structures such as steel and glass buildings appear invisible to bats' echolocation system.

Nature Podcast

new-pod-red

Listen

Protecting red haired people from cancer, machine learning and gravitational distortions, and peeking inside predatory journals.