Computer systems have been prone to error since the early days. Credit: USAF/Getty

The finding seemed counterintuitive: warming in North America was driving plant species to lower elevations — not towards higher, cooler climes, as ecologists had long predicted. But the research published in Global Change Biology indeed turned out to be wrong. In February, the journal retracted the paper after its intriguing conclusion was found to be the result of errant software code1.

Worried about a rising tide of results that fail to measure up, journals are starting to take action. In the latest such move, Nature Biotechnology announced on 7 April a plan to prevent such embarrassing episodes in its pages (Nature Biotechnol. 33, 319; 2015). Its peer reviewers will now be asked to assess the availability of documentation and algorithms used in computational analyses, not just the description of the work. The journal is also exploring whether peer reviewers can test complex code using services such as Docker, a piece of software that allows study authors to create a shareable representation of their computing environment.

Researchers say that such measures are badly needed. They note that the increasing size of data sets and complexity of analysis software makes errors harder to detect. “This is a big step forward,” says Ciera Martinez, a plant biologist at the University of California, Davis. “A large journal focusing on reproducibility is desperately needed.” Computational experts often raise issues about code quality or availability during peer review, she adds, but such concerns are often ignored because many journals do not require examination of code.

The result can be errors or irregularities that lead to retractions, corrections and divisive debates. In announcing its policy, Nature Biotechnology cited two of its studies that were called into question by scientists who could not replicate the conclusions. Both papers2,3 had reported new methods for analysing connections within networks, but neither provided sufficient documentation of their tools or approach. The journal has now published more information about how software was used in each analysis.

“We are simply seeking to make our editorial evaluation of computational tools more consistent,” says Nature Biotechnology editor Andrew Marshall, who adds that other journals that publish computational-biology research have taken similar steps.

But several issues complicate the drive for software reproducibility. One is the difficulty of finding qualified reviewers for papers in disciplines that cross departmental boundaries. “The research is collaborative, but the review process is stuck in a disciplinary mindset,” says Lior Pachter, a computational biologist at the University of California, Berkeley.

Another is social: there is no etiquette governing how those who wish to replicate results should behave towards those whose work they examine. If authors of erroneous studies face public embarrassment and shaming, that can discourage other researchers from submitting to the same scrutiny. “It’s like taking your clothes off; you don’t want to be embarrassed by someone pointing at you because you have a lot of body hair,” says Ben Marwick, an archaeologist at the University of Washington in Seattle.

Mindful of such concerns, advocates of software reproducibility are placing less emphasis on publications. Instead they argue that published tools should be able to be used by other researchers. They say that this approach acknowledges the iterative nature of science.

“When we say ‘open science’ or ‘open research’, it’s not just about accessibility and availability of content or material,” says Kaitlin Thaney, director of the non-profit Mozilla Science Lab in New York. “It’s taking it one step further to think about use and reuse, so someone can carry that forward.”

An increasing number of initiatives aim to encourage scientists to ensure that their software is replicable. Courses run by organizations such as the non-profit Software Carpentry Foundation teach the value of writing and sharing solid scientific code, as well as the principles of constructing it. Software packages such as iPython and knitr make it easier to document code creation transparently and in its research context. The Mozilla Science Lab has experimented with training researchers in the scientific-coding process, and universities such as the University of California, Berkeley, are creating courses that train graduate students to code in a way that advances the cause of open and reproducible science.

The cause has been slow to catch on in the upper echelons of research. But those pushing for great replicability hope that a combination of incentives could begin to make a difference. Measures aimed at the publication process, such as those announced by Nature Biotechnology, will hit home for many researchers. Others may be lured by the notion that replicable work is more likely to stand the test of time. “The incentive for me, as a young researcher, is simple,” says Martinez. “Better science.”