What ‘data thugs’ really need

Science needs to develop ways and means to support the checking of data, says Keith Baggerly.
Keith Baggerly is a recently retired professor of bioinformatics at the University of Texas MD Anderson Cancer Center.

Search for this author in:

I felt a sense of déjà vu last month, reading how a prominent nutritionist had resigned from his professorship after other researchers (self-dubbed ‘data thugs’) identified problems in his work that led to more than a dozen retracted papers. Years earlier, I and a colleague, Kevin Coombes, had spent more than 1,500 hours painstakingly checking analyses and raising alarms about work by Anil Potti, a cancer researcher then based at Duke University in Durham, North Carolina. Our efforts eventually led to retractions, lawsuits and halted clinical trials. It is past time for the scientific community to work out how to ease these kinds of investigations.

The case against nutritionist Brian Wansink at Cornell University in Ithaca, New York, began with his own blogpost celebrating ‘deep dives’ into data sets that turned up attention-grabbing findings about why people overeat, results with clear implications for policy and personal behaviour. Subsequently, early-career researchers Jordan Anaya, Nick Brown, James Heathers and Tim van der Zee delved into the data and found a welter of questionable research practices. They got plenty of publicity but faced, as Heathers put it, a “vacuum” when it came to getting credit or funding for being a scientific critic, or for the many hundreds of hours of careful work required.

In 2006, when Kevin and I started to examine Potti’s data on genetic signatures of cancer, both of us had tenure and resources to devote to the work. (We were funded to provide statistical support to faculty members and could easily argue that they would want to implement techniques Potti described.) Once we established (at least to our own satisfaction) that the results were simply wrong, it was harder to justify our work, but we couldn’t stop: patients were being enrolled in useless trials that would expose them to toxic agents.

Had our careers been at an earlier stage, or had our research institution required us to cover a greater part of our salaries with grant funding, allocating our time in this way would have been harder. There have been plenty of other instances when we’ve felt that we couldn’t face the effort required, even if it might have been good for science.

Lots of people have had to make this call. Just this week, a researcher came to me seeking advice about computational flaws he had spotted in a high-profile publication. He wanted to know whether I agreed that there were problems (I did), and how to proceed. He didn’t want to write a letter that might be seen as ‘angry’ or in any way denigrating of the original authors; he wanted the mistake corrected before it led to gross misallocations of resources. He also wanted to get credit for the work that his team had done in spotting the error, and to protect himself from any negative repercussions.

He had a clear understanding of the science involved, but he was less clear about the politics. When it comes to correcting high-profile work, good advice is available for ‘doing it right’: document your work carefully; ask someone else to double-check it (Kevin and I ran each other’s code to make sure our findings were solid); involve whoever chairs your department; ask authors for explanations (without making accusations), ideally in a live conversation; craft letters, both for the authors and for the journals concerned; and try to find a positive message. Also, to avoid ‘being a crank’, stay away from harsh words, keep a calm tone and write clearly, in full sentences.

There is, however, no good support mechanism for pursuing corrections. Stonewalling and blaming are common. Potti eventually stopped responding to our queries; Wansink’s critics describe similar behaviour. So we were left to rely on other options, including press coverage. Such efforts eventually led to broader investigations. Last month, a faculty committee at Cornell University found that Wansink’s research had included problematic use of statistical techniques, misreporting and sloppy documentation. (Wansink says that mistakes were unintentional, and he generally defends his conclusions.) The US Office of Research Integrity found that Potti had included false data in a grant application and in published manuscripts. (Potti settled with the agency without admitting or denying these findings.)

But corrections are much rarer than they should be. The problem is that most people aren’t going out and checking somebody else’s analysis as part of a broader programme of policing the literature. They’re doing it because they’re interested in a specific claim, which they hope to build on, or at least to understand. So when they do find something problematic, they face a conundrum: walk away and investigate something else, or expend more effort trying to fix the problem — which could take a good deal of time that’s hard to justify, and that could put their own career at risk.

If a scientific result might direct how millions of dollars of research funding are allocated, then it would make sense to have 1% or 2% of that sum devoted to ‘kicking the tyres’. Perhaps scientific funders could earmark a portion from each funding programme specifically for that purpose: researchers who found a problem could apply for a small grant from the original funding agency to try to pursue it, at the programme manager’s discretion. In 2016, the Netherlands Organisation for Scientific Research announced a €3-million (then US$3.3 million) pilot programme that would support researchers who wanted to repeat “cornerstone” research, but such a step is unusual.

Of course, applying for funds to critique work outside your research group could make people apprehensive and defensive. Another approach might be to make this task part of the job description for a scientist at the funding agency. That person would have no stake in the outcome, except to make sure that future funds are well spent. Journals, too, need mechanisms to assess and acknowledge scientific criticism from external sources.

In other words, it is neither wise nor fair to expect self-motivated data vigilantes to police scientific flaws, at least not without clearer reward mechanisms and rules of engagement. Instead, scientific funders should take on a checking role — it is in their own interests.

doi: 10.1038/d41586-018-06903-2
Nature Briefing

Sign up for the daily Nature Briefing email newsletter

Stay up to date with what matters in science and why, handpicked from Nature and other publications worldwide.

Sign Up