From where I work at the University of Sydney, you cannot see the ocean. However, in Australia, the ocean is part of our national consciousness. This is perhaps why I think of the research literature as an ocean, linking researchers in disparate yet ultimately connected fields. Just as there is growing alarm about our rising, polluted oceans, scientists are increasingly talking about the swelling research literature and its contamination by incorrect research results.
Most of the talk centres on unconscious bias and ill-informed sloppiness; conversations about intentional deception are more difficult. Unlike most faulty research practices, fraud actively evades detection. It is also overlooked because the scientific community has been unwilling to have frank and open discussions about it.
In 2015, I discovered several papers had been written about a gene that I and my colleagues first reported in 1998. All were by different authors based in China, but contained shared and strange irregularities. They also used highly similar language and figures. I think the papers came from third parties working for profit, fuelled by the pressure on authors to meet unrealistic publication expectations. (Such operations have been identified by investigative journalists.) I also think that, with most of the protein-coding and non-protein-coding genes in the human genome currently understudied, such third parties are targeting less-well-known human genes to produce low-value and possibly fraudulent papers.
How could such manuscripts slip through peer review? If genes are understudied, reviewers are unlikely to have the expertise needed to spot problems. Manuscripts could be distributed to different author groups, and submitted, over similar time periods, across many low-impact journals to avoid detection.
I’m not alone in my suspicions that dubiously produced papers are getting published. Informatician Cyril Labbé at Grenoble Alps University in France and I have developed a tool, Seek & Blastn (go.nature.com/2hsk06q), to identify such papers on the basis of wrongly identified nucleotide sequences. So far, our work has uncovered dozens of papers and resulted in 17 retractions, with several investigations pending (see Nature 551, 422–423; 2017). Before contacting 22 journals, Cyril and I wrote to the corresponding authors of an initial set of 48 papers describing our results. None replied.
Although papers have been retracted, I know of no formal accusations of misconduct; some authors have said experiments were performed as stated but that results are unreliable. I can point out unexplained similarities, but not prove that the flagged papers came from third parties.
Some might argue that my efforts are inconsequential, and that the publication of potentially fraudulent papers in low-impact journals doesn’t matter. In my view, we can’t afford to accept this argument. Such papers claim to uncover mechanisms behind a swathe of cancers and rare diseases. They could derail efforts to identify easily measurable biomarkers for use in predicting disease outcomes or whether a drug will work. Anyone trying to build on any aspect of this sort of work would be wasting time, specimens and grant money. Yet, when I have raised the issue, I have had comments such as “ah yes, you’re working on that fraud business”, almost as a way of closing down discussion. Occasionally, people’s reactions suggest that ferreting out problems in the literature is a frivolous activity, done for personal amusement, or that it is vindictive, pursued to bring down papers and their authors.
Why is there such enthusiasm for talking about faulty research practices, yet such reluctance to discuss deliberate deception? An analysis of the Diederik Stapel fraud case that rocked the psychology community in 2011 has given me some ideas (W. Stroebe et al. Perspect. Psychol. Sci. 7, 670–688; 2012). Fraud departs from community norms, so scientists do not want to think about it, let alone talk about it. It is even more uncomfortable to think about organized fraud that is so frequently associated with one country. This becomes a vicious cycle: because fraud is not discussed, people don’t learn about it, so they don’t consider it, or they think it’s so rare that it’s unlikely to affect them, and so papers are less likely to come under scrutiny. Thinking and talking about systematic fraud is essential to solving this problem. Raising awareness and the risk of detection may well prompt new ways to identify papers produced by systematic fraud.
Last year, China announced sweeping plans to curb research misconduct. That’s a great first step. Next should be a review of publication quotas and cash rewards, and the closure of ‘paper factories’.
Finally, efforts to police the literature need to be valued as highly as the publication of original data. It is more than ironic that systematic fraud is itself understudied. Like our environment, the literature is a commons, the care of which should be shouldered by all. National funding bodies should dedicate a proportion of their funds to developing, testing and implementing literature-screening approaches. Institutions need to implement faculty evaluations that are alert to fraudulently produced papers, with systems to discipline those found guilty of serious misconduct. Journals also need to devote more resources to monitoring the literature that they help to produce, and to purging it of faked science. They must respond to reported errors and be quick to investigate. They should encourage peer reviewers to be alert to the possibility of fraud and to describe reasonable suspicions.
We create the literature that we deserve. We must act against this under-recognized threat to valid science.
Nature 566, 9 (2019)