Article retractions have been growing steadily over the past few decades, soaring to a record-breaking figure of nearly 14,000 last year, compared with less than 1,000 per year before 2009 (see go.nature.com/3azcxan and go.nature.com/3x9uxfn).

Retracting flawed research papers is part of a healthy scientific process. Not all retractions stem from misconduct — they can also occur when mistakes happen, such as when a research group realizes it can’t reproduce its results.

But regardless of how erroneous results found their way into a published paper, it is important that they are not propagated in the scientific literature. No one wants to base their reasoning on false premises. In the same way that many people wouldn’t accept a medical treatment bolstered by shaky clinical trials, the scientific community doesn’t want researchers, the public and, increasingly, artificial intelligence (AI) to rely on erroneous data or conclusions from retracted articles.

One aspect that is often overlooked is what happens to the papers that cite retracted research. For example, in June, a Nature paper1 on stem cells was retracted amid concerns about the reliability of the data shown — 22 years after its publication, having amassed nearly 5,000 citations. Of course, an article lists references for a variety of reasons, such as to provide context, to introduce related work or to explain the experimental protocol. A retraction doesn’t mean that all the papers that cited the retracted article are now unreliable, too — but some might be. At a minimum, researchers should be aware of any retractions among the studies that they have cited. This would enable them to assess potential negative effects on their own work, and to mention the relevant caveats clearly should they continue to cite the retracted paper in the future. But, as far as I know, no systematic process is in place for publishers to alert citing scholars when an article is retracted. There should be.

Beyond retractions, what is needed is a large-scale mechanism to stop errors from propagating in the scientific literature. The tools exist — now, practices need to change.

Shaking up the status quo

Publications and citations are important currency in academia. Yet dubious papers or citations can be difficult to distinguish from genuine ones. Combined with the fact that the editorial, peer-review and publishing processes are highly reliant on trust, this has led to many distortions.

A researcher’s performance metrics — including the number of papers published, citations acquired and peer-review reports submitted — can all serve to build a reputation and visibility, leading to invitations to speak at conferences, review manuscripts, guest-edit special issues and join editorial boards. This can give more weight to job or promotion applications, be key to attracting funding and lead to more citations, all of which can build a high-profile career. Institutions generally seem happy to host scientists who publish a lot, are highly cited and attract funding.

Unscrupulous businesses known as paper mills have popped up that capitalize on this system. They produce manuscripts based on made-up, manipulated or plagiarised data, sell those fake manuscripts as well as authorship and citations, and engineer the peer-review process.

But reputable publishers are also complicit, when they churn out papers from high-profile researchers — including some who might have built visibility quickly through dubious or dishonest practices — and regularly use those individuals as reviewers and editors. The publishing industry benefits from large volumes of papers, including those that are not scientifically sound.

Stylised illustration of a row of dominoes carrying scientific symbols

Illustration: Phil Wheeler

Tools for change

Researchers, publishers, institutions and funders must all act to uphold the integrity of the scientific record.

Scientists who discover a suspicious or problematic paper can flag it through the conventional route by contacting the editorial team of the journal in which it appeared. But it can be difficult to find out how to raise concerns, and who with. Furthermore, this process is typically not anonymous and, depending on the power dynamics at play, some researchers might be unwilling or unable to enter these conversations.

And journals are notoriously slow. The process requires journal staff to mediate a conversation between all parties — a discussion that authors of the criticized paper are typically reluctant to engage in and which sometimes involves extra data and post-publication reviewers. Most investigations can take months or years before the outcome is made public.

Other avenues exist to question a study after publication, such as commenting on the PubPeer platform, where a growing number of papers are being reported. As of 20 August, 191,463 articles have received comments on PubPeer — nearly all of which were critical (see https://pubpeer.com/recent). But publishers typically don’t monitor these, and the authors of a criticized paper aren’t obliged to respond. It is common for post-publication comments, including those from eminent researchers in the field, to raise potentially important issues that go unacknowledged by the authors and the publishing journal.

In February 2021, I launched the Problematic Paper Screener (PPS; see go.nature.com/473vsgb). This software originally flagged randomly generated text in published papers. It now tracks a variety of issues to alert the scientific community to potential errors.

I devised a tool for the PPS to comb the literature for nonsensical ‘tortured phrases’ that are proliferating in the scientific literature (see ‘Lost in translation’). Each tortured phrase first needs to be spotted by a human reader, then added as a ‘fingerprint’ to the tool that regularly screens the literature using the 130 million scientific documents indexed by the data platform Dimensions. So far, 5,800 fingerprints have been collated. Humans are involved in a third step to check for false positives. (Dimensions is in the portfolio of Digital Science, which is part of Holtzbrinck, the majority shareholder in Nature’s publisher, Springer Nature.)

Lost in translation

Nonsensical phrases in scientific papers can sound alarm bells.

Co-authors, editors, referees and typesetters should keep an eye out for unnatural phrases in articles. They can expose text that has been generated by artificial intelligence or by an elaborate form of copy-and-paste that uses a translation tool to make phrases unrecognizable to plagiarism-detection tools.

Yet, even published articles that are riddled with dozens of these ‘tortured phrases’ are slow to be investigated, corrected or retracted. As of 20 August, the Problematic Paper Screener that I launched in 2021 had flagged more than 16,000 papers citing 5 or more such tortured phrases — only 18% of which have been retracted (see go.nature.com/3mbey8m).

Tortured phrase (Expected phrase)

Counterfeit consciousness (Artificial intelligence)

Man-made brainpower (Artificial intelligence)

Bosom malignancy (Breast cancer)

Kidney disappointment (Kidney failure)

DNA fix (DNA repair)

DNA harm/hurt (DNA damage)

Lactose bigotry (Lactose intolerance)

Invulnerable framework (Immune system)

Since my colleagues and I reported that tortured phrases had marred the literature2, publishers — and not just those deemed predatory — have been retracting hundreds of articles as a result. Springer Nature alone, for example, has retracted more than 300 articles featuring nonsensical text (see go.nature.com/3ytezsw).

And I am increasingly concerned by the number of articles that cite retracted studies — even after their retraction3.

To facilitate ongoing checks and continuous clean-up of the literature, I have devised two other tools for the PPS. One is the Annulled Detector, which keeps track of papers that have been retracted, withdrawn or removed — these are the various labels used by publishers to flag that a study is no longer valid. The detector harvests data from individual publishers, the Crossref database (which includes the Retraction Watch database) and the biomedical database PubMed to track the global landscape of retractions, withdrawals and removals. Some 62,000 such ‘annulled’ articles have now been cited more than 836,000 times overall (see go.nature.com/4dp5d7f).

The other is the Feet of Clay Detector, which serves to quickly spot those articles that cite annulled papers in their reference lists (see go.nature.com/3ysnj8f). I have added online PubPeer comments to more than 1,700 such articles to prompt readers to assess the reliability of the references.

Prevent and cure

There are simple steps, using widely available tools, that would significantly bolster the reliability of the scientific literature.

Authors should check for any post-publication criticism or retraction when using a study, and certainly before including a reference in a manuscript draft.

Two PubPeer extensions are instrumental. One plug-in automatically flags any paper that has received comments on PubPeer, which can include corrections and retractions, when readers skim through journal websites. The other works in the reference manager Zotero to identify the same articles in a user’s digital library. For local copies of downloaded PDFs, the publishing industry uses Crossmark: readers can click on the Crossmark button to check the status of the article on the landing page at the publisher’s website.

Tools exist to check reference lists, such as RetractoBot, which alerts scholars when papers they have cited are retracted. And the Feet of Clay Detector can be used, for free, to check whether the reference list of a published article has any red flags. It can run checks using just the title of an article or entire publishers’ portfolios, making it easy for individual researchers and journals to check the literature that is of interest to them.

Science would also benefit from more active post-publication scrutiny by an increased number of readers reporting concerns on PubPeer and to publishers. Conversely, the authors of a criticized paper should engage in good faith in discussions with their peers and/or the relevant journal, and work towards a swift resolution.

Readers — especially reviewers — should be aware of red flags, such as tortured phrases and the possible machine-generation of texts by AI tools (including ChatGPT). Suspicious phrases that look like they might be machine-generated can be checked using tools such as the PPS Tortured Phrases Detector2.

Journals should also contact researchers who reviewed an article that went on to be retracted on technical grounds — for their own information and, if the technical issue is in their area of expertise, to prompt them to be more cautious in future.

Publishers are best placed to make impactful changes to their practices and processes. They should routinely run submitted manuscripts through tools that check for plagiarism, doctored images, tortured phrases, retracted or questionable references, non-existent references erroneously generated (‘hallucinated’) by AI tools and citation plantations (large shares of references that benefit certain individuals).

These checks and balances are being integrated into the STM Integrity Hub currently in development by STM, the association for the academic publishing industry that serves the editorial boards of subscribed publishers. The software aims to detect duplicate submissions across publishers, and to flag to editors any suspicious signals such as tortured phrases, comments on PubPeer or retracted references.

As well as preventive measures, publishers should speed up and scale up their curative efforts when it comes to investigations, corrections and retractions. They should take firm responsibility for the articles they have published, and conduct regular checks so that retractions in their portfolios do not go unnoticed.

To help independent tools such as the Feet of Clay Detector to harvest data on the current status of articles in their journals, all publishers should publicly release the reference metadata of their entire catalogue.

They should also “unmistakably” identify retracted articles, as stated in the guidelines from the Committee on Publication Ethics (COPE; see go.nature.com/4dh7fdg). Most publishers edit the article PDF file to include a ‘Retracted’ watermarked banner. But any copy downloaded before a retraction took place won’t include this crucial caveat.

‘Expressions of concern’ from publishers should be more widespread. Such notes serve to alert readers that the reliability of a paper’s conclusions has been called into question.

And when a study is retracted, it should trigger a cascade reaction and, in some cases, a ‘cascade retraction’. This would mean that all the papers that cite that study should be reassessed, and if their conclusions hinged on now-retracted results, they should be corrected or retracted as appropriate.

Overall, to facilitate all these steps, publishers must update their practices and attribute more resources to both editorial and research-integrity teams.

Hold all parties accountable

Finally, another aspect that could curb the propagation of errors in the literature is accountability. At the moment, there are few consequences for failing to correct or retract erroneous papers, and little reward for flagging them — a time-consuming endeavour. Universities and funders must give priority to good, solid science over indirect metrics such as numbers of publications and impact factors of the journals they appeared in. Contributions to correcting the scientific record should be viewed more positively, perhaps in terms of community service.

As publishers retract ever more articles, I nudge them to transfer to charity the article processing charges they received on publication. For instance, IOP Publishing, owned by the Institute of Physics in London, was among the first publishers to retract articles on the basis of tortured phrases. It donates revenues from its retracted articles to Research4Life, an organization that provides institutions in low-and middle-income countries with online access to the academic literature.

A combined preventive and curative effort from all involved is key to sustaining the reliability of the scientific literature — a crucial undertaking for science and for public trust.