Computer scientists at Columbia University in New York have used a mathematical model to estimate the number of flawed scientific papers that go unretracted, and its relation to journal impact factors.

In correspondence published in EMBO Reports (M. Cokol et al. EMBO Rep. 5, 422–423; 2007), the researchers find that fewer papers are retracted by journals with low impact factors. But their model raises as many questions as it answers, say specialists in scientific publishing, some of whom argue that it greatly oversimplifies the issues.

Murat Cokol and his colleagues at the biomedical-informatics department at Columbia downloaded data for 9.4 million articles published between 1950 and 2004 from PubMed, an index of biomedical and general scientific literature. They identified 596 retracted articles — flagged up as such in PubMed — and found some striking relationships between the numbers of retractions and the impact factors of the journals that had published them.

Journals with high impact factors retract more papers, and low-impact journals are more likely not to retract them, the study finds. It also suggests that high- and low-impact journals differ little in detecting flawed articles before they are published.

Finally, the authors ran a model to estimate how many articles should have been retracted, and came up with 10,000 in a best-case scenario and more than 100,000 in a worst-case one. Most of the papers that needed to be retracted were published in low-impact journals.

The Cokol study was not peer-reviewed. Aviv Bergman, director of the Center for Computational Genetics and Biological Modeling at Stanford University in California, says that the researchers' modelling techniques are sound, but that he isn't in a position to judge their input data.

But scientists and editors familiar with retraction issues are sceptical of the quality of the model's input data. Theoretical modelling exercises will generate bad results if the input data are flawed, says Drummond Rennie, deputy editor of the Journal of the American Medical Association, and a medical researcher at the Institute for Health Policy Studies, at the University of California, San Francisco.

Although the number of retracted articles is probably only the tip of the iceberg in terms of the number that should have been retracted, the model — based on journal impact factor and number of retractions — is too simplistic to capture the complex reality of the issues affecting the size and nature of the hidden part, Rennie says.

The model clumps data from 1950 to 2004, for example, whereas trends are likely to be affected by the fact that the United States first introduced official policies on research misconduct twenty years ago, and that other nations did so even later, says Rennie.

No one has much clue what the real number of retractions should be.

Experience also shows that retraction figures are skewed by the fact that once misconduct is detected in one article by a researcher, dozens of articles by the same author often need to be retracted. “Digging into the data behind all these other articles is a truly monumental task, but until it's done, no one has much clue what the real number of retractions should be,” says Rennie.

Impact factor alone is also a very broad yardstick, says one scientific-literature specialist at PubMed, who points out that the impact factors of individual articles vary widely, even when they are in the same journal. Other models could also be made to fit the data with potentially very different outcomes, he says. High-impact journals might attract flawed papers, he speculates, simply because they publish cutting-edge research, in which competition and time pressure may favour both errors and misconduct.

In the correspondence, Cokol argues that the larger number of retractions in high-impact journals reflects the fact that they receive more scrutiny. But Sandra Titus, director of intramural research at the US Office of Research Integrity in Rockville, Maryland, says that's too simple a verdict. “It's only part of the issue,” she says, adding that “legal barriers to retraction are so awkward that many journals simply pass rather than face the hassle.”

Cokol defends his approach as a valid one to start exploring data on retraction. “All models are wrong, but some are useful,” he says. “Our model certainly does not capture the reality in full, and no model does. But it captures certain aspects and gives a general direction on how to understand the issue better.”