Scientific papers that are not widely read and that lack any great influence can end up being classed as high-impact, claim researchers in California.

The mistake occurs because citations are often just copied from the reference list of one paper to another. A largely unremarkable or unread paper can therefore end up becoming highly cited, the researchers suggest.

“Simple mathematical probability, not genius, can explain why some papers are cited a lot more than others,” says Vwani Roychowdhury, an electrical engineer at the University of California, Los Angeles.

The assertion hinges on a previous analysis by Roychowdhury and his colleague Mikhail Simkin. Last year, they tracked identical errors in reference lists citing a seminal 1973 paper and concluded that almost 80% of authors had not read the paper in question before citing it (see Nature 420, 594; 200210.1038/420594a).

The pair have now built on that finding to generate a mathematical model to predict citation levels. They tested their prediction in an analysis of about 24,000 articles from the journal Physical Review D stored on SPIRES, a database of high-energy physics papers. The database sorts papers into six categories according to the number of citations that they receive — those receiving 500 or more are classed as 'renowned'.

Roychowdhury and Simkin's model closely matched the real distribution of citations. In results posted on the arXiv preprint server, they predicted that 40 papers would be cited 500 times or more. In reality, 44 articles in Physical Review D are renowned.

“If people cite randomly, the citation distribution would be the same as in reality,” says Roychowdhury. Given that citation patterns are similar in other sciences besides physics, the outcome of the model should be similar for biology or engineering papers, he argues.

The idea is “fascinating, unorthodox and inspiring”, says bibliometrics expert Anthony van Raan of Leiden University in the Netherlands. But although van Raan's research also suggests that copying is rife, he believes that citations levels generally reflect a paper's true impact. Researchers tend to copy references rationally, rather than randomly, and do so from papers that they have read, he says.

Ben Martin, director of the Science Policy Research Unit at the University of Sussex in Brighton, UK, points out that the new model is important, even if it only partially reflects reality. “Citations are a surrogate for quality,” he says. “Some departments look at these when making decisions in recruitment or tenure. Even if this is just a quirk, it's one that is worth airing.”

http://xxx.lanl.gov/abs/cond-mat/0305150