Download PDF

News
Published: 03 October 2013

Formula predicts research papers' future citations

Richard Van Noorden

Nature (2013)Cite this article

1500 Accesses
2 Citations
294 Altmetric
Metrics details

Subjects

Mathematical model allows the success of publications — and perhaps scientists — to be forecasted.

Credit: Data from Dashun Wang / Science

It sounds like a science administrator’s dream — or a scientist's worst nightmare: a formula that predicts how often research papers will be cited. But a team of data scientists now says it could be possible. They report¹ today that a simple model allows reasonably accurate predictions of a paper’s future performance on the basis of about five years of its citation history.

"We would like to be able to predict as early as possible, and with relatively stable accuracy, how impactful a particular paper will be in the future," says study co-author Dashun Wang at the IBM Thomas J. Watson Research Center in New York. Others say that the work, published today in Science¹, is an interesting advance, but is not yet useful for policy-makers. Even so, it could mark the beginnings of measures that focus on predicting a paper's future influence, rather than evaluating past performance.

The forecasting model relies on picking up clues from how a paper is cited in its early years. Surprisingly, the model does not need to know the subject of the paper, who published it or in which journal. Instead, it assumes that just three basic factors influence how a paper gains citations. The first is, of course, the underlying appeal of its ideas. But another is how immediately it gets cited: if a paper gains an early boost, its visibility makes it more likely to pick up more citations — a well-known ‘rich-get-richer’ network effect that speeds the paper's approach to peak influence². A third is that novelty fades; eventually, a paper's citation rate approaches zero.

Nature special: Metrics

Ultimate impact

Wang, working with Albert-László Barabási, a network theorist at Northeastern University in Boston, Massachusetts, and with Chaoming Song, a physicist at the University of Miami in Florida, built a model suggesting that if relative differences in these factors were mathematically corrected for, all papers would follow the same citation pattern over time. For any research paper, it is then a matter of finding out which relative values (for underlying appeal, rate of initial growth and decay rate) best adjust the observed citation pattern to the universal curve. And with those values in hand, the model can predict a range for the paper's future performance and its probable lifetime citations — or 'ultimate impact', as the team calls it.

The researchers tested out their model on physics papers from the 1960s. They made predictions on the basis of five years' citation data, and found that 25 years later, 93.5% of papers fell within their predicted citation range. In fact, says Wang, forecasts can be made for many papers on the basis of less than five years' data, because their citations peak at around two years and then die out. The model also works on papers from the 1990s and 2000s, he says.

In some cases, however, there was a lot of uncertainty, so the range of future prediction was very wide (see 'Predicting the future' above). And 6.5% of papers defied the model entirely: these were ones that did not stand out in their first five years, but gained a second wind and became influential later on.

Anthony van Raan and Ludo Waltman, who study science citation networks and mapping at Leiden University in the Netherlands, told Nature that the model was elegant and the paper important. But policy-makers should not get excited just yet, says van Raan. "Prediction of citation impact five years after a publication's appearance is of little use in a policy context." And he cautions that even if a range of lifetime citation figures can be predicted, administrators should remember that citations inevitably differ between fields; for example, biologists cite each other more than physicists.

Increasing complexity

Wang says that he will improve the model by incorporating more complex elements — such as the paper's topic, or where it was published. "The focus here is on the minimal factors needed. The surprising thing to me is we can achieve this level of predictability just by looking at the citations over time," he says.

The model is also applicable to collections of papers — for example, all the papers published in a given journal, by one institute or by a particular scientist. The last prospect is intriguing, because existing metrics by which scientists are judged, such as the h-index, have little capacity to predict future performance, says Wang. Although his model can predict how a scientist’s past papers will fare in later years, that does not imply that the scientist’s future papers will have similar impact. Even so, says Wang, just finding out how a scientist's future impact relates to past impact would be useful, because it would allow one to quantify, on the basis of citations, “to what extent an individual scientist’s career is predictable”. Wang now hopes to build a website that would produce citation forecasts for any research paper.

If administrators did use metrics to predict future impact, it could change how science is done, says James Evans, a sociologist at the University of Chicago in Illinois who wrote an article³ on the future of research to accompany Wang's paper in Science. Scientific discovery might move faster than it already does, he says. But, he warns, “knowing only the momentum of an article’s reception could act as a self-fulfilling prophecy” — where everyone would cluster around hot papers and ditch research areas that might later have proven fruitful.

References

Wang, D., Song, C. & Barabási, A-L. Science 342, 127–132 (2013).
Article ADS Google Scholar
Barabási, A-L. & Albert, R. Science 286, 509-512 (1999).
Article ADS MathSciNet Google Scholar
Evans, J. A. Science 342, 44–45 (2013).
Article ADS CAS Google Scholar

Download references

Authors

Richard Van Noorden
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Van Noorden, R. Formula predicts research papers' future citations. Nature (2013). https://doi.org/10.1038/nature.2013.13881

Download citation

Published: 03 October 2013
DOI: https://doi.org/10.1038/nature.2013.13881

Formula predicts research papers' future citations

Subjects

References

Related links

Related links in Nature Research

Related external links

Rights and permissions

About this article

Cite this article

Search

Quick links

Subjects

References

Related links

Related links

Related links in Nature Research

Related external links

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links