Count on me

Journal name:
Nature
Volume:
489,
Page:
177
Date published:
DOI:
doi:10.1038/489177a
Published online

Sometimes, the use of metrics to assess the value of scientists is unavoidable. So let's come up with the best measure possible.

In an ideal world, scientists applying for grants or jobs would be judged holistically — balancing quantitative measures such as their publication record against indications of their potential from recommendation letters, personal interactions and other activities. So even if a candidate had not generated many papers, it would count in their favour if the few they had published had received positive post-publication review (comments, tweets and blog posts, for instance). Also favourable would be a tendency to ask insightful questions at talks that lead to valuable discussions and new experiments, or a willingness to share reagents and expertise with their colleagues. That would be ideal. But that is not the world in which most scientists live.

Instead, hiring committees and grant reviewers sweat through hundreds of applications, often with only enough time to give each submission a cursory glance. In 2010, a Nature poll found that most administrators say that metrics — quantifiable measures of scientists' achievements — matter less in job decisions than scientists often think (see Nature 465, 860862; 2010), but good peer review is often simply not possible.

As a result, evaluators are increasingly turning to metrics, such as total citation count and the h-index, a measure of both the quality and quantity of papers (a scientist has an h-index of 12 if they have published 12 papers that have each received at least 12 citations). Naturally, many scientists object to such cold quantification of their contribution. Plus, all metrics have obvious flaws — a paper may gather many citations not because of its importance, but because it is in a large field that publishes frequently, so generates more opportunities for citations. Review articles, which may not add much to the research, count the same as original research papers, which contribute a great deal. And all existing metrics capture only what a scientist has done, not what he or she might be capable of. Clearly, there is a need for more and better measures.

On page 201, Daniel Acuna, Stefano Allesina and Konrad Kording suggest an alternative: the future h-index. Unlike other metrics, this index estimates a scientist's publication prowess five years or so into the future — a useful timescale for tenure decisions.

Using publicly available data on the history of publication, citation and funding for thousands of neuroscientists, researchers working on the fruitfly Drosophila and evolutionary biologists, the authors constructed an algorithm that converts information on a typical scientist's CV — the number of journals published in and articles in top journals for instance — into a number that represents their probable h-index in the years that follow.

Outraged? Please send complaints to the usual address. Interested? Calculate your own future h-index here: go.nature.com/z4rroc.

Nature receives thousands of submissions a year, some of which point out the flaws in existing metrics and propose alternatives. We accepted the piece by Acuna et al. after submitting it to peer review. The reviewers and our editors felt that the authors had used appropriate methods to obtain their algorithm, and its predictive values seemed realistic. Furthermore, the authors are cautious about its value, pointing out that it is probably less accurate for scientists in other disciplines, and should not be considered a replacement for peer review. At the very least, the future h-index should help to address some problems with the current h-index, which tends to favour established scientists because they have had more time to accrue citations. A forward-looking metric may give a leg up to promising, early-career scientists who don't yet have impressive CVs.

“There is no substitute for examining the research itself to appreciate its value.”

Nevertheless, no one wants their career potential to be reduced to a number. Nature publishes many scientific gems that nevertheless achieve few citations; there is no substitute for examining the research itself to appreciate its value. We know that the idea of a new metric published in these pages will raise some anxieties, and a few hackles. But metrics are already being used, so it is important that they create the most accurate picture possible of someone's potential. Plus, they do hold some advantages over peer review, by helping to eliminate the unconscious biases that can creep into personal evaluations.

In that vein, scientists should continue to hunt for metrics that capture a scientist's true value, including aspects such as teaching, reviewing and public-speaking ability, as well as online responses to publications in blogs and comments — 'alt-metrics'. We may not live in an ideal world, but we can still improve the recruitment, reward and opportunities for scientists.

Comments

  1. Report this comment #50145

    John Bashinski said:

    There is no metric that cannot be gamed, and there aren't very many metrics that are even hard to game. Even arguably "non-gaming" attempts to affect metrics can distort behavior and incentives in negative ways. Not always just the behavior and incentives of the person being measured.

    The more complicated you make a metric, the more room there is for these unintended consequences. The more commonly used a metric is, the more incentive there is to find ways to mess it up.

    This argues against standardizing metrics; a patchwork of metrics is harder to game than a single one, especially if you don't know in advance which one is going to be the important one. And it argues against spending too much time adding bells and whistles to any one metric, because after a certain point you probably won't understand the effects of your changes, which means you're very likely to be making things worse.

    Maybe what y'all need is a diverse set of respectable ways to get published, hired, tenured, or otherwise rewarded?

  2. Report this comment #50353

    Timir Datta said:

    Do we need metrics? Off course, a simple tool for assessing ?value? of a scientist?s contributions past or future can make a lot of decision making more manageable and perhaps guilt free. Our entire system of measuring success is based upon parameters that relate to who, what, how much and when. However, sincerely hope an index that conjures up future creativity based on past statistics may prove to be more useful than rainfall predictions based on the 100 year precipitation table.

    John Bashinski is correct, in that a new index will be one more item to game the system with. Absolutely, if future possibilities are predicated upon joining the right research group at a destination institution then that is the trajectory the ?gifted? will play. Criticisms regarding old ?boy favoritism will greatly diminish with double/triple blind peer review process.

    By all measures a larger and larger fraction of the highest GRE scorers, science & tech graduates and authors of publications in the top journals can be associated with the certain regions of the world. Does this mean that a new population is becoming more gifted while the established ones getting dumber? No, rather it is a sign that a new segment can afford the cost of transcontinental jet travel, test preparation fees and are willing to invest in activities that can and will show up as metrics of quality. This is exactly how one becomes a member of the team ? a team player. NSF is more prone to fund when current awardees favorably judge a proposal. What can be wrong with this formula- doesn?t every system desires continuity? Yes but past might not be the best indicator of the new.

    Nevertheless, kudos to the peer reviewers of the manus cript and Nature for publishing this new tool with an editorial flourishes.

  3. Report this comment #50425

    Peter Cary said:

    On behalf of David Clarke:

    Metrics such as the h-index and impact factors of journals in which one publishes are increasingly used to evaluate candidates for faculty positions, promotions and by granting agencies A legitimate criticism of using a single metric such as the h-index (the highest number of papers cited that many times) (Hirsch, J.E., Proc. Natl. Acad. Sci. 102, 116569-16572 (2005) is that it naturally favours the more senior researcher.

    Here, we propose the (annual) H.Y.P.E. index that provides more information about a candidate, his/her achievements and impact in their field. It can be presented in a fashion similar to that of an IP address:

    (Year):
    H (h-index):
    Y (Years since Ph.D or MD.):
    P (Productivity = total number of peer-reviewed papers):
    E (Expectation = number of peer-reviewed papers in the past 5 (or 3) years).

    An example is the current hype index for DMC: (2012): 052: 026: 125: 018.

    A researcher with retractions will report his/her H.Y.P.E.R (number of retracted papers) index.

    Tip W. Loo and David M. Clarke

    Department of Medicine and Department of Biochemistry
    University of Toronto

    e-mail: david.clarke@utoronto.ca

  4. Report this comment #50624

    Igor Mazin said:
    I believe we all agree that performance indices are evil, and I believe we all agree that it is not possible to do away with them entirely, for we need to evaluate scientific performance somehow. I further believe that most of us prefer as few indices as possible, and as simple as possible. Finally, I doubt that anybody will dispute the original Hirsch's argument that the number of citations is skewed in favor of minor co-authors of a single spectacular paper, and the number of publications is skewed in favor of managers running large-scale operations with dozens of posdocs, so-called paper mills. The h-index is better than either of the two, and it reasonably quantifies the total scientific output of an individual scientist, yet it still does not tell us much about the quality of his or her work, as opposed to quantity. Phil Anderson has an h-index slightly in excess of 100, and I know a couple of scientists who are not even nearly in the same league, but who also have h>100, just by running "paper mills" that output 300 papers a year (of which some are statistically destined to be cited many times). A solution exists. It is easy to show (the credit goes to Steve Erwin at NRL) that, on average, the h-index depends on the number of publications as h=1.09 N^0.69. Two numbers, the actual h, and the deviation of h from the one expected from the number of publications, characterize both the total scientific output, and the quality of the person's work. Deviation by more that 10% in either way indicate a truly outstanding scientist or an operator with access to funding.
  5. Report this comment #52123

    Gunther Eysenbach said:

    There is already a much simpler metric which can predict future citation performance – the Twimpact Factor (the number of tweets an article receives within the first week of publication) and the derivative metric "Twindex" (Twimpact Factor as rank percentile compared to similar articles) are highly predictive to forecast highly cited articles 2 years later. In the field of health informatics, top-cited articles can be predicted from top-tweeted articles with 93% specificity and 75% sensitivity.

    Can Tweets Predict Citations? Metrics of Social Impact Based on Twitter and Correlation with Traditional Metrics of Scientific Impact
    J Med Internet Res 2011;13(4):e123
    http://www.jmir.org/2011/4/e123/

  6. Report this comment #64033

    Murry Garry said:

    Finally, I doubt that anybody will dispute the original Smith's argument that the number of citations is skewed in favor of minor co-authors of a single spectacular paper, and the number of publications is skewed in favor of managers running large-scale operations with dozens of postdocs, so-called paper making factories. The h-index is better than either of the two, and it reasonably quantifies the total scientific output of an individual scientist, yet it still does not tell us much about the quality of his or her work, as opposed to quantity. Phil Anderson has an h-index slightly in excess of 100, and I know a couple of scientists who are not even nearly in the same league, but who also have h>100, just by running "paper mills" that output 300 papers a year (of which some are statistically destined to be cited many times). A solution exists. It is easy to show that, on average, the h-index depends on the number of publications.

  7. Report this comment #69393

    Mathew Pitu said:

    I believe we all agree that performance indices are evil, and I believe we all agree that it is not possible to do away with them entirely, for we need to evaluate scientific performance somehow. Meibomian Gland Dysfunction

Subscribe to comments

Additional data