Published online 28 July 2008 | Nature | doi:10.1038/news.2008.988


Stats reveal bias in NIH grant review

Alternative system could make 'fairer' funding decisions for a quarter of awards.

The system used by the US National Institutes of Health (NIH) to evaluate grant proposals does not adequately compensate for reviewer bias, a new study concludes.

money test tubeIs the NIH grant system fair?Alamy

The assessment of grant reviews generated by more than 14,000 reviewers suggests that the NIH needs to overhaul the peer-review system it uses to rank proposals, according to biostatistician Valen Johnson of the University of Texas M.D. Anderson Cancer Center in Houston, the author behind the survey.

He proposes an alternative system that would affect about 25% of grant awards and change the destination of billions of dollars of research money.

Johnson's analysis comes as the NIH evaluates possible changes to the policies it uses to hand out more than $20 billion of research money annually. Irate biomedical researchers have long criticized the system for its slow funding decisions, along with an apparent favouritism towards established researchers and conventional approaches.

In 2007, NIH director Elias Zerhouni launched an initiative to examine the NIH peer-review process. The agency has since proposed several changes, including shortening grant applications and comparing young investigators' proposals separately. But the changes did not include efforts to account for reader bias.

Most grants are assigned to a study section containing about 30 members. Each application is read by just two to five reviewers, and then favoured proposals discussed when the section meets. The application is then assigned scores by all of those present, and these are averaged into a final verdict.

This system fails to account for individual bias, and places undue weight on panel members who have not even read the proposals, Johnson argues. "That introduces a really large potential for bias," he says. "It's astounding to me that $20–25 billion a year is being spent on research, yet the selection process is based on a rather primitive system."

The power of the individual

Johnson has served on several NIH grant-review committees over the years. While attending a review meeting about five years ago, he realized that the rankings data were not being used effectively.

So he went to the NIH Center for Scientific Review (CSR) — the body that oversees grant applications — and got access to the reviews of almost 19,000 proposals evaluated in 2005. These involved about 14,000 reviewers, with each proposal having been read by an average of 2.8 people. "With only two to three people, on average, reading the proposals, the particular individuals that happen to read it have a major impact on its final score," he says.

Using those data, Johnson looked for patterns in how individual readers evaluated proposals1. He found that the top grants were largely unaffected by reader bias, but that such bias did impact grants closer to the funding cut-off line.

Money talks

Johnson recommends that the cost of proposals that fall close to this boundary should be taken into consideration. Favouring less expensive grants would allow the agency to fund more projects, he argues, and thus increase the likelihood that they have supported the best applications. It would also provide an incentive for researchers to request the minimum funds that they need. Under the current system, many applicants ask for the maximum amount they can justify, expecting that they may not receive all that they request.

Toni Scarpa, director of the CSR, chafes at this suggestion. Some studies are inherently more expensive than others, and a proposal that includes a clinical trial should not be penalized for being more expensive than a proposal that does not, he says.

Nevertheless, Johnson's analysis was well executed, says pathologist David Kaplan of Case Western Reserve University in Cleveland, Ohio, who has performed his own analyses of the NIH peer-review system. "The statistical analysis that the NIH has done is pretty elementary," says Kaplan. "Johnson's approach basically says we can do better."

At the moment, it's unclear what the NIH will do with Johnson's analysis. After Scarpa became director of the CSR in 2005, it asked Johnson to return the data. Johnson returned the original reviews, but was able to keep copies by placing a Freedom of Information Act request.

Scarpa says that the CSR had heard Johnson present preliminary results and was not interested in pursuing the project further. Although the center is interested in revising its scoring procedures, "there was relatively little enthusiasm" about Johnson's analysis, says Scarpa. Instead, the center is considering other changes, such as varying the weight given to different criteria such as innovation or the strength of preliminary data. It is also considering shortening applications and increasing the number of reviewers to about ten, he says. 

  • References

    1. Johnson, V. E. Proc. Natl Acad. Sci. USA doi:10.1073/pnas.0804538105 (2008).
Commenting is now closed.