Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Research efficiency

Turn the scientific method on ourselves

How can we know whether funding models for research work? By relentlessly testing them using randomized controlled trials, says Pierre Azoulay.

In times of tight budget constraints, scientists' wranglings about the real and perceived sins of public funding agencies become particularly acute. Complaints usually lead to the creation of a panel of respected, thoughtful and well-meaning scientists who come up with a plan of reform based on their intuition and experience. Funding agencies, who are genuinely concerned about improving the productivity of the scientific enterprise, often adopt these recommendations, at least in part. In one example of this process, the US National Institutes of Health (NIH) in Bethesda, Maryland, has created a large array of funding mechanisms, each one targeted to a particular problem — including the K99/R00 or 'kangaroo' grants, which pair postdoctoral scientists with mentors to help them to prepare for tenure-track faculty positions and funding independence. Not only is this range of mechanisms confusing and costly to administer, but the effectiveness of such reforms is never seriously evaluated.

It is time to turn the scientific method on ourselves. In our attempts to reform the institutions of science, we should adhere to the same empirical standards that we insist on when evaluating research results. We already know how: by subjecting proposed reforms to a prospective, randomized controlled experiment. Retrospective analyses using selected samples are often little more than veiled attempts to justify past choices.


What could such a formal experiment look like? Let me give an example. It is well documented that the past 30 years have seen a marked increase in the age at which academic scientists achieve funding independence1. One way to ensure the continued injection of talent into these ranks would be to evaluate first-time applicants separately from a larger pool, and dedicate to them a predetermined share of the available funding. An alternative would be to keep the current system in place, but to award their proposals 'bonus points'. Which reform should we adopt? And what if the 'greying' of the scientific workforce does not stem from institutional failure, but rather reflects the influence of an ever-expanding burden of knowledge, whereby scientists must spend more time in training before they can become productive2?

To test these questions empirically, for example within the NIH (the agency I have studied most closely), we could choose a random subset of funding panels to implement the first method and a second subset to implement the other. A third subset, in which funding panels proceed with business as usual, would serve as the control group in the experiment. Ideally, the study would be designed to avoid 'panel shopping' by applicants; the 100,000 or so R01 grant proposals reviewed each year by the 183 NIH funding panels are more than enough to craft a statistical protocol with adequate power.

Experimenting on ourselves may well lay bare some shortcomings of the scientific community.

These experiments could exist outside government agencies, too. When philanthropic organizations develop new models to fund research, they should formally investigate how their approach compares with the dominant model, which averages experts' scores to determine the funding-priority ranking for particular projects. Some emerging models, for example, give higher ranking to projects that elicit enthusiasm and controversy than to projects that generate more consensus but only tepid support across reviewers. It might be that funding a project on the basis of reviewer sentiment is more likely to result in the selection of truly innovative, field-changing projects, but how will we know for sure? A serious evaluation of this question would compare the two systems by randomizing proposals to one of these two ranking approaches, and then examining which portfolio of projects is most successful.

When I suggest these experiments, I encounter a lot of resistance. Wouldn't this be gambling with scientists' careers? How can we measure success — by counting publications and citations, looking at the students trained as a by-product of these grants, or using other metrics? Won't this work shift scarce funding away from actual scientific investigations?

These criticisms are without merit. The current system already gambles with scientific careers, just in a haphazard way. Scientists often disagree on how to measure success, and the choice of metric, as well as the period necessary for a careful assessment, will always be context-dependent and controversial. With the good will of administrators in public agencies or private foundations, these experiments could be rolled out with minimal disruption for about the cost of a R01 grant (typically US$250,000 per year for 5 years). If the scientific community could test even a small number of hypotheses in this way, the system-wide benefits would dwarf this modest investment.

I am well aware that this vision will sound utopian to some. Sceptics abounded when my colleagues at the Massachusetts Institute of Technology in Cambridge founded the Jameel Poverty Action Lab and subjected development-assistance methods to randomized, controlled trials to see which worked best. But as a result of this work, policy-makers now know that it is better to give out free mosquito nets to prevent malaria than to charge even a low price for their purchase — one example among many3,4.

We inherited the current institutions of science from the period just after the Second World War. It would be a fortuitous coincidence if the systems that served us so well in the twentieth century were equally adapted to twenty-first-century needs. Experimenting on ourselves may well lay bare some shortcomings of the scientific community and expose us to criticisms from politicians, who are always looking for excuses to cut science funding. But the only alternative to such controlled experimentation is the gradual stultification of our most cherished scientific institutions.


  1. Rockey, S. Age Distribution of NIH Principal Investigators and Medical School Faculty (13 February 2012); available at

  2. Jones, B. F. Rev. Econ. Stud. 76, 283–317 (2009).

    Article  Google Scholar 

  3. Cohen, J. & Dupas, P. Q. J. Econ. 125, 1–45 (2010).

    Article  Google Scholar 

  4. Banerjee, A. V. & Duflo, E. Poor Economics: A Radical Rethinking of the Way to Fight Global Poverty (PublicAffairs, 2011).

    Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Pierre Azoulay.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Azoulay, P. Turn the scientific method on ourselves. Nature 484, 31–32 (2012).

Download citation

  • Published:

  • Issue Date:

  • DOI:

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing