China’s largest funder of basic science is piloting an artificial intelligence tool that selects researchers to review grant applications, in an attempt to make the process more efficient, faster and fairer. Some researchers say the approach by the National Natural Science Foundation of China is world-leading, but others are sceptical about whether AI can improve the process.

Choosing researchers to peer review project proposals or publications is time-consuming and prone to bias. Several academic publishers are experimenting with artificial intelligence (AI) tools to select reviewers and carry out other tasks, and a few funding agencies, including some in North America and Europe, have trialled simple AI tools to identify potential reviewers. Some of these systems match keywords in grant applications to those in publications of other scientists.

The National Natural Science Foundation of China (NSFC) is building a more sophisticated system that will crawl online scientific-literature databases and scientists’ personal web pages, using natural-language processing to glean detailed information about the publications or research projects of potential referees. The system will use semantic analysis of the text to compare the grant application with this information and identify the best matches, says agency head Li Jinghai, who is based in Beijing.

Time saver

An early version of the tool selected at least one member of each of nearly 44,000 panels that approved projects last year, says Yang Wei, the agency’s former head, who presented data on the pilot at a meeting on scholarly communication in Hangzhou last month. Panels are composed of between three and seven people. The system is already cutting the time administrative staff have to spend looking for referees, says Yang. A similar approach will be used this year to select reviewers, he says.

The NSFC has become a world leader in reforming grant-review processes, says Patrick Nédellec, director of the international-cooperation department of the French CNRS, Europe’s largest basic-research agency. The NSFC is being forced to innovate as the number of grant applications keeps growing, says Nédellec, who attended a meeting last September at which Li discussed the agency’s reform plans. “Because the pressure is so high, China has no choice but to find the best way,” he says.

In the past five years, the number of applications the NSFC receives has increased by roughly 10% a year. In 2018, the organization evaluated 225,000 grant applications — almost 6 times the number received by the US National Science Foundation. The NSFC is struggling to process applications and find appropriate reviewers, says Li. “The challenge is not having enough people,” he says. “AI will solve that.”

Reducing bias

Li also wants the tool to reduce bias in reviewer selection. In China, scientists try to lobby for their projects, he says. “A problem with evaluations is that people use connections. AI can’t be corrupted,” says Li.

This is also an issue in countries where applicants are asked to suggest experts who could review their proposals. For instance, the Swiss National Science Foundation has found that reviewers who were recommended by the applicants were much more likely to endorse a project than were referees chosen by the foundation.

The NSFC’s pilot AI system works only on websites written in Chinese characters, but Li wants it to be able to crawl English-language websites in the future.

“NSFC’s reform plan is ambitious, forward-looking and comprehensive,” says Manfred Horvat, a science-policy adviser at the Vienna University of Technology, who also heard Li’s talk last September.

Other countries are following China’s lead. Last month, the Research Council of Norway started using natural-language processing to cluster about 3,000 research proposals into groups and match them to the best reviewer panels, says Thomas Hansteen, an adviser to the council.

Hints of scepticism

But not everyone is convinced that AI should be used in the review process. Susan Guthrie, a science-policy specialist at research organization RAND Europe in Cambridge, UK, notes that the Canadian Institutes of Health Research ran into significant challenges with an algorithm used for reviewer selection.

The Canadian agency hired RAND Europe in 2016 to carry out a meta-analysis of studies on grant peer review. Based partly on that report, the agency concluded that the algorithm sometimes selected reviewers who had conflicts of interest or were otherwise not appropriate or qualified to evaluate the proposal. “While algorithm-based matching sounded attractive, there is a limit at this stage of artificial intelligence to what it can possibly achieve,” the independent expert panel concluded. “Reviewer selection must be primarily informed by scientific human judgement.”

Elizabeth Pier, a policy researcher at Education Analytics in Madison, Wisconsin, thinks AI will not remove selection bias. She fears that AI systems end up replicating the biases ingrained in human judgements, rather than avoiding them. She recommends that the NSFC should do a study comparing the reviewers selected by AI with those chosen by people. Li says the NSFC might consider this once the system is up and running.

Credit for reviewers

Li plans to introduce other tools to make the grant system fairer over the next five years. These include a credit system that will reward researchers for good, fair and timely reviews — although Li would not comment on the nature of the rewards.

The idea of the credit system is to encourage reviewers to take the job seriously and be professional, he says.

Statistician John Ioannidis of Stanford University in California applauds the NSFC’s efforts to use objective, data-driven tools in mapping proposals to select reviewers. But he thinks it will be difficult to evaluate whether reviewers have made good decisions and deserve credit. It can take decades for an idea to be considered “great or a waste”, says Ioannidis.

Li is ready to take on the challenges. “This task is not easy to accomplish and will require constant improvement in a long process of study and tests,” he says.