Countries the world over are looking to their research base to drive progress in an increasingly competitive environment. But how should research be guided or rewarded to achieve a country's goals — be they academic excellence, as exemplified by breakthrough papers and awards such as the Nobel prize, or measured in terms of economic prosperity and social wellbeing?

A panel comprising (L–R) Philip Campbell, Shen Wenqing, David Sweeney, Daniel Hook, Maki Kawai and Li Xiaoxuan discuss China's research evaluation. Credit: Shanghai Association for Science & Technology

Assessment and evaluation of research is a timely topic globally.

“Assessment and evaluation of research is a timely topic globally,” said Charlotte Liu, regional managing director for science (Greater China) and education (Asia) at Macmillan Science and Education, the parent company of Nature Publishing Group, when she opened the 2014 International Symposium on Research Assessment and Evaluation in Shanghai, China. “Crucial to this aim is to establish a comprehensive, rational and systematic evaluation framework to guide research investment and resource evaluation.” At the symposium, held in October at the Shanghai Association for Science and Technology (SAST), representatives from academia, industry and government convened to talk about their experiences and to discuss courses of action.

Growing and leading

In his introduction, Yang Jianrong, vice-chair of SAST, outlined important elements that any research evaluation plan must address. The first, he said, was a “focus on the quality of research benefits, including the need for objective criteria”. He stressed that, in a world plagued by resource shortages and environmental degradation, it is important to target innovations that are “beneficial to the planet's long-term stability and sustainable green development”.

The pages of Nature reflect the substantial growth of Chinese science in recent years, said the journal's editor-in-chief, Philip Campbell. The ultimate aims of research assessment, he said, were “to incentivize good practice and critical and ambitious thinking”. Research has an effect beyond the institution or even the community in which it takes place, with the potential to influence many lives through new technology, new governmental policies and better health care. Therefore, said Campbell, it is crucial to “capture the impact of research in the fullest sense — from both academia and the wider research community”.

Any evaluation of research needs to consider the interlocking chain of basic research, applied research, technology development and commercialization, said Zhang Xu, vice-president of the Shanghai branch of the Chinese Academy of Sciences (CAS). “A good system is not only conducive to the development of science and technology,” Zhang said, “but it will help our scientists grow, our education to improve, and help to create benefits for all of mankind and particularly for our environment.”

Promoting the underachievers

Kurt Wüthrich, a biophysicist dividing his time between the Scripps Research Institute in San Diego, California, the Swiss Federal Institute of Technology in Zürich, Switzerland, and the iHuman Institute at ShanghaiTech University, talked about the way in which scientists are evaluated. Wüthrich is best known for sharing the 2002 Nobel Prize in Chemistry for the development of nuclear magnetic resonance spectroscopy. However, during his undergraduate years at the University of Bern he was more successful in his athletic endeavours. He reflected on his track speciality in the conference's plenary address: “For the high jump, one needs to find children with talent and simply measure their achievement. Each attempt gives a clear and final result; good or bad.” But such obvious decision-making is not available to science, said Wüthrich. “The result of research, whether it is impact on quality of life or economics, may not appear for years or even decades.”

The appointment and promotion of scientists, Wüthrich continued, is often under-emphasized in discussions about research assessment. “If we do not select these scientists well, then we will not get good value out.” Wüthrich contends that there are many “talented underachievers” who might need encouragement to make better use of their skills, and who are usually overlooked in favour of people whose main ability lies in maintaining the status quo. These overachieving scientists will not be the ones who produce the breakthroughs, he argued (see page S13).

His message is that, rather than examining researchers' past output, more emphasis should be placed on finding, supporting and retaining the exceptional people. “There are very few who push things forward,” he said.

Impact in the United Kingdom

David Sweeney provided the counterpoint to Wüthrich's experience — that is, from the perspective of an assessing organization. Sweeney is a director of the Higher Education Funding Council for England (HEFCE), which evaluates research across the United Kingdom. The first task, said Sweeney, is to answer the question: “Why are you assessing?”

Kurt Wüthrich training in Switzerland in 1956. Credit: Courtesy of Kurt Wüthrich

For HEFCE, the answer to that question is clear: the evaluations guide the allocation of funding. Most research in the United Kingdom is undertaken at universities, who then choose what projects and researchers to support. “We expect universities to take wise decisions,” he explained.

But there is also another crucial question, Sweeney said: “What does research success look like — that is, what are you trying to achieve?” And here, he said, the answer is very much in flux. The economy of the United Kingdom, as in much of Europe, is struggling. “The government sees our universities and their research as one of the most successful systems in the world and a key part of the approach to returning the economy to balanced growth.” The United Kingdom has, he said, “intellectual leadership in the development of new knowledge”. But whether this knowledge has a positive impact on society has not been clear. “We assume it has, but do we have the evidence?”

For more than two decades, the United Kingdom ran the Research Assessment Exercise (RAE) to evaluate the quality of university research. According to Sweeney, the RAE had a hugely positive effect on the number of publications and citations, and of the quality of this output. “We think we are spending our money wisely,” said Sweeney. “We want to spend it even better.”

And better, in this context, means more 'impact' — a much-used term at this symposium and in discussions of science evaluation in general (see page S21). Sweeney explained: “Research impact is the demonstrable contribution that research makes to the economy, society, culture, national security, health, public policy or services, quality of life, and to the environment”.

For the 2014 assessment, the United Kingdom transformed the RAE into the Research Excellence Framework (REF) and awarded 20% of the evaluation score for an institution on the basis of case studies, which describe the wider (non-academic) impact of research. However, modifying the assessment system is not intended to change the fundamental focus of UK research. “We don't want to discourage curiosity-driven research, but instead to prove that the best impacts come from this type of research,” he said.

The academy for change

China, unlike the United Kingdom, has no national system for research evaluation. However, the assessment process operating within CAS — one of the largest research organizations in the world — can be seen as a microcosm for the country, said Li Xiaoxuan, director of CAS's Institute of Policy and Management in Beijing.

China is a newcomer to the international research arena. “It was only 30 years ago that we started to reform,” said Li. “We were closed, but we started to open up to a market economy.” At the same time, efforts began to modernize industry and science, and the National Natural Science Foundation of China (NSFC) was established in 1986 as the main body for competitive project funding. “But we didn't appreciate how we should guide research through evaluation and assessment,” he added.

In 1990, the government began evaluating its national key laboratories, and shortly thereafter CAS began evaluating its own research institutes. The purposes behind the evaluations, said Li, were: to help select the best people through competition; to raise China's science and technology output, in line with international levels; to promote efficient resource allocation; and to bring scientific decision-making into the management of research and development (R&D).

But there were problems with the tools developed for these tasks. “The biggest issue was that there was too much focus on quantity,” Li admitted. Scientists would focus on increasing the number of projects they conducted and how many articles they wrote. “It led to very short-term behaviour,” said Li, with the knock-on effects of research misconduct and wasted funding.

To provide more helpful incentives, CAS is now moving towards qualitative evaluation (see page S18). Since 2011, CAS has been using the One-Three-Five programme. The name stands for one orientation, three breakthroughs and five major directions. “It is focused on outcomes, not papers,” said Li.

By focusing its research institutes on the One-Three-Five plan, CAS is hoping to avoid duplication and create areas of specialization, to help its institutes achieve major breakthroughs and maintain its fast pace of development. Results so far from the 19 institutes evaluated show that around one quarter can be considered world-leading — the highest determination, said Li. One of the core features of the plan is that is uses third-party assessment, including evaluation from international experts. Feedback from such experts should quickly identify any problems in management and provide constructive suggestions. And, as with the United Kingdom's research evaluation scheme, the plan “enables us to keep a balance between basic science and applied research”, Li said.

Wang Minmin, head of chemistry at Eli Lilly's Shanghai-based China research and development centre. Credit: Shanghai Association for Science & Technology

The afternoon of the symposium comprised panel discussions on the broader implications of research evaluation. One of the recurrent themes was how best to support and evaluate young scientists.

We need to look for potential rather than looking at past achievement.

“Evaluating young people is different from evaluating older, established scientists,” said Li Mengfeng, vice-president of Sun-yat Sen University (SYSU) in Guangdong. “We need to look for potential rather than looking at past achievement.”

Shen Wenqing, an academician at CAS, observed that the pervasive conservatism of Chinese society tended to “kill advanced ideas” — ones that have a low probability of success but potentially high impact. Echoing Wüthrich's earlier talk, Shen urged a greater focus on encouraging young scientists to pursue original ideas.

These aims are also of concern to the top institutes in Japan, said Maki Kawai, executive director of RIKEN, Japan's largest dedicated research institution. Young scientists are overburdened by Japan's onerous annual evaluation system, she said. Moreover, the current system does not take diversity into account. “We should not choose just one sort of person,” she said, “it is important to have flexibility in the system.”

CAS's Zhang raised the issue of how best to evaluate genuine breakthrough ideas. He called on the research community to help develop a new way to determine a paper's quality, one that is not entirely dependent on the journal in which it is published, or on how many times it has been cited. Such a broadening of criteria would be particularly helpful to researchers at the start of their career. “At the very beginning, it is hard to publish in high-impact journals,” he said.

Anthony Cheetham, vice-president of the United Kingdom's Royal Society, took up Zhang's point. He noted that “many papers that have recently won Nobel prizes were actually published in second-tier journals”. Having papers in these journals only should not be an impediment to promotion, he said (see page S34).

Participants started to reach a consensus on this point. Wang Xiao-Jing, associate vice-chancellor for research at New York University Shanghai, added that “these high-impact journals can inform assessment but should not be all of it”. When interviewing someone for a position, Wang Xiao-Jing advocates “spending a day reading all their work”.

Providing a view from industry was Cory Williams, head of clinical-trial management for Pfizer's Shanghai R&D centre, part of the New York-based pharmaceutical firm's global network. He spoke about the three leading indicators that he looks for in young researchers. The first two concern uniqueness of research and productivity, but the third is perhaps the most important: capacity for collaborative research. “Through collaboration you can reach new skill sets,” he said. “These interactions lead to more innovation.”

Williams's advice is to “reverse engineer what a distinctive researcher looks like at various points in their career”; then it will be possible to measure and mentor people towards that template from the beginning.

Societal impact

Another emergent theme was on the challenges in measuring and understanding the impact of all the diverse products of research for society. Campbell gave examples that illustrate how high-impact research is not always published in high-impact journals. “Multidisciplinary teams of natural and social scientists creating solutions for water-stressed cities” is hugely important, despite rarely being published in the leading journals.

Assessment of science is of critical importance to research-funding agencies. According to Chu Junhao a physicist at CAS's Shanghai Institute of Technical Physics, there are four factors that can help funders to determine the societal value of projects: whether the research led to new knowledge; to industrial output; to enhanced technical proficiency; or to development of new expertise.

For some areas of research, the main impact will be in terms of government policy. Lu Yonglong, an environmental scientist at CAS's Research Center for Eco-Environmental Sciences in Beijing, spoke about his speciality. “For us, the point is not to just have papers in top journals, but to ensure that the public understand what the issues are.”

A key element in public understanding of science is accessibility of research papers. “We are more likely to have solutions if everyone has access to the relevant information,” said Carrie Calder, strategy director for open research at Nature Publishing Group. The value of open access is particularly important when it comes to solving global challenges such as food security, pollution and climate change, which require cross-disciplinary collaboration.

Seeding new companies

The final headline topic grappled with how to move from research to the creation of new companies. Guo Chongqing, a mechanical engineer at Tongji University in Shanghai, provided a perspective from his four decades at the forefront of engineering design. Guo said that China has come a long way in the past few decades, but that R&D still has its problems. In particular, he said, “there is too much meddling and interference from government”.

But there are good signs in certain sectors. “There is a lot of enthusiasm and drive in Chinese universities to contribute to drug discovery,” said Wang Minmin, head of chemistry at the Shanghai R&D centre of Eli Lilly, a US pharmaceutical company based in Indianapolis. But key to this field is the ability to learn from failure. “We should emphasize publication and sharing of these stories,” Wang Minmin said. At Eli Lilly, she explained, they have “started to celebrate any outcome, positive or negative, to allow people to openly say what they've learned”. Such a policy helps to share information and prevent duplication of efforts. Failure, in this context, is also success.

The symposium closed with some reflections on the day's discussions from Chen Kaixian, chair of host organization, SAST. This meeting, involving participants from different scientific evaluation systems, “should help promote understanding of each other, through the blending and collision of ideas”, said Chen. As delegates prepared to return home to their companies and laboratories, there was a strong urge to take some of the lessons back with them to have an impact on how science is assessed around the world.