Systematic evaluation of scientific research might strengthen public support, but could it also stifle innovation? The issues were debated at a symposium in Melbourne.
Since the United Kingdom's first Research Assessment Exercise in 1986, the concept of a national evaluation of publicly funded research has expanded to other countries, including Belgium, France, Italy, Australia and New Zealand. Some assessments are performed specifically to determine allocation of research funds, whereas others are benchmarking exercises of the performance of local research in a global context. Although the overall goals of these assessment systems are well understood, there is doubt as to how well each is working.
Their relative effectiveness was the focus of a symposium in February 2014 in Melbourne, Australia. Nature brought together experts from institutes and universities in Australia, New Zealand and Singapore to examine issues surrounding the outcomes and impact of how research is measured.
In his introduction to the symposium, Nature editor-in-chief Phil Campbell outlined several of the issues and views that were later discussed. “There is a need for research evaluators to be explicit about the methods they use to measure impact,” he said. “Openness is an essential part of earning trust. Nature welcomes a diversity of indicators.” Relying solely on citations, Campbell added, “absolutely can't be sustained”.
The United Kingdom has recently re-oriented its research-assessment programme to bring peer review, case histories and metrics into a system called the Research Excellence Framework (REF), which runs for the first time this year (see 'How research benefits the United Kingdom'). The Melbourne symposium examined this approach against various schemes in the Asia-Pacific region, including Singapore's carefully programmed development of knowledge-based industry; New Zealand's proposition that criteria for assessment be laid down even before research starts; and Australia's quantitative evaluation of its research strengths and weaknesses. Two things were clear: there are many reasons for evaluating research, and there are lots of approaches to get results. Perhaps the first hurdle to overcome is deciding what you want to achieve.
The symposium's keynote speaker was David Sweeney, the director for research, innovation and skills at the Higher Education Funding Council for England (HEFCE) in Bristol. Sweeney, who managed development of the REF, told delegates there was “no right to research funding”. He said, “If, as happened in previous budget proposals in the UK, senior scientists say to government 'Give us the money, and we will deliver the goods', the treasury has a right to say, 'Prove it!'.”
Sweeney said that scientists cannot assume that the general public understands the value of their research, so evaluation has become an essential tool for convincing UK government, business and society why they should invest in universities and research. In fact, he said, the UK government wanted to enlist companies to help fund university research — unlocking some of the capital that businesses had put away during the global financial crisis to protect against hard times. The outcomes of the REF, teamed with matched-funding schemes, could help the government release previously hidden private pots of money, he argued.
The methodology does the job that needs to be done now, even though it's not perfect.
Sweeney outlined the REF's methodology. “Academic excellence is still the number one objective of public funding,” he said. But conventional gauges of merit, such as peer review and citations, should not comprise the whole assessment; it's also important, he said, to reward research that has a positive impact on society. He asserted that the REF did not open the way for government to dictate research direction. Nor did it mean a bias towards funding applied research. Instead, said Sweeney, REF provided a means of validating the contribution of all research: “It's not about favouring one discipline over another.” He presented REF not as a perfect measure of impact, but as a first step. “The methodology does the job that needs to be done now, even though it's not perfect,” he explained.
Real-world issues, such as water and energy usage, are complex and inter-connected, and research addressing these matters needs to draw on expertise from physical and biological sciences, as well as social sciences including economics, behavioural psychology and law. Yet, according to participants in a panel discussion on multidisciplinary research, such crucial work has rarely been valued appropriately in research assessment exercises.
The intrinsic value of multidisciplinary teams, and the difficulties of their coordination, were well illustrated by the story of the Murray-Darling Basin Plan, set up to manage water resources in Australia's largest and most agriculturally productive area. “It was a wonderful document that told us exactly what we should do,” said Robert Saint, pro vice-chancellor of research strategy at the University of Adelaide. The plan was unpopular as it proposed swingeing cuts to water allocation for many farmers. “Its release was closely followed by farmers burning it, and the whole business had to go back to the drawing board.” The problem was that the Murray-Darling Basin Authority, which compiled the report, lacked the specific capabilities for incorporating legal, political and social issues alongside the science.
Australia's largest national research body, the Commonwealth Scientific and Industrial Research Organisation (CSIRO), based in Canberra, is no stranger to multidisciplinary research, said its chief executive Megan Clark. CSIRO, she noted, specializes in large-scale, broad, “pan-disciplinary” research groups. “There is an understanding from the minute you walk in that this is not a place to work on personal research,” she said. “We work in multidisciplinary teams on mission-directed research.” As a result, CSIRO's evaluation of its own research includes traditional outputs, such as patents and journal publications, and quality assessments by independent peer review panels, but crucially also takes into account the impact of its work on end users — including the public, government departments, private companies and environmental organizations (see page S72).
CSIRO runs large-scale multidisciplinary research partnerships known as National Research Flagships (see 'Launching flagships'). These focus on issues of national and global importance such as biosecurity, preventative health, manufacturing and sustainable agriculture. In a little more than a decade, the Flagships programme has grown to encompass more than half of all CSIRO research activity.
Many stakeholders, Clark recalled, feared that the Flagships programme would lead to a decline in the quality of the organization's science. But CSIRO's experience has been the reverse, she said. “Last year, we hit a record in the quality of our science and our standing globally.” For instance, the citation rate for CSIRO research publications is now 56% more than the global average, according to the organization's latest Science Health and Excellence report.
CSIRO's approach differs from multi-disciplinary work undertaken at universities, which are the primary training grounds for researchers, said Kim Langfield-Smith, vice-provost for academic performance at Monash University in Melbourne. The academic environment tends to have discipline-focused organization underpinning promotion tracks. This silo structure is not conducive to researchers thinking outside their speciality.
Langfield-Smith spoke of the difficulties in recruiting university researchers for multidisciplinary projects. In particular, mid-career and older researchers found it difficult to justify interrupting their research to join projects that might not yield publications in the top journals of their own fields. What's more, multidisciplinary research is difficult to get underway: it routinely lacks common language, modes of analysis, conceptual frameworks and dedicated journals (many outcomes are instead published as government reports).
Saint observed that peer review could be disadvantageous to multidisciplinary projects at both the funding and publication stage. “I remember the early days of bioinformatics: statisticians would argue that all the theory had been done 40 years ago, and biologists couldn't see anything interesting in statistics.” The panellists suggested several ways to promote multidisciplinary work, including setting up dedicated funding streams for such research, and altering the criteria of assessment so that work published in government reports is eligible for consideration.
Hugh Durrant-Whyte, chief executive of NICTA, Australia's largest information and communications technology research organization, suggested that the solution lay in removing disincentives. Funding agencies, he said, should foster a research culture that encourages scientists to undertake projects because they were “cool and exciting, not because there is a paper at the end”. Young scientists, he said, should be urged to “find something interesting and get on with it”. This would naturally stimulate collaborations and multidisciplinary work, he added.
When options are limited
The value of research to government can be very different from its value to business, or to academia or the public. That's why it's critical to set the criteria for evaluation from the very beginning, said Peter Gluckman, chief science adviser to the prime minister of New Zealand. This approach “changes the way research is done”, said Gluckman. “It influences how scientists work and think.”
Gluckman was mainly referring to government-directed projects that account for a large portion of the science budget of small countries such as New Zealand. Perhaps the most compelling argument for this principle can be seen in Singapore, which has taken little more than a decade to generate a biomedical industry from a low starting point (see 'How to grow an industry').
David Lane is chief scientist of Singapore's Agency for Science, Technology and Research (A*STAR) which, with the country's Economic Development Board, was responsible for implementing the Biomedical Sciences Initiative to develop the industry. He said that determining impact was a major part of the government's strategy. “Our budget was increased,” he said, “but 30 to 40% was set aside and would only be released if we could show we were doing work aligned with industry.” The yardstick by which the effort was measured was the level of corporate investment in Singapore's biomedical industry.
Such a utilitarian view of science by governments, said Gluckman, differs enormously from the academic perspective, which focuses on accumulation of knowledge. Governments, he said, were typically concerned with research impact on the economy, the environment, defence and public health. Such priorities were greater in small economies that cannot so easily spare money for blue-sky research.
One of the purposes of the Excellence in Research for Australia (ERA) programme at its inception was to determine in which research fields Australia had world standing, said Margaret Sheil, provost of the University of Melbourne and a former head of the Australian Research Council (ARC). Sheil, who was heavily involved in the design and operation of the ERA (see page S67), pointed out that although Australia had a small population, it was competing globally in many disciplines.
Representing these different viewpoints in one assessment tool is not easy. Science entrepreneur and chancellor of Monash University, Alan Finkel, suggested that funding bodies needed a framework where activities such as working in industry, contributions to government reports or communication of research outcomes to audiences other than a researcher's peers could be converted into a “citation equivalent” for the purpose of improving the measurement of research impact (see page S77).
Metrics are not the answer
Assessing a country's research enterprise is not an end in itself. And when it comes to acting upon the outcomes of research assessment, funders have vastly differing viewpoints. The one issue on which they tend to agree is that any worthwhile evaluation of research — whether it be for disbursing grants or encouraging excellence — needs to be based on a range of measures, not just the quantity of publications and how often they are cited by others. In the final panel of Melbourne symposium, representatives of four significant funding organizations discussed how best to incorporate the information gained from assessments.
Traditionally, research assessment evaluates completed projects. But, in an ever-changing research environment, a scientist's past successes might not be a predictor of how well they will perform in the future, said Tony Peacock, chief executive of the Australian Cooperative Research Centre (CRC) Association in Canberra, which runs the nation's 40 CRCs — collaborative partnerships between publicly funded researchers and industry. In fact, Peacock argued, rewarding only those strategies that were successful in the past would tend to discourage new approaches and stifle innovation, the essence of successful science. Relying solely on citation and peer review metrics was opposed for similar reasons by Warwick Anderson, chief executive of Australia's National Health and Medical Research Council (NHMRC) in Canberra, which dispenses more than AUS$750 million (US$700 million) of government money in research grants each year (see page S52).
“It's not only the research that's important, but also how it is used,” he said. Health researchers typically wish to influence decision-makers and medical practitioners as well as other scientists, which means they need to publish in areas outside academic literature. To properly evaluate their work, he said, you needed to consult sources other than scientific journals, such as government reports and health-care experts. Government has a huge interest in health care because of its enormous cost. Australia's AUS$140 billion health-care industry — comprising vaccine manufacturers and medical device developers, among others — is also the nation's second largest exporter of manufactured goods, Anderson said.
Australia's other major research funding body is the ARC, responsible for disbursing more than AUS$900 million a year. It also administers the ERA, which aims to determine areas of Australia's research strengths. ERA assessments are made by internationally recognized researchers, organized by discipline and clustered into eight Research Evaluation Committees. They use traditional measures of quality, such as citation analysis or peer review, but also incorporate a broader view, considering income from commercialization and measures of esteem — for example being admitted to a learned society such as becoming a fellow of the Australian Academy of Science.
ERA ranks research quality against a global scale and is “a rigorous and robust measure across all discipline domains”, ARC's chief executive Aidan Byrne told the symposium. It aims, he said, to get researchers to change their focus from quantity of work to quality. “In that, the ERA exercise has been spectacularly successful. And it did it without tying the exercise to financial rewards.”
Furthermore, despite its reliance on metrics, ERA results for academic excellence correlate with other real-world outputs, Byrne said. For instance, 95% of industry investment in research in Australia is in the same areas in which researchers performed at world-class or better. And the same is true for 98% of the research that was commercialized and for 97% of the work that was patented.
HEFCE's Sweeney's take on various methods of assessment was straightforward. No system will be perfect, he said, but you have to start somewhere: “You can propose alternatives, and spend five years discussing them, but that's not going to solve today's problems.”
About this article
Cite this article
Thwaites, T. Research metrics: Calling science to account. Nature 511, S57–S60 (2014). https://doi.org/10.1038/511S57a
Metrics and evaluation of scientific productivity: would it be useful to normalize the data taking in consideration the investments?
Microbial Cell Factories (2019)
Measuring impact in research evaluations: a thorough discussion of methods for, effects of and problems with impact measurements
Higher Education (2017)