Assessing (for) impact: future assessment of the societal impact of research

National research assessments play a role in providing accountability—to funders, government and civil society—for the activities of largely autonomous research systems. Increasingly, an element of such assessments is the assessment of societal impact. In this article, I review the development of impact assessment, with a focus on the UK’s Research Excellence Framework, and consider implications and challenges for the future. Notwithstanding these challenges, I further argue that the assessment of societal impact needs to become a more central aspect of research evaluation. This article is published as part of a collection on the future of research assessment.


Introduction
A cross the world many Governments invest significant amounts on public money into research (OECD, 2015). At least part of the justification for these investments is related to the diverse array of benefits that an effective research base brings to society. The focus on societal benefit as an outcome of research investment leads to increasing political pressure both to demonstrate impact from research and hold researchers and their institutions accountable for delivering benefits. As a result, a key question for research policy, at both a theoretical and practical level, is how considerations of societal impact can be included in both ex ante and ex post research evaluation. The aim of this article is to consider the current state of play with regard to ex post evaluation, and consider the development and future trajectory of research impact assessment.
My perspective on these developments is that of a policymaker who has spent almost a decade immersed in the practicalities of research impact. Over this period I have been involved in work to introduce considerations of research impact into ex ante evaluation, 1 and, most recently, my role has been focusing on ex post evaluation in the UK's national research assessment, the Research Excellence Framework (REF). I was responsible for the evaluation of the last REF, and, at the time of writing, I am deeply involved in developing proposals for the next national exercise.
This article represents a synthesis of evidence and debate on research impact from this perspective, leading to some thoughts on potential future trajectories. As such, it cannot be viewed as a disinterested or impartial take on the issues, and should be read with that context in mind. Equally, given how deeply issues of research impact are felt within the academy, it is important to recognize that the academic discourse on research impact may not always be completely impartial.

History and context
While the idea that research has practical implications has a long history, it began to be codified in the twentieth century, with a key moment being the publication of Bush's (1945) "Science, the endless frontier". Written shortly after the conclusion of the Second World War, Bush's report to the US Presidency can be thought of as a blue-print for research policy in the second half of the twentieth century in the United States and beyond. 2 Bush wrote: As long as [universities] are vigorous and healthy and their scientists are free to pursue the truth wherever it may lead, there will be a flow of new scientific knowledge to those who can apply it to practical problems in Government, in industry, or elsewhere While to modern readers, there is much to debate about Bush's view that reflects both a separation between academic research and the society it serves, and an essentially linear process of influence, there is no doubt that delivering benefit to society, the solving of "practical problems", was central to his case for investment in research.
Moving forward some decades, this same idea was central to reforms to the research funding system by the UK Government in the "Realising Our Potential" white paper, 3 published in 1993. The Government called for a renewed focus on the delivery of societal benefits from research, and proposed major reconfigurations of organizations and funding streams to bring about this objective. Following these reforms, continued focus led to the development of new ways of talking about the benefits from research-the "impact agenda" was born, first with an explicit link to economic development, and then as a broader concept of impact on "society and the economy". The important and influential report by Peter Warry, published in 2006, was central in establishing both the broad definition of impact, and in setting a clear strategic direction for policy interventions aimed at maximizing impact (Warry, 2006).
More recently, further extensive changes to the research system architecture have been proposed by the UK Government. 4 The delivery of societal benefit from research continues to be central, with the creation of a new agency-UK Research and Innovation -that brings together research funding and innovation support. The stated aim is to: … ensure that our research and innovation system is sufficiently strategic and agile to deliver national capability for the future that drives discovery and growth.
At the same time as the policy narrative emerged, there were also parallel developments in the understanding of the relationship between research and society. For Vannevar Bush, the route to impact was to leave scientists (researchers) alone, "free to pursue the truth", focussed on self-defined, presumably fundamental or "basic" research questions. It was for others to "apply it to practical problems". From the closing years of the twentieth century onwards that view has been increasingly challenged.
The work of Stokes (1997) critically evaluated the simplistic notion of "basic" and "applied" research. He argued that the motivation of research-whether it is focussed on expanding knowledge or solving problems-is to an extent independent of the nature of the research itself. Stokes emphasized that often research with a specific aim of solving issues in the real world depended on advancing fundamental understanding. Problem solving could itself generate new insight, rather than impact always flowing from "disinterested" research.
Further ideas were developed by Gibbons, Scott and Nowotny and colleagues (Gibbons et al., 1994;Nowotny et al., 2001Nowotny et al., , 2003 who articulated the notions of Mode 1 and Mode 2 research. In their words (Nowotny et al., 2003): The old paradigm of scientific discovery ('Mode 1')characterized by the hegemony of theoretical or, at any rate, experimental science; by an internally-driven taxonomy of disciplines; and by the autonomy of scientists and their host institutions, the universities-was being superseded by a new paradigm of knowledge production ("Mode 2"), which was socially distributed, application-oriented, trans-disciplinary, and subject to multiple accountabilities.
The key distinctions between these different modes are their different foci (internally to the research community or externally to the needs of society) and their different accountabilities (only to researchers' peers or to a much broader set of stakeholders). Mode 2, in keeping with the policy trends outlined above, is about both focussing research on the needs of society and opening the process of research to other expertises beyond the academia.
The two modes of research identified by Gibbons et al. are not necessarily mutually exclusive. It is possible for researchers to be operating in ways that satisfy both the requirements and norms of their academic peers, while at the same time seeking to solve societally relevant problems with partners from other sectors (Cruickshank, 2013). From the perspective of policymakers, operating in the context outlined above, there is a clear preference for a mode 2 orientation, with a related desire to design the policy frameworks and incentives to favour (or, at least, not discourage) that orientation. ARTICLE PALGRAVE COMMUNICATIONS | DOI: 10.1057DOI: 10. /palcomms.2016 Despite the evolving policy objectives for research, and the increasingly sophisticated articulations in the Science and Technology Studies literature, the evaluation of research and researchers has remained focussed on notions of research excellence defined and implemented by the academic community itself. While it is debatable whether a focus on meeting standards defined within the academy is itself damaging to achieving broader impact, in a time-and resource-limited system, it can be argued that an excessive focus on "Mode 1" criteria in evaluation could be detrimental to the achievement of "Mode 2" objectives. This is especially pertinent in the consideration of national research assessment systems. These frameworks provide powerful incentives, with the potential to define the criteria of success for academic research. As well as setting delivery of societal impact as a goal, inclusion in national systems is essential in situating delivery of impact as a legitimate activity to which to commit time and resources. Not only is impact assessed, but space is also created for generation of impact-assessment "for impact" as well as "of impact".
As a result, the attention of policy-makers has become increasingly focussed on developing approaches that assess research from the perspective of societal benefit or impact. There are a number of well-established challenges in assessing the impact from research, principally the interrelated concerns of the sometimes long time lags between research and impact, the problems of attribution of research to impact and the difficulties in providing evidence of the links between research and impact (Bornmann, 2013;Penfield et al., 2014;Greenhalgh et al., 2016). These issues are further complicated by the complex, and potentially non-linear and non-sequential relationship between research and impact.
In the face of these complexities, several methods have emerged. At the level of research systems, econometric approaches can quantify the relationship between investment in research and economic benefits (Bornmann, 2013), although these approaches are less useful for ex post evaluation at the level of institutions, programmes, projects or individuals. In this case the recommended approaches focus on capturing information about knowledge exchange interactions (Molas-Gallart and Tang, 2011; Spaapen and van Drooge, 2011) or methods based on qualitative analysis and case studies (Donovan, 2011;Donovan and Hanney, 2011;Joly et al., 2015). In some national systems, notably the Excellence for Research in Australia, attempts have also been made to assess broader impact on the basis of a small set of proxy indicators. While there are strengths and weaknesses of all these approaches (Bornmann, 2013), there is an emerging consensus that case studies are the most effective approach in assessing broader impact.
An important consideration in comparing approaches to impact assessment are the transaction costs. For case-studybased methods these can be high. For example, the cost of national impact assessment in the United Kingdom (described in detail below) was £55 M, although this a relatively low transaction cost compared to the funding allocated on the basis of the assessment (Manville et al., 2015a). An area where costs could be reduced is through the use of systematic systems for the capture of evidence of impact, with a number of commercial systems being available for this purpose. Examples include ResearchFish (see http://www.researchfish.com/) and the VV-Impact Tracker (see http://www.vertigoventures.com/#!vv-impact-tracker/c1mi3).
Internationally there is considerable interest in implementing research impact assessment at a national scale. For example, in the Netherlands the latest iteration of the Standard Evaluation Protocol includes assessment criteria related to "relevance to society". 5 In Sweden, a proposal for a national system-FOKUSthat includes assessment of impact, is currently under discussion, 6 and the Australian Research Council is consulting on proposals for "Engagement and Impact Assessment" at the time of writing. 7 There are also efforts under way to achieve cross-border consensus on broader impact assessment, such as the Small Advanced Economies Initiative 8 and through the Science Europe working group on Programme Evaluation. 9 Perhaps the most ambitious and comprehensive attempt to assess the societal impact of research at a national scale to date is the UK Research Excellence Framework 2014 (REF), which will be discussed in more detail in the following section.
Although at a relatively early stage, it is timely to consider the future direction in the assessment of research impact. This article will consider the REF as a case study of research impact assessment, and then expand on key themes for the future of impact assessment.
Assessing societal impact at a national scale: the experience of the research excellence framework 2014 Completed at the end of 2014, the REF builds on three decades of experience of national research assessment in the United Kingdom (Bence and Oppenheim, 2005). Prior to the REF, the various Research Assessment Exercises had followed the conventional pattern for ex post evaluation. The key criteria related to the assessment of research outputs judged by academic peers; a focus on "Mode 1" criteria. For the REF, this changed with a new element to measure the impact of research, which was evaluated alongside more conventional elements focussing on research outputs and environment.
Following a review of international approaches to research impact assessment (Grant et al., 2010), and an extensive pilot exercise, 10 a methodology based around case studies of broader impact was adopted. 11 Societal impact was broadly defined: … as an effect on, change or benefit to the economy, society, culture, public policy or services, health, the environment or quality of life, beyond academia.
Submissions were organized in discipline-based groups, or "units of assessment". For the assessment of impact each submission contained two elements. First, each unit completed a narrative element (the "impact template") setting out the approach to, and strategy for delivering impact from research. Second, each unit was required to submit a number of case studies illustrating examples of impact that had occurred during the assessment period (2008-2013) related to their research. The number of case studies required was related to the size of the unit, with around one case study being needed for every 10 researchers, although a higher ratio was required for smaller units. A relatively simple structure was prescribed for case studies with the following sections: Summary of the impact Underpinning research References to the research Details of the impact Sources to corroborate the impact A central feature of the impact case studies was the linkage of impact to specific and identified "underpinning research", which was required to be evidenced by reference to research outputs of a minimum quality. 12 Only research outputs published since 1993 were eligible as underpinning research for case studies.
The impact element of the REF was assessed by expert, disciplinary panels during the course of 2014, alongside the other elements of the assessment. A key feature of the assessment was the inclusion of panel members and additional assessors drawn from the community of research users external to the academy. The research users were drawn from diverse backgrounds including business, the public and third sectors, journalism, and civil society.
The outcomes of the assessment, aggregated to the level of submitting disciplinary units were published at the end of 2014, 13 and a searchable database of the impact case studies was made available during the course of 2015. 14 The process of the assessment has been subject to considerable evaluation and analysis (Manville et al., 2015a, b;Samuel and Derrick, 2015;Derrick and Samuel, 2016). The impact case studies themselves have also proved to be a valuable source of data on research impact, notwithstanding some limitations related to the nature of the assessment process (King's College London and Digital Science, 2015).
Although, the introduction of impact assessment was not without controversy (Watermeyer, 2014), the key conclusion from the evaluation of impact assessment in the REF is that it was, by-and-large, successful (Manville et al., 2015b). Many of the inherent challenges of societal impact were identified, and remain key issues, but the use of case studies, combined with expert assessment (including experts from research user communities) allowed robust judgements about societal impact to be made. In advance of the assessment, panel members were concerned about the process (Samuel and Derrick, 2015), but following the assessment confidence improved (Manville et al, 2015b, Derrick, personal communication). There is also evidence that the assessment process has resulted in changing attitudes to societal impact within universities in the United Kingdom (Manville et al., 2015a).
The future of societal impact assessment The experience of the REF suggests there is practical potential to broaden the assessment framework for research to include societal impact. This conclusion was recently reaffirmed following a major independent review of the REF commissioned by the UK Government (Stern, 2016). The review concluded that: "Impact is clearly one of the success stories of REF2014, providing a rich picture of the variety and quality of the contribution that UK research has made across our society and economy." Focusing on refinements to the approach, the review made three recommendations for the assessment of impact (Stern, 2016). First, some impact case studies should be submitted at the level of institutional-level in order to foster and showcase further multi-and inter-disciplinary work that leads to impact. Second, the review recommends that impacts that build on a body of work, rather than specific research outputs, should be eligible for assessment. The third recommendation calls for an additional broadening of the definition of impact used in the future, and enhanced guidance to ensure clarity on the full range of eligible impacts.
In the near term, the focus of societal impact assessment in the United Kingdom will build on the success of the REF, with a focus on addressing the issues raised in the evaluations of the process (Manville et al., 2015a, b) and responding to the recommendations of the Stern Review. It also likely that the processes of collecting evidence of impact will be streamlined through systematic collection of data in digital impact collection tools.
Looking further into the future, there remain challenges in the further extension and embedding of societal impact assessment.
In addition to the well-described issues of time-lag, attribution and evidence, there are three additional features of societal impact from research that will need consideration in the future: the incorporation and reflection of models of engagement and co-produced research that are becoming central to the delivery of research impact; the evolving relationship between research disciplines, and the delivery of impact, and; the nature and direction of goals for societal impact from research.
These areas are expanded on below.

Co-production and impact
Alongside the policy narrative relating to delivering impact from research, is a related and complementary discourse about the coproduction of knowledge and public value derived from that knowledge. In a UK context, this is exemplified by the Connected Communities programme coordinated by the Arts and Humanities Research Council. 15 A central notion in this programme is the idea of co-produced research; the programme aims: … to achieve: new insights into community and new ways of researching community that put arts and humanities at the heart of research and connect academic and community expertise. [Emphasis added] Facer and Enright (2016), reporting on a 2-year study of the programme, conclude that the focus on combining academic expertise with expertise situated outside of the academy is neither new nor related to a single research tradition. Rather, the term "co-production" has become a catch-all for a range of participatory and impact-focused research methodologies. At their core, these approaches recognize that societal impact from research often occurs when researchers and stakeholders work together throughout the research process, framing questions and solving problems together. These various approaches to knowledge generation bring co-production and engagement to the fore, with an increased focus on creating the conditions for this type of interaction (see, for example, Price and Delbridge, 2015).
An extensive range of beneficiaries are referenced within the case studies, many of whom will have been engaged to an extent with the research related to the described impact (King's College London and Digital Science, 2015). However, there is some evidence that research impact derived from co-production was under represented in the case studies submitted for assessment in at least one REF sub panel (Greenhalgh and Fahy, 2015) In principle, the impact assessment approaches discussed above are equally applicable to co-production or other models of research impact. However, co-production models of research impact challenge the notion of prior "underpinning" research outputs in relation to impact (Baim-Lance and Vindrola-Padros, 2015). With co-production, it is possible, indeed likely, that impact and research outputs are delivered in parallel. In these circumstances, it is even feasible for research outputs aimed at a scholarly audience to lag behind impacts, rather than the other way round.
The relationship between disciplines and impact One of the striking findings from the analysis of the impact case studies submitted to REF is the disciplinary diversity in the associated research (King's College London and Digital Science, 2015). In nearly 90% of the case studies there was evidence of some disciplinary diversity, and in two-thirds of case studies the ARTICLE PALGRAVE COMMUNICATIONS | DOI: 10.1057DOI: 10. /palcomms.2016 disciplines involved were reasonable different from one another. 16 The analysis does not reveal whether distinct disciplinary knowledge was integrated during the research, or during the generation of societal impact, or both. The result could depend on associated research that was multi-or interdisciplinary, or could reflect the combination of disciplines during, for example, the processes of knowledge exchange. Nonetheless, the result provides important confirmation of the hypothesis that "mode 2" research, which is related to societal challenges or problems, is likely to draw on knowledge from different disciplinary domains. 17 The disciplinary diversity that is associated with societal impact has important implications for the delivery and assessment of impact. Three important issues emerge. First, the delivery of impact is strongly linked to the ability of researchers to work across disciplinary boundaries. Second, the assessment of societal impact needs to be carried out in a way that respects the essentially interdisciplinary nature of impact. Third, the assessment of academic research outputs needs to be able to evaluate outputs that cross or merge disciplinary boundaries, so that there a no barriers created to interdisciplinary work.
In the context of powerful disciplinary structures (both cultural and organizational) within the academy, there are well established challenges associated with working across disciplinary boundaries. Siedlok and Hibbert (2014) group these challenges into two categories-institutional factors, and differences in disciplinary traditions. Processes of research assessment and evaluation are themselves a potential component of the institutional factors that might limit interdisciplinary working, but the issues are broader (see, for example, Callard and Fitzgerald (2015) for an extended discussion of the complex environment for interdisciplinary research). In particular, disciplinary norms and expectations, which themselves influence the processes of assessment, may represent considerable barriers. In this context, inclusion of societal impact evaluation within research assessment frameworks may act as a potential positive pressure towards working across disciplinary boundaries.
However, research assessment, like the REF, is often structured along disciplinary lines, raising issues for how essentially interdisciplinary societal impact is situated. The experience of REF suggests that universities were comfortable submitting impact case studies drawing on different disciplines to disciplinary-focussed panels (King's College London and Digital Science, 2015), but it is important to be alive to the possibility that cases crossing a broad range of disciplines were not submitted because of concerns that they would not meet disciplinary norms. This, in turn could reduce incentives for broad cross-disciplinary working.
A potential alternative to a strict disciplinary framing for societal impact would be change the level of research impact assessment to broader disciplinary areas (equivalent to the main panels in the REF) or even to the institutional level, as recently recommended in the context of the UK REF (Stern, 2016). The alternatives themselves bring risks around disciplinary spread and involvement in societal impact. In the absence of a disciplinary framing for the assessment of impact, some alternative would be required, such as the socio-economic objectives of the Australia and New Zealand Research Classification (see http://www.abs. gov.au/Ausstats/abs@.nsf/Latestproducts/CF7ADB06FA2DFD69 CA2574180004CB82?opendocument, accessed 19 May 2016). This framework was used in an impact assessment trial, carried out by 12 universities in Australia. 18 There are risks, however, that any taxonomy has the potential to exclude certain types of impact, so acting to narrow definitions and constrain activity.
Given the disciplinary structures of research assessment processes, the assessment of research outputs against academic or "mode 1" criteria also poses a challenge, which could have implications for the delivery of interdisciplinary societal impact. In the REF, for example, there is a well-established perception that research outputs that cross-disciplinary boundaries may not be assessed appropriately, despite some analysis that suggested this is not the case (available at http://www.ref.ac.uk/results/ analysis/outputprofilesanddiversity/, accessed 19 May 2016). There is also some evidence that a bias against interdisciplinary research outputs occurred during the selection of outputs for submission to the REF (Elsevier, 2015a, b). There is clearly a need to continue to improve the perception and practice of evaluation of interdisciplinary research outputs (Strang and McLeish, 2015).

Socially-desirable impact goals
While there are advantages to using broad and inclusive definitions that focus on "influence" and "change" of impact for assessment purposes, it raises questions about the nature, direction and desirability of impacts. This is perhaps summed up in a neologism, "grimpacts", coined in the context of a workshop on societal impact organized as part of a National Science Foundation funded research project (see https://philoso phyimpact.org/philosophy-of-impact/workshop-2016/, accessed 19 May 2016). A pertinent question is whether the nature of the impact should be a factor in its assessment.
It is important to distinguish two issues in the context of the direction of societal impact: the problem of poorly conducted, inaccurate or fraudulent research the impact from which is inevitably problematic, and; research that leads to value-laden choices in terms of the delivery of impact.
Both of these issues were raised by panel members in interviews carried out in advance of the assessment process (Samuel and Derrick, 2015;Derrick and Samuel, 2016). In the case of poorly conducted research, participants, drawn from the medical and life science disciplines, in the study of Derrick and Samuel (2016) highlighted the example of the erroneous link between autism and the Measles-Mumps-Rubella vaccine.
A more complex question relates to the situations where there are choices to be made about the direction or nature of impact. This question was also raised by panel members in advance of the assessment (Samuel and Derrick, 2015), with evidence that panel members were aware of the importance of value judgements in the assessment of impact. There were also concerns about the potential variation in these judgements over time (Samuel and Derrick, 2015).
The inclusion of value judgements about the nature of impact raises important questions for evaluation, especially the breadth of stakeholder involvement within the assessment needed to make such judgements of "public value" (Bozeman and Sarewitz, 2011). There is limited empirical evidence on the variation in views about impact from research. Two studies, focusing on medical research, have examined the preferences of researchers and the public with respects to research impact (Miller et al., 2013;Pollitt et al., 2016). These studies suggest differing levels of alignment between researcher and public views, emphasizing the importance of the question for research impact evaluation. In both studies, however, both researchers and the public showed a preference for health-related over economic or commercial outcomes (Miller et al., 2013;Pollitt et al., 2016).
In practical terms it may not be possible to broaden involvement in assessment of societal impact with the aim of enhancing the legitimacy of judgements about impact. An alternative approach is to include, within assessment, factors related to the processes whereby choices about impact are made.
For example, the inclusion of publics and stakeholder groups throughout the process of impact delivery and research, potentially using co-production approaches, could itself be used as evidence that impacts are socially desirable. In this case, public and stakeholder engagement is not (just) a potential pathway to impact, but is an indicator that decision-making has been framed within the context of societally desirable goals. Adoption of the principles of responsible research and innovation (Owen et al., 2012) should become increasingly integrated into the delivery and assessment of impact.
Conclusion: making impact assessment mainstream Although there remain challenges and areas of debate, the potential for assessing research in relation to its societal impact is great. Two decades after Gibbons et al. (1994) coined the notion of "mode 2" research we have the theoretical insight and the practical tools to align research assessment and evaluation to "mode 2" criteria. However, this new assessment is seen as additional to, even subordinate to, a continued focus on more long-standing criteria. As has been observed by Cruickshank (2013): It is noticeable that even in the 'bold experiment' of the REF academics providing an impact case study are still measured on their [Mode 1] research evidenced through publications as well as having to describe their [Mode 2] activity.
Is there a case for making assessment of societal impact "mainstream", even to the extent of judging research (and researchers) on the basis of "mode 2" criteria alone?
Answering this questions depends on understanding the relationship between "mode 1" and "mode 2" research, and on questioning the behavioural implications of focussing assessment processes.
Evidence from the REF suggests that units that perform well against "mode 1" criteria (as judged by the assessment of research outputs, against the criteria of originality, rigour and (academic) significance) also perform well against "mode 2" criteria (as judged by the assessment of case studies of societal impact). However, this observation only confirms that research of both modes occurs in the same locations, rather than providing information on the relationship between them. It is also noteworthy that even the assessment against "mode 2" criteria in the last REF was conditioned by "mode 1" criteria, given the quality threshold that applied to "underpinning research" for impact case studies.
In principle, it is possible to view different relationships between the two modes: a linear relationship where "mode 1" is a precursor to "mode 2" or vice versa (the latter being prevalent in co-production models), or; a more nested or networked relationship between the two.
In reality, there is likely to be a mixture of these relationships, with variation driven by context, circumstances or disciplinary norms. The justification for public funding of research is related to "mode 2" outcomes. So a question for assessment policy is whether removing "mode 1" criteria will adversely affect these outcomes. Given the history and importance of "mode 1" criteria within the culture of research, it seems likely that there is considerable scope for increasing the emphasis on assessment against "mode 2" criteria without jeopardizing the delivery of societal impact.
In conclusion, we face important questions about how we steer and guide the research system for the future. On the one hand, a focus on independent inquiry and self-governance of research has a long and productive history. On the other hand, the increasing complexity of the problems that society needs to solve is putting increasing pressure on research to deliver solutions, as well as expanded knowledge. With the tool of societal impact assessment we have the potential to provide the conditions and incentives need to further accelerate and deepen the contribution of research.