World-wide barriers and enablers to achieving evidence-informed practice in education: what can be learnt from Spain, England, the United States, and Germany?

A global push exists to bolster the connections between research and practice in education. However, fostering evidence-informed practice (EIP) has proven challenging. Indeed, this ‘problem’ requires simultaneously attending to multiple aspects/levels of education systems, and to the contexts within which they reside. As such, comparative analyses using systems approaches hold potential for achieving context-specific insights regarding how to foster EIP. However, such analyses have been scarce, and what research does exist has generally been limited relative to methods and theory. Given this, the present study executes and describes/reflects upon a novel approach for analysing and comparing EIP in/across systems. In this study, educators’ evidence use patterns are described and comparatively analysed, using a sample of four regions within high-income national settings: Catalonia (Spain), England (UK), Massachusetts (USA), and Rheinland-Pfalz (Germany). This study employs a dual analytical frame (a cohesion/regulation matrix and institutional theory) to supply a methodological lens through which to understand EIP within and across these four systems. Together, this approach not only provides a way of accounting for the macro-level differences between contexts, it also enables a comparison of meso-level and micro-level factors (via institutional theory) that might be common and distinct across systems. This study’s findings reveal substantial diversity in the extent and nature of evidence use between systems, which in turn patterned according to distinctive cultural, systemic, and institutional features. Considering these findings, this study’s discussion advances some provisional insights and reflections regarding actual and potential EIP in education. For example, variability relative to the types/extents of accountability pressures, and how this affected educators’ data and evidence use, enabled a discussion holding relevance for policymakers. We also share process-related insights—i.e., describing the advances and challenges we experienced while undertaking this new approach. These points hold relevance for colleagues wishing to emulate and improve upon the efforts described herein, which we argue are applicable both in and beyond the education sector. Relative to education, these approaches can be applied and improved with an eye toward developing context-specific (vs. one-size-fits-all) packages for fostering EIP and, ultimately, achieving high quality and progressively improving schools/systems.

Introduction and aims T his paper examines the question of how to bring about more consistent, evidence-informed practice (EIP) in education globally. To do so, we use a social regulation/ cohesion matrix and an institutional analytic lens to engage in a comparative analysis across four contexts: Catalonia (Spain), England (UK), Massachusetts (MA, United States), and Rheinland-Pfalz (RP) (Germany). The aim of the analysis is twofold: First, we aim to explore and critique a new way to analyse EIP in (and potentially beyond) education. Second, we aim to generate insights into how to more routinely foster EIP, and to ascertain whether there were generalisable lessons from education that can be applied to other social policy areas. Although not without limitations, a number of insights do emerge from our work. These include that research evidence is but one of the potential influences on practice; furthermore, that use is contingent on a host of favourable features and conditions, set across micro, meso, and macro levels. Perhaps most significantly, this study demonstrates a new theoretical/methodological approach for studying EIP in/across systems, which we suggest can be taken up in and beyond the education sector.
Evidence-informed teaching Across many countries, national, federal and district level governments are increasingly pursuing approaches to school improvement that seek to achieve so-called 'bottom-up' change. That is, improvements to teaching and learning that are generated by teachers and subsequently shared horizontally and vertically within educational systems. In particular, 'self-improvement' is now viewed by many as the preferred approach to enhancing educational provision at the school and system level (Greany, 2015). An approach often chosen to support self-improvement is that of EIP. EIP involves fostering situations in which teaching practice is consciously informed by evidence derived from: (1) formal research produced by researchers; (2) practitioner enquiry; and/or (3) routinely collected school or system-level data (for example, student assessment data). A focus on EIP is not without merit and there is a nascent but growing evidence base to suggest that when teachers engage with evidence, this can lead to improvements in outcomes for both teachers and students. For instance, correlational data indicates that where research is used as part of high-quality initial teacher education and ongoing professional development, there is an association with higher school and school system performance (Mincu, 2014). More recently Rose et al. (2017), using a randomised control trial, showed that increased collaborative research use by primary school teachers had a positive impact on primary school students' exam results. A range of positive teacher outcomes that emerge from collaborative research-informed practice include: improvements in pedagogic knowledge and skills, greater teacher confidence, and high teacher job satisfaction (Bell et al., 2010;Godfrey, 2016). Similarly, teachers' use of educational data (e.g. standardised test scores, data used for formative assessment, student self-assessment data, or other data such as attendance) can, in the right situations -such as part of a professional development initiative-also lead to improved teaching and student outcomes (e.g. Lai et al., 2014;Van Geel et al., 2016).
Various theories of change for why EIP should lead to improved teaching and student outcomes have been established (e.g. see Brown et al, 2017;Cain, 2015;Cain et al., 2019). Broadly these argue that, assuming they have both access and the capacity to do so, teachers can use a full gamut of evidence in relation to the decision-making that occurs as part of their work. For example: (1) evidence-especially forms of data-can be used by school leaders to identify and pinpoint areas for improvement, both in terms of a given cohort or group of students, or in relation to an innovation that is required at the level of the school or across schools; (2) evidence can aid teachers in the design of new bespoke strategies for teaching and learning in order to tackle specific identified problems; (3) evidence can provide teachers with ideas for how to improve aspects of their day to day practice by drawing on approaches that research has shown to be effective; (4) ideas from research can help teachers expand, clarify and deepen their own concepts, including the concepts they use to understand students, curriculum and teaching practice; and (5) programme evaluations can also provide teachers with specific programmes or guidelines, shown by research to be effective, which set out how to engage in various aspects of teaching or specific approaches to improve learning. Finally, data can be used to assess the impact of embarking on (2)-(5) above. Thus, if teachers are able to engage with evidence (both research and data) in a way that enables them to undertake any of these actions, their teaching quality should be improved. Correspondingly, improved teaching quality should then lead to improved student outcomes.
Although there is now a recognition that evidence use can and should be used to improve practice, there is only limited evidence on how this might be facilitated at the school level (Graves and Moore, 2017). What's more, a systemic gap appears to exist between research and practitioners which as yet shows little indication of narrowing (Coldwell et al., 2017;Graves and Moore, 2017;Whitty and Wisby, 2017). As a result, this leaves only sporadic instances of EIP occurring within and across schools with other factors, such as intuition and experience, instead solely driving much of the decision-making undertaken by teachers (Vanlommel et al., 2017). The danger then is that misconception, biases, and fallible 'fast' decisions (Kahneman, 2011) are as likely to influence teachers' decision making as much as high quality evidence that helps the identification of problems or that point to effective solutions. A key question therefore is what can we do to achieve EIP? In other words, how can we can get school principals and teachers within education systems, globally, to systematically use high quality academic research and other forms of evidence to improve how they lead and teach?
In this paper we attempt to shed light on this issue by examining EIP, in a new way, in/across four school systems: Catalonia, England, Massachusetts, and Rheinland-Pfalz (RP). (Our selection of these four cases is described further in a subsequent section.) In doing so, we aim both to generate provisional insights related to fostering more/better EIP in education, and to set a discussion regarding the merits and drawbacks of the methodological and theoretical approaches that we undertook. In addition, in light of the aims of this special edition, this paper also seeks to ascertain whether there are generalisable lessons from education that can be applied to improve EIP (and/or its study) in other key areas such as health, justice and social care.
Comparative analyses of evidence use in education have been scarce. Typically, evidence use or research engagement has not been a primary focus of educational researchers who have undertaken comparative study, though these topics still sometimes reveal themselves. Darling-Hammond et al. (2017), for example, selected a small set of high-performing systems and then sought to examine commonalities and generate useful insights. Germane to the present study, they found these jurisdictions tended to view and support teaching as a "research-informed and research-engaged profession" (p. 15). Relatedly, they noted systemic ways in which these systems supported teaching as a collaborative rather than an isolated occupation (e.g., affording opportunities to observe others' lessons, fostering teacher sharing within-and across-schools).
There have also been a small number of more direct comparative examinations of EIP (or evidence-informed policy) in recent years, though these too have featured different samples, goals, and analytical approaches relative to our paper. For example, the Organisation for Economic Co-operation and Development (OECD) published an edited volume, "Evidence in Education: Linking Research and Policy" (Burns and Schuller, 2007). This book brought together "experts on evidence-informed policy in education from a wide range of OECD countries" (OECD, 2020, n.p.). The book as a whole, and chapters within, supplied numerous insights while detailing various 'cases' (e.g., organisations, knowledge brokers, programmes). However, its focus on evidence-use within systems was generally limited, it focused more at policy than at practical levels, and contributors did not operate from a common framework to aid cross-case comparisons. Also, the Evidence Informed Policy and Practice in Education in Europe (EIPPEE; see http://www.eippee.eu/) network's funded work yielded comparative information and insights relative to knowledge brokerage activities and mechanisms across 11 European countries (e.g., see Gough et al., 2011). Importantly, the analytical framework they developed and drew upon reflected systems thinking/modelling; in line with Best and Holmes (2010), they emphasised how various agents/actors are tied together by a system and embedded/organised through structures that shape the interactions and knowledge exchange that ultimately takes place.
Similarly, we see value in a systems approach. Given this, we employ a dual analytical frame (as described below: a cohesion/ regulation matrix and institutional theory) to supply a methodological lens through which to understand EIP within and across four systems. The system matrix, presented next, supports the idea of the "power of context" (Chapman, 2019, p. 4). It highlights the critical role of the local diversity of settings that systems operate in and contributes to understanding the variety of layers intervening in the implementation of any school reforms and innovations, including EIP. Besides, the matrix ensures there is a focus on the different challenges systems experience, and represents a starting point for any analysis which looks on the configuration of facilitators and barriers in any school improvement process. Each context is unique and powerful and determines a specific configuration of factors, agents, and conditions, which can be explored in-depth through the lens of institutional theory. Together, then, our dual approach not only provides a way of accounting for the macro-level differences between contexts, it also enables a comparison of meso-level and micro-level factors (via institutional theory) that might be common and distinct across systems.
The cohesion/regulation matrix Globally, school systems have a range of differentiating contextual and structural elements. The cohesion/regulation matrix, set out in Fig. 1, has previously been used by Chapman (2019) (drawing on the work of Hood, 1998) as a way of segmenting school systems according to the principal macrolevel or system-level factors that define them. The axis used to form the matrix may be considered as follows: 'social cohesion' refers to the institutions, norms and networks that bind societies together. Systems with high social cohesion have a higher propensity and willingness to collaborate. Threats to social cohesion-which tend to result in low socially cohesive systems-tend to occur when such structures and systems (e.g. specific layers of government, the trade unions, the church, as well as the provision of universal services such as health) are dismantled and replaced with deregulation and privatisation. In other words, approaches that place an onus on individual agency over collective approaches (Bauman, 2012).
The second axis, 'regulation', refers to the institutions that determine control and how accountability functions in a system. Typically, in a high regulation system there is a dominant hierarchical culture with associated bureaucratic control. High regulation systems often also involve the danger of 'high stakes' failure; i.e. a situation in which not meeting exacting accountability standards results in individuals or institutions being highly penalised. Systems displaying low social regulation, on the other hand, tend to exhibit much flatter, non-hierarchical cultures, with improvement achieved through partnership. A low social regulation system is also much less likely to have external accountability measures which lead to penalisation.
As can be seen in Fig. 1, combinations of high/low social cohesion and high/low social regulation result in the following four system types (Hood 1998, p.  Williams, 2019). These rules and norms tend to be durable, and within a field we tend to see substantial continuity-because, for example, different organisations are frequently subject to similar pressures and tend to have reached similar understandings regarding what behaviours are (in)appropriate within their realms (Powell and DiMaggio, 1991). The 'state' is generally an important actor when considering public service provision, as for example public sub-entities are typically at least partially dependent on the state for resources. Accordingly, state-level expectations are influential, and some are codified in formal policies that direct attention and work effort in certain directions. Thus, certain forms of evidencethe focus of this manuscript-and certain priorities are likely to be privileged while others are side-lined or given lesser focus. In many cases (albeit more so in some fields than others), professionals also are significant in terms of enabling/constraining and regulating members' behaviours. Within organisations some individuals are relatively more powerful than others, which also might have implications in terms of evidence use. For example, Brown and Malin (2017) described how school principals are pivotal in terms of bringing about EIP in their respective settings. Such individuals and groups are also often key to knowledge brokerage in various ways (e.g., via setting meetings, via their communications, and so forth; see Malin and Brown, 2020).
A key insight from institutional theory-as applied to the study of evidence use-is that "the strength of the evidence is not the sole, or even the major, determinant of its influence on practice; rather, more powerful actors hold considerable sway in determining what (and indeed whether) evidence is used" (Martin and Williams, 2019, p. 55). However, "this does not also mean that there is no deviation from institutionally prescribed behaviours" (Martin and Williams, 2019, p. 56). In fact, practitioners are often creative despite substantial constraining institutional forces. Thus, a challenge for researchers using institutional theory is to sensitively examine contextualised norms, rules, and structures, while also attending to the reality of what is occurring in order to accomplish the focal organisation's main tasks.

Research questions and approach
In the following paper, we attempt to shed light on these three questions: 1. To what extent are teachers in systems with different types of cohesion/regulation characteristics engaging in EIP? 2. What EIP-related enabling or hindering factors do different systems present and what are the relative 'strength' of these enablers and hindering factors? 3. Are there generalisable lessons from education-at the system level-that can be applied to improve EIP in other key areas such as health, justice and social care?
To address these questions we present four school systems-Catalonia, England, Massachusetts, and RP-as miniature case studies. These cases were selected out of a combination of convenience and strategy. Strategically, the recruitment of this study's team of authors was driven by the lead authors' desire to comparatively examine contexts reflecting diversity along the dimensions being studied. Accordingly we sought authors known to possess information access and expertise that would collectively enable us to address this study's main questions, relating to the context with which they were most familiar as we turned later to a cross-case analysis.
We selected cases that correspond to different system models, creating opportunities to identify commonalities and differences between them and lessons for improvement. We see a mosaic of policy landscapes across Europe and the U.S., each one with its unique development, challenges, tensions, and dilemmas. The Catalan model is an example of the contextualisation of legal regulations, being part of the Spanish system, characterized by collegiality, decentralisation, and institutional autonomy. Despite the high level of social cohesion, local regulations have a central influence, enhancing the level of bureaucracy and generating a hierarchical culture, which supposes a series of tensions between system levels and agents. The English system is typical of what Pasi Sahlberg refers to as Global Educational Reform Movement (Sahlberg, 2016) systems: as well as being characterized by high autonomy and high accountability, there is a focus in England on the core subjects of literacy and numeracy. This has led to much standardisation of practice, despite the aim of recent reforms to create an innovative self-improving system. In other words, despite having the freedom to experiment, school leaders often choose to emulate the practices of others out of fear of being an outlier and subsequently punished for failure. Massachusetts appears by most measures to be the top-performing state education system in the United States. It is also a context in which considerable efforts have been made, at state and national levels, to increase/enhance the use of research in policy and practice. For example, its state-level education department has shown a systematic and pioneering commitment to planning and research, and key federal law in the US maintains evidence and accountability requirements. RP is also unique, as it is the only German state without state-wide exams. It has justified doing so by pointing to its students' comparatively good results in nationwide tests of student performance. More recently, even the school inspection system has been abolished (the current study, however, refers to this instrument), as it was evaluated as too costly and not very effective. Instead, more investment is now being made in enhancing support systems for schools. RP thus offers a particularly good example of a rather egalitarian approach in the matrix and data use in such a system. Despite our best efforts, this study contains certain limitations. For example, our study relied upon authors' access to extant data and other information (versus requiring novel data collection) related to the focal questions. Accordingly, we did not as part of this research collect/analyse uniform data across the cases we studied, but rather relied upon what was available for each case. Accordingly, cross-case comparisons are made-and should be taken-with some caution. The contexts we studied also do not represent the full diversity of educational systems internationally. Still, our aim is to compare an educational phenomenon within distinct cultural areas, with regard to its contextuality and different governance constellations in the sense of international comparative education: "As comparative education is a field that is fundamentally grounded on an interest in learning from each other's experience (that is, generated from each other's contexts), context has always mattered" (Lee et al., 2014, p. 150). For example, in this study we did not include any systems that fit in the individualist quadrant of the social cohesion/regulation quadrant.
We have classified each context according to the cohesion/ regulation matrix in Fig. 1 and justify this by detailing the specifics of the different systems that make this so. We then set out to examine: (i) the extent to which teachers implement research evidence into their teaching practice (outcome); (ii) which enablers and barriers with research use are described in the different systems, using institutional theory as a way of guiding how and why types of evidence are privileged and the more and less powerful evidence actors within those systems; (iii) finally we assess the relative 'strength' of these enablers and barriers linked to specifics of the different systems.

Catalonia (Spain)
The Catalan educational model, mirroring the Spanish model, sits within the top left quadrant of Fig. 1 (the hierarchist way). It is characterised as a system based on a collegial model, decentralisation and processes of institutional autonomy, as well as the contextualisation of legal regulations (Marchesi and Martin, 2002). It is a model that offers a privileged position to agreedupon proposals for action based on an educational project and a series of intervention projects for improvement; which, according to the Act on the Right to Education (LODE, 1985) currently include: improvement plans, the curricular project, the environmental plan, etc. Schools display different levels of autonomy in the areas of planning, management and organisation. However, with a strong background as a centralized system in the 1990s, schools are responsible for designing and implementing education and management plans under the supervision of their respective education authorities. It is within the powers of the principal according to the Act on the Education Quality Improvement (LOMCE, 2013) to ensure the functioning of the school and stimulate improvement processes. According to this model, Catalan educational law reinforces the idea of coresponsibility between schools and local authorities in decisionmaking, promotes collective responsibility in management, and empowers school management teams and teachers themselves to promote innovation through a horizontal system that supports collaborative initiatives (Gairin, 2015). The model balances high social cohesion with a high level of participation. At the same time, it addresses educational inertia with a high interest in accountability and governance through a weakly articulated structure based on a cultural model of quality assessment that uses rigid standards and structures. The coexistence of two models of governance in tension makes it difficult to fully transition to a completely decentralised and cohesive model.
Use of evidence. The Catalan educational system is experiencing a very diffuse and spontaneous wave of change and educational innovation (Martinez, 2019), magnified by a clear commitment to the emergence of collaborative networks between schools and social and educational organisations (Azorín, 2019). Of late, this phenomenon has been driven by the emergence of complex problems and the lack of sufficient resources (Díaz-Gibson et al., 2015) to address them, leading schools to seek solutions through the 'Planes Educativos de Entorno' (environment educational plans) or 'Local Educational Networks' (Civís and Longás, 2015). These experiences are based on collaborative work and networks that assume the existence of a new paradigm for socioeducational services and the growth of extended (community) schools (Azorín, 2019). These developments have generated a complex scenario with different levels of involvement and improvement and specific adaptations as a function of contextual variables such as school ownership (public-private), teachers' stages of professional development and attitudes towards change and innovation (Perines, 2018), to mention just a few; these variables can generate an imbalance between levels of social cohesion and commitment to change.
To face these challenges, in recent years, so-called 'evidenceinformed practices'-which involve teachers integrating research evidence into decision-making-has taken on increased importance in the practice of schools as well as having a greater visibility in the public discourse. With the adoption of the Catalan Education Act (Decret 274/2018), a systemic and formal commitment to the promotion and use of evidence and research in the field of education has been put into place. This commitment represents the beginning of a new stage in Catalan policymaking promoted by the public bodies, where the aim is to make scientific knowledge an engine for improving educational practices and policies. To do that, the programme called "Evidence-informed schools" [Escoles d'evidència] aims to put evidence on what works in education in the service of educational policies and schools, to promote the most rigorous empirical evidence, and at the same time to connect it with the needs of the system, schools, and teachers. According to the Decret 274/2018, with this programme, Catalonia will work towards the articulation of an ecosystem that brings together all the educational agents and puts research evidence at the service of improving education.
Despite the late visibility of the concept and some pioneer initiatives implemented in schools in recent years, (especially promoted by the private sector such as the EduCaixa programmes [EduCaixa, 2019] or innovation movements based on evidence), one cannot say that this trend has become generalised. Studies show that adopting an evidence-based view of teaching requires an understanding of how to integrate teachers' experiential knowledge and should be complemented by contextual and experiential interpretations of research with a reflective approach to practice (see . This is a paradigm shift that involves interventions at all levels. Application of institutional theory. Framed by institutional theory, we can discern that different levels are involved, from an epistemological level linked to one's conception of research, to the personal, organisational and systemic levels (these are further explained below). The ecology of educational practices and policies that are 'enriched' (Oancea, 2018) with research should involve, in a shared manner, all actors: teachers, students, public administration, local authorities, and government agencies. The dialogue among these actors involves harmonising different interests and narratives, which in turn depends on, and generates power relations amongst, different contexts and tensions and dilemmas between contexts and levels, agents and decision makers. First, at the epistemological level, Spanish teachers tend to disconnect the conception of research from educational practice. For example, data indicates that teachers perceive research as a type of abstract knowledge that is useless, of poor quality and far removed from their daily practice (Murillo and Perines, 2017;Murillo, 2006;Díaz Costa, 2009). The ability to support researchinformed practices from the bottom-up depends equally on teachers' individual capacity to promote and work in a climate of trust and collaboration, in which the exchange of knowledge and the shared and grounded construction of and critical reflection on their own practice, represents their own professional ethos. However, studies in the Spanish context show that innovations are rarely based on research evidence, although teachers themselves claim to be in favour of the use of evidence in their classroom practice. In a study conducted with teachers in Madrid and Catalonia, 68.1% of teachers and 77.3% of principals declared that they frequently or always use research to inform their practices (Ion and Gairín, 2019). However, when they have to inform their innovations in class, teachers acknowledge limited use of scientific evidence in favour of experiential and peer knowledge (Perines, 2018;. Among the factors that limit the development of the ability to use research, teachers include limitations of time, resources, or support from the management team (Perines, 2018). In addition, teachers identify clear deficiencies in their initial research training and a strong disconnect with the context of the production of research, marked by concerns about issues that diverge from the reality of the classroom (Perines, 2017).
At the organisational level, the development of the capacity to use evidence requires leadership that is clearly sensitive to research and favours a positive organisational culture . However, this is not sufficient to promote the use of evidence unless it is accompanied by a research culture that supports a general orientation towards the use of evidence in any decision-making process, assumes the contextual nature of knowledge, supports the integration of evidence into teachers' professional development and cultivates an organisational ethos favourable to collaboration and academic integration (Oancea, 2018;Ion and Gairín, 2019).
At the system level, promoting a vision of practice based on evidence requires coherent and responsible actions among all actors. In Catalonia, innovative educational initiatives are still far from being a generalisable trend; rather, initiatives are isolated and depend on personal initiative (Camacho, 2016). This undoubtedly contributes to the fact that efforts to promote improvement and innovation continue to be poorly recognised and rewarded, poorly documented (Perines, 2017(Perines, , 2018) and very diverse. Furthermore, despite a high level of thematic diversity, initiatives appear to be minimally connected and are quite different at the methodological level. Additionally, at the system level, there are few mechanisms to identify and connect innovative educational practices with one another, which makes it difficult to identify different agents' degree of development and involvement in the progress of the innovation process (Camacho, 2016;Martinez, 2019). In Catalonia, in a context of increasing concern based on accountability (Catalan Government, Departament d'Educació, 2019), politicians dedicated to managing educational systems tend to have a reduced view of educational research, such as the evaluation of educational systems. Additionally, the approval of laws and decrees are often not sustained by research, and the existence of accumulated research (such as research syntheses or meta-analyses) is simply unknown (Martinez, 2019). In this way, a divided model is reproduced, in which decision-making is separated from enquiry and reflection on practice, and the two exist in remote spheres that do not respond to each other. Change involves: (1) making evidence both sides of the same coin of discourse and practice (i.e., political discourse supporting evidence in practice and the measures taken should be aligned, to ensure coherence between all layers and actors); (2) introducing research as an instrument of both the political system and of governance; and (3) creating the conditions for research to fulfil a social function; that is, to have an impact beyond the academic function. Political discourse must double the measures to promote governance mechanisms that stimulate this type of practice and that support it with adequate resources and mechanisms of recognition and reward for horizontal and collaborative initiatives.

England (UK)
England's high accountability (high social regulation) and high autonomy (low social cohesion) context places it firmly in the top right-hand quadrant of the matrix (the fatalist way). These two elements have achieved particular prominence in the last decade. Beginning with the latter, central government policy makers in England have now devolved multiple decision-making powers and resources to schools. Included in this process of devolution is the responsibility for teacher professional development, in the belief that this will improve quality and increase innovation (Greany and Earley, 2018;Howland, 2015). This commitment has been described elsewhere as the move towards a 'self-improving school system' (Greany, 2017). Here the characteristics of 'selfimprovement' include individual schools now having greater responsibility for their own improvement; that teachers and schools are expected to learn from each other so that effective practice spreads; and that schools and school leaders should extend their reach to support other schools as they look to improve (Greany, 2014;Robinson, 2017).
A key point to understand is the structure of England's school system. While education policy is shaped centrally, since the early 20th century local authorities had responsibility for the education of children in their locales. However, the relationship between local authorities and central government has not been easy and in 1988 the Education Reform Act saw local authorities lose many of their powers until their role was one of scrutiny and support. A significant change came with the establishment of academies, state schools directly funded by the Department for Education and outside local authority control. As self-governing trusts academies have a number of freedoms afforded to them in terms of innovation and curriculum which local authority schools do not. While some academies operate independently, a number of these schools are networked into trusts, groups of schools with centralised policies, curriculum and approaches to professional learning. Therefore the type of autonomy that staff in schools experience will vary depending on the type of school and its structure.
To further encourage improvements in quality and innovation, policy makers have also embedded a range of accountability systems. These "combine quasi-market pressures-such as parental choice of school coupled with funding following the learner -with central regulation and control" (Greany and Earley, 2018, p. 7). A key aspect of this system is the regular school inspections process undertaken by Ofsted (the school inspection agency in England). Ofsted inspections are highlighted by many school leaders as a key driver of their behaviour and for good reason. As a result of an inspection, for which there is less than 24 h notice, schools are placed into one of four hierarchical categories of grades. The top grade-'outstanding'-has historically carried a number of benefits. For example, it makes the school more attractive for parents, meaning more students apply to attend, and thus more funding is directed towards the school. The reverse is then true, that schools with lower ratings find it more challenging to attract families and the attached funding with it. In addition, up until 2019, schools rated outstanding were exempt for subsequent inspections (even with changes of leadership and staff), meaning that accountability pressures are considerably lessened. At the other end of the scale, schools judged to be in the lowest Ofsted category-'inadequate'-are subject to a forcible removal from local authority control and the Department for Education pay academy trusts to take on these schools in a bid to rapidly improve performance. Inspection frameworks have seen a number of changes in them, which directly impacts the work in schools so that they are meeting an ever-evolving criteria of what is considered by Ofsted to be good practice.
In addition to inspection is the use of government produced annual 'league tables' of schools and a publicly available 'Find and compare schools in England' website, which allows those accessing it to rank schools on a number of different variables and student outcomes. As a result, it is acknowledged that England's accountability framework both focuses the minds of-and places pressure on-school leaders to concentrate on very specific forms of school improvement. In particular, such improvement principally tends on ensuring students achieve well in progress tests in key subject areas (e.g. English literacy and mathematics) (Ehren, 2018) leading to a narrowing in the curriculum.
Use of evidence. Some data does exist in terms of EIP in the English context. For example, a survey of 1670 teachers in England was undertaken by the National Foundation for Educational Research in 2017. Here it was found that academic research had only a 'small to moderate' influence on teacher decision making.
Instead of research-evidence, when deciding on approaches to improve student outcomes, teachers were in fact much more likely to draw ideas and support from their own experiences (60% of respondents identified 'ideas generated by me or my school'), or the experiences of other teachers/schools (42% of respondents identified 'ideas from other schools'). In addition, non-researchbased continuing professional development (CPD) was also cited as an important influence (54% of respondents). These compare to the much lower figures of 13% and 7% for 'sources based on [the] work of research organisations' and 'advice/guidance from a university or research organisation', respectively (Walker et al., 2019). The survey also asked teachers to identify the relative importance of a range of factors likely to have an impact on any decision to adopt a new approach to teaching and learning. The factors that teacher respondents were most likely to identify were: straightforward to use (47%); aligned with professional expertise (46%); and a good fit with existing practices (44%). Only a third (32%) indicated research was a 'strong influence' on their decision to adopt their approach (Walker et al., 2019).
Application of institutional theory. Turning now to institutional theory and it is clear that some salient elements are present. For instance, EIP has been supported from non-governmental organisations that operate to support schools. The Education Endowment Foundation (EEF)-the 'what works' centre for education in England-for example, provides a freely available 'tool kit' of what works evidence in order to ensure summaries of educational research are accessible to non-academics. In addition to this substantial investment, in 2014 the EEF launched a £1.4 m fund to improve the use of research in schools (EEF, 2014). This initiative was followed up in 2016 with the launch of the EEF's Research Schools initiative; schools charged with leading EIP in their local area. There has also been a substantial rise in bottom up/teacher-led initiatives, such as the emerging network of 'Teachmeets' and 'ResearchED' conferences (Whitty and Wisby, 2017) designed to help teachers connect more effectively with educational research. Furthermore, a prominent example of a teacher-led initiative was the 2017 launch of England's Chartered College of Teaching: an organisation led by and for teachers and whose mission, in part at least, is to support the use of EIP (Whitty and Wisby, 2017). EIP is also being increasingly promoted and supported at a government level. For example, England's Department for Education funds the work of the EEF, and has also ensured the inclusion of references to EIP within principals' standards and in the pilot Early Career Framework for newly qualified teachers. Finally, the periodic Research Excellence Framework (the 'REF'), via which UK universities are funded, now requires them to account for the "impact" their research has had on, "the economy, society, culture, public policy or services … beyond academia" (Higher Education Funding Council, England (HEFCE), 2011, p. 48). In other words, the government's aim is to use REF to encourage universities to ensure that their research is used in the world beyond academia, for example by directly working with teachers and schools (Cain et al., 2019).
Yet from the figures above it is clear that EIP has some way to go before it becomes a way of life in schools in England (Bell et al., 2010;Walker et al., 2019). Furthermore, the take-up of the EEF toolkit is limited, with just less than a quarter of teachers indicating they accessed it. The figure for school leaders is much higher however at just under 60% (EEF, 2018). Reported barriers to EIP are manifold and include research held behind paywalls, dense academic style writing which can be difficult to access, underdeveloped research literacy skills and support, both individually and organisationally; and the pressures of high stakes accountability in England's schools Greany, 2015). In addition, the ability for school leaders to put in place structures within their school to enable teachers to engage effectively in collaborative EIP development have been limited by the budget cuts which have ravaged the education and wider public sector in England (e.g. see Busby, 2019). Teachers themselves also lament the lack of time they have to do anything other than their day to day role, with EIP often seen very much as a luxury (Brown, 2020;Galdin-O'Shea, 2015). This argument is reinforced by OECD data which indicates that England's Primary teachers have the fifth highest number of teaching hours out of all countries surveyed. While, a teacher in Finland has 677 h-and in German they have 799 h of contact time with pupils-a teacher in England has on average 942 h (OECD, 2020).

Massachusetts (USA)
Placing the Massachusetts (MA) United States (U.S.) school system in the cohesion/regulation matrix is not entirely straightforward. Regarding the individualism-egalitarianism dimension, though MA and the U.S. are individualistic in nature, teachers in their work settings are often more communitarian (Shober, 2016). Most recently, however, MA has embraced neoliberal and managerialist education policies (Piazza, 2017;Horsford et al., 2018); for instance, in 2012 MA enacted a law that limits seniority-based job protections for teachers and may undercut a communitarian, professional ethos. (Also see later discussion of Race to the Top policy.) MA is also-relative to other U.S. states-socially cohesive (e.g. Wise, 2015 places it in the top 10 on this measure). However, the U.S. presently is conspicuously un-cohesive, and MA is no marked exception. All considered, we have placed MA (like England) within the top right quadrant, the fatalist way, albeit with the understanding that this is a dynamic context and policy area.
MA is unique, relative to other U.S. states, in that its K-12 students' achievement consistently ranks at or near the top (Papay et al., 2020). It has also, since at least 1993, been viewed as a leader in U.S. education reform. In 1993, the omnibus Massachusetts Education Reform Act introduced state-wide learning standards and an associated state testing/accountability system. In exchange for the increased accountability, state-based school funding also considerably increased. Notwithstanding these efforts and successes, educators and policymakers in MA have also taken note of, and have sought to rectify, considerable performance inequities between certain groups of students (i.e., inequitable outcomes according to class, race, ethnicity; Darling-Hammond, 2010; Papay et al., 2020).
In the U.S., as the national constitution does not specifically address education, states hold primary authority, though states historically have also delegated much authority and responsibility downward-i.e., to school district levels. Thus it is appropriate to examine U.S. school systems at a state level, but with attention also to local variation (i.e., districts, schools, teams). Nevertheless, educators are also substantially enabled/constrained by federal policies. U.S. educators have since the early 2000s needed to respond to a largely federally led "what works" agenda, characterised by a "strikingly narrow focus on evidence of the impact of interventions" (Tseng and Coburn, 2019, p. 351) and neglecting broader concerns and types of evidence. Most notably, the No Child Left Behind Act of 2001 (NCLB) introduced highstakes student achievement testing mandates and required that school leaders and teachers reduce achievement gaps. Although NCLB stimulated educators' data use, it also squarely emphasised summative measures and contributed to other dubious practices (e.g., narrowed curricular offerings and a focus upon "bubble kids" who tested near proficiency cut-points; Datnow et al., 2013;Hackmann et al., 2019) In 2010, MA pursued and was subsequently awarded a US $250 mm federal Race to the Top (RttT) grant. RttT was an Obama-era federal grant competition (the largest ever of its kind in US history, and brought forth on the heels of a recession that left states especially solicitous for funds) that was aimed to spur educational innovation, but that was specifically focused around stimulating particular state-level reforms, including: (a) adoption of core standards and assessments; (b) building data systems that could measure student success and inform teachers/schools how they could improve; (c) the recruitment, development, and retention of effective teachers; and (d) turning around low-performing schools (Horsford et al., 2018). MA and other states needed to develop and implement policies in these areas, which clearly reflected a neoliberal approach to education reform, in order to compete for and receive this funding. These policy shifts subsequently enabled and constrained educators' behaviours and focal areas, introducing new regulatory pressures and directed their attention toward certain forms of data and evidence (more discussion to follow as part of institutional theory analysis).
MA's educational system is hierarchically organised, albeit in some cases with overlapping authority and in some aspects with the higher (i.e., state department of education) level serving more as a resource/support (and less as a heavy-handed governor) to the local districts and their educators. For example, and pertinent to this manuscript, MA's state-level education department was the first to include a state research director, part of a robust Office of Planning and Research (OPR). MA educators and state education officials are also beholden in some key ways to federal educational law, such as the Every Students Succeeds Act of 2015, which replaced NCLB and, while devolving some authority to states, largely kept intact key evidence and accountability requirements (Tseng and Coburn, 2019).

Use of evidence.
What does all of this mean in terms of MA educators' engagement with research evidence? This case draws primarily upon findings presented as part of a recent study, 'Evidence use in Massachusetts School Districts', completed by Hedberg (2018) for DESE's OSR. This interview study (N = 22 district-level interview participants), drawn from a stratified sample of MA districts, 'sought to understand how districts are currently using, building and sharing data and research' (p. 1). This study was undertaken as part of early stage efforts by DESE/ OSR to support MA districts' evidence use. Supportive insights are also drawn from a piece by Carrie Conaway (2020), who then was research director for OPR.
The first point is that 'data'-and especially from a particular source-are being used more frequently than research. In part, this finding reflects the aforementioned policy and larger agenda, which has tuned educators' focus especially toward students' performance on annual tests. Indeed, in this case educators mentioned their state test-"MCAS"-data as being of premium importance. Hedberg observes, "looking at student performance data is becoming part of the regular routines for teachers and administrators" (p. 4). Some districts (N = 6 of 19) have also invested in one or more staff members whose function revolves around data or evidence. In all, districts appear to "have more systems for integrating data use in their decision-making than for integrating outside research" (p. 9). These systems/structures include data meetings (13 mentions), professional development (5), and dedicated data teams (5).
In terms of 'building evidence,' findings were mixed. On one hand, many districts reported conducting in-house research and/ or partnering with outside organisations in order to do so. On the other, in most cases the descriptions offered did not suggest formal research questions and data collection/analyses. Thus, most common-at least among this sample-is engagement in informal research related to several key areas (e.g., tracking school culture and climate).
Research evidence use reportedly occurred in certain ways-for instance, 25% noted looking at the research base as part of selecting new programmes or interventions. Respondents also reported using either data or research to "adopt new materials" (9.13 on a scale of 1-10; 10 = all the time), "select intervention" (8.87), "provide professional development" (8.27), "inform instruction" (8.07), "allocate funds" (7.53), and "allocate staff" (7.0). Evidence was also used by some respondents/districts to measure implementation, and to measure impact (primarily via student assessment data).
This study also inquired about real and potential barriers to evidence use. Responses suggest limited time/staff resources (12 mentions), value of available research (5), and culture (3) present the largest obstacles. Confidence in engaging with research did not appear to be a considerable barrier, at least for these respondents. Across-district evidence sharing appears to be modest amongst MA educators, with most such sharing occurring "at conferences or collaborative meetings" (p. 9). Hedberg (2018) globally sensed some "scepticism around research" (p. 13) including vendor-produced research that was thought to be biased.
Lastly, it was clear that most educators accessed research indirectly and in brokered fashion-i.e., through professional associations (8 mentions) and conferences, and from 'other education publications such as the Marshall Memo (also see Malin et al., 2018;Malin and Paralkar, 2017). State resources were also noted, again underscoring the importance of the broader system in facilitating or hindering use.
Application of institutional theory. When applying institutional theory to the MA case, some key elements are evident. First, we can see how formal policies (state and federal) have constrained educators' attention in certain directions (e.g., attentive to highstakes testing data, and more generally toward data relative to other sources of evidence). Within that parameter, we can see the state attempting to be helpful, providing data in a timely fashion and in a format that is said to be desirable. There also appear to be earnest and somewhat successful efforts to facilitate coherence across the system, with the state assuming a leading role. The pattern thus appears to be a primarily top-down approach to evidence (and, more specifically, certain canonical data), whereby evidence from some other authoritative location is brought into practice (Martin and Williams, 2019). The state also is showing interest in encouraging/facilitating bottom-up use, e.g., by "facilitating the cross-pollination of ideas and resources by evidence-oriented practitioners" (Hedberg, 2018 recommendation, p. 2).
Despite considerable isomorphic pressures on MA educational organisations and educators, from the top, there also appears to be considerable organisational-level diversity in EIP. Some districts, for instance, have invested in data coaches or similar, whereas others have not. Meanwhile, some schools and districts are facing major pressures to improve or turnaround their schools (again, a top-down policy), and educators within these districts are compelled to engage with evidence in different ways and with different levels of urgency. Within such a situation, we suggest there is still potential for cross-district sharing if/when educators/ organisations could network and form consortia according to common challenges and interests. However, results of the evidence use survey suggest much of this potential is currently untapped.
We can also see that the state, though a very important part of the evidence use system in several ways (e.g., via relevant laws, its assessments, and its research supports), is not the sole influencer beyond the level of the organisation. For example, professional associations play a key role in facilitating evidence sharing and use in MA, and certain other brokers/mediators are performing important linkage functions as well. At the epistemic level, too, educators' scepticism toward research serves as a barrier to use, but the level of scepticism most likely varies considerably (a conclusion reached by Hedberg, 2018, p. 2, whose recommendations relied upon leveraging and connecting educators who are more "evidence-oriented"). Overall, it would seem EIP in MA is skewed top-down in important ways, but there is also recognition of and are some earnest efforts also/instead to promote more bottom up EIP in and across MA schools and educational organisations. Such bottom-up efforts, however, typically do not include the conduct of formal research.

Rheinland-Pfalz (Germany)
The German school governance model can be classified in the quadrant of the egalitarian way. This means rather high social cohesion with lower institutional autonomy, where the idea of 'managing' development processes encounters a still bureaucratic administrative context. Meanwhile the German system is characterised by low social regulation; accountability is relatively low stakes. This is the result of a longer development process.
Until the 1990s, the German school system-with its 16 country-specific variants in the federal system-was characterized by an input-oriented governing model with hierarchically organised school supervision and top-down detailed control through laws and decrees. Comparable material and personnel resources and the binding nature of the curricula were seen as a guarantee for the quality and comparability of school results. This corresponds to the logic of conditional programming (Luhmann, 1970) in the sense of standardising the results of work by standardising the framework conditions.
As a result of the overall weak German results in Large Scale Assessments and especially since the PISA study, orientation towards a logic of goal programmes (Luhmann, 1970) is now also found in more output-oriented steering and control elements. This includes the expectation that schools will orientate themselves more towards educational standards and that they will be accountable within the framework of external evaluation. This has led to a paradigm shift in Germany-in line with the international trend-although elements like competition and the market are not yet apparent or are only beginning to emerge.
Use of evidence. In the context of the joint project EviS (Evidence-based School Development), evidence-based knowledge and action in schools in the German federal state RP has been operationalised and descriptively analysed. Evidence was defined as systematically generated, objective and explicit information on the effectiveness of educational processes (Demski et al., 2012). The spectrum ranged from scientific empirical studies as well as state-wide assessments and school inspections (external evidence) to peer observation and student feedback (internal evidence)regarding any sort of information as evidence if it is more or less objective, reliable, and valid (Dormann et al., 2016). Figure 2 illustrates the reception and use of different evidence sources by teachers (N = 1230) in the EviS-study . The data show comparatively intensive use with regard to internal process-related information sources that are closely related to the teaching practice of teachers, e.g. on the basis of systematic student feedback on teaching (upper right quadrant in Fig. 2). The results also point to a relatively intensive use of school subject-related journals. On the other hand, the use of data generated in the context of external instruments (feedback from school inspections, state-wide tests) is significantly lower (bottom left quadrant). 1 It is noticeable that the approval ratings of school management members tend to be higher than those of the teachers (not illustrated here). This might be explained due to principals' more comprehensive view of the school as an organisation, but requires further analysis.
In addition to this data, Muslic (2017) shows in her study concerning the German federal states of Berlin and Baden-Württemberg that in many schools the subject-specific departments are central to the discussion of the results of state-wide assessments. The measures derived include, in particular, processrelated activities such as support concepts and didactic agreements. An in-depth analysis of the results with regard to the teaching quality, however, rarely takes place. The exchange between teachers and school management remains largely informal and rather superficial. Only in the case of poor results would the school management intervene. In many cases it is up to teachers themselves whether to use evidence and they are hardly accountable (low social regulation).
As the results from RP show, the new assessment and evaluation instruments are not only little used overall, but are also regarded as comparatively unhelpful. Here the distance to teaching practice may seem bigger than with other instruments, and the descriptive data usually do not provide any explanation or knowledge of change. Furthermore, teachers perceive results, e.g. from state-wide testing, in particular as a starting point for considerations on quality development if they have an objective reference norm, i.e. if there is information about what students can do and in what respect they still have to develop competences (Kühle and van Ackeren, 2012). Longitudinal data with individual reference standards should also lead to more acceptance.
The effectiveness of data-based school development improvement measures also requires a competent, trained handling of evidence-based knowledge and its integration into reflective practice. Studies on the different use of data feedback in the context of state-wide assessments in Germany show that schools with high expertise in school improvement can benefit more from data-supported feedback (so-called Matthew effect of accumulated advantage) than those schools that have less competence in this area (Maier et al., 2011). Therefore, more systemic effort is needed to support schools' development processes.
Application of institutional theory. Since 2006, following the so called "PISA-shock" stating unexpectedly weak results for Germany, the introduction of a nationwide educational monitoring concept and data provision on school quality has been associated with the hope of making the actions of schools more effective and thus contributing to the improvement of schools. In the meantime, education policy seems disillusioned with regard to comparative measurements and data use.
On the system level, the usefulness of the data for school and teaching improvement has been questioned by the Standing Conference of the Ministers of Education and Cultural Affairs of the Länder in the Federal Republic of Germany (e.g. Kuhn, 2014). There is also increasing debate on how empirical knowledge can be better integrated into education policy, administration and school practice in order to achieve meaningful change. From a German perspective, education is not a directly steerable or controllable, technocratic production process (cf. on the The partially critical policy view on continuous performance measurements and EIP in schools is also influenced by the abovementioned research findings concerning the organisational level. In Germany a 'real' and meaningful data-driven change is rarely initiated. With reference to "school development through insight" (Kotthoff et al., 2016, p. 338) rather than control or competition, it will be up to the schools in the end to decide how to react if standards are not achieved within the low stakes environment. There exists an understanding of school as a learning organisation, which has been partially strengthened in its scope for decisionmaking and action. Schools should be able to adapt to local changes continuously and with the help of data and be able to monitor their actions and the effects they have achieved themselves.
Nevertheless, the coupling of the development of the overall system and the individual school is complex, since German schools and the individual actors in them can decide quite independently how to deal with external interventions; from the individual perspective, the interpretive sovereignty over 'school quality' lies with the pedagogical professionals (Klein and Bremm, 2020). Overall, there seems to be a lack of fit between a demand for science-oriented reflection from the outside perspective and an experience-based practice within schools. The data also seem to lack 'cultural significance' in relation to the specific context of the individual school and individual teacher (Heinrich, 2015). Furthermore, school authorities often do not see themselves as managers or at least do not take on this role in practice (Klein and Bremm, 2020). This can be seen in Mintrop (2015), for instance, who speaks of a "public management reform without managers" (p. 790) in Germany. In this way, pressure and unintended side effects are largely avoided, but change is not systematically initiated and supported.
It seems helpful to provide longer-term support, e.g. within the framework of school networks and individual school improvement support. Furthermore, findings from design-based school improvement projects point to the relevance of different opportunities for schools in this context: to understand and recontextualize externally produced data in their specific individual school situation (Fend, 2009), gain expertise in the assessment and interpretations of own data and-most importantly-draw adaptive conclusions and strategies that lead to measurable success in schools (Bremm et al., 2017). Accordingly, there is a need for a new balance between gaining knowledge and supporting schools in the reception and use of data, including a stronger focus on the action level (teaching) and the control level (school management). Corresponding competences, such as integrating empirical results into reflective practical experience, should be included in teacher training, which has rarely been the case so far in Germany. School leadership plays an important role for harnessing the benefits of data use (Brown and Flood, 2019). German studies have shown that directive or discursive leaders are more likely to favour the discussion of data within school, while delegated school management is more likely to inhibit these processes (e.g. Kronsfoth et al., 2018). School principals can also positively influence an organisational climate that is open to evaluations and instruments of performance measurement, as well as promote an overall organisational embedding of the processing and use of evidence (Muslic, 2017).
With regard to the current state of discussion in Germany, it can be stated that there is generally an unsatisfactory situation of a costintensive system for data generation without systematic utilisation impact. Therefore the 'Standing Conference of the Ministers of Education and Cultural Affairs' (KMK) has changed its overall monitoring strategy for education in order to support a more transfer-oriented development through more relevant and explanatory knowledge for administration and schools and to process research findings more systematically as well as to better support schools (Kultusministerkonferenz, 2015). It remains questionable, however, whether a more binding requirement for practitioners' data use is needed, for example regarding comprehensive and evaluated quality management systems in schools and by strengthening the role of supervision and leadership.

Discussion
In this study, we set out to further examine the question of how to bring about more consistent, EIP in education. To do so, we  ARTICLE HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-020-00587-8 undertook a novel approach, examining EIP comparatively across four geographical contexts (Catalonia, England, Massachusetts, and RP) through the use of particular theoretical perspectives/ frameworks. We applied a cohesion/regulation matrix and institutional theory to frame our analyses of factors influencing evidence use across these contexts. Our aims in doing so were twofold. First, we hoped to generate provisional insights related to fostering more/better EIP in education. Second, we aimed to achieve and share process insights related to this undertaking (i.e., to inform those who might wish to emulate and improve upon these approaches). Lastly, given the aims of this special edition, we sought to ascertain whether there are generalisable lessons in this study (either in terms of findings, or process) that could be applied to improve EIP (and/or its study) in other sectors. First, reflecting on our research process, here we offer several observations. Most generally, our diverse author team found the dual analytical approach to support our aims-it directed our attention in common (albeit not entirely overlapping) ways, providing us with structure through which to make sense of the level and nature of EIP that existed in these contexts. More specifically, the matrix offered us the chance to understand the uniqueness of each educational system and its specific national/ local flavours, while the institutional theory provided the chance to find commonalities between these systems. Indeed, the intersection of these two axes of analysis provided an understanding of enablers and barriers to EIP, as well as agents involved in each one of them. Accordingly, the dual frame has been (and can be) helpful for accurately diagnosing key aspects and levels. In turn, we suggest decision-makers might be able to apply the frame (and/or insights derived from others' applications of it) to "develop appropriate context-specific rather than one-size-fits-all packages of support to stimulate improvements" (Chapman, 2019, p. 5).
Nevertheless, in this instance, we were hampered by certain challenges, which mostly can be understood as study limitations that can be overcome by others. For example, we experienced unevenness across cases in terms of the level and depth of data and evidence available for analysis. Conceivably, a future study could utilise the same frame, but could proactively collect common data across contexts-and could do so at multiple levels of the respective systems-rather than relying, as we did, on extant data and literature related to the contexts. We expect such an approach would further the comparative process. Also, and in line with Martin and Williams' (2019) advice for scholars in these areas, we found institutional theory to provide a quite useful and illuminating, if also complicated, lens for analysing and understanding the relationships between evidence and practice in (and across) contexts. Although ultimately beneficial, it required substantial work for us as researchers to develop a shared understanding of the various elements and how they applied. In this aspect as well, though, we strongly suspect a pre-planned, active comparative study of evidence-use, using this dual approach, would go far in terms of understanding EIP variation (and, thus, suggesting potential approaches to improve EIP). Further, we suggest such research endeavours would be useful beyond education, or perhaps could support the simultaneous study of more than one sector; these lenses can be illuminative irrespective of sector, as nuances related to norms, policies, traditions, etc. (beyond mere differences in evidentiary bases across sectors) will go far in terms of understanding EIP patterns/variation.
Beyond the process, we also suggest this study's findings yield some tentative insights in terms of EIP, both for educators/educational scholars and for those outside the education area. Particularly for the latter group, certain facts may seem counterintuitive. For example, why is it that, in a system in which teachers have a great deal of autonomy like Catalonia, teachers report infrequently relying upon research to guide their practice? Here, some sector-specific analysis/understanding is supportive. For instance, as Cohen and Mehta (2017, p. 649) observe, teaching has across many (though not all) contexts "[failed] to crystalize as a fullfledged profession", a feature that renders teaching/learning "vulnerable to lay views of education and reform, as well as to inherited patterns of practice." Education as a field is challenged by manifold and often competing/conflicting goals, multiple constituencies and often fragmented systems (Labaree, 1997), which "complexifies the idea of a unified body of research informing classroom practice" (Lubienski, 2020, p. 183). Indeed, as compared to professions like law or medicine, education has fewer agreed upon truths and shared definitions of problems and solutions (Willingham, 2012). In part, this is also a reflection of the inherent complexities and contingencies around teaching and learning. Altogether, one might better understand why educators who have considerable freedom may not direct it in consistent easy to spot 'research-informed' ways. As such, education is indeed somewhat unique in terms of featuring relatively weak and conditional research-practice links (Lubienski, 2020).
Still, there are certain paths forward toward bringing about more/better evidence use in education. In education (much like in other fields), it helps to recognise that research evidence is but one of the potential influences on practice (Farley-Ripple et al., 2018). Indeed, as Cain et al. (2019Cain et al. ( , p. 1074) summarise, there is now "near-universal agreement that research-generated insights are an insufficient basis for practice". Above all else, educatorslike other practitioners and policymakers-want to be able to confidently make decisions about problems/issues that are important to them. Ultimately, it is realistic first to understand research knowledge represents just one of several forms of knowledge educators might draw upon as they go about their work. Accordingly, its professional use is not pre-given but is contingent on a host of favourable features and conditions, set across multiple levels. As such, and although none of the cases addressed in this study represents a utopia in terms of professional evidence use, individually and collectively they are assistive in drawing out some such features that can help move educational systems toward more desirable states.
The England and MA cases, for example, show how strong accountability pressures (via inspection and high-stakes assessments, respectively) certainly can coax educators to consistently focus on particular forms of data and research. MA educators are particularly attentive to students' performance on annual highstakes state tests, while English educators are considerably driven by official school inspections. However, there can be considerable costs associated with such approaches, roughly summarised as 'what gets measured matters' (and the reverse: what is not measured might be underemphasised or ignored, reduced/cut from programming, etc.). Accordingly, we offer that when accountability systems are in place, specific details (e.g., assessment areas, format, foci, speed, and quality of feedback) are salient. RP offers a useful point of comparison here, in that their external assessment data appear to be attracting substantially less educator attention; this fact is most likely explained by the relative weakness of sanctions and gratuities associated with these measures. Depending on one's vantage point, this might be construed as a virtue or a vice. On the positive side, for example, perhaps RP educators are freer to direct their attention toward locally important data and research (e.g., bottom-up evidence generation/use), and as such are utilising the external data only to the extent that it is perceived to add value to their decision-making. On the other side, results show that low stakes accountability and higher degrees of autonomy in Germany come with lower levels and less infrastructure for support that would help practitioners with understanding, discussing and recontextualizing these data.
The MA case also shows a relatively strong and layered infrastructure for supporting evidence use, including a state-level research director and department focused on planning and research. Moreover, with their recent report, they too are now positioned to be more evidence-informed about how to promote more routine and deep evidence use in schools (i.e., they can tailor their activities and processes to what they have been learning from MA educators regarding actual evidence use, supply, and demand).
Conaway (2020), formerly research director for MA, provides research-grounded and practice-grounded insights regarding how to potentially move toward next level evidence use. Broadly, she writes If we want research to matter…we need to devote resources to building relationships and strengthening organisational practices, in service of building organisations that learn (p. 2).
More specifically, she highlights the importance of several specific aspects and structures/arrangements, most of which are evident and/or incipient in one or more of the cases we reviewed. For example, she highlights the potential of learning networks (see the English and Catalonia cases to learn more about their relatively longstanding and more recent embrace of PLNs, respectively), embedded research directors (see MA), and researchpractice partnerships (relatively prevalent in MA).
Perhaps most fundamental, and perhaps best exemplified presently in England and in RP, is the overarching goal to develop and support educational 'organisations that learn.' To foster such organisations on a broad scale, we suspect, ultimately will require that various conditions be simultaneously met-both philosophically and materially, and at multiple levels within complex educational systems such as those we have profiled herein. In other words, to truly approximate an educational 'evidence use utopia' will require attention to both institutional and social factors, as highlighted via our analyses. Additionally, we suggest, it will require establishing conditions in which teaching is treated and experienced as a full-fledged profession (see Darling-Hammond et al., 2017, for jurisdictions in which this is a reality, or nearly so). Ultimately, what we suggest should be envisioned and worked toward are coherent systems in which teachers and educational leaders routinely and effectively can access and integrate research evidence with other forms of knowledge/knowing at the "point of use" (Nutley et al., 2019, p. 242), as they are making consequential educational decisions.

Data availability
All datasets analysed or generated are indicated in the paper. Note 1 It should be noted that in RP, as in most Länder, there is no longitudinal individual data at pupil level available that could show developments in a more differentiated way.