Introduction

Advocates of the use of evidence in policy development argue that the main way to maximize this goal is (a) to encourage policymakers to learn to think like scientists, and (b) to learn how to solve problems primarily with reference to the evidence generated by professional, scientific, and technical methods of inquiry. Yet the available research on the learning process of people engaged in the design and implementation of public policies highlights the need to further develop this argument (Cairney et al., 2016). It implies that policymakers seek to learn from the experiences of others, or the past, but that the process is no more technical or straightforward than the usual politics of agenda setting. They demand information in specific ways, to reflect their existing beliefs, and the way in which they define problems. This enhances their ability to learn only in particular ways and from a limited range of experiences. Further, they do so in specific political settings and environments which furthermore limits the applicability of lessons from elsewhere.

Utilizing the policy-oriented learning perspective, this article deduces from scientific evidence on policy learning take home messages that account for differences in how individuals acquire and process evidence in a political context.Footnote 1 Learning in this context is defined as an ongoing process of search and adaptation, which is motivated by the desire to improve one’s understanding of policy problems, its causes and the probable impacts of alternative solutions on policy objectives (Sabatier, 1988, p 151). This notion of policy learning is grounded in the assumption that individuals in a complex policy context strive to increase the utility of their choices by optimizing their judgment in order to fulfill their individual goals—in particular when they encounter an unfamiliar situation and non-routine tasks (problems referred to as ’novel problems’ hereafter). Feedback and environmental cues help them to learn from judgmental errors and optimize their reasoning. (Heikkila and Gerlak, 2013) draw our attention to the cognitive processes by which individuals directly or indirectly acquire and process relevant information—in particular instrumental lessons about the “viability of policy instruments or implementation designs, social lessons “about the social construction of policy problems, the scope of policy, or policy goals, and political lessons “about policy processes and prospects.” The first two lessons are related to thoughts about the validity of policy-related arguments, whereas the latter concerns the development of strategies about how to successfully promote these ideas (May, 1992).

A person’s immediate social environment and the socio-ecological environment within which it is embedded determine what feedback and environmental cues individuals perceive and learn from (Weible et al., 2010). Policy learning settings vary as each policy sector has a unique politics of its own (Lowi, 1972). With few exceptions (for example, (Diaz-Kope et al., 2013)), policy-oriented learning studies analyze individual and collective learning in complex or volatile settings that give rise to uncertainty and ambiguity. (Dunlop and Radaelli, 2013, p 603) lable as “reflective” settings that lack socially accepted expertize and contain policy problems too complex to be traced. Here the evidence justifies sampling methods that go beyond measuring governmental learning (Etheredge and Short, 1983). Unlike accounts rooted in Hall’s notion of social learning, the analyses of this subject generally focus only on influential individuals (a policy elite (Heclo, 1974; Hall, 1993)) within a policy subsystem that is geographically and domain bound. For example, the policy elite that shapes Oregonian transport policy would constitute such a setting. The evidence suggests that these structures include an increasingly diverse and growing set of experts (policy analysts, scientists, consultants, and researchers in government and non-government organizations) who advise individuals with legal powers to impact decisions on a particular policy issue (Weible, 2008). As epistemic guidance diversifies, the available expertize becomes contested and individuals learn from observing others’ preferences and actions. This elite rarely draws direct lessons from expert-based evidence to either confirm (political use) or critically evaluate (instrumental use) their views. Rather, policy-oriented learning describes how lessons from lived or witnessed experience, analysis or social interaction accumulate, gradually altering or confirming a person’s thoughts or intentions (Weible et al., 2010; James and Jorgensen, 2009). Individual learning then incrementally translates into social learning and the development of paradigms in policy at time x, which can affect individual learning processes at time y, as well as subsequent policy changes (Howlett and Migone, 2011). In a related discussion, (Birkland, 1997) argues that sudden events beyond human control can facilitate or hinder learning and policy change.

Guided by the literature on policy-oriented learning, the remainder of this discussion distinguishes different types of policy learners in reflective settings according to their individual background and immediate social environment. The first section below outlines common assumptions about how human beings learn that form the basis of this literature. The focus is here more on the acquisition of new knowledge rather than skills. The second part of this article introduces pathways that evidently result in policy learning under varying conditions. Here, the review focuses on contributions that take the individual into account. The third part summarizes take home messages for scientists and advisors struggling to maximize the use of their evidence and seasoned, and young policymakers who seek to optimize their reasoning. In doing so, this review distinguishes itself from more extensive literature reviews whose primary focus is the reconciliation of various notions of policy learning (May, 1992; Bennett and Howlett, 1992; Howlett and Cashore, 2009; Dunlop and Radaelli, 2013).

How we learn

How human beings learn is not an easy question to answer. While some believe in our capacity to think and act rationally, others argue that reality is all but a construct. This article builds on Sabatier’s definition of policy-oriented learning that sits somewhere between these extremes. It is rooted in the literature on dual learning (Sabatier, 1988). A handful of scholars have already followed his example, further developing the model of the individual subsystem actor. For example, (Jenkins-Smith et al., 2014, p 488) describe individuals as “boundedly rational with cognitive filters and biases that influence how individuals acquire information and update beliefs”; some of these filters are then credited with a person’s inability to learn. Weible et al., (2011, p 127) argue that some individuals have a better developed capacity to think analytically and overcome these biases, depending on “their formal level of training and education”. Taken together, this evidence suggests that individual learners differ in terms of their analytical capacities and reliance on cognitive filters (heuristics). This reflects recent developments in the dual learning literature that distinguish between automatic brain operations (heuristic processing) and controlled operations (analytical processing) (Stanovich, 2012). Heuristic processing promotes associations that come spontaneously to the mind, without conscious search or computation, and without effort. Optimizing this process are our cognitive abilities and propensity to critical thinking, which are both associated with analytical processing (Evans and Stanovich, 2013). The pros and cons of each of these two cognitive styles will be roughly outlined in the following two sections, providing the foundation upon which to develop more realistic strategies to maximize the use of evidence in policy.

Type 1: Heuristic processing

Type 1 processing develops unconscious, implicit problem-solving strategies that have been practiced to automaticityFootnote 2 (Kahneman and Klein, 2009). This biological predisposition through evolution means that we learn some associations between cause and effect better than others (Evans, 2006). For example, early on in our evolution, we did not have the luxury of time to think but needed to quickly respond to environmental cues. Our response time meant the difference between life and death. This is when we came to rely on intuitive processing to quickly identify the optimal response to sudden changes in our environment.

The ease with which one can link a new observation to past experiences influences how we perceive probabilities (availabilityFootnote 3 heuristics). Frequently occurring events are easily recalled, and memorable or dramatic occurrences will come to mind more readily. Consequently, vivid and emotionally charged or recent memories, and well-publicized information appears to be more plausible, and is given greater weight than other information (Kahneman, 2011). (Sabatier et al., 1987) credit these biases with the misperception of our opponents strength. They argue that we are more likely to remember failures than correctly assess our opponents’ weaknesses. This devil shift argument is then further developed to explain why individuals maintain collaborations with like-minded others. Studies have shown that the devil shift affects a wide array of individuals, ranging from interest group leaders to scientific experts (Ingold, 2011b; Zafonte and Sabatier, 1998).

Experimental research furthermore shows that the accuracy of an intuitive assessment of a given situation depends on whether the evolved mental habit (”belief bias” hereafter) suits the contextual situation in which the action is being executed. Intuitive processing works very well in benign contexts, but can distort assessments of probabilities when complex reasoning processes take place (Stanovich, 2012). This makes us easy prey for those who seek to manipulate our perception of reality by using stereotypes. This is when we have to rely on our ability to reflect on the information we’ve been given.

Type 2: Analytical processing

Type 2 (analytical processing) is a recent evolutionary structure that allows us to make deliberate choices between multiple options and allows general reasoning. It is non-autonomous. It is slow and largely restricted to sequential thinking (taking a step by step approach to solve one problem). It permits what (Evans, 2003) calls “abstract hypothetical thinking” which allows us to make inferences from hypothetical situations that have not yet happened—an attribute specifically linked to human learning. This style of reasoning through abstract logic we learn at school and indirectly through games and puzzles. Under the right conditions, Type 2 reasoning can optimize habitual responses by suppressing intuitive processes and responses. It does not require extensive background knowledge (Schneider and Newman, 2015). Its effectiveness depends on our propensity to collect and store information and to think about future consequences before taking action (cognitive ability). Furthermore, this function works best when paired with a reflective mind that tends to consider alternative representations. This association is most clearly recognized in Stanovich’s tripartite model, an extension of dual processing theory that links cognitive style and cognitive ability (Stanovich, 2012). For example, an individual with an excellent working memory but hardly any disposition to think critically can fail to detect and address biases (Stanovich and West, 1997). Policy scholars researching individual analytical policy capacity expect that policy analysts, scientists, consultants, and researchers in government and non-government organizations use an accepted analytical approach to acquire and critically process policy-relevant information (Leach et al., 2014).

While Type 1 is the modus operandi, Type 2 reasoning kicks in when conflicting associations are detected. Dual learning scholars commonly assume that Type 2 reasoning processes run parallel to Type 1 processes, though at a much slower pace (Pennycook et al., 2015). In addition, possible cues for analytical thought are embedded within a person’s environment, such as feedback or top-down instructions. This has only recently been the focus of discussion and experimental research. The evidence indicates that individuals who are informed about judgmental errors, are given time to think, and are encouraged to consider alternative responses optimize their performance (Macpherson and Stanovich, 2007).

Individual differences in learning

How individuals respond to a cue depends on what type of learner they are. Arguably, a person that is more open to logical reasoning is more likely to detect cues indicating cognitive conflicts before individuals inclined towards heuristic reasoning do (Pennycook et al., 2015). Thus, in a collective they are the most likely to give others the cue to employ Type 2 reasoning strategies. In other words, they are the teacher in Dead Poets Society who instructs all students to stand on the table to gain a new perspective in their life. Least likely to quickly respond to cues are individuals that lack the necessary open-mindedness or, as will be argued further below, the contacts to open-minded people.

Based on the aforementioned experimental research, scholars arrive at some predictions about an individual’s ability to reason and learn (Stanovich, 2012; Trippas et al., 2015; Schneider and Newman, 2015). Assuming similar perception and motor skills, four different types of learners can be deduced from this literature.

Novice : This category includes any person who has just entered a particular environment and any person who encounters an unfamiliar situation and non-routine task. These individuals are most likely to encounter novel problems. It can be assumed that they can utilize basic logical reasoning skills but generally lack the specialized knowledge that specialists or advisors have. This predisposition considerably lowers their ability to deliver a stellar performance.

Specialist : This category is used to refer to any person with basic logical reasoning skills who engages in routinized tasks and has specialized knowledge applicable only within a particular environment. Thus, when confronted with a problem in their field, they are likely to encounter a familiar pattern, and can solve task-related problems effectively by using heuristics processing. This results in satisfactory solutions. They would still be expected to effectively address an unfamiliar task in a familiar environment by using their logical reasoning skills (risking biased reasoning when failing to do so). However, the quality of solutions can be expected to be at the novice level when the person is confronted with novel tasks in an unfamiliar environment.

Advisor : Any person with advanced logical reasoning skills who has accumulated specialized knowledge applicable only within a particular environment. They are expected to behave like a specialist in a familiar situation. However, an advisor is more likely than a specialist to master an unfamiliar task in a familiar, as well as an unfamiliar context.

Scientist : Any person with extremely potent abstract reasoning skills and working memory (possessing academic abilities that are considered general knowledge in most cultures, such as historic knowledge, literacy, and numeracy). They may lack specific knowledge, but can access a higher order cluster of problem-solving strategies to compensate for this deficiency by using cues from similar situations to arrive at good solutions to a novel problem set in an unfamiliar environment. This predisposition increases the potential that they can use reasoning to identity optimal solutions once they have accumulated more specific knowledge. (Fig. 1).

Fig. 1
figure 1

Learning types by ability to detect cues for analytical thought

In order to gain a first impression about how individuals learn (what learner type they are), try to answer the questions:

What level of specialized knowledge the person has accumulated?

Does their schooling imply advanced or basic logical reasoning skills?

Pathways to policy learning

The discussion about different learner types supports observations that individuals in a policy environment develop deeply rooted cognitive filters that help them effectively address policy issues in familiar situations by using Type 1 processing (Jenkins-Smith et al., 2014). This can entail fundamental strategies of defining policy problems or possible solutions that are constantly reviewed and occasionally revised. (Leach et al., 2014, p 593) argue that “whether one (new knowledge) leads to the next (new beliefs) depends on the extent to which people are assumed to be rational, and whether the learning context is designed to facilitate rational evolution of policy beliefs”. The discussion also adds weight to claims that individuals with high analytical capacity are more actively engaged in the search for information (Weible et al., 2011).

There are several interrelated pathways to policy learning. First, aspects of the learning situation can affect the extent to which individuals with potent abstract reasoning skills can overwrite heuristic processing—their own as much as that of others (Zafonte and Sabatier, 1998, Weible and Sabatier, 2005). (Heikkila and Gerlak, 2013, p 497) deduce from their review of the policy learning literature that their social interactions determine how individuals acquire, make sense of and disseminate information—in particular relationships that “involve trust and patterns of openness”. Likewise, (Elgin and Weible, 2013) argue that if we are to properly understand how individuals receive and respond to cues for Type 2 reasoning, and individual’s influence on another needs to be considered. It is thus important to understand how they are embedded in a learning situation.

In related research, (Weible et al., 2010) find the potential for learning is greatly reduced when individuals segregate into competing advocacy coalitions. In other words, they only maintain ties to like-minded others. Understanding the attributes of a learning situation is the second question that needs to be addressed to understand how individuals arquire, make sense of and and disseminate information.

The answers to this and the previous question about an individual’s position in a learning situation depends on the factors that shape a learning situation. William Leach and associates observe that the diversity of participants that engage in a collaboration matters. More importantly, a perception that the procedure guiding their action is fair and that their opponent is trustworthy positively correlates with the acquisition of new technical or social knowledge and with instrumental and social learning (Leach and Sabatier, 2005; Leach et al., 2014). Thus the third question to address is what factors shape the learning situation.

What cues for learning individuals receive from their social environment

Field studies that link individual and collective learning in a political environment have in common are a focus on social networks that form around the issues that require collective action. These relational structures exist in relation to and within other such networks, and involve a diverse set of individuals, such as governmental officials, activists, journalists, researchers, and policy analysts. The issues for which individuals within a specific network seek optimal solutions are often too complex to be fully comprehended by one individual alone. Thus, individuals within a network cluster around narrations that utilize shared heuristics (also referred to as beliefs) as to the nature of the problem and the effectiveness and beneficiaries of specific solutions. This occurs in particular where individuals regularly collaborate (Ingold et al., 2016; Matti and Sandstrom, 2011). Zafonte and Sabatier (2004) argue that in such settings individuals perceive it as less costly to build and maintain collaborations with like-minded others (through reciprocity) that can help them realize long-term policy objectives than to defect from such a partnership for short-term gains. Zafonte and Sabatier (1998, p 480) clarify that it is not strong but weak coordination that matters in this context. Individuals “monitor each other’s political behavior, and then alter their action to make their political strategies complementary with respect to a common goal.”

Besides their strength (i.e., trivial or non-trivial degree of coordination) and nature (e.g., collaboration or conflict), relationships between individuals are also measured by the evidence that can potentially flow from one to the other via these channels. Here, scholars distinguish convergent and divergent relationships (Leifeld, 2013; Ingold, 2011a). The term convergent, describes a relationship in which individuals exchange information that confirms their causal understanding of an ambiguous situation (intra-coalition learning (Weible et al., 2010)). This setting fosters intuitive reasoning and political learning that help to rationalize and institutionalize existing belief bias and foster routinized behavior and policy stability (Zafonte and Sabatier, 2004). This occurs more so amongst individuals that are over-reliant on Type 1 reasoning than amongst individuals that have an incentive, such as professional norms, to engage in Type 2 reasoning (Howlett, 2009). Individuals that maintain a divergent relationship are expected to receive cues from each other to engage Type 2 reasoning (inter-coalition learning (Weible et al., 2010)). The exchange of natural science research is said to be particularly effective in this regard (Sabatier, 1988). If an analytical reasoning process is triggered, this can result in cognitive and subsequently–though not necessarily–in behavioral change. This process can also be described as instrumental learning, as it primarily involves lessons about policy tools and interventions (May, 1992).

It has been noted, however, that it is rare for an individual to have a direct effect on another’s learning. Weible et al., (2010) observe that subsystem members rarely consult one source of information to confirm (political use) or evaluate (instrumental use) information. Rather, the evidence points towards the indirect and unintended impact of networking behavior on learning. Here, Albright and Crow (2014) find that in the context of flood risk management (Colorado, USA) individuals gradually reaffirm and revise their causal reasoning about policies, targets, and outcomes. This is referred to as social learning by May (1992). Jenkins-Smith et al., (2014) argue that individuals that are linked to like-minded others are likely to face the same environmental cues and become increasingly similar in thought (Leifeld, 2013). Network homogeneity then enforces Type 1 reasoning (Jenkins-Smith et al., 2014). Hence, whose knowledge individuals are exposed to depends partially on their position in the network structure within which they are embedded (Weible, 2008). Three positions can be deduced from the policy learning literature, using the discussion presented in (Ingold and Gschwend, 2014) as guidance: Policy Entrepreneur, Policy Broker, and Advocate.

Policy Entrepreneur: At the core of each coalition, policy entrepreneurs use whatever resources are available to them at the time to gradually develop and maintain shared narrations of their personal beliefs about the causes of a problem and the effects of possible solutions (Mintrom, 1997). Theoretically, entrepreneurs “employ heresthetical arguments strategically linking policy options to important outcomes… in a manner designed to split the dominant coalition and render change possible” (Jones et al., 2009, p 42). In a related discussion, (Mintrom and Norman, 2009) highlight that it is this desire to gain resource supremacy (i.e., attract sufficient support to impact the policy process) that distinguishes the entrepreneur from an advocate. Entrepreneurs are expected to favor knowledge that supports their argument. The evidence furthermore suggests that individuals in an entrepreneurial position are invested in the process long-term and their expertize is recognized by their peers (Dudley, 2013; Crow, 2010; Weible et al., 2004; Ansell et al., 2009).Footnote 4 From this it can be deduced that entrepreneurs constitute the most densely connected and homogenous clusters within a policy subsystem, a network component that consists of many symmetric relationships between like-minded individuals. Knowledge circulates within this group. Ingold (2011b) finds that this group includes representatives of government agencies or powerful interest groups. Crow (2010) observes entrepreneurs to be the most effective when they have accumulated more technical or managerial expertize over time.

Policy Broker: In contrast to entrepreneurs, policy brokers’ principle concern is to provide cues for Type 2 reasoning and to mediate conflict between competing coalitions (Sabatier, 1988, p 133). They are assumed to trigger “new ideas concerning, e.g., causal relationships and policy instruments” (Sabatier, 1988, p 159). Scientists, journalists, and civil servants tend to occupy this position at the periphery of each coalition (Ingold and Varone, 2012; Howlett and Newman, 2010). They engage in dialog and deliberation with individuals in opposing coalitions (Ingold, 2011a). They are central members in the network but are only intermittently involved, or involved for a short period, and they do not regularly engage in coalition-related activities (Weible, 2008). Thus, brokers play a less integrative role within a coalition but drive collaborations on a subsystem level. This evidence supports the argument that scientists and journalists who interfere in a specific discussion can, for a short period of time, create bridges via which individuals in opposing coalition then can send and receive cues for Type 2 reasoning (Sabatier, 1988).

Advocate: To borrow a phrase found in (Mintrom and Norman, 2009, p 650) “an advocate is an individual who is comfortable working within established institutional arrangements.” They hold authority or potential for authority to enforce and monitor policy design and implementation, but they are not interested in upsetting the status quo. Advocates may not be central actors in a coalition (like entrepreneurs) or the subsystem as a whole (like brokers), they are nevertheless skillful leaders with the legal authority, financial means, social capital, or knowledge to impact individual and social learning. Montpetit (2011) argues that this can include scientists who share their expertize with policy entrepreneurs. They are targeted by policy entrepreneurs who seek sufficient support to gain subsystem supremacy. Members in the group who receive cues for Type 2 reasoning from their exogenous environment may approach brokerage position as they seek to make sense of their observations (Weible, 2008). It is thus expected that advocates constitute the loosely connected periphery of a coalition. (Fig. 2)

Fig. 2
figure 2

Positions within a network of allies advocating shared narrations

In the absence of contradictory evidence it seems fair to argue that, no matter their learning type, everyone can occupy one of these positions at a given time. It would thus be shortsighted to assume, for example, that all scientists take brokerage positions. To gain a first impression about a person’s structural position, ask:

Are they central to the development of an advocacy coalition?

Do they mediate between conflicting advocacy coalitions?

Are they less engaged but still affiliated with an advocacy coalition?

What are the attributes of the learning situation

How individuals in a specific position acquire and process evidence, depends as much on their cognitive disposition as on the overall network formation and the exogenous environment in which their action is embedded. Theoretically, three network formations can be distinguished: one dominant coalition, several collaborating coalitions, or several adversarial coalitions (Ingold and Gschwend, 2014).

Collaboration favors inter-coalition learning and analytical reasoning (Leach and Sabatier, 2005). Collaborative settings are most likely where the level of conflict between coalitions is low or the issue at hand is narrow in scope and traceable. In such a setting one would expect to see many brokers but also a large number of advocates and a few entrepreneurs that maintain divergent relationships with one another, resulting in policy-oriented learning and a subsequent increase in homogeneity across coalitions (Weible, 2007).

In an adversarial situation, policy entrepreneurs push convergent information onto advocates in an effort to nourish narratives promoting the beliefs that glue their coalition together (Ingold and Christopoulos, 2014). Entrepreneurs who maintain the coalition with the most influential advocates control the political agenda, and as such, several environmental cues to which network members are exposed. In other words, they determine what is discussed when and how in various policy venues. Policy entrepreneurs that lack the resources to influence policy design and implementation within the current situation may seek to attract new advocates to the network through brokerage (Dudley, 2007). Individuals that occupy brokerage positions are more open to divergent information and thus are able to mediate between conflicting coalitions—in particular those whose expertize is universally recognized. Their central position within the network exposes them to a diverse set of narratives and forces them to engage in Type 2 reasoning (Ingold and Gschwend, 2014). Successful brokerage may then alter advocates’ perceptions and convince policy entrepreneurs on both sides to reframe their narratives. This is easier when there are additional cues to give them an incentive to negotiate seriously (Lundin and Oberg, 2014). In settings that lack such cues, policy entrepreneurs first seek to rationalize their beliefs (political use of information), and only address a cognitive conflict (instrumental use of information) when this appears to be the most feasible option to maintain their links with influential advocates.

In an unitary setting, the network structure can be described as densely connected, enforcing homogeneity. In such a setting, scientists are expected to sit at the coalition’s periphery (Ingold and Fischer, 2014). Cues for Type 2 reasoning need to come from the exogenous network environment, which will be introduced in the next section.

From this it can be concluded that the attributes of a learning situation can impact policy learning. To get a first impression, ask:

Does the setting favor inter-coalition learning and analytical reasoning?

Does the setting lack potential triggers for analytical reasoning?

Does the level of conflict lead individuals to overly rely on heuristic reasoning?

What other factors shape the learning situation

The use of evidence in policy also reflects previously established policies (rules, norms, or strategies) that define what individuals can or cannot do in the present context (Heikkila and Gerlak, 2013). (James and Jorgensen, 2009) argue that policy designs (the content and structural logic of public policy) institutionalize patterns of behavior, and thus can help to predict with whom individuals are likely to engage. Likewise, (Leach et al., 2014) find that, depending on their content, policies not only regulate access to a network but also define their functions, tasks, and responsibilities. For example, collaborative policy designs may dictate elaborate stakeholder involvement, whereas policies that clearly outline decision processes ensure that these actors also trust in the procedure. In doing so, they can reduce the transaction costs of communication between individuals with divergent mindsets. (Jenkins-Smith et al., 2014) draw our attention to paradigms that are rooted in our culture and can increase the transaction cost. Both culture and policies are expected to change slowly and incrementally.

Furthermore, an individual’s socio-economic and physical context influences what resources are available to individuals in a network at a given time (Heikkila and Gerlak, 2013). This can entails more rapid and unpredictable changes—for example, man-made disasters (Nohrstedt and Weible, 2010; Jones and Jenkins-Smith, 2009), severe weather events (Albright and Crow, 2014), or changes in public opinion (Crow and Lawlor, 2016) that heighten the visibility of policy failures and trigger ex post learning. When confronted with rapid exogenous changes, the policy entrepreneurs observed by (Birkland, 1997) framed the novel problem in a way that highlighted similarities in their understanding of the change and possible solutions. When this strategy to recruit sufficient support to dominate a policy venue failed, they assumed a brokerage position to attract similarly minded advocates from networks that address related issues. This forced them into divergent relationships and in a few individual cases resulted in policy learning and defection.

In short, a macro-perspective can help diagnose or predict changes to a learning situation. To gain a first impression, ask:

Can the policy design, socio-economic and/or physical contexts explain who enters/ leaves a learning situation?

Can the policy design, socio-economic and/or physical contexts explain which position an individual occupies in a specific learning situation?

Can the policy design, socio-economic and/or physical contexts explain an individual’s level schooling (ability to reason logically) or their ability to accumulate specialized knowledge?

Take-home lessons

The review has shown that human beings have at any point in time already accumulated some knowledge that is likely to influence how they frame the available evidence defining an issue and its possible solutions. It is evident that among the individuals that negotiate possible solutions, some agree with each other and some do not. This is because nobody knows what the optimal choices are at the time. Most of the time we are forced to decide on and engage in a collective action prior to knowing the final outcome. We do so by responding as best we can to both predictable and unexpected cues from our environment. In other words, no one is an expert but everyone is a learner. Some are in a better position to be more rational than others. Common wisdom implies that those with an academic training are predisposed to be the more rational learning agent. The above cited evidence on dual learning indicates that this is true in situations in which such individuals are presented with a novel problem. It also shows that individuals that have had the opportunity to acquire specialist knowledge through experience can, with little more than basic training, outperform the academic performance in familiar settings. Likewise, it is as important to look beyond an individual’s qualifications and assess the learning situation in which they are embedded in, asking wether it favors inter-coalition or intra-coalition use of knowledge.

Based on our knowledge about what type of learner there are and how they are embedded in the learning situation at a given time, one individual can be distinguished from another. Assuming relatively stable exogenous parameters as well as assuming that policymakers have received basic analytical training, we can deduce from the available evidence several tailored lessons about what constrains and what drives stellar learning performances in a political environment.

Scientist

Individuals falling under this label have been observed to take on brokerage roles within a specific network, but rarely engage in one particular process for a long period of time. Consequently, they do not have enough history in common with other network members to ensure that their perspective is heard. One communication barrier is that their sequential reasoning process (Type 2) is slow in comparison to that of advisors, whose specialist knowledge about the issue domain enables them to engage in heuristic reasoning (Type 1), much like specialists. Then again, this open-mindedness allows scientists to detect cognitive conflicts. Thus, scientists benefit the most from random contacts with advocate advisors, who understand their academic culture and maintain links to a coalition’s core membership. Their ability to think critically qualifies them to review an advisor’s action and highlight judgmental errors, whereas their general knowledge allows them to advise advisors on how to address novel problems. Scientists in advocacy positions do not actively engage in a policy process at the time but are still credited with expertize. In contrast to scientists in brokerage positions, they are more likely to be affected through inter- or intra-coalition learning. In an adversarial setting this can undermine their reputation as a neutral expert, as policy entrepreneurs target advocates through intra-coalition learning, e.g., referencing their expertize to justify a specific cause—framing them either as an ally or opponent.

Advisor

Whereas advisors and scientists have a similar academic background, the former has acquired more specialist knowledge through experience. In contrast to specialists, advisors have been trained in critical thinking. Like specialists they rely on their innate belief structure to quickly arrive at satisfying outcomes. This can result in analytical biases (Type 1), which make it less likely for them to inhabit a brokerage position. These fast and frugal heuristics are not mental flaws, but rather they are necessary tools for good decision making, guiding time-constrained searches among alternatives in uncertain decision situations. The best choice under uncertainty is not the optimal decision. Uncertainty limits optimization. Given the lack of time, the search ends when a satisfactory choice can be made. Heuristics, despite their frugal nature, can be very accurate compared to classical algorithmic computation, because one’s utility is being maximized. The accuracy of an intuitive decision depends on whether the evolved mental habit suits the contextual situation in which the action is being executed. Unlike specialists, advisors find it easier to detect cognitive conflicts, using their advanced logical reasoning skills, and to adapt their routinized decision behavior.

Specialist

Compared to novices, specialists have acquired through experience sufficient knowledge about specific issues and tools to satisfactorily address such issues. Like novices they lack advanced training in logical reasoning and are likely to rationalize their judgmental errors. Removing specialists from their domain of expertize or radically changing attributes of the domain would render them equal to a novice. As long as the characteristics of the new domain are similar, cues that link the situations can be used to optimize their response. It is thus not surprising to find specialists at a coalition’s core, given their specialist knowledge and social capital for a particular policy context. However it is expected that specialists are no more than novice advocates when confronted with novel problems and tasks. Specialists use information to rationalize their belief bias. Hence, a direct link between scientists in brokerage positions and entrepreneurs at the core is not beneficial. The latter has to be able to think on their feet and lacks the time to review scientific reports, while the other is less invested in a particular cause and more invested in producing verifiable and generalizable material that describes the bigger picture. However, adversarial settings that force entrepreneurs to attract new resources to their cause also create opportunities for them to regularly engage with advisors in an advocate position, maintaining relationships that can foster understanding and trust. This allows advisors to pass on to entrepreneurs tailored cues for analytical reasoning—cues that they deduce from scientific data received through their random connection with brokers.

Novice

Novices are the slowest learners of them all, due to a lack of specific knowledge and novel problem-solving ability. Since they lack the expertize to successfully engage in Type 1 reasoning, they are more inclined to process information analytically. Given their basic academic training and lack of specialist knowledge, they are the most likely to inhabit the position of an advocate. Depending on their social context, they will be more likely to acquire domain specific or generalizable knowledge. In other words, depending on their education and contact with scientist brokers or specialist entrepreneurs, novice advocates develop a narrow or broad understanding of the policy problem.

Way forward

This discussion is but a first attempt to theoretically distinguish policy learners on the basis of their cognitive ability, cognitive style, and position within a specific learning situation. This review primarily utilizes policy-oriented learning research. Underlying this evidence base is the assumption that subsystems are embedded within a relatively stable political system. Out of this context, brokers can draw resources, such as individuals with the knowledge to facilitate policy learning, in their effort to alter subsystem membership and subsequently the nature of conflict between advocacy coalitions. This is due to the fact that the cited studies are designed to explain policy learning in North America and Western Europe. The discussion lacks evidence that takes into consideration learning situations in which individuals are not exposed to an education that develops logical reasoning skills, but are still exposed to uncertainty and ambiguity. These individuals, including those otherwise categorized as scientists and advisors, are assumably more likely to act on their intuition and less likely to detect and correct cognitive conflicts. Likewise, specialists and advisors require relatively stabile settings to develop specialist knowledge. In short, instability creates novices. This statement and its likely effects have yet to be studied. Further research is necessary before any conclusions can be drawn about policy learning in these contexts.

The review has shown that methodologies are available to systematically measure not only individuals’ positions within a specific learning situation but also their cognitive ability and style. It is now a matter of integrating the available methodologies to help further specify what type of policy learner engages in brokerage or entrepreneurial activities in what learning situation. We recommend using these tools to further explore how we can influence individual and collective learning in a policy context.

At the same time, the discussion reminds us to be wary of our direct or indirect effect on the phenomenon studied. Throughout their research, scientists can form long-term partnerships with advisors and specialists, growing to accept judgmental errors and failing to detect analytical biases (Cohen, 2006). It is thus important that their activities are frequently exposed to blind reviews that involve peers that operate within a different academic paradigm (Lakatos, 1970). Equally important is the development of “multi-analytic competencies” (Weible et al., 2012). Hence, we do need the network of scientists that sit in their ivory tower and reflect on the ontological and epistemic beliefs that bias academic advice, as much as we need advisors to channel this advice to specialists and novices in a specific subsystem. Advisors act as an intermediary between the watchful ivory tower and the specialists on the ground, who are the first to know of potential problems. This system of checks and balances depends on a common understanding of concepts, a shared language so to speak. Advisors are in the best position to learn this language due to their education and experiences. So, yes we all have different learning abilities. However, in a world in which nothing is certain, nobody trumps the other. All types of learners can fill an essential role in problem-solving.

Data availability

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.