## Abstract

While judging their sensory environments, decision-makers seem to use the uncertainty about their choices to guide adjustments of their subsequent behaviour. One possible source of these behavioural adjustments is arousal: decision uncertainty might drive the brain’s arousal systems, which control global brain state and might thereby shape subsequent decision-making. Here, we measure pupil diameter, a proxy for central arousal state, in human observers performing a perceptual choice task of varying difficulty. Pupil dilation, after choice but before external feedback, reflects three hallmark signatures of decision uncertainty derived from a computational model. This increase in pupil-linked arousal boosts observers’ tendency to alternate their choice on the subsequent trial. We conclude that decision uncertainty drives rapid changes in pupil-linked arousal state, which shape the serial correlation structure of ongoing choice behaviour.

## Introduction

In perceptual and sensory-motor tasks, humans and animals behave as if they make use of decision uncertainty—the probability that a choice is correct, given the sensory evidence^{1,2,3}. Theoretical accounts postulate that decision uncertainty should shape subsequent decision processing and, thereby, subsequent choice behaviour^{1,4,5}. But how decision uncertainty is transformed into subsequent behavioural adjustments has, so far, remained elusive.

One prominent idea is that the brain broadcasts uncertainty signals across brain-wide neural circuits via low-level arousal systems^{4,6,7}. Arousal systems might be driven by uncertainty^{4,7,8,9,10,11}, and they profoundly shape the global state of the brain through the action of modulatory neurotransmitters^{12,13,14}. Uncertainty-dependent changes in global brain state, in turn, might translate into adjustments of choice behaviour. The goal of our study was to investigate whether arousal (1) reflects decision uncertainty in a perceptual choice task; and (2) predicts changes in subsequent choice behaviour.

Changes in central arousal state (as assessed by various measures of cortical dynamics) are tightly coupled to fluctuations in pupil diameter under constant luminance^{13,15,16,17,18}. We here built on this connection and monitored pupil diameter as a proxy for central arousal state. We used a model based on statistical decision theory, illustrated in Fig. 1, in which decision uncertainty is defined as the probability a choice is correct, given the available evidence^{1,19}. This operationalization of decision uncertainty obviates the need for subjective confidence reports^{5}, bridging to the insight from animal physiology that neurons in a number of brain regions encode decision uncertainty, as defined in Fig. 1 (refs 2, 20, 21, 22).

The model assumes that observers base their judgment of each stimulus on a noisy decision variable, sampled from a distribution that depends on the identity and strength of the stimulus (Fig. 1a). Two-alternative forced choice tasks entail comparing this decision variable with a decision bound. When the decision variable happens to fall on the wrong side of the bound, errors occur. This happens more often for weaker stimuli, because the distributions corresponding to the two possible stimuli show higher overlap (Fig. 1b). A monotonic function of the distance between the decision variable and the bound is a metric of decision confidence; uncertainty is its complement^{2,19,23} (Fig. 1a and Methods).

This model predicts three signatures of decision uncertainty^{2,19}: (1) uncertainty decreases with evidence strength for correct choices (blue line in Fig. 1c) but, counter-intuitively, increases with evidence strength for incorrect choices (red line in Fig. 1c); (2) uncertainty predicts a monotonic decrease in choice accuracy from 100 to 50% (Fig. 1d); (3) higher uncertainty predicts lower choice accuracy, even for the same evidence strength (Fig. 1e). The opposite, monotonic scaling of uncertainty with evidence strength for correct and error trials (Fig. 1c) also emerges from a variety of dynamic decision-making models, including race models^{2}, Bayesian attractor models^{24}, and biophysically detailed circuit models of cortical dynamics^{25,26}.

We systematically manipulated the strength of sensory evidence and tested whether pupil responses exhibited the three signatures derived above. We then quantified the predictive effects of pupil-linked arousal on subsequent behaviour in terms of the key elements of the perceptual decision process: response time (RT), perceptual sensitivity, lapse rate, and choice bias. Choice bias was decomposed into an overall bias for one choice, and a serial bias dependent on the history of previous choices or stimuli. We found a predictive effect of pupil-linked arousal responses on serial choice bias.

## Results

### Pupil responses reflect decision uncertainty

Twenty-seven human observers performed a two-interval forced choice visual motion coherence discrimination task (Fig. 2a and Methods). We applied motion energy filtering^{27} to the stochastic random dot motion stimuli, yielding a more fine-grained estimate of the decision-relevant sensory evidence contained in the stochastic stimuli than the nominal level of motion coherence (Fig. 2b,c and Methods). The absolute value of this sensory evidence served as a single-trial measure of evidence strength (Fig. 2b). As expected, stronger evidence yielded higher choice accuracy and faster responses (Fig. 2d and Supplementary Fig. 2a).

In line with previous work^{19}, RT exhibited all three signatures of decision uncertainty derived in Fig. 1 above (Fig. 2e and Supplementary Fig. 1b,c). This was true despite the interrogation protocol^{28}, in which the test stimulus had a fixed duration, its offset prompted the choice, and observers were instructed to maximize accuracy without speed pressure (response deadline was 3 s after test offset). Specifically, RT decreased with evidence strength on correct trials but increased with evidence strength on errors (Fig. 2e). Further, RT predicted accuracy over a wide range, but not below 50% (Supplementary Fig. 1b), indicating that RT reflected decision uncertainty rather than error detection^{2}. We next assessed whether decision uncertainty also affected pupil-linked arousal.

The pupil dilated during decision formation, peaking just after the choice (button press) as observed in previous work^{29}, and then dilated again after feedback (Fig. 3a). Between these two peaks, dilation amplitudes diverged between different conditions, as predicted by decision uncertainty (compare with Fig. 1c): Pupil responses were smallest after correct decisions based on strong evidence, they were overall larger after errors than correct choices, and largest after errors made on trials with strong evidence (Fig. 3a).

To quantify the temporal evolution of uncertainty scaling in the pupil, we regressed baseline-corrected pupil time courses against each trial’s evidence strength, separately for correct and error trials. From choice onwards, pupil dilation scaled positively with evidence strength on error trials, and negatively on correct trials (Fig. 3b,c and Supplementary Fig. 3a). In other words, the scaling of the pupil response with evidence strength diagnostic of decision uncertainty emerged in the interval between choice and feedback. Consequently, this uncertainty scaling was not a response to the external information about choice correctness provided by the external feedback, but rather reflected internal decision-related computations as described in Fig. 1. For simplicity, we refer to the single-trial pupil dilation averaged across the 250 ms interval before feedback as ‘pupil response’ in the following.

The pupil response also exhibited the other two signatures of decision uncertainty predicted by the model in Fig. 1. Larger pupil responses were accompanied by an overall lower choice accuracy (Fig. 3e and Supplementary Fig. 3c), and psychophysical sensitivity was lower on trials with a larger pupil response (Fig. 3d and Supplementary Fig. 3b). Specifically, the pupil response did not predict choice accuracy below 50%, suggesting that it did not signal the detection of errors (Supplementary Fig. 3c).

The scaling of the pupil response with decision uncertainty was not inherited from the analogous scaling of RT, but was also present after first removing (via linear regression) the trial-to-trial variations accounted for by RT (Supplementary Fig. 3d–f). Indeed, trial-to-trial correlations between pupil responses and RTs were generally small (Pearson correlation, average *r*: 0.087 range: −0.042 to 0.302, for log-transformed RT). For all subsequent analyses reported in this paper, we removed RT-fluctuations from the trial-to-trial fluctuations of single-trial pupil responses via linear regression (see Methods).

### Pupil-linked arousal alters subsequent choice behaviour

We proceeded to test whether uncertainty-related pupil responses predicted changes in subsequent choice behaviour. It has been proposed that arousal signals control various aspects of learning and decision-making^{4,6,7,8,11}. In the context of our task, the choice parameters of interest were perceptual sensitivity (measured as the slope of the psychometric function, Supplementary Fig. 4a), lapse rate (measured as the vertical distance of the asymptotes of the psychometric function from 0 or 1, Supplementary Fig. 4a), bias (measured as the horizontal shift of the psychometric function, Supplementary Fig. 4a) and RT. For RT, we focussed on increases after error trials, an effect referred to as post-error slowing^{30}, which was found to be modulated by pupil-linked arousal in a speeded RT task^{31}. Choice bias was further decomposed into two parameters: overall bias (that is, a general tendency towards one choice option, averaged across the entire experiment, Supplementary Fig. 4b) and serial bias (that is, a local, choice history-dependent tendency towards one option that becomes evident when conditioning the psychometric function on the preceding choice, Supplementary Fig. 4c (refs 32, 33, 34)). Because in our task (as common in laboratory choice tasks), the sensory evidence was independent across trials, any serial bias was maladaptive, reducing observers’ performance below the optimum they could achieve given their perceptual sensitivity.

The pupil response predicted a reduction of serial choice bias (Fig. 4a and Supplementary Fig. 5). When a choice was followed by a small pupil response, observers tended to repeat this choice on the next trial; when the previous pupil response was large, this serial bias was abolished (Fig. 4a). This predictive effect was similar for correct and error trials (Supplementary Fig. 6a). An analogous pattern of predictive effects was observed when binning by previous trial RT: Fast, but not slow, RTs were followed by a tendency to repeat the previous choice (Fig. 4f and Supplementary Fig. 6b).

The pupil response predicted none of the other choice parameters on the next trial (assessed by one-way repeated-measures analysis of variance (ANOVA)), neither overall choice bias (signed overall bias: F_{(2,52)}=0.939, *P*=0.398, Bf_{10}=0.221; absolute value of overall bias: F_{(2,52)}=1.817, *P*=0.173, Fig. 4b), nor perceptual sensitivity (F_{(2,52)}=1.936, *P*=0.155, Fig. 4c), nor lapse rate (F_{(2,52)}=2.213, *P*=0.120, Fig. 4d), nor RT (overall RT: F_{(2,52)}=3.232, *P*=0.048, Bf_{10}=1.207; post-error slowing: F_{(2,52)}=2.056, *P*=0.138, Fig. 4e). Variations in RT, likewise, did not predict a change in any of the other parameters of the decision process (Fig. 4g–j, all *P*>0.05). The overall pattern of results implies that observers did not simply act more randomly after large pupil responses or RT. Random button presses would have reduced sensitivity, in other words, decreased the slope of the psychometric function, contrary to our observations (Fig. 4c,h). Rather, the pattern of results implies that, after large pupil responses or RT, observers’ tendency towards one or the other choice became less history-dependent.

In sum, large pupil responses and slow RTs were neither followed by improved processing of sensory evidence (a common effect of attention^{35}), nor a change in overall response bias. Large pupil responses and slow RTs were followed by only minor (and statistically not significant) changes in stimulus-independent lapses as well as small adjustments in speed-accuracy trade-off, as observed after response conflict, errors, or large pupil responses in speeded RT tasks^{31,36,37}. The weak effect on post-error slowing might be due to the use of an interrogation protocol in our study, which did not require observers to optimize their speed-accuracy trade-off^{28}. However, both RT and pupil-linked arousal had a robust effect on serial choice bias, reducing an overall repetition bias that predominated across the group of observers. This effect of both uncertainty-related measures on the serial correlation structure of choice behaviour has so far been unknown. We therefore proceeded to model and comprehensively quantify this effect at the level of individual observers.

### Pupil-linked arousal predicts choice alternation

To this end, we extended a previously established regression model of serial choice biases^{33} with pupil- and RT-dependent modulatory effects. The basic model (that is, without modulatory terms) quantified the impact of the previous seven choices and stimuli on the current choice bias in terms of linear combination weights (Fig. 5a, see Methods and ref. 33). We added to this model multiplicative interaction terms, that quantified how much the effect of previous stimuli and choices was modulated by either pupil response or RT on those same trials (Fig. 5a). Simultaneously modelling the effects of both pupil responses and RT enabled us to estimate their independent impact on serial choice bias; we found the same results when fitting a separate regression model for each modulatory variable (Supplementary Fig. 7).

The model fits revealed robust, and idiosyncratic, patterns of serial choice biases in most observers (Fig. 5c,d; see Supplementary Fig. 2b,c for individual sessions). As expected, the contribution of past stimuli and choices to current behaviour was strongest when sensory evidence was weak and decayed strongly with evidence strength (Fig. 5b). The weight of the immediately preceding choice was generally stronger than the weight of the previous stimulus (Fig. 5d). The effect of previous choices lasted up to seven trials in the past (corresponding to about 60 s, Fig. 5c), but had the largest absolute magnitude on the preceding trial (Fig. 5c, grey dashed line). There was large inter-individual variability in choice weights (Fig. 5c,d). While the majority of observers systematically repeated their choices (purple symbols; 12 significant at *P*<0.05), a good fraction tended to alternate their choices (orange symbols; 7 significant at *P*<0.05).

Observers’ serial choices biases were unrelated to the (small) serial correlations between stimuli. The transition probabilities between stimulus categories (that is, s2>s1 or s2>s1) were close to 0.5 (range across observers: 0.475–0.508), and did not correlate with individual choice weights (Pearson correlation *r*=0.010, *P*=0.960, Bf_{10}=0.149) or stimulus weights (Pearson correlation *r*=−0.176, *P*=0.381, Bf_{10}=0.217). Likewise, the auto-correlation of absolute motion coherence differences (that is, absolute levels of evidence strength) was close to 0 (range across observers: −0.061 to 0.028) and did not correlate with individual choice weights (Pearson correlation *r*=0.123, *P*=0.541, Bf_{10}=0.179) or stimulus weights (Pearson correlation *r*=−0.142, *P*=0.480, Bf_{10}=0.190).

Critically, pupil responses and RT both negatively interacted with the effect of previous choices (Fig. 5e), in line with the observation that large pupil responses or long RTs were followed by less choice repetition (Fig. 4a,f). By contrast, neither pupil responses nor RT interacted with the effect of the previous stimulus (Fig. 5e). Pupil responses beyond one trial in the past, as well as baseline pupil diameter on the current trial, did not predict a modulation of serial biases (Supplementary Fig. 8). Moreover, these results were not accounted for by trial-to-trial variations in trial timing or the passage of time between trials (Supplementary Fig. 9).

The pupil response after feedback did not contain information predictive of serial choice bias, beyond the information already present during the pre-feedback interval. The post-feedback pupil responses similarly predicted modulation of serial choice biases, but no longer did so when removing (via linear regression) variance explained by pre-feedback pupil responses from the post-feedback pupil signal (Supplementary Fig. 10).

While the modulatory effects associated with pupil responses and RT were both negative on average, such an overall reduction of the group-level repetition bias (Fig. 4a,f) might be due to two alternative scenarios at the level of individual observers: either a reduction of each observer’s intrinsic serial choice bias for repetition or alternation (referred to as ‘bias reduction’ hereafter); or, alternatively, a general boost of choice alternation, regardless of the observer’s intrinsic serial bias (referred to as ‘alternation boost’). We quantified intrinsic serial bias as each observer’s choice weight (that is, the main effect of the previous on the current choice estimated by our model). The bias reduction scenario predicts a negative correlation between choice weights and modulation weights across observers. The alternation boost scenario predicts negative individual modulation weights for all observers, independently of their corresponding choice weights (that is, no correlation).

The analysis of these individual behavioural patterns revealed dissociable effects of pupil-linked arousal and RT (Fig. 5f,g). Modulation weights for the pupil were negative for most observers, irrespective of their individual choice weight. When splitting all 27 observers into ‘alternators’ and ‘repeaters’ based on the sign of their intrinsic bias (that is, choice weight), we found no correlation between individual modulation and choice weights (Fig. 5f, Pearson correlation *r*=0.017, *P*=0.935, Bf_{10}=0.149). Further, the modulation weights were negative for both subgroups, and not significantly different between them (Fig. 5g). These observations are consistent with the idea that pupil-linked arousal generally boosted observers’ tendency to alternate their choice on the next trial.

The alternation boost scenario for pupil responses was further supported by a striking contrast to RT-linked modulations, which were in line with the bias reduction scenario. The RT-linked modulation weights exhibited a strong negative correlation with individual choice weights (Fig. 5f, Pearson correlation *r*=−0.634, *P*<0.001, Bf_{10}=76.359), were negative only for the group of repeaters, and differed significantly between alternators and repeaters (Fig. 5g). Correspondingly, the correlations with individual choice weights were significantly different for pupil- and RT-modulation weights (Fig. 5f). Moreover, RT-dependent bias reduction was most pronounced after incorrect choices, whereas the pupil-dependent alternation boost was most pronounced after correct choices (Supplementary Fig. 11).

In sum, the modulatory effects associated with post-decision pupil-linked arousal and RT both shaped the serial correlation structure of choices, but in distinct ways: pupil-linked arousal generally promoted choice alternation, regardless of the observer’s intrinsic bias, whereas RT-linked processes generally reduced observers’ intrinsic bias.

## Discussion

Decisions about an observer’s sensory environment do not only depend on the momentary sensory input but also on the behavioural context^{38}. One such contextual factor is the history of preceding choices and stimuli, which robustly biases even highly trained decision-makers^{33}. Although such serial choice biases were first identified in psychophysical tasks about a century ago^{32}, their determinants have remained poorly understood. Previous treatments of serial choice biases have conceptualized experimental history as sequences of binary external events (past stimulus identities, choices, or feedback)^{33,39}. We here established that these serial biases were also modulated by the decision-maker’s pupil-linked arousal state on the previous trial, which, in turn, reflected the uncertainty about the observer’s choice.

Several important features of our approach allowed us to move beyond previous work linking human pupil dynamics to uncertainty and performance monitoring. First, different from most previous studies, we here unravelled the temporal evolution of uncertainty information in the pupil response, enabling inferences about not only the existence, but also the time course of this information (see ref. 40 for a similar approach). Second, the model-based definition of decision uncertainty we used helped dissociate decision uncertainty from error detection, which has previously been linked to pupil dilation^{41}. In a two-choice task, a signal encoding decision uncertainty should predict behavioural performance over a range from 100% to 50% correct (corresponding to 50% for the maximum uncertainty signal, or larger when encoding is imprecise). By contrast, an error detection signal should predict performance over the range 100% to 0% correct^{2}. Our measurements were more consistent with decision uncertainty than error detection (Supplementary Fig. 3c). Third, in our task, decision uncertainty critically depended on internal noise (the primary source of the variance in Fig. 1a). By contrast, previous studies linking uncertainty to pupil dynamics^{9,10,40,42} have used tasks in which the primary source of uncertainty was in the observers’ environment. Last, in contrast to most previous pupillometry studies^{29,42,43} we comprehensively quantified the predictive effects of pupil-linked arousal on the parameters of choice beyond the current trial, thereby complementing recent work on the effects of pupil-linked arousal on learning^{9,40}. Taken together, our results critically advance the understanding of how internal decision uncertainty is encoded in pupil-linked arousal in humans, in a way that builds a direct bridge to single-unit recording studies of decision uncertainty in animals^{2,20,21,22}.

The neural sources of task-evoked pupil responses at constant luminance are not yet fully identified^{44}, but mounting evidence points to the noradrenergic locus coeruleus (LC)^{45,46,47} (a core component of the brain’s arousal system^{11}) as well as the superior and inferior colliculi^{48}. Microstimulation of all three structures triggers pupil dilation^{45}. Among these structures, activity of the LC (spontaneous or evoked by electrical stimulation) is followed by pupil dilation at the shortest latency^{45}. The LC also has widespread, modulatory projections to the cortex implicated in regulating central arousal^{11}. Dopaminergic and cholinergic systems, which are closely connected with the LC^{49}, are likewise implicated in central arousal state^{13} and may also contribute to task-evoked pupil responses.

We propose that decision-makers’ uncertainty about their choices might shape serial choice biases by recruiting pupil-linked neuromodulatory systems. Frontal brain regions encoding decision uncertainty send descending projections to several of these systems^{11,49}, which in turn project to large parts of the cortex, including networks of regions involved in perceptual inference and decision-making^{50}. Neuromodulators like noradrenaline can profoundly alter the dynamics and topology of cortical networks^{13,15,51,52}. Thus, these brainstem arousal systems might be in an ideal position to transform variations in decision uncertainty into adjustments of choice behaviour^{4,7}.

The behavioural effect of pupil-linked arousal might be explained by at least two (not mutually exclusive) scenarios. First, arousal responses might promote choice alternation at the level of response preparation, by altering the state of the motor system^{53}. Second, the arousal response might modulate the decision stage—specifically the dynamic updating of beliefs about the upcoming evidence, for example by shifting the criterion (assumed to be constant in signal detection theory, Fig. 1) from one choice to the next. When this criterion is shifted in the direction opposite to the last choice, alternation ensues. In line with these ideas, changes in pupil-linked arousal state can indeed translate into specific behavioural effects^{15,29}, presumably by interacting with selective cortical circuitry^{54}.

Our current observations are not easily reconciled with existing theoretical accounts of the impact of phasic arousal on decision-making. One account posits that threshold crossing of the decision variable triggers phasic noradrenaline release, facilitating the translation of the decision into a behavioural response^{11}. In contrast to our observations, this framework focuses on functional effects of phasic arousal within the same trial, rather than subsequent ones, and it predicts improvements in sensitivity and/or RT^{55}, rather than changes in bias. Other accounts have proposed that phasic noradrenaline release facilitates a ‘network reset’^{56}, enabling the transition of neural decision circuits to a new state^{8}. Our group-level finding that high pupil-linked arousal reduces serial biases might be interpreted as the discarding of post-decisional activity traces due to network reset^{57,58}. However, our analysis of individual choice patterns revealed that pupil-linked arousal boosted alternation also in those observers who already exhibited a tendency to alternate their choices, which is not easily reconciled with the network reset idea.

Previous theories of arousal and neuromodulation have coarsely distinguished between two timescales of arousal fluctuations: tonic fluctuations over the course of seconds to hours, and phasic responses on a sub-second timescale, time-locked to rapid cognitive acts^{7,8,11}. Changes in tonic arousal occur spontaneously^{13,59}, and might also track changes in task utility or uncertainty^{7,9,10,11}. Pupil-linked changes in tonic arousal strongly shape the operating mode of cortical circuits, including early sensory cortices, on slow timescales^{13}. Phasic pupil-linked arousal responses, on the other hand, predict behaviour related to the same transient cognitive processes that drive them^{29,42,60}. The uncertainty-linked pupil responses we identified here built up slowly after choice and predicted choice behaviour several seconds later. Thus, our current results suggest that pupil-linked arousal systems are driven by, and interact with, cognitive processes also at intermediate timescales; faster than tonic arousal, but more sustained than task-evoked phasic responses.

The dissociation between pupil- and RT-linked modulatory effects (Fig. 5f and Supplementary Fig. 11) on serial choice bias suggests that decision uncertainty signals were propagated along distinct central neural pathways, one linked to pupil responses and the other to RT, which then shaped serial choice biases in different ways. Even if the same uncertainty signals fed into these pathways, they might have become decoupled through independent internal noise. Specifically, it is tempting to speculate that the pupil-linked alternation boost reflected neuromodulator release from brainstem centres (such as noradrenaline from the LC^{58}), whereas RT-linked bias reduction was driven by frontal cortical areas involved in explicit performance monitoring and top-down control (such as anterior cingulate cortex)^{36,61,62}. Top-down effects of prefrontal cortex on decision-making^{36,63} are commonly associated with explicit strategic effects that are adaptive within the experimental task. Indeed, the RT-linked modulation of serial bias was adaptive, in that it generally reduced observers’ intrinsic serial bias. By contrast, pupil-linked arousal modulated serial choice patterns in a way that was maladaptive for part of the observers (the alternators). This finding might be related to the observation that maladaptive serial choice biases remain prevalent even in highly trained observers who know the statistics of the task^{32,33}. Taken together, the dissociation between pupil- and RT-linked effects suggest that serial choice biases result from a complex interplay between low-level, pupil-linked arousal systems and higher-level systems for strategic control. Future studies should pinpoint the neural systems underlying these distinct effects, as well as their interactions^{58}.

In conclusion, our study identified decision uncertainty as a high-level driver of phasic arousal, and it uncovered a role of this pupil-linked arousal response in shaping the dynamics of serial choice biases—a pervasive but often ignored characteristic of human decision-making. These insights shed new light on the link between decision uncertainty, pupil-linked arousal state, and serial dependencies in decision-making. They set the stage for further investigations into the neural bases of arousal-dependent modulations of serial choice behaviour.

## Methods

### Operationalizing decision uncertainty

In signal detection theory, a decision variable is drawn on each trial from a normal distribution *N*(*μ*, *σ*) with corresponding to that trial’s sensory evidence and reflecting the internal noise. In Fig. 1, we used the range of single-trial motion energy values [−6, 6] as our *μ*. We estimated *σ* from the data using a probit psychometric function fit on data combined across observers. The probit slope *β*=0.367, where its inverse *σ*=2.723 reflected the standard deviation of the distribution. The decision bound was set to , reflecting an observer without overall choice bias. The two pairs of distributions in Fig. 1 were generated using *μ*=−1 and *μ*=1 for weak evidence, and *μ*=−4 and *μ*=4 for strong evidence. To calculate the relationship between evidence strength and decision uncertainty (Fig. 1c), we simulated a normal distribution of for each level of evidence strength, with *μ*=[0,6] and *σ*=2.723. Since these uncertainty computations are symmetrical with respect to choice identity, we visualized only the pattern corresponding to *μ*>0 (stimulus B in Fig. 1a). All samples from such a distribution were split into correct and error parts based on their position with respect to the decision bound *c*. For each combination of evidence strength and choice, the average uncertainty level is

where *f* is the cumulative distribution function of the normal distribution

which transforms the distance between dv and *c* into the probability of a correct response^{21}.

We simulated ten million trials based on the range of evidence in the data, and for each we computed a binary choice, the corresponding level of decision uncertainty, and the accuracy of the choice. Figure 1c–e visualizes the relationship between evidence strength, uncertainty and choice accuracy in these simulated data.

### Participants and sample size

Twenty-seven healthy human observers (10 male, aged 23±5.2 years) participated in the study. The ethics committee at the University of Amsterdam approved the study, and all observers gave their informed consent. We included all observers in each analyses presented in the paper. Each observer participated in one practice session and five main experimental sessions, each of approximately two hours and comprising 500 trials of the task. The number of observers was selected to allow for robust analyses of individual differences, as in previous pupillometry work from our laboratory^{29}, and the total number of trials per observer was chosen to allow for robust psychometric function fits and detection of subtle changes in the fit parameters.

### Task and procedure

Observers performed a two-interval forced choice motion coherence discrimination task at constant luminance (Fig. 2a). Observers judged the difference in motion coherence between two successively presented random dot kinematograms (RDKs): a constant reference stimulus (70% motion coherence) and a test stimulus (varying motion coherence levels specified below). The intervals before, in between, and after (until the inter-trial interval) these two task-relevant stimuli had variable duration (numbers in Fig. 2a) and contained incoherent motion. A beep (50 ms, 440 Hz) indicated the onset of each (test and reference) stimulus. After offset of the test stimulus, observers had 3 s to report their judgment (button press with left or right index finger, counterbalanced across observers). After a variable interval (1.5–2.5 s), a feedback tone was played (150 ms, 880 or 200 Hz, feedback-tone mapping counterbalanced across observers). Dot motion was stopped 2–2.5 s after feedback, with stationary dots indicating the inter-trial interval, during which observers were allowed to blink their eyes. Observers self-initiated the next trial by button press (range of median inter-trial intervals across observers: 0.68–2.05 s).

The difference between motion coherence of test and reference was taken from three sets: easy (2.5, 5, 10, 20, 30), medium (1.25, 2.5, 5, 10, 30) and hard (0.625, 1.25, 2.5, 5, 20). All observers started with the easy set. We switched to the medium set when their psychophysical thresholds (70% accuracy defined by a cumulative Weibull fit) dropped below 15%, and to the hard set when thresholds dropped below 10%, in a given session.

Motion coherence differences were randomly shuffled within each block. We applied a counterbalancing scheme ensuring that within a block, each stimulus category (s2>or < s1) was followed by itself or its opposite equally often^{64}. The algorithm generated sequences of 53 trials, of which the first 50 were used per block.

### Random dot kinematograms

Stimuli were generated using PsychToolbox-3 (ref. 65) and presented on a 22′′ CRT monitor with a resolution of 1024 × 768 pixels and a refresh rate of 60 Hz. A red ‘bulls-eye’ fixation target^{66} of 1.5° diameter was present in the centre of the screen. Dynamic random noise was presented in a central annulus (outer radius 12°, inner radius 2°) around fixation. The annulus was defined by a field of dots with a density of 1.7 dots/degrees^{2}, resulting in 768 dots on the screen in any given frame. Dots were 0.2° in diameter and had 100% contrast from the black screen background. All dots were divided into ‘signal dots’ and ‘noise dots’, whose proportions defined the varying motion coherence levels. Signal dots were randomly selected on each frame, and moved with 11.5° s^{−1} in one of four diagonal directions (counterbalanced across observers). Signal dots that left the annulus wrapped around and reappeared on the other side. Signal dots had a limited ‘lifetime’ and were re-plotted in a random location after being on the screen for four consecutive frames. Noise dots were assigned a random location within the annulus on each frame, resulting in ‘random position’ noise with a ‘different’ rule^{67}. Three independent motion sequences were interleaved^{68}, making the effective speed of signal dots in the display 3.8° s^{−1}.

### Motion energy filtering

Due to the stochastic nature of the dynamic RDKs, the sensory evidence fluctuated within and across trials, around the nominal motion coherence level. To quantify behaviour and pupil responses as a function of the actual, rather than the nominal, single-trial evidence, we used motion energy filtering to estimate those fluctuations^{27}.

Two spatial filters, resembling weighted sinusoids in opposite phase, were defined by

where . The parameters *σ*_{g}=0.05 and *σ*_{c}=0.35 defined the carrier sinusoid and the Gaussian envelope, respectively, in line with the response properties of MT neurons^{69}. The coordinate system (*x*, *y*) was rotated to match the stimulus’ target direction or its 180° opposite. Two temporal filters were defined by

where *k*=60 reflected the envelope of the temporal filters, and *n*_{s}=3 and *n*_{f} =5 controlled the width of the slow and fast filters, respectively^{69}. A pair of spatio-temporal filters in quadrature pair was obtained by *f*_{1}*g*_{1}+*f*_{2}*g*_{2} and *f*_{2}*g*_{1}–*f*_{1}*g*_{2}. We convolved each filter with the single-trial random dot movies. The resulting values were squared, and summed together across the two filters^{27}.

This filtering procedure was performed for each observer’s individual target direction as well as its 180° opposite. To avoid cardinal biases in motion perception, we used the four diagonals as target directions counterbalanced across observers. Outputs of the two filtering operations were subtracted to yield a direction-selective signal over time^{69}. To obtain a single measure of sensory evidence per trial, we averaged overall timepoints within each stimulus interval, and took the difference between motion energy in the first and second interval as our measure of single-trial sensory evidence. Evidence strength was defined by taking the absolute value of this sensory evidence, collapsing over the two stimulus identities (Fig. 2b).

### Pupillometry

Observers sat in a dark room with their head in a chinrest at 50 cm from the screen. Horizontal and vertical gaze position, as well as the area of the pupil, were monitored in the left eye using an EyeLink 1000 desktop mount (SR Research, sampling rate: 1,000 Hz). The eye tracker was calibrated before each block of 50 trials.

Missing data and blinks, as detected by the EyeLink software, were padded by 150 ms and linearly interpolated. Additional blinks were found using peak detection on the velocity of the pupil signal and linearly interpolated. We estimated the effect of blinks and saccades on the pupil response through deconvolution, and removed these responses from the data using linear regression using a procedure detailed in ref. 70. The residual pupil time series were bandpass filtered using a 0.01–10 Hz second-order Butterworth filter, *z*-scored per run, and resampled to 100 Hz. We epoched trials, and baseline corrected each trial by subtracting the mean pupil diameter 500 ms before onset of the reference stimulus.

We included all trials from all five main sessions (that is, excluding the practice session) in the analyses reported in this paper. The time series of consecutive trial-wise stimuli, choices, RTs and pupil responses was necessary for the regression model of serial bias modulation. Observers were well-practiced in the task structure after the practice session. As a consequence, they made few blinks during the trial intervals (on average across observers, only 7.7% of trials contained more than 50% interpolated samples). The percentage of interpolated trials did not correlate with the estimated effect of pupil responses on serial choice bias (*r*=−0.268, *P*=0.175, Bf_{10}=0.369).

### Quantifying pupil time courses and single-trial responses

To characterize the time-course of uncertainty encoding in the pupil response, we regressed across-trial evidence strength onto each sample of the baseline-corrected pupil signal, separately for correct and error trials (Fig. 3b). The design matrix for this regression also included an intercept and three nuisance covariates: (i) log-transformed RTs; (ii) sample-by-sample horizontal gaze coordinates; and (iii) sample-by-sample vertical gaze coordinates. We tested the significance of this regression time course using cluster-based permutation statistics^{71}.

We took the mean baseline-corrected pupil signal during 250 ms before feedback delivery as our single-trial measure of pupil response. Because of the temporal low-pass characteristics of the sluggish peripheral pupil apparatus^{72}, trial-to-trial variations in RT can cause trial-to-trial in pupil responses, even in the absence of amplitude variations in the underlying neural responses. To specifically isolate trial-to-trial variations in the amplitude (not duration) of the underlying neural responses, we removed components explained by RT via linear regression

where **y** was the original vector of pupil responses, **r** was the vector of the corresponding single-trial RTs (log-transformed and normalized to a unit vector), and *T* denoted matrix transpose. The residual **y**′ thus reflected pupil responses, after removing variance explained by trial-by-trial RTs. This residual pupil response was used for all analyses reported in the main text.

### Quantifying post-error slowing

We quantified post-error slowing, for tertiles of previous trial pupil responses, as described in ref. 30. Briefly, we selected those error trials that were both preceded and followed by a correct trial, and subtracted the pre-error RT from the associated post-error RT. This procedure ensured that estimates of post-error slowing could not be driven by error-unrelated, intrinsic fluctuations in RT over the course of a session. Before this subtraction, we removed trial-by-trial evidence strength from RTs using linear regression, to account for the large variations in RT with stronger sensory evidence (Fig. 2d).

### Quantifying the psychometric function

We modelled the psychometric function (Supplementary Fig. 4a) as follows. The probability of a particular response *r*_{t}=1 on trial t was described as

where *λ* and *γ* were the probabilities of stimulus-independent errors (‘lapses’), was the signed stimulus intensity (here: signed sensory evidence as in Fig. 2b), *g*(*x*)=1/(1+e^{−x}) was the logistic function, *α* was perceptual sensitivity, and *δ* was a bias term. The free parameters *γ*, *λ*, *α* and *δ* were estimated by minimizing the negative log-likelihood of the data (using Matlab’s fminsearchbnd). We constrained *γ* and *λ* to be identical, so as to estimate a single, choice-independent lapse rate.

For the quantification of serial choice bias (Supplementary Fig. 5), we binned the data by previous choices and by previous pupil responses or RT. For each of those subsets of trials, we fit the psychometric function (equation (8)) to choices on the subsequent trials. The resulting bias term *δ* was transformed from log-odds into probability by *P*=e^{δ}/(1+e^{δ}). This quantified *P*(*r*_{t}=1) for ambiguous evidence (that is, strength of zero). Collapsing these values across the two-choice options (shown separately in Supplementary Fig. 5) yielded the pooled measure of choice repetition probability in Fig. 4a,f.

### Quantifying perceptual sensitivity using cumulative Weibull function fits

In Fig. 3d and Supplementary Figs 1c and 3b, we fit a cumulative Weibull function to accuracy as a function of evidence strength. The probability of a correct response *c*_{t}=1 on trial *t* was defined as

where *s*_{t} was the absolute evidence strength, *f*(*x*)=(1–e^{−x}) was the cumulative Weibull function, *λ* was the lapse rate, *θ* was the threshold indicating at which level of evidence strength an accuracy of ∼80% is achieved, and *β* was the slope of the cumulative Weibull function. The free parameters *θ*, *β* and *λ* were estimated by minimizing the negative log-likelihood of the data (using Matlab’s fminsearchbnd). Perceptual sensitivity was then defined as 1/*θ*.

### Modelling the modulation of serial choice bias

We modelled the pupil- and RT-linked modulation of serial choice bias by extending an established regression approach^{33}. The basic regression model extended the psychometric function model from equation (8) by means of a history-dependent bias term *δ*_{hist}(**h**_{t}), which was a linear combination of previous stimuli and choices

With

where the bias term *δ*(**h**_{t}) was the sum of the overall bias *δ*′ (see equation (8)) and the history bias , where *ω*_{k} were the weights assigned to each previous stimulus or choice. We here modelled

as a concatenation of the last seven responses and stimuli (see ref. 33 for details). This procedure allowed us to quantify the effect of past trials on current choice processes (Fig. 5c). We convolved every set of seven past trials with three exponentially decaying basis functions^{33}. Positive history weights *ω*_{k} indicated a tendency to repeat the previous choice, or to make a choice that matched the previous stimulus. Negative weights described a tendency to alternate the corresponding history feature.

To model the effect of pupil-linked uncertainty on history biases, we extended this model by adding a multiplicative interaction term , which described the interaction of pupil responses with the choice and stimulus identities at the last seven lags:

where were the history × pupil interaction weights, were the pupil weights and was a concatenation of the last seven pupil responses. The term acted as a nuisance covariate. To simultaneously model the effects of pupil responses and log-transformed RT, our model also included RT and history × RT terms, generated using the same procedure.

All parameters were fit using an expectation maximization algorithm. To assess whether individual observers were significantly influenced by their experimental history, we ran 1,000 iterations of permuting all trials, fitting the full model, and subsequently comparing the likelihood of the intact model to this null distribution (where permutation nullifies true history effects)^{33}. Confidence intervals for individual regression weights were obtained from a bootstrapping procedure.

### Serial bias and outcome-dependent choice strategies

The history weights for past stimuli and responses allowed us to characterize different decision strategies^{33} (Fig. 5d). Positive weights associated with the previous choice, or the previous stimulus category, indicate a tendency to repeat this previous choice, or to make a choice corresponding to the previous stimulus, respectively. Negative weights correspond to a tendency to alternate previous choice or stimulus. In the left and right triangle of this strategy space, the magnitude of the response weight is larger than the magnitude of the stimulus weight. Consequently, strategies are dominated by the previous choice and can be simply defined as choice alternation (left) or choice repetition (right).

In the upper and in the lower triangle, the magnitude of the stimulus weight is larger than the magnitude of the response weight, so strategies are dominated by the identity of the previous stimulus (which is only known to the observer as a function of their previous response and feedback). In the upper and lower triangle, strategies are thus defined by the sign of the stimulus weight. In the upper triangle stimulus weights are positive, indicating a tendency to repeat the previous stimulus. On a correct trial, previous choice and stimulus are equal and therefore, repeating the previous stimulus is equal to repeating the previous choice (a win-stay strategy). On errors, the previous choice is opposite to the previous stimulus and repeating the previous stimulus is equal to alternating the previous choice (lose-switch strategy). Conversely, in the lower triangle stimulus weights are negative, reflecting a tendency to alternate the previous stimulus. This implies a tendency to alternate the previous choice if the previous choice was correct (win-switch strategy) and a tendency to repeat the previous choice in case of a previous error (lose-stay strategy).

The weights for previous choices and stimuli can easily be combined to obtain weights reflecting a tendency to repeat previous correct or incorrect choices (Supplementary Fig. 6). Specifically, correct weights are defined by choice + stimulus, and error weights by choice—stimulus^{33}. The same holds for modulation weights. This transformation is identical to fitting a model with regressors for previous successes and failures^{39,73}.

### Statistical tests

We used non-parametric permutation testing to test for the group-level significance of individually fitted parameter values (Figs 3 and 5e,g). We randomly switched labels of individual observations either between two paired sets of values, between one set of values and zero, or between two unpaired groups. After repeating this procedure 10,000 times, and computing the difference between the two group means on each permutation, the *P* value was the fraction of permutations that exceeded the observed difference between the means. All *P* values reported were computed using two-sided tests.

In Fig. 4, we split the data into tertiles of pupil response or RT, and computed next trial serial choice bias, signed choice bias, overall choice bias, perceptual sensitivity, lapse rate, RT and post-error slowing in each bin. We used a repeated-measures ANOVA to test for the main effect of bin on each dependent variable. We further used Bayes Factors (Bf), obtained from a Bayesian one-factor ANOVA^{74}, to support conclusions about null effects observed. Bf_{10} quantifies the evidence in favour of the null or the alternative hypothesis, where Bf_{10} < 1/3 or>3 is taken to indicate substantial evidence for H_{0} or H_{1}, respectively. Bf_{10}=1 indicates inconclusive evidence. We similarly computed Bf_{10} for correlations, based on the Pearson correlation coefficient^{75}.

The *P*-value for the difference between the two correlation coefficients (choice weight by pupil modulation weight vs choice weight by RT modulation weight), shown in Fig. 5f, was obtained through permutation testing. To generate a null distribution of no difference, we randomly switched (or not, dependent on a simulated coin flip) each observer’s RT and pupil modulation weights, after which we computed the between-subject correlation between choice weights and pupil modulation weights as well as between choice and RT modulation weights. Repeating this procedure 10,000 times generated a distribution of the difference in correlation, under the null hypothesis of no difference.

### Data availability

All raw and processed data, as well as the code to reproduce all analyses and figures, are available at http://dx.doi.org/10.6084/m9.figshare.4300043.

## Additional information

**How to cite this article:** Urai, A. E. *et al*. Pupil-linked arousal is driven by decision uncertainty and alters serial choice bias. *Nat. Commun.* **8**, 14637 doi: 10.1038/ncomms14637 (2017).

**Publisher’s note:** Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## References

- 1.
Pouget, A., Drugowitsch, J. & Kepecs, A. Confidence and certainty: distinct probabilistic quantities for different goals.

*Nat. Neurosci.***19**, 366–374 (2016). - 2.
Kepecs, A., Uchida, N., Zariwala, H. A. & Mainen, Z. F. Neural correlates, computation and behavioural impact of decision confidence.

*Nature***455**, 227–231 (2008). - 3.
Ma, W. J. & Jazayeri, M. Neural coding of uncertainty and probability.

*Annu. Rev. Neurosci.***37**, 205–220 (2014). - 4.
Meyniel, F., Sigman, M. & Mainen, Z. F. Confidence as Bayesian probability: from neural origins to behavior.

*Neuron***88**, 78–92 (2015). - 5.
Kepecs, A. & Mainen, Z. F. A computational framework for the study of confidence in humans and animals.

*Philos. Trans. R. Soc. Lond. B Biol. Sci.***367**, 1322–1337 (2012). - 6.
Dayan, P., Kakade, S. & Montague, P. R. Learning and selective attention.

*Nat. Neurosci.***3**, 1218–1223 (2000). - 7.
Yu, A. J. & Dayan, P. Uncertainty, neuromodulation, and attention.

*Neuron***46**, 681–692 (2005). - 8.
Dayan, P. & Yu, A. J. Phasic norepinephrine: a neural interrupt signal for unexpected events.

*Netw. Comput. Neural Syst.***17**, 335–350 (2006). - 9.
Nassar, M. R.

*et al.*Rational regulation of learning dynamics by pupil-linked arousal systems.*Nat. Neurosci.***15**, 1040–1046 (2012). - 10.
de Berker, A. O.

*et al.*Computations of uncertainty mediate acute stress responses in humans.*Nat. Commun.***7**, 10996 (2016). - 11.
Aston-Jones, G. & Cohen, J. D. An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performance.

*Annu. Rev. Neurosci.***28**, 403–450 (2005). - 12.
Lee, S.-H. & Dan, Y. Neuromodulation of brain states.

*Neuron***76**, 209–222 (2012). - 13.
McGinley, M. J.

*et al.*Waking state: rapid variations modulate neural and behavioral responses.*Neuron***87**, 1143–1161 (2015). - 14.
Harris, K. D. & Thiele, A. Cortical state and attention.

*Nat. Rev. Neurosci.***12**, 509–523 (2011). - 15.
Eldar, E., Cohen, J. D. & Niv, Y. The effects of neural gain on attention and learning.

*Nat. Neurosci.***16**, 1146–1153 (2013). - 16.
Reimer, J.

*et al.*Pupil fluctuations track fast switching of cortical states during quiet wakefulness.*Neuron***84**, 355–362 (2014). - 17.
Vinck, M., Batista-Brito, R., Knoblich, U. & Cardin, J. A. Arousal and locomotion make distinct contributions to cortical activity patterns and visual encoding.

*Neuron***86**, 740–754 (2015). - 18.
McGinley, M. J., David, S. V. & McCormick, D. A. Cortical membrane potential signature of optimal states for sensory signal detection.

*Neuron***87**, 179–192 (2015). - 19.
Sanders, J. I., Hangya, B. & Kepecs, A. Signatures of a statistical computation in the human sense of confidence.

*Neuron***90**, 499–506 (2016). - 20.
Komura, Y., Nikkuni, A., Hirashima, N., Uetake, T. & Miyamoto, A. Responses of pulvinar neurons reflect a subject’s confidence in visual categorization.

*Nat. Neurosci.***16**, 749–755 (2013). - 21.
Lak, A.

*et al.*Orbitofrontal cortex is required for optimal waiting based on decision confidence.*Neuron***84**, 190–201 (2014). - 22.
Teichert, T., Yu, D. & Ferrera, V. P. Performance monitoring in monkey frontal eye field.

*J. Neurosci.***34**, 1657–1671 (2016). - 23.
Hebart, M. N., Schriever, Y., Donner, T. H. & Haynes, J.-D. The relationship between perceptual decision variables and confidence in the human brain.

*Cereb. Cortex***26**, 118–130 (2014). - 24.
Bitzer, S., Bruineberg, J. & Kiebel, S. J. A Bayesian attractor model for perceptual decision making.

*PLoS Comput. Biol.***11**, e1004442 (2015). - 25.
Wei, Z. & Wang, X.-J. Confidence estimation as a stochastic process in a neurodynamical system of decision making.

*J. Neurophysiol.***114**, 99–113 (2015). - 26.
Insabato, A., Pannunzi, M., Rolls, E. T. & Deco, G. Confidence-related decision making.

*J. Neurophysiol.***104**, 539–547 (2010). - 27.
Adelson, E. H. & Bergen, J. R. Spatiotemporal energy models for the perception of motion.

*J. Opt. Soc. A***2**, 284–299 (1985). - 28.
Bogacz, R., Brown, E., Moehlis, J., Holmes, P. & Cohen, J. D. The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks.

*Psychol. Rev.***113**, 700–765 (2006). - 29.
de Gee, J. W., Knapen, T. & Donner, T. H. Decision-related pupil dilation reflects upcoming choice and individual bias.

*Proc. Natl Acad. Sci. USA***111**, E618–E625 (2014). - 30.
Dutilh, G.

*et al.*How to measure post-error slowing: a confound and a simple solution.*J. Math. Psychol.***56**, 208–216 (2012). - 31.
Murphy, P. R., van Moort, M. L. & Nieuwenhuis, S. The pupillary orienting response predicts adaptive behavioral adjustment after errors.

*PLoS ONE***11**, e0151763 (2016). - 32.
Fernberger, S. W. Interdependence of judgments within the series for the method of constant stimuli.

*J. Exp. Psychol.***3**, 126 (1920). - 33.
Fründ, I., Wichmann, F. A. & Macke, J. H. Quantifying the effect of intertrial dependence on perceptual decisions.

*J. Vis.***14**, 9 (2014). - 34.
Yu, A. J. & Cohen, J. D. Sequential effects: Superstition or rational behavior?

*Adv. Neural Inf. Process. Syst.***21**, 1873–1880 (2008). - 35.
Ress, D., Backus, B. T. & Heeger, D. J. Activity in primary visual cortex predicts performance in a visual detection task.

*Nat. Neurosci.***3**, 940–945 (2000). - 36.
Botvinick, M. M., Braver, T. S., Barch, D. M., Carter, C. S. & Cohen, J. D. Conflict monitoring and cognitive control.

*Psychol. Rev.***108**, 624 (2001). - 37.
Gao, J., Wong-Lin, K., Holmes, P., Simen, P. & Cohen, J. D. Sequential effects in two-choice reaction time tasks: decomposition and synthesis of mechanisms.

*Neural Comput.***21**, 2407–2436 (2009). - 38.
Gold, J. I. & Shadlen, M. N. The neural basis of decision making.

*Annu. Rev. Neurosci.***30**, 535–574 (2007). - 39.
Abrahamyan, A., Silva, L. L., Dakin, S. C., Carandini, M. & Gardner, J. L. Adaptable history biases in human perceptual decisions.

*Proc. Natl Acad. Sci. USA***113**, E3548–E3557 (2016). - 40.
O’Reilly, J. X.

*et al.*Dissociable effects of surprise and model update in parietal and anterior cingulate cortex.*Proc. Natl Acad. Sci. USA***110**, E3660–E3669 (2013). - 41.
Wessel, J. R., Danielmeier, C. & Ullsperger, M. Error awareness revisited: accumulation of multimodal evidence from central and autonomic nervous systems.

*J. Cogn. Neurosci***23**, 3021–3036 (2011). - 42.
Preuschoff, K., ’t Hart, B. M. & Einhäuser, W. Pupil dilation signals surprise: evidence for noradrenaline’s role in decision making.

*Front. Neurosci.***5**, 115 (2011). - 43.
Lempert, K. M., Chen, Y. L. & Fleming, S. M. Relating pupil dilation and metacognitive confidence during auditory decision-making.

*PLoS ONE***10**, e0126588 (2015). - 44.
McDougal, D. H. & Gamlin, P. D. R. in

*The Senses: A Comprehensive Reference*eds Albright T. D.*et al.*521–536Academic Press (2008). - 45.
Joshi, S., Li, Y., Kalwani, R. M. & Gold, J. I. Relationships between pupil diameter and neuronal activity in the locus coeruleus, colliculi, and cingulate cortex.

*Neuron***89**, 221–234 (2016). - 46.
Murphy, P. R., O’Connell, R. G., O’Sullivan, M., Robertson, I. H. & Balsters, J. H. Pupil diameter covaries with BOLD activity in human locus coeruleus: pupil diameter and locus coeruleus activity.

*Hum. Brain Mapp.***35**, 4140–4154 (2014). - 47.
Varazzani, C., San-Galli, A., Gilardeau, S. & Bouret, S. Noradrenaline and dopamine neurons in the reward/effort trade-off: a direct electrophysiological comparison in behaving monkeys.

*J. Neurosci.***35**, 7866–7877 (2015). - 48.
Wang, C.-A. & Munoz, D. P. A circuit for pupil orienting responses: implications for cognitive modulation of pupil size.

*Curr. Opin. Neurobiol.***33**, 134–140 (2015). - 49.
Sara, S. J. The locus coeruleus and noradrenergic modulation of cognition.

*Nat. Rev. Neurosci.***10**, 211–223 (2009). - 50.
Siegel, M., Engel, A. K. & Donner, T. H. Cortical network dynamics of perceptual decision-making in the human brain.

*Front. Hum. Neurosci.***5**, 21 (2011). - 51.
Polack, P.-O., Friedman, J. & Golshani, P. Cellular mechanisms of brain state-dependent gain modulation in visual cortex.

*Nat. Neurosci.***16**, 1331–1339 (2013). - 52.
Marder, E. Neuromodulation of neuronal circuits: back to the future.

*Neuron***76**, 1–11 (2012). - 53.
de Lange, F. P., Rahnev, D. A., Donner, T. H. & Lau, H. Prestimulus oscillatory activity over motor cortex reflects perceptual expectations.

*J. Neurosci.***33**, 1400–1410 (2013). - 54.
Donner, T. H. & Nieuwenhuis, S. Brain-wide gain modulation: the rich get richer.

*Nat. Neurosci.***16**, 989–990 (2013). - 55.
Cavanagh, J. F., Wiecki, T. V., Kochar, A. & Frank, M. J. Eye tracking and pupillometry are indicators of dissociable latent decision processes.

*J. Exp. Psychol. Gen***143**, 1476–1488 (2014). - 56.
Bouret, S. & Sara, S. J. Network reset: a simplified overarching theory of locus coeruleus noradrenaline function.

*Trends Neurosci.***28**, 574–582 (2005). - 57.
Karlsson, M. P., Tervo, D. G. R. & Karpova, A. Y. Network resets in medial prefrontal cortex mark the onset of behavioral uncertainty.

*Science***338**, 135–139 (2012). - 58.
Tervo, D. G. R.

*et al.*Behavioral variability through stochastic choice and its gating by anterior cingulate cortex.*Cell***159**, 21–32 (2014). - 59.
Steriade, M. Corticothalamic resonance, states of vigilance and mentation.

*Neuroscience***101**, 243–276 (2000). - 60.
Einhäuser, W., Stout, J., Koch, C. & Carter, O. Pupil dilation reflects perceptual selection and predicts subsequent stability in perceptual rivalry.

*Proc. Natl Acad. Sci. USA***105**, 1704–1709 (2008). - 61.
Ebitz, R. B. & Platt, M. L. Neuronal activity in primate dorsal anterior cingulate cortex signals task conflict and predicts adjustments in pupil-linked arousal.

*Neuron***85**, 628–640 (2015). - 62.
Yeung, N., Botvinick, M. M. & Cohen, J. D. The neural basis of error detection: conflict monitoring and the error-related negativity.

*Psychol. Rev.***111**, 931 (2004). - 63.
Miller, E. K. & Cohen, J. D. An integrative theory of prefrontal cortex function.

*Annu. Rev. Neurosci.***24**, 167–202 (2001). - 64.
Brooks, J. L. Counterbalancing for serial order carryover effects in experimental condition orders.

*Psychol. Methods***17**, 600–614 (2012). - 65.
Kleiner, M.

*et al.*What’s new in Psychtoolbox-3.*Perception***36**, 1 (2007). - 66.
Thaler, L., Schütz, A. C., Goodale, M. A. & Gegenfurtner, K. R. What is the best fixation target? The effect of target shape on stability of fixational eye movements.

*Vision Res.***76**, 31–42 (2013). - 67.
Scase, M. O., Braddick, O. J. & Raymond, J. E. What is noise for the motion system?

*Vision Res.***36**, 2579–2586 (1996). - 68.
Roitman, J. D. & Shadlen, M. N. Response of neurons in the lateral intraparietal area during a combined visual discrimination reaction time task.

*J. Neurosci.***22**, 9475–9489 (2002). - 69.
Kiani, R., Hanks, T. D. & Shadlen, M. N. Bounded integration in parietal cortex underlies decisions even when viewing duration is dictated by the environment.

*J. Neurosci.***28**, 3017–3029 (2008). - 70.
Knapen, T.

*et al.*Cognitive and ocular factors jointly determine pupil responses under equiluminance.*PLoS ONE***11**, e0155574 (2016). - 71.
Blair, R. C. & Karniski, W. An alternative method for significance testing of waveform difference potentials.

*Psychophysiology***30**, 518–524 (1993). - 72.
Hoeks, B. & Ellenbroek, B. A. A neural basis for a quantitative pupillary model.

*J. Psychophysiol.***7**, 315–324 (1993). - 73.
Busse, L.

*et al.*The detection of visual contrast in the behaving mouse.*J. Neurosci.***31**, 11351–11361 (2011). - 74.
Rouder, J. N., Morey, R. D., Speckman, P. L. & Province, J. M. Default Bayes factors for ANOVA designs.

*J. Math. Psychol.***56**, 356–374 (2012). - 75.
Wetzels, R. & Wagenmakers, E.-J. A default Bayesian hypothesis test for correlations and partial correlations.

*Psychon. Bull. Rev.***19**, 1057–1064 (2012).

## Acknowledgements

We thank O’Jay Medina for assistance with data collection, all members of the Donnerlab for valuable discussions, and Konstantinos Tsetsos, Jan Willem de Gee, Niklas Wilming, Camile Correa, Florent Meyniel and Sander Nieuwenhuis for helpful comments on the manuscript. We acknowledge computing resources provided by NWO Physical Sciences. This research was supported by the German Academic Exchange Service (DAAD) and G.-A. Lienert Foundation (to A.E.U.) and the German Research Foundation (DFG), SFB 936/A7, SFB 936/Z1, DO 1240/2-1 and DO 1240/3-1, and European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 604102 (Human Brain Project) (to T.H.D.).

## Author information

## Affiliations

### Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany

- Anne E. Urai
- , Anke Braun
- & Tobias H. Donner

### Department of Psychology, University of Amsterdam, Amsterdam 1018 WT, The Netherlands

- Anne E. Urai
- & Tobias H. Donner

### Amsterdam Brain and Cognition (ABC), University of Amsterdam, Amsterdam 1018 WT, The Netherlands

- Tobias H. Donner

## Authors

### Search for Anne E. Urai in:

### Search for Anke Braun in:

### Search for Tobias H. Donner in:

### Contributions

Conceptualization, A.E.U. and T.H.D.; Investigation, A.E.U.; Formal Analysis, A.E.U. and A.B.; Software, data curation and visualization, A.E.U.; Writing, A.E.U. and T.H.D.; Supervision, T.H.D.

### Competing interests

The authors declare no competing financial interests.

## Corresponding authors

Correspondence to Anne E. Urai or Tobias H. Donner.

## Supplementary information

## PDF files

- 1.
### Supplementary Information

Supplementary Figures

- 2.
### Peer Review File

## Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

## About this article

## Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.