Neural Circuit Mechanism of Decision Uncertainty and Change-of-Mind

Decision-making is often accompanied by a degree of confidence on whether a choice is correct. Decision uncertainty, or lack in confidence, may lead to change-of-mind. Studies have identified the behavioural characteristics associated with decision confidence or change-of-mind, and their neural correlates. Although several theoretical accounts have been proposed, there is no neural model that can compute decision uncertainty and explain its effects on change-of-mind. We propose a neuronal circuit model that computes decision uncertainty while accounting for a variety of behavioural and neural data of decision confidence and change-of-mind, including testable model predictions. Our theoretical analysis suggests that change-of-mind occurs due to the presence of a transient uncertainty-induced choice-neutral stable steady state and noisy fluctuation within the neuronal network. Our distributed network model indicates that the neural basis of change-of-mind is more distinctively identified in motor-based neurons. Overall, our model provides a framework that unifies decision confidence and change-of-mind.

changes-of-mind occur as a result of an internal error-correction mechanism 25 , suggesting decision uncertainty plays a role in inducing changes-of-mind 31 . However, the neural mechanism of decision uncertainty (within a single trial or across consecutive ones) and its link to change-of-mind has so far remained ambiguous. In particular, there is no neural circuit model that explains this shared neural mechanism 17 .
Within the studies of perceptual decision confidence/uncertainty and change-of-mind, there are some common findings that have been identified (Supplementary Figs. 1 and 2). Firstly, more difficult tasks, associated with lower (sensory) evidence quality, lead to higher decision uncertainty, which is also associated with lower choice accuracy (Supplementary Fig. 1) 6,32 .
Secondly, higher decision uncertainty is associated with lower evidence quality for correct choices while counter-intuitively associated with better evidence quality for incorrect choices (forming the often observed "<" pattern) (Supplementary Fig.1) 6,11,33,34 . Thirdly, changes-ofmind are more likely to occur when the task is more difficult, and more often accompanied by correcting an initial impending error choice -hence more error-to-correct changes than correct-to-error changes 4,35 (although the difference has been shown to vary in some cases 35 ).
Further, the likelihood of correct changes-of-mind (to the subsequent correct choices) may peak at an intermediate level of task difficulty and then decrease gradually when the task becomes much easier ( Supplementary Fig. 2) 4,35 .
In this work, and to the best of our knowledge, guided by the above findings and related neural data ( Supplementary Fig. 3), we have developed the first neural circuit computational model that can mechanistically quantify and monitor decision uncertainty, which may subsequently cause a change-of-mind, hence unifying the two areas of study. Our multi-layer recurrent network model not only accounts for the abovementioned key characteristics of decision uncertainty 6,10,36 and change-of-mind 4,35 across a wide variety of experiments (of both behavioural and neural data) but also sheds light on their neural circuit mechanisms. In particular, using dynamical systems analysis, we show that change-of-mind occurs due to the presence of a transient choice-neutral stable steady state together with noisy fluctuations within the neuronal network. Interestingly, because our model consists of multiple layers of neural integrators, we found that the reversal of competing neural activity encoding the choices (neural basis for change-of-mind) is more likely to be more distinctive for neurons near the motor execution area, without necessarily requiring a clear reversal of neural activity at more upstream sensory or sensorimotor neurons.

Neural circuit model computes decision uncertainty
We propose a novel neural circuit model that can encode, quantify, and monitor decision uncertainty, which we named the decision uncertainty-monitoring module (Fig. 1a, grey box).
This circuit is built on our previous biologically-motivated neural circuit model of decisionmaking that focuses on sensory evidence accumulation 37 (Fig. 1a).
The uncertainty-monitoring module receives input based on the summed sensorimotor neuronal populations activities ( Fig. 1a and b). In particular, a population of inhibitory neurons ( Fig. 1a, green circle) integrates these summed activities (Fig. 1a, blue and orange pointed arrows; Methods). This neuronal population in turn inhibits a neighbouring excitatory neuronal population that encodes decision uncertainty (Fig. 1a, red circle). Hence, decision uncertainty can be continuously monitored (Fig. 1b, middle). Together, the network structure with these two neuronal populations is reminiscent of a cortical column 38 .
Further, decision uncertainty information from the uncertainty-monitoring module is continuously fed back equally to the sensorimotor neuronal populations (Fig. 1a, light blue box), thus providing, effectively, an excitatory feedback mechanism between the two brain systems, which consequently may affect the final decision outcome, and in some instances, even lead to change-of-mind, as we shall demonstrate below. This feedback loop, as in control theory, provides the key computational basis of linking decision uncertainty and change-ofmind. Without this feedback loop, the model does not exhibit change-of-mind behaviour ( Supplementary Fig. 4). However, it can still encode decision uncertainty and produce the experimentally-observed relationship between decision uncertainty and task difficulty ( Supplementary Fig. 5). In addition, the neural circuit model also has motor-based neuronal populations either located within the same brain region or downstream in the decisionprocessing pathway (Fig. 1a, green box). Inputs to these populations are temporally integrated based on the neural firing rate outputs of the associated sensorimotor neuronal populations  populations. Inhibitory neuronal population (green) receives excitatory input (straight arrows) from output of sensorimotor module while inhibiting the uncertainty-encoding neuronal population (lines with filled circles), which in turn provides excitatory feedback to sensorimotor module. The uncertainty-encoding population receives a constant tonic excitatory input which varies across trials in specific cases (i.e. multi-stage paradigm, see Methods and below). The sensorimotor module consists of two competing (mutually inhibitory) neuronal populations each selective to noisy sensory information (e.g. rightward or leftward random-dot motion stimulus) favouring one of two (e.g. right R or left L) choice options. The motor module, receiving inputs from sensorimotor module, also consist of neural integrators that report the choice made (b) Timecourse of neuronal population firing rates averaged over non-change-of-mind trials with evidence quality, = 25.6% (easy task; solid lines) and = 3.2% (difficult task; dashed lines), where is equivalent to motion coherence in the classic random-dot stimulus. Faster ramping activity (top and bottom panels) with lower uncertainty quantification (middle panel; red) with larger . Colour of activity traces reflects the associated neural populations in (a). To reveal the full network dynamics, the network activities (greyed out) were not reset after a choice was made. (c) Psychometric function used to fit choice accuracy (using a Weibull function, see Methods). (d) Response times for correct (blue) and error (red) responses from the model. In this example, the activation onset times for the inhibitory and uncertainty-encoding neuronal populations are 400 ms and 500 ms after stimulus onset, respectively. Importantly, the (phasic) activity of the uncertainty-encoding neuronal population is higher for trials with higher uncertainty (due to lower evidence quality) (Fig. 1b, middle panel). This rise-and-decay activity around the motor movement onset is consistent with observations from neural recordings in animal and human studies 6,11,25,42 . More specifically, single neuronal firing activity in the orbitofrontal cortex (from rodents) 6,11 , EEG 25 and fMRI 42 recordings in humans exhibited this rise-and-decay pattern in experimental studies of decision-making under uncertainty ( Supplementary Fig. 3), and these activities are higher with higher decision uncertainty. We shall henceforth use this phasic neural activity as an indicator of decision uncertainty monitoring in real-time, and the temporal integral of its neural activity (i.e. area under the curve as a proxy for any downstream neural integrator) as a readout of the decision uncertainty (see Methods). Further, a tonic constant excitatory bias input to the uncertaintyencoding population (Fig. 1a) is required to provide overall excitation. As will be shown below, when trials are sequentially dependant (i.e. a reward is only received when a pair of coupled trials result in two correct choices), this same parameter is linearly varied based on the level of uncertainty in the first trial, influencing the uncertainty level (and response time) of the second trial 43 (see below and Methods).

Model accounts for relationships among decision uncertainty and psychophysics
We next simulate with our network model to replicate the key experimental findings related to decision uncertainty and confidence as discussed in the Introduction. As most of the decision uncertainty and change-of-mind tasks are based on two-choice reaction-time task paradigms, we shall only focus on such paradigms. Our model first replicates choice accuracy decreasing monotonically with decision uncertainty (Fig. 2a), while producing the '<' pattern 6,11,33,34 of decision uncertainty (Fig 2b), in which decision uncertainty is higher for lower (higher) evidence quality in correct (error) choices 6,34 (compared to Supplementary Fig. 1).
This pattern also correlates with the response time pattern in Fig. 1d.
To explain the results in Figs. 2a and b, we map out the neural activity of the uncertaintyencoding population (denoted by the colours in Figs. 2c and d) with respect to the evidence quality and total input to the uncertainty-encoding neuronal population. Based on Figs. 2c and d, it is clear that as long as the total input is high, and there is sufficient time (i.e. long response time -see Fig. 1d) for the uncertainty-encoding population to integrate its input, the uncertainty level will be high, regardless of correct or error responses. From the perspective of the network dynamics, for correct responses with low evidence quality, the inhibition to the uncertaintyencoding population will initially be higher, i.e. lower total input. This leads to an initial weaker excitatory feedback to the sensorimotor neural populations, causing the ramping-up speed of the latter's activity to become slower, which in turn results in a prolonged response time. The longer response time allows the uncertainty-encoding population to have more time to integrate and eventually attains a higher activity level, i.e. encodes higher uncertainty. The activities of the competing sensorimotor populations will also eventually deviate (i.e. have a clear winner), resulting in higher total input (i.e. less inhibition) to the uncertainty-encoding population (moving vertically upwards in Fig. 2c, left side). For correct responses with higher evidence quality, the response times are typically faster (Fig. 1d, blue), and hence allowing for less time for the uncertainty-encoding population to integrate, leading to lower uncertainty activity levels (moving vertically upwards in Fig. 2c, right side; see also Supplementary Fig.   6). However, for error responses, the response times are longer for higher evidence quality  : Uncertainty measure based on averaged peak (peak) or temporal integral (area) of the uncertainty-encoding neuronal population activity (Methods). Error bars are SEM. (c-d) Activity level of uncertainty-encoding population depends on the total input to the uncertainty-encoding population and evidence quality. Uncertainty activity level is normalised (see Methods). (c) Correct responses. Activity of uncertainty-encoding population is higher for correct responses in difficult tasks (lower ) due to prolonged response times (RTs) (Fig. 1d), allowing the uncertainty-encoding population longer time to integrate. See text for more detailed description. (d) Error responses. Activity of uncertainty-encoding population is higher during errors in easier tasks (higher ) due to prolonged RTs (Fig. 1d), allowing the uncertainty-encoding population longer time to integrate. evidence quality (Fig. 1d) (leading to longer integration time for the uncertainty-encoding population) and a higher total input to the uncertainty-encoding population, higher uncertainty level is reached. Hence, the larger difference.

Model accounts for change-of-mind behaviours
Previous studies have shown that change-of-mind during decision-making usually leads to the correction of an impending error 4,35 . Although previous studies have linked change-of-mind to the temporal integration of noisy stimulus 4,35 , we demonstrate that the simulated changeof-mind in our biologically-motivated model is due not only to noise, but more importantly, to the necessity of an excitatory feedback mechanism induced by decision uncertainty When there is no uncertainty excitatory feedback loop, decision uncertainty can still be encoded ( Supplementary Fig. 5) but there is no change-of-mind ( Supplementary Fig. 4). This suggests that for the biophysically-constrained network model, noisy fluctuation may be necessary but not sufficient to allow significant change-of-mind behaviour. Importantly, a choice-neutral stable steady state (or attractor) due to nonlinearity may be needed.

Figure 4 | Model accounts for and predicts key characteristics of change-of-mind. (a)
Probability of change-of-mind with respect to evidence quality. Probability of change-of-mind for a single evidence quality level is calculated by dividing the total number of change-of-mind trials by the total number of simulated trials for a specific evidence quality (see Methods). Black: Total probability of change-of-mind, consisting of both correct and error choices. Green (red): only subsequent correct (error) change-of-mind choices. Probability of change-of-mind for subsequent correct choices peak at

Neural circuit mechanism of change-of-mind behaviours
Next, we will apply dynamical systems analysis 37 to demonstrate that this reversal phenomenon is caused not only by noise and strong sensory evidence favouring one population over the other, as indicated in previous modelling work 35 (Fig. 5c). Notice that the choice attractors have vanished. Furthermore, while the trajectory is being drawn, it moves closer towards and crosses the diagonal line (Fig. 5c). Importantly, the model suggests that this new stable steady state plays an important role in change-of-mind -it provides the initially losing neuronal population a higher chance of winning.
Due to the transient nature of the uncertainty-encoding neuronal population activity (Fig.   1b, middle, and Supplementary Fig. 7), the excitatory feedback returns to baseline level, and the phase plane reverts to its initial configuration (Fig. 5d) (prior to the activation of the uncertainty-monitoring module (Fig. 5b)). This causes the trajectory to move towards the higher part of the phase plane and, coupled with noise, leads to a change-of-mind behaviour.
Overall, this is reflected in the reversal of dominance in the neural activities of the motor populations (Fig. 5a, middle) and motor movement (negative-to-positive) direction (Fig. 5a, bottom) (see also Supplementary Fig. 8). It should be noted that in the model, the final decision is determined by whether the firing rate of motor neural populations, which themselves are neural integrators, reach a prescribed target threshold (see Methods). Thus, change-of-mind could still occur even if the activity reversal is not clearly observed in the sensorimotor module.
In our analyses we found that the new central choice-neutral stable steady state is less likely to emerge with higher evidence quality due to shorter response time and weaker excitatory feedback from the uncertainty-monitoring module (Figs. 2c and d; Supplementary   Fig. 6). This explains why higher evidence quality generally leads to lower probability of change-of-mind 4,35 (Figs. 4a, black). For lower evidence quality, the phase plane is almost symmetrical (Fig 5b). Thus, the network is likely to make an error choice initially due to noisy fluctuations. This can lead to longer integration time for the uncertainty-monitoring module and provides stronger excitatory feedback -in the form of a transient, centralized attractor stateand consequently, correcting the decision. Hence, this explains why there are more correct change-of-mind trials than error change-of-mind trials. However, increasing the evidence quality leads to lower probability of change-of-mind, as discussed above. This explains the observed peak in probability of correct changes-of-mind ( Fig. 4a and Supplementary Fig. 2).

Discussion
We have proposed a novel neural circuit computational model that encodes decision uncertainty, the reciprocal of decision confidence. Decision uncertainty in the model can be Our model was able to exhibit higher levels of decision uncertainty and lower choice accuracy with more difficult tasks 6,10,36 (Fig. 2a). Further, the model showed higher decision uncertainty with lower evidence quality for correct choices, but counter-intuitively, lower decision uncertainty with higher evidence quality for incorrect choices, in line with the previously observed '<' pattern 6,11,33,34 (Fig. 2b, Supplementary Fig. 1). This was explained by the faster response times for correct choices, with lesser integration time for the uncertaintymonitoring module, which led to lower decision uncertainty (Figs. 2c and d). For error choices, the integration time was longer with higher evidence quality (Fig. 1d). This led to longer integration time for the uncertainty-monitoring module and hence higher decision uncertainty.
Furthermore, the uncertainty-monitoring module provided a closed-loop recurrent network mechanism of excitatory feedback with the sensorimotor neuronal population, enhancing the latter's responses. This was reminiscent of a dynamic gain or urgency mechanism 46,47 . Future work could test this mechanism, e.g. using a task paradigm that produces fast error choices 48 and determining whether the '<' pattern is absent.
By utilising a proxy memory mechanism instantiated in the existing tonic bias input to the uncertainty-encoding neural population, our model was also able to show that decision uncertainty from a correct first trial caused a slower response time in the second trial, compared to when the first trial was incorrect (Fig. 3a). Moreover, the model predicted a slightly larger difference in response times when the second responses were error choices than when the second responses were correct choices (Fig. 3b). This difference was more pronounced with increasing evidence quality. Future work could test our model's prediction, for instance, by direct micro-stimulation or inactivation of the uncertainty-encoding neurons or brain region while performing a multi-stage decision version of the waiting-time task 6,11 .
It should be noted that the multi-stage decision paradigm is a special case of sequential decision-making 43 . Specifically, two coupled decisions have to be correct in order to receive a reward. The time delay from the first choice to the next stimulus onset (response-stimulus interval, RSI) is sampled from a truncated exponential distribution (range 0.3-1.0 s; mean 0.57 s). When simulating this paradigm, we have reset the uncertainty bias upon the completion of every pair of coupled trials. Thus, our implementation of the paradigm would not be affected by the RSI. Our stored uncertainty bias could perhaps be allowed to decay over time, for instance, similar to our previous work 49 . However, to the best of our knowledge, this multistage decision study 43 is the only published work that links decision confidence to response times with a sequential dependency, and we defer such speculation to further experimental evidence.
The results in Figs. 3a and b could be explained by the uncertainty level mappings (Figs. 2b, c and d). Specifically, in pairs of coupled trials, errors in first decisions led to a higher tonic bias input (and subsequently, higher overall input, Fig. 2d) in second decisions, due to higher uncertainty levels in first error decisions (Fig. 2b, red) than correct decisions (Fig. 2b, blue), which resulted in stronger excitatory feedback to the sensorimotor module. This led to faster activity ramping up of the sensorimotor populations, which in turn caused faster error (than correct) response times in second decisions. Furthermore, Fig. 3b showed that such differential effect would be more prominent for higher evidence quality.
The same model could exhibit changes-of-mind which were more likely to occur with lower evidence quality 4,35 (Fig. 4a, black). Specifically, the model showed that changes-of-mind were more often accompanied by correcting an impending error choice -hence more error-tocorrect changes than correct-to-error changes (Fig. 4a, green vs red), consistent with previous observations 4, 35 . Furthermore, the likelihood of error-to-correct changes slightly peaked at an intermediate level of evidence quality before decreasing as the task becomes easier 4,35 (Fig.   4a, green). The model predicted slower response times during changes-of-mind, regardless of evidence quality (Fig. 4b). Critically, when we removed the excitatory feedback from the uncertainty-monitoring module to the sensorimotor module, the decision uncertainty could still be encoded, with no change-of-mind (Supplementary Figs. 4 and 5). This demonstrated the importance of the uncertainty-induced excitatory feedback on changes-of-mind.
We used phase-plane analysis to explain the change-of-mind phenomenon. First, the process of change-of-mind could be understood in terms of the sensorimotor network state being attracted to three distinct basins of attraction: the initial choice, then to the central choice-neutral 'uncertain' state, and finally to the other choice. With higher evidence quality, we found that the correct choice attractor dominated the phase plane, with its generally larger basin of attraction (e.g. Supplementary Fig. 9; see also 37 ) and the central attractor was less likely to appear due to the weaker uncertainty-based excitatory feedback (e.g. compare Supplementary Fig. 6 to Supplementary Fig. 7). This explains the monotonic decrease of the probability of change-of-mind (Fig. 4a). In other words, changes-of-mind did not occur due to the heavily biased phase plane and fast response times. However, at low evidence quality levels (ε < 4%), the phase plane was almost symmetric (Fig. 5b), which led to more initial errors (Fig. 4a). Under such low evidence quality, it was increasingly likely that the network would make an initial error choice 37 Figs. 7 and 8). On the contrary, increasing the evidence quality led to lower probability of changes-of-mind. This explains the peak in probability of correct changes-of-mind at an intermediate evidence quality (Fig. 4a; Supplementary Fig. 2).
The model further suggested that during changes-of-mind, noisy fluctuation around the phaseplane diagonal led to subtle deviations early in the trial (Fig. 5). The downstream motor module, which was itself a neural integrator, amplified any slight deviation and led to movement being initiated towards a choice target (Figs. 5a, and Supplementary Figs. 7 and   8). To provide further insights, we have provided a bifurcation (or stability) analysis of the activity of a neuronal population (selective to choice 1/Left) in the sensorimotor module, S1, with respect to the systematic variation (bifurcation parameter) of the overall excitatory feedback input current from the uncertainty module with evidence quality = 0 (Fig. 6b). The stable steady states are denoted by black lines, while dotted lines represent the unstable saddle steady states. During the initial epoch of a trial, this excitatory feedback input from the uncertainty-monitoring module (specifically the uncertainty-encoding neuronal population) to the sensorimotor population is very low or zero (green vertical dashed line). This is the regular winner-take-all regime 37 . As sensory evidence is accumulated in the sensorimotor populations, the uncertainty level is increased, which leads to higher excitatory feedback from the uncertainty-monitoring module. When the overall excitatory feedback is sufficiently large  When making a choice between two alternatives, the strength of the stimulus (and noise) drives the ball towards one of the two wells (in this case, an error choice). A transient strong excitatory input (due to excitatory feedback from uncertainty-monitoring module) changes the "energy" landscape into one centralized deep well, allowing a higher chance to change its initial decision. (b) Bifurcation (or stability) diagram of the activity of a neuronal population selective to choice 1/Left in the sensorimotor module, S1, with respect to variation in the overall excitatory feedback input current from the uncertaintymonitoring module. Evidence quality ε=0. Black bold: stable steady states; black dotted: unstable saddle steady states. Green dashed: initial low uncertainty-induced excitatory feedback and lying within the winner-take-all regime. Our model complements simpler computational cognitive models such as the extended drift-diffusion models 4,51,52 , by providing a neural circuit perspective on the neural mechanism behind decision confidence/uncertainty and change-of-mind. Specifically, our model links to psychophysical data (Figs. 1c, 1d, 2a, 2b, 3a, and 4a) and also directly relates to neurophysiological data (Figs. 1b and 5a, and Supplementary Figs. 6, 7 and 8a), which simpler models cannot do. Hence, both psychophysical (Figs. 3b and 4b) and neural ( Figs. 1 and 5, and Supplementary Figs. [6][7][8][9] predictions are naturally embedded in the model. That said, such biologically-motivated models can be linked back (through various model reductions and assumptions) to simpler cognitive models such as the drift-diffusion models 37,45,52 .
Our distributed neural circuit model is more realistic than other biologically-motivated computational models of decision confidence or change-of-mind 35,50 . Evidence shows perceptual decisions are performed and distributed across multiple brain regions 53 .
Specifically, the activity of our motor module can be directly transformed to motor positional coordinates, hence directly maps to physical output. Our model with the feedforward neural integrator architecture (from sensorimotor to motor module) suggests that the reversal of neural activities resembling a change-of-mind could be more clearly identified in more motorbased neurons than sensory-based neurons ( Fig. 5 and Supplementary Figs. 7 and 8). Future experiments could show the difference in neural dynamics in different brain regions during change-of-mind tasks, e.g. via dual recordings at the sensory and motor-based brain regions.
In summary, our work has provided a neural circuit model that can compute decision confidence or uncertainty within and across trials while also occasionally exhibiting changesof-mind. The model can replicate several important observations of decision confidence and change-of-mind and is sufficiently simple to allow rigorous understanding of its mechanisms.
Taken together, our modelling work has shed light on the neural circuit mechanisms underlying decision confidence and change-of-mind.

Methods
Psychometric and chronometric function. We used a Weibull function 54 to fit the psychometric function, = 1 − 0.5 (− / ) 7 , where is the probability of a correct choice, is the evidence quality, which, in the case of the random-dot stimulus 55,56 , is equal to the motion coherence level ( 9 ). With the parameters used with our model (see Table 1), (the threshold at which the performance is 85%) is set to 7.32%, while , the slope, is equal to 1.32. We defined the model's response (or reaction) time as the overall time it took for the motor neuronal population activity to reach a threshold value of 17.4 Hz (mimicking a motor output and physically reaching a choice target) from stimulus onset time. This is equivalent to the motor position reaching some threshold value (see below).
Modelling sensorimotor populations using two-variable model. We used the reduced version of the spiking neural network model 50 described by its two slowest dynamical variables, which are the population-averaged NMDA-mediated synaptic gating variables 37 .
The dynamics of the two neuronal populations can be described by: where the two excitatory neuronal populations representing the two choice options are labelled 1 and 2, and the 's are the population-averaged NMDA-mediated synaptic gating variables.
is some fitting constant based on previous work 37  represents some top-down inhibition (1000 nA) on the uncertainty-encoding (and inhibitory) population from beginning of trial, which is removed 500ms from Eqs. (6) and (7) after stimulus onset, respectively (see Supplementary Figure 10 where the effect of this timing feature on the model performance was explored). We used these delay values for all the figures in the main text and in Supplementary Information, unless noted otherwise (see Fig. 1). When the activity of one of the sensorimotor neuronal populations crosses a threshold value (35.5 ), is reactivated (3000 nA). This results in the activity pattern of uncertainty-monitoring module to mimic data observed in neural recordings 6,25 (see Fig. 1b, middle panel). Y,Jtu denotes the inhibition strength from the inhibitory neuronal population to uncertainty neuronal populations, while is some excitatory constant bias input that can be modulated (only in multi-stage decisions) by decision uncertainty from the first trial in a pair of coupled-trials (see below).
Motor neuronal populations. Similar to the uncertainty-monitoring neuronal populations, we dynamically modelled the motor output module using threshold-linear functions (with threshold value of 0). Two neural populations selective for right and left -with mutual inhibition -were used. The persistent activity can be maintained using mutual inhibition to create a line attractor model 57 . The dynamics of the neuronal populations for the two choices (1 and 2) are described where y " and y … are the dynamical variables of the left and right motor neuronal populations, respectively. C and G are the firing rates from the two corresponding sensorimotor populations (Fig. 1), and the associated coupling constant J = 1 nA Hz -1 . Y,JZ denotes a coupling constant from population to population . The negative sign indicates connectivity is effectively inhibitory. As for the uncertainty-monitoring module, represents some top-down inhibition (1000 nA) on the motor populations from beginning of trial and is removed when the activity of one of the sensorimotor neuronal populations crosses a threshold value (35.5 ).
Mapping the activity of the motor module to X position. The motor module output as a position in the directional space is approximated by a linear function: where is a constant scaling factor with a value determined by the equation: where Š‹OE is the hypothetical position of the two opposing choice targets. 1366x768 is one of the most commonly used screen resolutions. Therefore, in the model, this value is set to where denotes the trial number, and are scaling parameters. The parameter values set in this work are = 0.008 and = 0.5 . This value of txC is then used to modulate the tonic input (and hence baseline activity) of the uncertainty-encoding population ( , in equation (7)) in the second trial using the following update: → + txC (13) Upon the completion of a pair of coupled trials, the uncertainty bias txC , stored in , is reset to 0.
Regression and classification of model outputs. We used a smoothing spline function in MATLAB to fit the model's decision accuracy as function of uncertainty level. The model behaviour is identified to have a change-of-mind if there is a reversal in the order of dominance between the two motor neuronal population firing rates, i.e. if there is a change in the sign of (eq. 10), and a choice target is eventually reached (before a 4 s timeout -see below) (see Selection of parameter values. Please refer to Table 1 in Supplementary Information for more information on how the parameter values were selected. In some cases, parameters were adopted from previous work 37 . Some parameter values, such as ]^\ (coupling strength between the uncertainty-encoding and sensorimotor populations) and the integration timing parameters were selected to fit qualitative aspects of existing observations (< pattern 6,11,33,34 , probability of changes-of-mind 4 , neural profile of experimentally-observed uncertaintyencoding neurons and regions 6,25 ). We describe the effect of changing these parameter values on model behaviour in Supplementary Figs. 4 and 10.
Code availability. MATLAB and XPPAUT codes were used to simulate the model and generate the figures. They can be found at the following GitHub repository (repo): https://github.com/nidstigator/uncertainty_com_ revised_submission The accompanied README file includes instructions on how to reproduce these figures.