Dissociable roles for Anterior Cingulate Cortex and Basolateral Amygdala in Decision Confidence and Learning under Uncertainty

It has been suggested the subjective sense of certainty, or confidence, in ambiguous sensory cues can alter the interpretation of reward feedback and facilitate learning. We trained rats to report the orientation of ambiguous visual stimuli according to a spatial stimulus-response rule. Following choice, rats could wait a self-timed delay for reward or initiate a new trial. Waiting times increased with discrimination accuracy, demonstrating that this measure could be used as a proxy for confidence. Chemogenetic silencing of BLA shortened waiting times overall whereas ACC inhibition rendered waiting times insensitive to confidence-modulating attributes of visual stimuli, suggesting contribution of ACC but not BLA to confidence computations. Subsequent reversal learning was enhanced by confidence. Both ACC and BLA inhibition blocked this enhancement but via differential modulation of learning strategies and consistency in using learned rules. Altogether, we demonstrate dissociable roles for ACC and BLA in transmitting confidence and learning under uncertainty.


Introduction
Learning relies on the ability to use external cues to predict the state of the world, take actions based on those predictions, and associate those actions with subsequent reward. Learning such associations can be straightforward when stimuli that precede actions or rewards can be discriminated clearly. However, this is not the case in naturalistic settings in which sensory cues or stimuli are ambiguous and thus the perception of or prediction about the state of the world is uncertain. In such situations, stimulus detection or discrimination are frequently accompanied by a sense of certainty, or confidence, in choice (Grimaldi, Lau, & Basso, 2015;Kepecs, Uchida, Zariwala, & Mainen, 2008). Recent evidence indicates that confidence may influence neural activity in brain regions involved in orchestrating reward responses (Guggenmos, Wilbertz, Hebart, & Sterzer, 2016;Hebart, Schriever, Donner, & Haynes, 2016) particularly when reward is significantly delayed (Iigaya, 2016). Consequently, the sensory properties of reward-predicting cues and confidence in disambiguating them may directly influence valuation (Chen, Mihalas, Niebur, & Stuphorn, 2013) and learning from reward feedback.
Recent studies in humans have revealed neural correlates of confidence estimation and learning in several brain regions, including the prefrontal cortex (Morales, Lau, & Fleming, 2018;Rounis, Maniscalco, Rothwell, Passingham, & Lau, 2010). However, it is unclear whether these areas are causally involved in these processes. Despite powerful interference techniques in rodents (Mahler & Aston-Jones, 2018;Smith, Bucci, Luikart, & Mahler, 2016), most rodents studies on neural mechanisms of confidence have been conducted within olfactory and auditory modalities (Foote & Crystal, 2007;Lak et al., 2014). In contrast, human studies on choice and learning under perceptual uncertainty have focused on visual processing, making it difficult to link findings across species.
Here, we trained rats to report the orientation of noisy Gabor patches by making spatial choices based on a learned stimulus-response rule (e.g., Horizontal→left and Vertical→right).
We manipulated different aspects of the visual stimuli to alter performance and uncertainty associated with discriminating the orientation. Following action selection using a touchscreen, rats expressed their confidence by time-wagering: they could wait for a variable amount of time before they could receive a possible reward or initiate a new trial (Lak et al., 2014). This design allowed us to measure confidence on a trial-by-trial basis. After ensuring that rats learned the stimulus-response associations, we reversed these associations to study the effect of confidence on learning using a counterbalanced design. Extensive studies in rodents have shown a distributed network supports learning and choice involving uncertain outcomes, including basolateral amygdala, BLA (Ghods-Sharifi, St. Onge, & Floresco, 2009;St Onge, Stopper, Zahm, & Floresco, 2012;Stopper & Floresco, 2011;Winstanley & Floresco, 2016), and anterior cingulate cortex, ACC (Akam et al., 2017;Tervo et al., 2014). Therefore, we used inhibitory Designer Receptors Exclusively Activated by Designer Drugs (DREADDs) to transiently inactivate projection neurons in the ACC or BLA in order to test the causal role of each area in confidence estimation or computation, and learning under perceptual uncertainty.

Results
Waiting time provides a proxy for confidence. To assess confidence during perceptual choice with uncertain visual information, we used a novel experimental paradigm in which rats were first presented with a single Gabor patch with one of two possible dominant orientations (horizontal (H) and vertical (V)) embedded in noise (Figure 1A, B). Perceptual uncertainty was manipulated using two parameters of the visual stimuli: 1) the signal-to-noise ratio (SNR), defined as the ratio of the contrast of the Gabor patch relative to the contrast of the added Gaussian noise; and 2) the overall contrast of both the Gabor patch and the added noise for a given SNR. These manipulations allowed us to modulate performance and confidence independently in order to design "matched-performance different-confidence" stimulus pairs used for learning Koizumi, Maniscalco, & Lau, 2015;Odegaard et al., 2018). Following stimulus presentation, rats reported the perceived orientation by nosepoking one of the two side compartments of the touchscreen based on a complementary stimulus-response rule (e.g., H→left and V→right). Following action selection, rats expressed confidence in their response via time wagering; that is, they could wait for a probabilistically delivered reward if confident or initiate a new trial otherwise (see Methods).

Figures 2 and 3).
Averaging over different contrast levels, we found that the probability of making a correct response was larger when SNR was larger (GLM; (3,437) = 106, = 1.1 × 10 =>M , adjusted E = 0.418; ratio: = 0.054, = 1.3 × 10 =EQ ; (correct) = 0.69 ± 0.06, 0.74 ± 0.05, and 0.80 ± 0.04 for SNR of 2, 3, and 4, respectively). In addition, the number of reinitiations decreased as SNR increased (GLM; (3,437) = 17.7, = 7.6 × 10 =MM , adjusted E = 0.102; SNR: = −0.01, = 4.36 × 10 =T ; fraction of re-initiated trials = 0.17 ± 0.02, 0.16 ± 0.03, and 0.15 ± 0.02 for SNR of 2, 3, and 4, respectively). Finally, the distribution of waiting time generally followed that of reward delivery (Supplementary Figure 4). These results illustrate that rats learned the task and used visual stimuli to make a choice and for postchoice wagering. The rat first initiated a trial by nosepoking a white square in the center of the screen. The initiation stimulus then disappeared, and the rat was briefly presented (1 ) with a single horizontal (H) or vertical (V) Gabor patch masked by noise. Rats were required to report the dominant orientation (H or V) via nosepoke based on a complementary stimulus-response rule; e.g., H→left and V→right. Correct choices were rewarded probabilistically (on 70% of randomly selected trials), following variable delay times. After stimulus discrimination, rats could wait a self-timed delay in anticipation of reward or initiate a new trial. The initiation stimulus appeared on the touchscreen 2 seconds after a rat indicated its choice. (B) Examples of visual stimuli and one of the two stimulus-response rules. We refer to their discriminability as an SNR value reflecting the strength of visual signal (4, most discriminable; 3, moderately discriminable; 2, least discriminable). After discrimination of the visual stimulus, the rat makes a response (using the touchscreen) according to the rule H→Left and V→right. (C) Representative expression of inhibitory (Gi-coupled) DREADDs under CaMKIIa are shown in ACC and BLA.
This negative correlation between post-decision wagering and reaction time involving visual stimuli has also been reported in primates, but not in rodents.
To examine the relationship between accuracy and time wagering, we computed waiting time separately for correct and incorrect responses. We found that rats waited significantly longer following correct relative to incorrect responses, or trial type (diff(mean)=4.74; GLM: (3,878) = 123, = 1.31 × 10 =TT , adjusted E = 0.294; trial type: = 4.57, = 2.46 × 10 =OO ; Figure 2D). Consistent with this result, on trials with a re-initiation, rats discriminated more accurately when waiting times were longer for a given SNR (Pearson  Waiting time serves as a proxy for confidence that is more sensitive than reaction time. (A) Waiting time before re-initiation increases as SNR increases. Plotted are the distributions of waiting times for each SNR: 2 (blue), 3 (green), and 4 (red), following vehicle administration. Solid lines show the median of each distribution. The y-axis on the right shows the probability of a correct response. Solid circles indicate the calculated probability of correct responses for each bin of the waiting time distribution, for each SNR. The dashed lines show the regression line of the probability of correct responses for bins in each SNR. (B) Reaction time decreases as SNR increases. Plotted are the distributions of reaction time for all trials. Similar to panel A, y-axis on the right shows the probability of a correct response. Solid circles indicate the calculated probability of correct responses for each bin of the reaction time distribution, for each SNR. Conventions are the same as in panel A. (C) Waiting time before re-initiation of a new trial is negatively correlated with reaction time to make a choice. Waiting time is plotted as a function of the reaction time for all trials and all rats. Each data point is a trial in a session following vehicle administration. (D) Waiting time is larger for correct compared to incorrect responses for any SNR. Plotted is waiting time for all trials (black), correct trials (green), and incorrect trials (purple) for different SNR. The inset shows the relative difference in waiting time between correct and incorrect responses for different SNR. Error bars show the S.E.M. over sessions (typically smaller than the symbol). (E) Reaction time only weakly reflects the accuracy of the response. Plotted is the reaction time for all trials and separately for correct and incorrect responses. Conventions are the same as in panel D. The inset shows the relative difference in reaction time between correct and incorrect responses for different SNR. Overall, response accuracy is reflected in waiting time an order of magnitude better than in reaction time.
Compatible with previous findings, we also found that reaction times decreased with larger SNR and were faster for correct responses relative to incorrect responses/discrimination (GLM; (3,878) = 2.56, = 0.05, adjusted E = 0.005; SNR: = −0.069, = 0.03; Figure   2E). The normalized difference in reaction times between correct and incorrect responses, however, changed only between 1-5% for different SNR values compared to 30-50% for waiting times (Figure 2E inset). In addition, unlike waiting time, there was no significant correlation between performance and reaction time on a session-by-session basis (Pearson correlation; SNR=2, = −0.1, = 0.2; SNR=3, = 0.13, = 0.1; SNR =4, = 0.01, = 0.82; Figure   2B). Importantly, we found similar results when we performed all above analyses for each value of contrast separately (Supplementary Figure 2). Moreover, independently of SNR and contrast levels, waiting time increased with discrimination performance (Supplementary Figure 1B).
Together, our results illustrate that waiting time reflects confidence in a perceptual discrimination with much higher fidelity than that of reaction time, to include the proportional nature of confidence and accuracy. Our findings thus extend previous observations in primates (Kiani & Shadlen, 2009) to rodents, and suggest that waiting time in our paradigm can also serve as a proxy for decision confidence (Lak et al., 2014).
Dissociable contributions of BLA and ACC to time wagering. We expressed Gi-coupled DREADD receptors in projection neurons of ACC and BLA ( Figure 1C). After allowing time for transduction, we injected rats with CNO prior to a subset of testing sessions to inhibit these brain regions, using a within-subject design. In addition, to confirm the effect of CNO using ex vivo electrophysiological recording, we prepared a separate group of rats (n = 3) with ACC DREADDs using identical procedures. We found a significant reduction in field potential after CNO application only in the transfected slices (Supplementary Figure 5).
Despite strong, dissociable effects on waiting time, inhibition of ACC and BLA did not change the overall task performance, discrimination accuracy, response bias, or reaction time.
First, probability of correct response was not significantly different between vehicle and CNO administration (GLM; (3,289) = 2.77, = 0.04, adjusted E = 0.017; drug: = 0.003, = 0.57; Figure 4A). Second, we computed discrimination performance or ′ and found this measure was also not significantly different between vehicle and CNO administration (GLM; (7,82) = 40.7, = 4.19 × 10 =E\ , adjusted E = 0.757; drug: = −0.26, = 0.21; Figure   4B). Importantly, we observed a strong and significant correlation between discrimination accuracy ′ following CNO administration and following vehicle administration in rats with indicating that perceptual discrimination was not affected by ACC or BLA inhibition. Third, we found no significant effect of drug condition (CNO vs. vehicle) and targeted brain region on the decision criterion (i.e., the response bias; (Macmillan & Creelman, 1997)  Finally, in additional control conditions, we also tested whether the presence of active virus was essential for the observed changes, and whether vehicle administration alone could cause changes in behavior. To do so, we measured behavioral responses in the rats with expressed DREADDs but without the administration of vehicle (no-injection control prior to reversal) and in rats with null virus and compared them with those under vehicle administration (i.e., the main control condition). We found that waiting time and reaction time did not differ between vehicle administration and no-injection control (see Supplementary analyses and Supplementary Figure 6). In addition, we found that the observed effects of CNO depended on the presence of active virus (Supplementary analyses).
Interestingly, ACC inhibition renders waiting time even insensitive to the contrast level of visual stimuli such that for higher contrast and higher SNR, waiting time following inhibition dropped below the control condition (Supplementary Figure 3A-C). These findings suggest that ACC is involved in modifying visual uncertainty, perhaps via gain modulation, in order to compute perceptual uncertainty and to influence post-decision processes based on the latter.
To further test this, we computed metacognitive efficiency (meta-′/ ′; see Methods for details), that assesses how well waiting time tracks discrimination performance ( ′) across trials (Maniscalco & Lau, 2012), or equivalently, the trial-by-trial correspondence of accuracy and waiting time. We found that this measure was significantly reduced following ACC inhibition compared to vehicle administration (Wilcoxon rank-sum test; = 1.037 × 10 =? ; Figure 4C) but remained intact following BLA inhibition (Wilcoxon rank-sum test; = 0.11). Nonetheless, metacognitive efficiency was larger than zero in both vehicle and CNO administration conditions  Collectively, these results suggest that whereas inhibition of BLA decreases waiting time, this effect is most likely due to the general delay aversion or an increase in impulsive choice, because rats are still able to appropriately scale their waiting times according to performance and trial difficulty. In contrast, inhibition of the ACC renders rats' waiting times relatively insensitive to discrimination accuracy ( ′) and SNR, suggesting that this region meaningfully participates in estimating the reliability of visual stimuli and consequently, computing and reporting confidence. Taken together with the results we provide in the previous section, we show that here in rats we are able to interfere with and dissociate first order (discrimination performance) from second order (metacognition) processes, as has been done in nonhuman and human primates (Koizumi et al., 2015;Maniscalco, Peters, & Lau, 2016;Miyamoto, Setsuie, Osada, & Miyashita, 2018;Odegaard et al., 2018).
Confidence enhances reversal learning. Following learning of the task and the stimulusresponse rule, rats were randomly assigned into a high confidence (HC) or low confidence (LC) condition, and subsequently experienced a reversal in the stimulus-response rule. Unlike in the initial task, upon reversal, rats were not permitted to re-initiate the trial while waiting for a possible reward delivery. Correct responses, now under a reversed stimulus-response rule, were reinforced probabilistically as before (70% of the time). We selected the two parameters of the stimulus (SNR and contrast level) for each rat from a pair of matched-performance (i.e., matched discrimination accuracy, or ′) but different average waiting time. This was only possible because as we illustrated earlier, SNR and contrast allowed us to modulate performance and waiting time independently. As we show below, although we used different visual stimuli across HC and LC conditions, there was no systematic difference in performance accuracy and instead, there was difference only in confidence levels measured by the waiting time before the reversal.
Importantly, we calculated ′ and confidence using only the data from sessions that were not preceded by injections (i.e., no-injection control prior to reversal) in order to assign rats to HC and LC conditions.
We performed several analyses to ensure that the only difference between HC and LC conditions was the confidence reported via waiting time. First, we found that ′ for HC and LC conditions were not significantly different for each of the stimuli that was administered after reversal, i.e., the contrast-SNR pairs that were chosen for reversal were not associated with To identify how learning and choice strategies were affected by confidence, we first compared learning between HC and LC conditions following vehicle administration. We found that rats in the HC condition performed better than the rats in the LC condition following vehicle administration (diff ( The observed faster learning occurred simultaneously with an increase in selection of the correct stimulus-response rule following selection of this rule and being rewarded on the preceding trial (Win-Stay; Permutation test; = 9.7 × 10 =MN ; Figure 5B). In addition, animals increased their tendency to switch from the incorrect to correct stimulus-response rule following unrewarded trials when the response on the preceding trial was incorrect (Lose-Switch after incorrect; Permutation test; = 0.046). The improvement in learning due to higher confidence was also accompanied by a decrease in switch from the correct to incorrect stimulus-response rule when the response on the preceding trial was correct but not rewarded (Lose-Switch after correct; note that 30% of correct responses were not rewarded by design; Permutation test; = 3.3 × 10 =\ ). We also compared the tendency of the animals to repeat the same stimulusresponse rule as in the previous trial beyond what is expected by chance, measured by the rulebased repetition index (RRI; (Soltani, Noudoost, & Moore, 2013); see Methods). We found that RRI was larger for the HC relative to LC condition (Permutation test; = 0.029), indicating that animals were more consistent/persistent in their behavior (following a specific rule) under higher confidence. Together, these results suggest that confidence can improve learning strategies from all possible outcomes and moreover, can increase consistency in following learned stimulusresponse rules. To our knowledge, the observed enhancing effect of perceptual confidence on learning has been reported in humans (Guggenmos et al., 2016) but not in rodents, and the effects on rule consistency are novel. showing performance (probability of correct response) following reversal for rats prepared with DREADDs after vehicle administration. Plots shows the performance across all rats in highconfidence (HC; black) and low-confidence (LC; gray) conditions averaged over a sliding window of 100 trials. The inset shows the average performance in early (first half of the trials) and late (last quarter of the trials) trials, demonstrating that higher confidence improved the rate of learning but did not change the steady state. (*) indicates a significant difference in median between the two conditions (Chi-square test of ratio, < 0.05). (B) The influence of confidence on learning strategies and perseveration. Plotted are the difference in the proportions of Win-Stay, Lose-Switch following correct but unrewarded responses, Lose-Switch after incorrect responses, and rule-based repetition index between HC and LC conditions in the first half of trials after the reversal.
Both BLA and ACC support reversal learning. We next compared overall learning across different DREADDS inhibition conditions. In contrast to the control conditions, the overall performance over time in the LC condition was significantly better than in the HC condition following CNO treatment (diff (  = 5.43 × 10 =\ ; BLA: = 5.83 × 10 =ME ; Figure 6B,D) but this reduction was stronger for BLA inhibition (Permutation test; = 0.0048). This suggests that both ACC and BLA inhibition attenuate reversal learning by reducing the tendency to repeat a rewarded stimulusresponse rule due to higher confidence. Therefore, although both ACC and BLA contribute to mediating the effect of confidence on learning from positive feedback, the stronger attenuation and reversal of this effect following BLA inhibition illustrates a more prominent role for BLA in learning under uncertainty.
Secondly, we analyzed the effect of negative feedback on switching from the previous stimulus-response rule, separately for when the previous rule was correct (Lose-Switch after correct) and when the previous rule was incorrect (Lose-Switch after incorrect). We examined these two types of trials separately because as we showed above, confidence has differential effects on these trials ( Figure 5B). We observed that both ACC and BLA inhibition reversed the effect of confidence (i.e., difference between HC and LC conditions) on learning from negative feedback; however, this was only significant for BLA (Permutation test; ACC Lose-Switch after correct: = 0.23; ACC Lose-Switch after incorrect: = 0.44; BLA Lose-Switch after correct: p=7.4*10 -4 ; BLA Lose-Switch after incorrect: p=0.0044; Figure 6B,D). In addition, the effect of confidence on Lose-Switch after an incorrect response became more negative after ACC inhibition compared to BLA inhibition (Permutation test; p = 0.0051), indicating a stronger role for ACC in learning from negative feedback when the response was incorrect. Together, these results suggest that BLA has a more pronounced role in mediating the effect of confidence in learning from positive feedback whereas ACC is more involved in mediating the effect of confidence in learning from negative feedback. Learning curves (probability of correct response) after a reversal for rats prepared with ACC DREADDs following CNO (blue) or vehicle (Veh; black) administration. Plot shows the performance across all rats averaged over a sliding window of 100 trials for high-confidence (HC) and low-confidence (LC) conditions. The inset shows the average performance in early (first half of the trials) and late (last quarter of the trials) trials, demonstrating that either perceptual uncertainty or ACC inhibition decrease the rate of learning. Following ACC inhibition, rats eventually reach a similar performance level compared to the control condition (vehicle administration). (*) indicates a significant difference in median between the two conditions (Chi-square test of ratio, < 0.05). (B) Win-Stay, Lose-Switch following correct response, Lose-Switch after incorrect response, and rule-based repetition index for the HC and LC conditions in ACC DREADDs during the first half of trials after the reversal. ACC inhibition only removes the benefit of confidence on Win-Stay but weakens the effect of confidence on learning from negative feedback or consistency in rule selection. (*) indicates a median significantly different from zero or a significant difference in median between the two conditions (Permutation test, Bonferroni corrected, < 0.01). Magenta squares indicate a significance difference between ACC and BLA inhibition (Permutation test, < 0.05). (C) Learning curves (probability of correct response) after a reversal for rats prepared with BLA DREADDs following CNO (red) or vehicle (black) administration. BLA inhibition decreases the rate of learning but eventually rats reach a similar performance level compared to the vehicle administration condition. (D) The same as in panel B but for BLA DREADDs. Unlike ACC inhibition, BLA inhibition reverses the benefits of confidence on all learning strategies and consistency in rule selection.
Finally, we evaluated the consistency in using learned stimulus-response rules using the RRI. We found that BLA but not ACC inhibition reversed the effect of confidence on RRI (Permutation test; ACC: = 0.36; BLA: = 4.6 × 10 =\ ; Figure 6B,D), and this effect was more negative after BLA than ACC inhibition (Permutation test; = 4.4 × 10 =\ ). This indicates that BLA, but not ACC, is important for mediating the effect of confidence in consistently using Together, these results reveal dissociable effects of ACC and BLA inhibition on learning under uncertainty. Importantly, only BLA inhibition consistently reversed the benefit of confidence on learning from both positive and negative feedback. This suggests that BLA is directly involved in confidence-dependent learning (and not estimation) because BLA inhibition only shifts confidence readout with respect to perceptual uncertainty, as shown earlier. Different from BLA effects, ACC has a more specific role in supporting learning, mainly from negative feedback following an incorrect response and to lesser extent from positive feedback (Win-Stay), perhaps by making confidence computation sensitive to the level of perceptual uncertainty, as suggested by our results on confidence readout. In addition, the consistency in using learned stimulus-response rules did not differ by confidence condition following ACC inhibition, whereas BLA inhibition made rats less likely to apply the learned rules under higher confidence.

Discussion
We examined the causal roles of ACC and BLA in confidence report and learning under perceptual uncertainty. We studied learning under uncertainty by training rats to report the orientation of ambiguous visual stimuli based on a learned stimulus-response rule and read out their confidence in choice using a time wagering task. Decision opt-out and wagering tasks have been previously used to assess confidence in rats (Foote & Crystal, 2007;Lak et al., 2014), showing that rats exhibit similar behavior to those of humans with monetary rewards (Persaud, McLeod, & Cowey, 2007). We observed that rats are willing to tolerate larger delays to outcomes after faster and easier perceptual decisions involving more salient stimuli. We showed that ACC is required for appropriate waiting according to the uncertainty of the visual stimulus and ensuing choice. Following ACC inhibition, post-decision waiting times were less sensitive to the strength of the visual evidence, and accuracy tracked less well with these waiting times on a trial-by-trial basis. In contrast, inhibition of BLA decreased rats' willingness to wait overall, regardless of the strength of the visual information and decision difficulty.
It has been proposed that confidence in a decision not only affects our choices but also influences how we learn (Guggenmos et al., 2016;Kiani & Shadlen, 2009). However, the effect of high confidence on reinforcement learning had not been explored directly in any animal model. We found that high confidence in a perceptual decision can boost subsequent reversal learning of stimulus-response rules using reward feedback, even when we controlled for signal processing capacity, (i.e., task performance). Critically, all rats were able to learn new reward contingencies upon the change in stimulus-response mapping, but the learning was faster in the group of rats that had higher confidence at the onset of reversal. We show that the BLA and ACC are both required for the enhancement of learning by perceptual certainty or confidence.

ACC in perceptual decision making.
In rats, perceptual metacognition has been previously assessed within olfactory and auditory (Foote & Crystal, 2007;Kepecs et al., 2008;Lak et al., 2014) but not visual modalities. These studies have revealed a role of orbitofrontal cortex (OFC); for example, it has been shown that activity in the rat OFC reflects the degree of uncertainty in decisions based on olfactory information during reward anticipation (Kepecs et al., 2008). Similar to our results for the ACC, inhibition of OFC impairs behavioral adjustments to decision confidence, but not perceptual choices based on conflicting evidence themselves (Lak et al., 2014). However, there are several important differences, not just similarities, between previous and present studies. First, Lak et al. (2014) showed that waiting time increased for correct trials and decreased for incorrect trials, whereas we show that waiting time increased for both correct and incorrect trials, across all contrast levels. This relationship, however, depended on the ACC. Specifically, following ACC inhibition, waiting time became insensitive to SNR especially on trials with an incorrect response. Nevertheless, here, we show that the ACC plays a similar role to the OFC but in visual information processing. That is, the ACC may guide commitment to and persistence with the current behavior based on the quality of visual evidence that led to the decision. These similarities offer interesting possibilities for the frontocortical mechanisms of confidence estimation and suggest there may not be a subregional specialization for this process (Hunt & Hayden, 2017;Yoo & Hayden, 2018). Consequently, future research should be directed at uncovering the constraints (and if there is differential involvement that may be revealed over different timescales) for the ACC and OFC in decisional confidence.
We show here that unlike BLA inhibition, ACC inhibition renders confidence readout rather insensitive to both attributes of visual stimuli (SNR and contrast), suggesting that ACC "gain" modulates visual uncertainty computed in visual areas to determine perceptual uncertainty and post-decision processes. Anatomically, the ACC is densely interconnected with visual cortices in rodents (Vogt & Miller, 1983;Vogt & Paxinos, 2014), particularly the more rostral aspect of ACC in rat as we have targeted here (Vogt & Paxinos, 2014). Furthermore, this brain region is well positioned to integrate information about stimuli, actions, and rewards by tracking trial-by-trial outcomes of responses (Bryden, Johnson, Tobia, Kashtelyan, & Roesch, 2011;Hayden, Heilbronner, Pearson, & Platt, 2011;Heilbronner & Hayden, 2016). In our task, inhibition of the ACC rendered post-decision waiting times less sensitive to the strength of visual information and performance accuracy across trials, without affecting perceptual discrimination itself: i.e., impaired second order but left the first order processes intact (Miyamoto et al., 2017).
Previous work in primates has demonstrated that confidence reports are informed by both decision difficulty and elapsed decision time (or reaction time; (Fetsch, Kiani, Newsome, & Shadlen, 2014;Kiani, Corthell, & Shadlen, 2014;Kiani & Shadlen, 2009)). Even in the absence of a change in decision accuracy, longer reaction times are associated with lower confidence. In the present work, we demonstrate that the same effect is present in rats and is also supported by the ACC. Finally, we found that ACC inhibition decreased metacognitive efficiency, or the trialby-trial correspondence between decision accuracy and waiting times. In humans, a similar effect has been reported for perturbations of activity in the dorsolateral prefrontal cortex, which is shown to be important for visual metacognition (Rounis et al., 2010).
We note that waiting time is an indirect measure of confidence and as such, the effect of brain manipulations should be interpreted with caution. Firstly, several cortical and subcortical brain regions participate in reward timing (Bakhurin et al., 2017;Huertas, Hussain Shuler, & Shouval, 2015;Levy, Zold, Namboodiri, & Hussain Shuler, 2017;Murakami, Shteingart, Loewenstein, & Mainen, 2017). Secondly, an overall reduction in waiting time can result from an increased delay sensitivity or impulsivity and therefore may not be reflective of confidence per se. Here, we found that inhibition of the BLA renders rats less willing to wait overall.
However, this effect of BLA inhibition was independent of the strength of visual evidence to make a perceptual decision. Furthermore, whereas inhibition of the ACC decreased metacognitive efficiency, inhibition of the BLA failed to change this measure. Thus, during perceptual decision making, the BLA may overall increase waiting time for reward, perhaps enabling other brain regions to interpret and/or act on ACC signals related to the strengths of visual information.
Our post-decision wagering paradigm mimics many features of foraging tasks that involve patch-leaving decisions. In rats, the ACC represents expected outcomes and signals errors in reward prediction, and is engaged when a change in the course of action is required and encodes information about rewards in remote locations (Bryden et al., 2011;Hyman, Holroyd, & Seamans, 2017;Mashhoori, Hashemnia, McNaughton, Euston, & Gruber, 2018). Similarly, in primates, the dorsal ACC participates in foraging decisions, signaling the value of leaving a patch in pursuit of other opportunities in the environment (Hayden et al., 2011). Furthermore, the dorsal ACC signals the value of the rejected option after the decision has been made (Blanchard & Hayden, 2014). Future research is needed to determine whether the impairment produced by ACC inhibition is specific to post-decision wagering tasks or will also manifest in opt-out tasks.

Visual metacognition in rats. Recent work documents important similarities in visual
information processing between rodents and primates, although species differences do exist (Meier & Reinagel, 2011Reinagel, 2015). Pigmented rat strains, like the Long-Evans strain we studied here, have previously been used for vision research (Reinagel, 2015). Here, we found that rats also show high levels of visual metacognition, adjusting post-decision waiting times based on the uncertainty in perceptual decisions. This may allow direct comparison with the modality most often assessed in human and nonhuman primates while enabling easier, precise circuit manipulations.

BLA and ACC in learning under perceptual uncertainty.
We show that stimulus-response remapping is facilitated by perceptual certainty. Critically, both the BLA and ACC are required for faster learning when perceptual certainty is strong enough to improve learning. Considering that the BLA only shifts confidence readout, the observed reversals of all benefits of confidence on learning strategies and consistency in following a stimulus-response rule after BLA inhibition suggest a direct role of BLA in learning under uncertainty. That is, if the influence of BLA on learning was due to shifting confidence readout we would expect a bias in a certain direction and not reversal of all effects. In contrast, the effects of ACC seem to work through distorting confidence readout because its inhibition mainly attenuated the effect of confidence on learning.
Our results are also consistent with previous observations that the ACC-BLA circuit adjusts the levels of attention directed at environmental cues for learning based on prediction errors (Bryden et al., 2011). More specifically, it has been shown in rats that there is strong attention-related activity in the ACC during the entire trial following unexpected changes in reward and is most pronounced prior to and during outcome-predictive cues (Bryden et al., 2011). In contrast, unsigned reward prediction errors in the BLA may serve as attention signals, occurring at the time of unexpected reward delivery and omission (Roesch, Calu, Esber, & Schoenbaum, 2010). The ACC and BLA share direct and indirect bi-directional projections and the activity in this circuit appears to be required for adaptive learning under conditions of uncertainty in the visual cues guiding decisions or perhaps under more general cases of learning under uncertainty (Farashahi et al., 2017;Stolyarova & Izquierdo, 2017;Soltani & Izquierdo, 2019).

Subjects.
In total 31 male outbred Long Evans rats (Charles River Laboratories, Crl:LE, Strain code: 006) were used in the experiments. The housing room in the vivarium was maintained under a reversed 12/12 h light/dark cycle at 22°C and all behavioral testing was conducted during rats' active phase, of the dark portion of the cycle (between 08:00 and 18:00h). Rats remained undisturbed for 3 days after arrival to our facility to acclimate to the vivarium. Each rat was then handled for a minimum of 10 min once per day for 5 days. Following handling, rats underwent stereotaxic surgery to express inhibitory Designer Receptors Exclusively Activated by Designer Drugs (DREADDs; or control null virus to express only a fluorescent protein but no mutant receptors) and allowed to recover for three weeks. Rats were subsequently food-restricted to ensure motivation to work for food for one week prior to and during the behavioral testing, while water was available ad libitum except during behavioral testing. All rats were pair-housed at arrival and separated on the last day of handling to facilitate post-surgical recovery and minimize aggression during food restriction. We ensured that rats did not fall below 85% of their free-feeding body weight, and we saw a significant increase in rat body weight throughout the prolonged behavioral testing. On the last two days of food restriction prior to behavioral training, rats were fed 20 sugar pellets in their home cage to accustom them to the food rewards. For each experiment, rats were randomly assigned into groups, with the exception of assignment into high-confidence (HC) and low-confidence (LC) conditions for reversal learning as detailed below. All procedures were approved by the Chancellor's Animal Research Committee at the University of California, Los Angeles.

Viral constructs. We used inhibitory (Gi-coupled) DREADDs on a CaMKIIa promoter to
transiently inactivate projection neurons in the ACC and BLA during performance on the behavioral task. An adeno-associated virus AAV8 driving the hM4Di-mCherry sequence under the CaMKIIa promoter was used to transduce ACC or BLA neurons with DREADDs (AAV8-CaMKIIa-hM4D(Gi)-mCherry, packaged by Addgene). A virus lacking the hM4Di DREADD gene (AAV8-CaMKIIa-EGFP, packaged by Addgene) was used as a null virus control. There were four experimental groups of rats: the active virus in BLA (n=8), the active virus in ACC (n=7), the null virus in BLA (n=8), and the null virus in ACC (n=8). The groups with the null virus expressed in brain regions of interest allowed us to control for virus exposure, non-specific effects of surgical procedures and subsequent injections on behavior.
Surgery. All surgeries were performed using aseptic stereotaxic techniques under isoflurane gas anesthesia (5% in O2 during induction and 2-2.5% in O2 for maintenance). After being placed into a stereotaxic apparatus (David Kopf; model 306041), the scalp was incised and retracted.
The skull was then leveled to ensure that bregma and lambda were in the same horizontal plane.
Small burr holes were drilled in the skull to allow cannulae with an injection needle to be lowered into the BLA (the injection needle extended 1mm below the cannulae and its tip was at AP: −2.5; ML: ±5.0; DV: −7.8 (0.1μl) and −8.1 (0.2μl) from skull surface) or ACC (0.3 μl, AP = +3.7; ML= ±0.8; DV = −2.6). The injection needle was attached to polyethylene tubing connected to a Hamilton syringe controlled by a syringe pump. The viruses were infused bilaterally at a rate of 0.1 μl/min. For the BLA, the ventral infusion was administered first (at -8.1) followed by the dorsal site (-7.8) since our prior experiments demonstrated more precise targeting with this approach. There was no waiting time between the two infusions for BLA.
After the last viral infusions in BLA or single infusion in ACC, the needle was left in place for 10 minutes to allow for diffusion of the virus, after which the cannulae were slowly lifted out of the brain and the wounds stapled. Each surgery took approximately 40 min. All rats were given a three-week recovery period prior to food restriction and subsequent behavioral training.
Carprofen (5mg/kg, s.c.) was administered for 5 days postoperatively to minimize pain and discomfort. Behavioral measures of discomfort and conditions of the wounds were monitored daily, and all surgical staples were removed within 7-10 days after surgeries depending on a rat's recovery.
Electrophysiological confirmation of DREADDs. Separate rats were prepared with ACC DREADDs using identical surgical procedures to the main experiments. Slice recordings did not begin until at least three weeks following surgery to allow sufficient hM receptor expression.
Slice recording methods were similar to those previously published (Babiec, Jami, Guglietta, Chen, & O'Dell, 2017). Three rats were deeply anesthetized with isoflurane and decapitated. The brain was rapidly removed and submerged in ice-cold, oxygenated (95% O2/5% CO2) artificial cerebrospinal fluid (ACSF) containing (in mM) as follows: 124 NaCl, 4 KCl, 25 NaHCO3, 1 NaH2PO4, 2 CaCl2, 1.2 MgSO4, and 10 glucose (Sigma-Aldrich). 400-μm-thick slices containing the ACC were then cut using a Campden 7000SMZ-2 vibratome. Slices from the site of viral infusion were used for inhibitory G-protein (G)i validation. Expression of mCherry was confirmed post-hoc. Slices were maintained (at 30°C) in interface-type chambers that were continuously perfused (2-3 ml/min) with ACSF and allowed to recover for at least 2 hours before recordings. Following recovery, slices were perfused in a submerged slice recording chamber (2-3 ml/min) with ACSF containing 100 μM picrotoxin to block GABAA receptormediated inhibitory synaptic currents. A glass microelectrode filled with ACSF (resistance = 5-10 MΩ) was placed in layer 2/3 ACC to record field excitatory postsynaptic synaptic potentials and population spikes elicited by layer 1 stimulation delivered using a bipolar, nichrome-wire stimulating electrode placed near the medial wall in ACC. Stimulation intensity (0.2 msec duration pulses delivered at 0.33 Hz) was set to the minimum level required to induce reliable population spiking in ACC. Once reliable responses (measured as the area of postsynaptic responses over a 4 second interval) were detected, baseline measures were taken for at least 10 minutes, followed by a 20 minutes bath application of 10 μM CNO. Unless noted otherwise, all chemicals were obtained from Sigma-Aldrich.
Behavioral training. Behavioral training was conducted in operant conditioning chambers (Model 80604, Lafayette Instrument Co., Lafayette, IN) that were housed within the sound-and light-attenuating cubicles. Each chamber was equipped with a house light, tone generator, video camera, and LCD touchscreen opposing the pellet dispenser. The pellet dispenser delivered 45mg dustless precision sucrose pellets. Software (ABET II TOUCH; Lafayette Instrument Co., Model 89505) controlled the hardware. All testing schedules were customized in ABET by our group and can be requested from the corresponding author. During habituation, rats were required to eat five pellets out of the pellet tray inside of the chambers within 15 min before exposure to any stimuli on the touchscreen. They were then progressively trained to respond to visual stimuli presented on the screen, to initiate the trial, report the orientation of the visual stimulus (vertical or horizontal) by nosepoking left or right on a white square stimulus, and wait for rewards.

Behavioral Testing and experimental paradigm.
A rat first initiated each trial by nosepoking a bright white square in the center of the screen. The initiation stimulus then disappeared, and a rat was briefly (1s) presented with a vertical (V) or horizontal (H) Gabor patch embedded in noise, and required to report the orientation (H or V) based on a complementary stimulus-response rule, e.g., H→left and V→right. These spatial responses were made by nosepoking the right or left compartments of the touchscreen that became illuminated after the disappearance of the oriented visual stimulus. We altered two properties of the visual stimuli to manipulate their ambiguity.
First, we changed the signal-to-noise ratio (SNR), defined as the ratio of the contrast of the Gabor patch relative to the contrast of the added Gaussian noise. Second, we changed the overall contrast of both the Gabor patch and the added noise for a given SNR. Gratings were 200 pixels square, with spatial frequency 20 px/cycle. For training, gratings were presented at 100% contrast. For testing, gratings were embedded in white noise as follows. To create different contrasts designed to produce a range of performance (measured by ′) and confidence (measured by waiting time) responses such that HC and LC conditions could be established, animals performed the task on 40%, 60%, and 80% maximum contrast Gabor patches embedded in noise also with three possible levels of increasing contrast, for nine possible full-factorial combinations in total. This method of constant stimuli (Macmillan & Creelman, 2004) facilitated selection of a pair of stimuli from these nine levels such that the animal had produced matched perceptual performance capacity ( ′) but different waiting time in HC and LC conditions. Correct choices were reinforced probabilistically after a randomly assigned delay: 70% of correct responses resulted in reward delivery. Time to reward delivery was drawn from an exponential distribution with mean of 8 sec (see Supplementary Figure 4 for an example and the average distributions of reward delivery) and on trials with no reward, the trial ends after 40 sec of no re-initiation occurs. Specifically, following stimulus discrimination, rats expressed their confidence by time wagering: they could wait a self-timed delay in anticipation of reward or initiate a new trial similar to previous work by the Kepecs lab (Lak et al., 2014). The initiation stimulus appeared on the touchscreen 2 sec after a rat indicated its choice. This delay was imposed to prevent non-discriminant responding. We define the time that the animal waited before re-initiating a trial as the waiting time (see Supplementary Figure 4 for an example and the average the distributions of waiting time). Following fully learning the task and testing on the perceptual decision-making with reinitiation (confidence report), rats were randomly assigned to a high confidence (HC)-or low confidence (LC) condition and experienced a reversal in the stimulus-response rule. In order to determine the visual stimuli for HC and LC conditions for each rat, we selected two SNR and contrast levels that had equal discrimination accuracy ( ′) and reinforcement history and were different only in confidence levels measured by waiting time. After determining discriminationmatched stimuli for each rat, rats were randomly assigned to LC and HC conditions and the corresponding stimulus was used for each rat based on the assigned condition. After the reversal in stimulus-response rule, rats were no longer offered an option to re-initiate the trial, but were required to wait a random delay before reward delivery or the end of the trial (on no-reward trials) following a response. This was to simplify the re-learning and ensure rats were not adopting a complex strategy due to the availability of the re-initiation option.
To study the contributions of BLA and ACC to decision-making and learning under perceptual uncertainty, we used a within-subject design: rats were given vehicle injections, CNO injections, and no injection (prior to reversal). The order of CNO and vehicle injections was counterbalanced. Therefore, a subset of behavioral sessions were preceded by inactivation of ACC or BLA pyramidal neurons via peripheral (3mg/kg; i.p.) administration of clozapine-noxide (CNO) 10 min prior to the testing. The injections were administered in rats' housing room.
Due to the long duration of pretraining on our task, all CNO injections were administered at least 12 weeks following the surgery, ensuring sufficient virus transduction and receptor expression.
On another subset of sessions, rats received vehicle to control for behavioral effects of the stress of injections. All rats received 2-day wash-out period between drug conditions and the order of injections was counterbalanced across rats.
Histology. Rats were euthanized within 90 min following the last testing session with an overdose of Euthasol (Euthasol, 0.8 mL, 390 mg/mL pentobarbital, 50 mg/mL phenytoin; Virbac, Fort Worth, TX), were transcardially-perfused, and their brains removed for histological processing. Brains were fixed in 10% buffered formalin acetate for 24 hours followed by 30% sucrose for 5 d. To visualize hM4Di-mCherry and -EGFP expression in BLA or ACC cell bodies, free-floating coronal sections were mounted onto slides and coverslipped with mounting medium for DAPI. Slices were visualized using a BZ-X710 microscope (Keyence, Itasca, IL), and analyzed with BZ-X Viewer and analysis software.
Signal detection theory analyses. According to standard signal detection theory, ′ measures how well a subject's perceptual decisions track physical stimuli. ′ is preferred to other discrimination accuracy measures (i.e. percent or probability correct) because it accounts for biases such as side bias, stimulus preference (Odegaard et al., 2018). Extending the same approach to confidence measures, metacognitive sensitivity measures how well confidence tracks the likelihood that a perceptual decision is correct, and like ′, can also be formulated to account for bias. Specifically, Maniscalco & Lau (2012) have proposed meta-′ to measure metacognitive sensitivity on the same scale as ′ so that one can calculate the ratio between the two (meta-′/ ′) to assess the metacognitive efficiency of a subject. They defined the task of classifying stimuli as a Type 1 task, whereas the rating of confidence in this classification as a Type 2 task. Meta-′/ ′ varies between 0 and 1, where 0 indicates that the rat's trial-by-trial waiting times (i.e. confidence) do not correspond with trial accuracy and 1 indicates that the rat's Type 2 capacity is exactly matching its Type 1 sensitivity. In other words, subjects could wait a longer time before re-initiating a trial when their response is correct, and wait less time before reinitiating when the response is incorrect, considering the limitation in discrimination.
Meta-′/ ′, therefore, should be larger than 0 and also significantly less than 1; a meta-′/ ′ of 1 would indicate ideal or optimal metacognitive behavior. Compared to commonly used Type 2 receiver operating characteristic (ROC) analysis, the meta-′/ ′ approach has the advantage of allowing one to isolate the effects of confidence on behavior from basic perceptual performance capacity. To calculate meta-′, we used MATLAB (MathWorks, Natick, MA) functions freely available at http://www.columbia.edu/~bsm2105/type2sdt/.

Data analyses.
We used MATLAB (MathWorks, Natick, MA; Version R2018b) for data and statistical analyses. For trial-by-trial learning analyses, we included data from all 15 rats that completed a total of 127,303 trials in 603 sessions. Learning occurred in a mixed design with three within-subject/repeated-measures of stage (vehicle, inhibition, no-injection) and three between-subjects-(group) conditions of vehicle, ACC inhibition, and BLA inhibition. All 15 rats experienced the vehicle and no-injection conditions. However, 8 and 7 rats experienced BLA and ACC inhibition conditions, respectively. Since it was possible that a rat received reward prior to "intended" re-initiation of a trial, we excluded rewarded trials in the analysis of waiting time and reaction time. Furthermore, we excluded the trials in which reaction time deviated from the mean of the reaction time of the session by more than three times the standard deviation. This criterion resulted in removal of 1.1% of the trials.
Unlike the analyses presented in the preceding sections with n=7 or 8 for each targeted brain region, statistical results for after reversal were restricted to 3 or 4 rats per group due to the additional HC and LC conditions (n=4 rats in each of the HC and LC conditions with the BLA as the targeted region, n=4 rats in the HC condition and n=3 rats in the LC condition with the ACC as the targeted region). This constraint was a consequence of the experimental (within-subject) design and longitudinal nature of pre-training on the task, followed by extensive learning. Not including pretraining, the mean number of sessions to complete both the initial and re-learning part of the experiments was 40-210 trials on average, in each session.
For comparisons of learning strategies and consistency following in following a stimulusresponse rule across different experimental conditions, we used permutation test. More specifically, we first calculated the actual probability of using a strategy (e.g., Win-Stay) in the observed data and then permuted this data 10000 times to construct the permutation or null distribution. We then calculated the probability of obtaining the observed value for use of the strategy based on the null distribution from which we estimated p-values (Hesterberg et al., 2005).
To compare the rate of learning after reversal we used an exponential function to fit the learning curve from all rats in a given experimental condition: where is the learning parameter.
Finally, we showed the only significant difference between the HC and LC conditions prior to reversal learning was waiting time for the assigned contrast-SNR pairs that were administered after reversals. Since contrast is correlated with SNR, we used a stepwise regression to find the variables which contributed significantly in explaining the response variables (waiting time, reward history). Confidence condition was entered as a predictor variable in a GLM along with other variables (SNR, contrast, targeted brain region, and trial accuracy) in a stepwise manner to observe which of these increased adjusted R-squared significantly for the response variables.

Rule-based Repetition index (RRI).
In order to examine the consistency in following a stimulus-response rule on two consecutive trials, we used a repetition index that was previously introduced to capture tendency to repeat the same choice beyond what is expected by chance (Soltani et al., 2013), and extended it to selection based on response rules. Specifically, we computed the probability that the same rule (either correct or incorrect) was used on two consecutive trials, (StayRule), and subtracted the tendency to repeat the same rule on two consecutive trials due solely to chance to arrive at the rule-based repetition index: where (C) is the probability of choosing the correct rule. Therefore, unlike other perseveration indices (Izquierdo et al., 2006), the RRI accounts for the probability of using the same rule on consecutive trials by chance.
Brain and Behavioral Health for enabling the use of equipment for microscopy imaging.

Supplementary analyses
Trial re-initiation is only affected by SNR. In order to ensure that our drug manipulations did not change the tendency of the animals to re-initiate a new trial, we used a GLM to examine whether the number of re-initiations was affected by either vehicle or CNO administration, or by either the BLA and ACC as the targeted brain region, or SNR of the stimuli. Importantly, we found that only SNR significantly affected the number of trial re-initiations (GLM; (7,871) = 13.5, = 1.14 × 10 =MT , adjusted E = 0.0908; ratio: = −2.26, = 0.00026), indicating that re-initiation mainly depended on the strength of the visual information and thus perceptual uncertainty.

Relationship between waiting time and reaction time in different experimental conditions.
The confidence intervals (95%) for the slopes of linear regressors indicated that the negative correlations between waiting time and reaction time in the two control conditions (vehicle and no-injection prior to reversal) were not significantly different, but were significantly different following CNO administration. More specifically, the confidence intervals for the slopes are as  Figure 7C,D). Thus, the correlation between these measures was still negative but weaker in both ACC and BLA inhibition conditions. Vehicle administration does not change the behavior. To show that vehicle administration alone does not produce the observed behavioral changes, we compared the effect of vehicle administration and no-injection on waiting time and reaction time. Similar to vehicle administration, we found that in the sessions in which no injection was administered, waiting times were longer on trials with a larger SNR (GLM; no-injection: (7,904) = 212, = 1.26 × 10 =MV> , adjusted E = 0.618; = 1.75, = 2.46 × 10 =M> ; Supplementary Figure 6).
Absence of non-specific effects of virus exposure. In the main body of the manuscript we presented data on behavior following vehicle and CNO administration in rats with DREADDs expressed in the BLA and ACC. Here, we include data demonstrating that these impairments are not due to non-specific effects of surgery/virus exposure. Therefore, in this section we compare null (EGFP) and active (DREADDs) virus following vehicle administration on the major measures of the task such as waiting time, reaction time, performance, ′, and meta-′.
We first observed that the type of virus (null vs. active) had no significant effect or interaction with trial type (correct vs. incorrect) and/or SNR values (GLM;   (80) or less (40) contrast with different signal-to-noise (SNR) ratio, reflecting the strength of the visual signal (4, most discriminable; 3, moderately discriminable; 2, least discriminable). (B) Waiting time and ′ increases with SNR for any value of contrast. Figure S2. Waiting time increases with SNR for any value of contrast whereas reaction time dependence on SNR was strongly modulated by contrast. Plotted are the waiting time and reaction time as a function of SNR for different values of contrasts, and separately for correct and incorrect trials. (A-C) Plotted is the waiting time for all trials as a function of three contrast levels (40, weaker to 80, stronger) and different signal-to-noise (SNR) ratio, reflecting the strength of the visual signal (4, most discriminable; 3, moderately discriminable; 2, least discriminable) for correct responses (green), incorrect responses (blue), and all trials (black). (D-E) Same as in panel A-C but for reaction times. Error bars show the S.E.M. over sessions (typically smaller than the symbols). Conventions are the same as in Figure 3. Overall, ACC inhibition rendered waiting time insensitive to SNR for all contrast values such that at high contrast, waiting time in ACC-inhibited rats fell below the control condition. In contrast, BLA inhibition, reduced waiting time for all contrast values without reducing the sensitivity to SNR.  Bottom: Points show FP area (normalized to baseline) at end of CNO application (averaged over last 5 min) from individual slices. Application of CNO strongly suppressed FPs in transfected slices (filled circles; FP area was reduced to 13.4 ± 8% of baseline; paired t-test for comparison to baseline, t(4) = 11.333, p = 3.46x10 -3 , n = 5 slices from 3 rats) but had no effect on responses in nontransfected slices (open circles; FP area was 103.7 ± 7% of baseline; paired t-test for comparison to baseline, t(3) = 0.578, p = 0.604, n = 4 slices from 3 rats).

Figure S6. Behavior is similar following vehicle administration and no-injection control. (A)
Waiting time in sessions with no injection increases similarly to sessions following vehicle administration. Plotted are the distributions of the waiting time for each SNR:2 (blue), 3 (green), and 4 (red), following no injection. Solid lines show median of each distribution. Dashed lines show the median of the same distributions but following vehicle administration as is shown in Figure 2A.   Figure 2A. (B) Same as in panel A but for sessions following inhibition of BLA, via CNO injection. Solid lines show median of each distribution. Dashed lines show the median of the same distributions but following vehicle administration as is shown in Figure  2A. (C) Waiting time before re-initiation of a new trial is negatively correlated with reaction time to make a choice following inhibition of ACC. Waiting time is plotted as a function of the reaction time for all trials and all rats. Each data point is a trial in a session following vehicle administration. (D) Same as in panel C but for sessions following inhibition of BLA, via CNO injection.