Polarity of uncertainty representation during exploration and exploitation in ventromedial prefrontal cortex

Trudel, Nadescha; Scholl, Jacqueline; Klein-Flügge, Miriam C.; Fouragnan, Elsa; Tankelevitch, Lev; Wittmann, Marco K.; Rushworth, Matthew F. S.

doi:10.1038/s41562-020-0929-3

Article
Published: 31 August 2020

Polarity of uncertainty representation during exploration and exploitation in ventromedial prefrontal cortex

Nature Human Behaviour volume 5, pages 83–98 (2021)Cite this article

4012 Accesses
26 Citations
69 Altmetric
Metrics details

Subjects

Abstract

Environments furnish multiple information sources for making predictions about future events. Here we use behavioural modelling and functional magnetic resonance imaging to describe how humans select predictors that might be most relevant. First, during early encounters with potential predictors, participants’ selections were explorative and directed towards subjectively uncertain predictors (positive uncertainty effect). This was particularly the case when many future opportunities remained to exploit knowledge gained. Then, preferences for accurate predictors increased over time, while uncertain predictors were avoided (negative uncertainty effect). The behavioural transition from positive to negative uncertainty-driven selections was accompanied by changes in the representations of belief uncertainty in ventromedial prefrontal cortex (vmPFC). The polarity of uncertainty representations (positive or negative encoding of uncertainty) changed between exploration and exploitation periods. Moreover, the two periods were separated by a third transitional period in which beliefs about predictors’ accuracy predominated. The vmPFC signals a multiplicity of decision variables, the strength and polarity of which vary with behavioural context.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Experimental task and design.**

**Fig. 2: Task statistics, Bayesian model and choice hypotheses.**

**Fig. 3: Dissociable effects of accuracy and uncertainty on predictor selections and subjective confidence judgements.**

**Fig. 4: Modulation of uncertainty prediction difference in vmPFC according to behavioural mode.**

**Fig. 5: Whole-brain maps for uncertainty prediction difference during exploration and exploitation.**

**Fig. 6: Interaction of repetition and uncertainty representation in vmPFC.**

**Fig. 7: Accuracy processing mediates uncertainty polarity change from exploration to exploitation.**

**Fig. 8: Polarity of subjective uncertainty in vmPFC changes from exploration to exploitation.**

Neural and computational underpinnings of biased confidence in human reinforcement learning

Article Open access 28 October 2023

Dissociable neural correlates of uncertainty underlie different exploration strategies

Article Open access 12 May 2020

Interactions between ventrolateral prefrontal and anterior cingulate cortex during learning and behavioural change

Article 07 July 2021

Data availability

We have deposited all choice raw data used for the analyses in the OSF repository at https://osf.io/d5qzw/?view_only=037ea3b875914623a06999cef97ac57f. We have deposited unthresholded fMRI maps of all contrasts depicted in the manuscript on NeuroVault at https://identifiers.org/neurovault.collection:8073. Source data are provided with this paper.

Code availability

The above OSF repository includes the full Bayesian modelling pipeline. Relevant behavioural and neural regressors were derived from this pipeline. We also provide the code for behavioural GLMs shown in Fig. 3. Please follow the README file inside the repository for details of its use: https://osf.io/d5qzw/?view_only=037ea3b875914623a06999cef97ac57f.

References

Akaishi, R., Kolling, N., Brown, J. W. & Rushworth, M. Neural mechanisms of credit assignment in a multicue environment. J. Neurosci. 36, 1096–1112 (2016).
CAS PubMed PubMed Central Google Scholar
Leong, Y. C., Radulescu, A., Daniel, R., DeWoskin, V. & Niv, Y. Dynamic interaction between reinforcement learning and attention in multidimensional environments. Neuron 93, 451–463 (2017).
CAS PubMed PubMed Central Google Scholar
Garrett, N., González-Garzón, A. M., Foulkes, L., Levita, L. & Sharot, T. Updating beliefs under perceived threat. J. Neurosci. 38, 7901–7911 (2018).
CAS PubMed PubMed Central Google Scholar
Charpentier, C. J., Bromberg-Martin, E. S. & Sharot, T. Valuation of knowledge and ignorance in mesolimbic reward circuitry. Proc. Natl Acad. Sci. USA 115, E7255–E7264 (2018).
CAS PubMed Google Scholar
Mackintosh, N. J. A theory of attention: variations in the associability of stimuli with reinforcement. Psychol. Rev. 82, 276–298 (1975).
Google Scholar
Pearce, J. M. & Hall, G. A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychol. Rev. 87, 532–552 (1980).
CAS PubMed Google Scholar
Wilson, R. C., Geana, A., White, J. M., Ludvig, E. A. & Cohen, J. D. Humans use directed and random exploration to solve the explore–exploit dilemma. J. Exp. Psychol. Gen. 143, 2074–2081 (2014).
PubMed PubMed Central Google Scholar
Kolling, N., Scholl, J., Chekroud, A., Trier, H. A. & Rushworth, M. F. S. Prospection, perseverance, and insight in sequential behavior. Neuron 99, 1069–1082.e7 (2018).
CAS PubMed PubMed Central Google Scholar
Findling, C., Skvortsova, V., Dromnelle, R., Palminteri, S. & Wyart, V. Computational noise in reward-guided learning drives behavioral variability in volatile environments. Nat. Neurosci. 22, 2066–2077 (2019).
CAS PubMed Google Scholar
Basten, U., Biele, G., Heekeren, H. R. & Fiebach, C. J. How the brain integrates costs and benefits during decision making. Proc. Natl Acad. Sci. USA 107, 21767–21772 (2010).
CAS PubMed Google Scholar
Boorman, E. D., Behrens, T. E. J., Woolrich, M. W. & Rushworth, M. F. S. How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron 62, 733–743 (2009).
CAS PubMed Google Scholar
Chau, B. K. H., Kolling, N., Hunt, L. T., Walton, M. E. & Rushworth, M. F. S. A neural mechanism underlying failure of optimal choice with multiple alternatives. Nat. Neurosci. 17, 463–470 (2014).
CAS PubMed PubMed Central Google Scholar
De Martino, B., Fleming, S. M., Garrett, N. & Dolan, R. J. Confidence in value-based choice. Nat. Neurosci. 16, 105–110 (2012).
PubMed PubMed Central Google Scholar
FitzGerald, T. H. B., Seymour, B. & Dolan, R. J. The role of human orbitofrontal cortex in value comparison for incommensurable objects. J. Neurosci. 29, 8388–8395 (2009).
CAS PubMed PubMed Central Google Scholar
Fouragnan, E. F. et al. The macaque anterior cingulate cortex translates counterfactual choice value into actual behavioral change. Nat. Neurosci. 22, 797–808 (2019).
CAS PubMed Google Scholar
Papageorgiou, G. K. et al. Inverted activity patterns in ventromedial prefrontal cortex during value-guided decision-making in a less-is-more task. Nat. Commun. 8, 1886 (2017).
PubMed PubMed Central Google Scholar
Philiastides, M. G., Biele, G. & Heekeren, H. R. A mechanistic account of value computation in the human brain. Proc. Natl Acad. Sci. USA 107, 9430–9435 (2010).
CAS PubMed Google Scholar
Wunderlich, K., Dayan, P. & Dolan, R. J. Mapping value based planning and extensively trained choice in the human brain. Nat. Neurosci. 15, 786–791 (2012).
CAS PubMed PubMed Central Google Scholar
Hunt, L. T. et al. Triple dissociation of attention and decision computations across prefrontal cortex. Nat. Neurosci. 21, 1471–1481 (2018).
CAS PubMed PubMed Central Google Scholar
Lim, S.-L., O’Doherty, J. P. & Rangel, A. The decision value computations in the vmPFC and striatum use a relative value code that is guided by visual attention. J. Neurosci. 31, 13214–13223 (2011).
CAS PubMed PubMed Central Google Scholar
Lopez-Persem, A., Domenech, P. & Pessiglione, M. How prior preferences determine decision-making frames and biases in the human brain. eLife 5, e20317 (2016).
PubMed PubMed Central Google Scholar
Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B. & Dolan, R. J. Cortical substrates for exploratory decisions in humans. Nature 441, 876–879 (2006).
CAS PubMed PubMed Central Google Scholar
Kolling, N., Behrens, T. E. J., Mars, R. B. & Rushworth, M. F. S. Neural mechanisms of foraging. Science 336, 95–98 (2012).
CAS PubMed PubMed Central Google Scholar
Zajkowski, W. K., Kossut, M. & Wilson, R. C. A causal role for right frontopolar cortex in directed, but not random, exploration. eLife https://doi.org/10.7554/eLife.27430 (2017).
Badre, D., Doll, B. B., Long, N. M. & Frank, M. J. Rostrolateral prefrontal cortex and individual differences in uncertainty-driven exploration. Neuron 73, 595–607 (2012).
CAS PubMed PubMed Central Google Scholar
Costa, V. D., Mitz, A. R. & Averbeck, B. B. Subcortical substrates of explore–exploit decisions in primates. Neuron 103, 533–545.e5 (2019).
CAS PubMed PubMed Central Google Scholar
Noonan, M. P., Kolling, N., Walton, M. E. & Rushworth, M. F. S. Re-evaluating the role of the orbitofrontal cortex in reward and reinforcement: re-evaluating the OFC. Eur. J. Neurosci. 35, 997–1010 (2012).
CAS PubMed Google Scholar
Hunt, L. T. et al. Mechanisms underlying cortical activity during value-guided choice. Nat. Neurosci. 15, 470–476 (2012).
CAS PubMed PubMed Central Google Scholar
Rushworth, M. F. S., Noonan, M. P., Boorman, E. D., Walton, M. E. & Behrens, T. E. Frontal cortex and reward-guided learning and decision-making. Neuron 70, 1054–1069 (2011).
CAS PubMed Google Scholar
Wilson, R. C., Takahashi, Y. K., Schoenbaum, G. & Niv, Y. Orbitofrontal cortex as a cognitive map of task space. Neuron 81, 267–279 (2014).
CAS PubMed PubMed Central Google Scholar
Meder, D. et al. Simultaneous representation of a spectrum of dynamically changing value estimates during decision making. Nat. Commun. 8, 1942 (2017).
PubMed PubMed Central Google Scholar
Kolling, N., Wittmann, M. & Rushworth, M. F. S. Multiple neural mechanisms of decision making and their competition under changing risk pressure. Neuron 81, 1190–1202 (2014).
CAS PubMed PubMed Central Google Scholar
Wittmann, M. K. et al. Predictive decision making driven by multiple time-linked reward representations in the anterior cingulate cortex. Nat. Commun. 7, 12327 (2016).
CAS PubMed PubMed Central Google Scholar
Boorman, E. D., Behrens, T. E. & Rushworth, M. F. Counterfactual choice and learning in a neural network centered on human lateral frontopolar cortex. PLoS Biol. 9, e1001093 (2011).
CAS PubMed PubMed Central Google Scholar
Boorman, E. D., Rushworth, M. F. & Behrens, T. E. Ventromedial prefrontal and anterior cingulate cortex adopt choice and default reference frames during sequential multi-alternative choice. J. Neurosci. 33, 2242–2253 (2013).
CAS PubMed PubMed Central Google Scholar
Kolling, N., Behrens, T., Wittmann, M. & Rushworth, M. Multiple signals in anterior cingulate cortex. Curr. Opin. Neurobiol. 37, 36–43 (2016).
CAS PubMed PubMed Central Google Scholar
Kolling, N. et al. Value, search, persistence and model updating in anterior cingulate cortex. Nat. Neurosci. 19, 1280–1285 (2016).
CAS PubMed Google Scholar
Hayden, B. Y., Pearson, J. M. & Platt, M. L. Neuronal basis of sequential foraging decisions in a patchy environment. Nat. Neurosci. 14, 933–939 (2011).
CAS PubMed PubMed Central Google Scholar
Quilodran, R., Rothé, M. & Procyk, E. Behavioral shifts and action valuation in the anterior cingulate cortex. Neuron 57, 314–325 (2008).
CAS PubMed Google Scholar
Stoll, F. M., Fontanier, V. & Procyk, E. Specific frontal neural dynamics contribute to decisions to check. Nat. Commun. 7, 11990 (2016).
CAS PubMed PubMed Central Google Scholar
Karlsson, M. P., Tervo, D. G. R. & Karpova, A. Y. Network resets in medial prefrontal cortex mark the onset of behavioral uncertainty. Science 338, 135–139 (2012).
CAS PubMed Google Scholar
O’Reilly, J. X. et al. Dissociable effects of surprise and model update in parietal and anterior cingulate cortex. Proc. Natl Acad. Sci. USA 110, E3660–E3669 (2013).
PubMed Google Scholar
Tervo, D. G. R. et al. Behavioral variability through stochastic choice and its gating by anterior cingulate cortex. Cell 159, 21–32 (2014).
CAS PubMed Google Scholar
Bernacchia, A., Seo, H., Lee, D. & Wang, X.-J. A reservoir of time constants for memory traces in cortical neurons. Nat. Neurosci. 14, 366–372 (2011).
CAS PubMed PubMed Central Google Scholar
Lebreton, M., Abitbol, R., Daunizeau, J. & Pessiglione, M. Automatic integration of confidence in the brain valuation signal. Nat. Neurosci. 18, 1159–1167 (2015).
CAS PubMed Google Scholar
Smith, S. M. et al. Advances in functional and structural MR image analysis and implementation as FSL. Neuroimage 23(Suppl. 1), S208–S219 (2004).
Google Scholar
Jenkinson, M. & Smith, S. A global optimisation method for robust affine registration of brain images. Med. Image Anal. 5, 143–156 (2001).
CAS PubMed Google Scholar

Download references

Acknowledgements

N.T. was funded by a DTC ESRC studentship (ES/J500112/1), J.S. was supported by a MRC Skills Development Fellowship (MR/NO14448/1), M.C.K.-F. by a Sir Henry Wellcome Fellowship (103184/Z/13/Z), M.F.S.R. was funded by a Wellcome Senior Investigator Award (WT100973AIA). E.F. was funded by UKRI FLF (MR/T023007/1). The funders had no role in the study design, data collection and analysis, decision to publish or preparation of the manuscript. We would like to thank all members of the Rushworth lab for great discussions on this project.

Author information

These authors contributed equally: Marco K. Wittmann, Matthew F. S. Rushworth.

Authors and Affiliations

Wellcome Integrative Neuroimaging (WIN), Department of Experimental Psychology, University of Oxford, Oxford, UK
Nadescha Trudel, Jacqueline Scholl, Miriam C. Klein-Flügge, Elsa Fouragnan, Lev Tankelevitch, Marco K. Wittmann & Matthew F. S. Rushworth
School of Psychology, University of Plymouth, Plymouth, UK
Elsa Fouragnan

Authors

Nadescha Trudel
View author publications
You can also search for this author in PubMed Google Scholar
Jacqueline Scholl
View author publications
You can also search for this author in PubMed Google Scholar
Miriam C. Klein-Flügge
View author publications
You can also search for this author in PubMed Google Scholar
Elsa Fouragnan
View author publications
You can also search for this author in PubMed Google Scholar
Lev Tankelevitch
View author publications
You can also search for this author in PubMed Google Scholar
Marco K. Wittmann
View author publications
You can also search for this author in PubMed Google Scholar
Matthew F. S. Rushworth
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

N.T., M.K.W. and M.F.S.R. conceived and designed the experiment; N.T., J.S. and M.K.W. constructed the Bayesian model; N.T. conducted the experiment; N.T., E.F., L.T., M.K.W. and M.F.S.R. conceived behavioural analyses; N.T., M.C.K.-F., M.K.W. and M.F.S.R. conceived neural analyses; N.T. conducted data analyses; N.T., M.K.W. and M.F.S.R. wrote the manuscript; all authors provided expertise and feedback on the write-up; M.K.W. and M.F.S.R. supervised the research project.

Corresponding author

Correspondence to Nadescha Trudel.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Primary Handling Editor: Marike Schiffer

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Model comparison: Bayesian model with informative prior.

a, Participants might form expectations about possible sigma values across time. Then prior beliefs at block start should reflect information gathered from past observations. We constructed a Bayesian model that incorporated block-wise priors that reflect the previous history of all observations irrespective of predictor (referred to as adaptive model; green model) and compared it to the Bayesian model using uniform priors (referred to as original model; grey model). b, We replicated all effects of interest when deriving belief estimates with the adaptive Bayesian model (accuracy, t(23) = 4.7, p<0.001, d=0.96, 95% CI=[0.91 2.3]; uncertainty, t(23) = 1.2, p= 0.25, d=−0.24, 95% CI=[−0.77 0.21]; uncertainty x block time, t(23) = 6, p<0.001, d=1.2, 95% CI=[0.83 1.73]; accuracy x block time, t(23) = 2.6, p= 0.015, d=−0.54, 95% CI=[−1.1 −0.13]). c, The original Bayesian model was a better model fit for all behavioural analyses (here we only show across all trials) compared to the adaptive Bayesian model. One reason might be that a uniform prior provides more flexibility for estimates to converge towards their true value across time. d, Importantly, for behavioural and neural analyses, variables are constructed in relative terms (for example difference between left and right predictor): changing prior distributions only impact absolute values while the relative values keep the same proportions. For this reason, the key results remain unchanged when modifying initial block-wise priors. e, Next, we used the confidence judgement (that is interval size) at each block start averaged across four predictors as an index of prior beliefs and compared it across all (six) blocks. We show confidence judgements at the start of a block (red line) and at the end of a block (blue line). Participants reset their prior beliefs from one block to the next (blue line). Analysis of the interval size shows no credible evidence for a change of confidence judgment during the first encounters across blocks when excluding blocks that were affected by practice trials (red line: Mauchly’s test indicated a violation of equal variances: x²(9)=23,p=0.006, therefore we used Greenhouse Geiser: F(2.6,92) = 2.5, p =0.08, η²=0.71, Bayes factor₁₀=0.91, error%=0.36). (see Supplementary Methods, Section 2 for detailed model construction and confidence judgement analysis) (n = 24; error bars are SEM across participants).

Source data

Extended Data Fig. 2 Model comparison: reinforcement learning model tracking payoff history.

The payoff scheme reflects participants’ beliefs in the accuracies and certainties associated with selected predictors, however it may itself exert an additional independent effect on behaviour and neural activity. We constructed a reinforcement learning (RL) model that tracked each predictor’s payoff history in a recency-weighted way and compared it to the Bayesian model using uniform priors (referred to as original model). a, Behavioural effects of interest (Fig. 3a) were replicated when controlling for the RL-derived value difference (in yellow) (accuracy, t(23) = 5.5, p<0.001, d=1.1, 95% CI=[0.48 1.1]; uncertainty, t(23) = −3.1, p= 0.0049, d=−0.63, 95% CI=[−0.75 −0.15]; uncertainty x block time, t(23) = 5.2, p<0.001, d=1.1, 95%CI =[0.49 1.13]; accuracy x block time, t(23) = −6.8, p<0.001, d=−1.4, 95% CI=[−0.84 −0.44] and RL value difference, t(23) = 11.9, p<0.001, d= 2.43, 95% CI =[0.7 0.99]). b, This is consistent with the relative lack of correlation between variables derived from the Bayesian model and the RL-derive value difference. c, A combination (red bar) of the RL value model (yellow bar) and original Bayesian model (grey bar) was the best fit for choice behaviour, supporting the relevance of value-based and information-based variables for explaining choice behaviour. d, We repeated the previous whole-brain analysis (fMRI-GLM1) and additionally included the RL value difference. We replicated a domain general prediction difference in vmPFC (upper panel), while there was no cluster-significant activation for RL-derived value difference (lower panel); however, the activation that was strongest was located within vmPFC. In conclusion, RL value terms complement the Bayesian model but do not substitute for the Bayesian model terms as an explanation of behaviour. (n = 24; error bars are SEM across participants; whole-brain effects family-wise error cluster corrected with z > 2.3 and p < 0.05).

Source data

Extended Data Fig. 3 Model fit across time.

a, Model fits are better for the second compared to the first block half (paired t-test: t(23) 8.5,p<0.001,d=1.74,95% confidence interval(CI)=[28.5 46.8]). We tested whether we can replicate main effects of interest when equalizing model fits across block halves. b, c, We used choice residuals which offer an index of model fit but unlike BIC measures, they are specific to each trial and they do not depend on trial number or number of parameters. b, We show absolute choice residuals and block time for one example participant across all trials (‘short, medium and long’ refer to the horizon length). There is a limited correlation for all choice residuals across all trials and the block time variable (inset shows correlation across all participants; r=0.27;7% shared variance,95% CI=[−0.15 0.7]). This makes it very unlikely that results are driven by linear changes in model fit alone. c, We tested this empirically. First, we extracted absolute residuals from the main GLM across all trials and separated them into first and second halves. c-i, In accord with the model fit results, there was more residual variance during the first compared to second block half (paired t-test: t(23)=7.7, p<0.001, d=1.6, 95% CI=[0.08 0.14]). c-ii, Next, we excluded trials on the basis of the trial-wise choice residuals until there was no credible evidence that block halves were different in their residual variance (in effect this meant trials with residuals above 0.6 had to be excluded; paired t-test: t(23)=1.35, p=0.19, d= 0.28, 95% CI=[−0.004 0.02], Bayes factor₁₀=0.48, error%=1.164e-4). d, Trials were collapsed back into one category and the main GLM (Fig. 3a) was applied onto the new subset of trials. We replicated all effects of interest (accuracy: t(23)=13.1, p<0.001, d=2.67, 95% CI=[2.8 3.8]; uncertainty: t(23) = −1.14, p=0.26, d=−0.23, 95% CI=[−1.6 0.45]; uncertainty x block time: t(23)=9.7, p<0.001, d=1.98, 95% CI=[2 3.1]; accuracy x block time: t(23)= −6.81, p<0.001, d=−1.39, 95% CI=[−2.8 -1.5]). (n = 24; error bars are SEM across participants).

Source data

Extended Data Fig. 4 Trial classification into exploration/ exploitation according to individual choices.

A-i, We classified trials into exploration and exploitation according to subjects’ choices. To this end, we compared accuracy and uncertainty of the chosen and unchosen predictors, defining the prediction difference. Explorative choices (Ai-1) were defined as those directed towards higher uncertainty (positive uncertainty prediction difference) and less accurate predictors (negative accuracy prediction difference) (approximately 18% of trials). An exploitative trial (Ai-2) was defined by choices of predictors with a higher accuracy (positive accuracy prediction difference) and lower uncertainty (negative uncertainty prediction difference) than the unchosen predictor (approximately 52% of trials). A-ii, When participants chose predictors that were both more accurate and more uncertain, we compared the relative magnitudes of the accuracy prediction difference and the uncertainty prediction difference (Ai-3). If the difference in accuracy prediction was greater than the uncertainty prediction difference, then that trial was allocated to the exploitative bin. If the difference in uncertainty prediction difference was greater than the difference in accuracy prediction difference, then the trial was assigned to the exploratory bin. However, if the difference between the sizes of the decision variables was small (<5) then the trial was assigned to both exploration and exploitation bins (5% of trials). The remaining 25% of trials (white area in panel i) were not assigned to either exploitative or exploratory bins, because these choices were neither guided by uncertainty nor accuracy. A-iii, Example of an exploitative choice: the chosen predictor has a higher accuracy and a lower uncertainty prediction difference. B, As a manipulation check, we plot the prediction differences for accuracy and uncertainty separated by exploration and exploitation. We find that indeed exploratory trials are characterized by a positive uncertainty prediction difference (the chosen predictor was associated with greater uncertainty than the unchosen predictor) while exploitative trials are defined by a positive accuracy prediction difference (the chosen predictor was associated with greater predictive accuracy than the unchosen predictor) and negative uncertainty prediction difference (the chosen predictor was associated with greater negative predictive uncertainty than the unchosen predictor). For robustness of trial classification, see Supplementary Fig. 8. (n = 24).

Extended Data Fig. 5 Uncertainty-related signals in subcortical regions during exploration and exploitation.

We show subcortical activation associated with the uncertainty prediction difference for exploration, exploitation, and their difference. Activation is shown during the decision phase, and when relevant, during the outcome phase. We used bilateral masks and averaged the results over both hemispheres for each ROI. a, Amygdala represents uncertainty prediction difference during exploration more strongly than during exploitation (paired t-test: uncertainty prediction difference, explore vs exploit: t(23) = 3.5, p=0.002, d=0.71, 95% confidence interval (CI)=[6 23.5]). b, Activation patterns in VS during decision (left panel) and outcome phases (right panel) suggest its primary involvement is during exploitation. VS represented both a negative uncertainty prediction difference during the decision phase during exploitation (t(23)=−2.4, p=0.02, d=−0.49, 95% confidence interval=[−15.8 −1.3]) and the payoff during the outcome phase during both exploration and exploitation but it did so more strongly during exploitation (paired t-test: payoff for explore vs exploit (t(23) = −2.3, p=0.033, d=−0.47, 95% CI=[−21.2 −0.96]). c, Finally, VTA activity reflected uncertainty during both exploration and exploitation in the decision phase (exploration: t(23) = 2.3, p= 0.03, d=0.47, 95% CI=[1.94 40]; exploitation: t(23) = −3, p=0.007, d=−0.6, 95% CI=[−25.3 −4.4]). (n = 24; error bars are SEM across participants).

Source data

Supplementary information

Supplementary Information

Supplementary Methods (1, details on task versions; 2, alternative computational models), Supplementary Figs. 1–9 (related to: 3, methods/experimental design; 4, neural results), Supplementary Tables 1–3 (6, peak coordinates of cluster-corrected whole-brain effects) and Supplementary References.

Reporting Summary

Source data

Source Data Fig. 3

Statistical source data for Fig. 3.

Source Data Fig. 6

Statistical source data for Fig. 6.

Source Data Fig. 7

Statistical source data for Fig. 7.

Source Data Extended Data Fig. 1

Statistical source data for Extended Data Fig. 1.

Source Data Extended Data Fig. 2

Statistical source data for Extended Data Fig. 2.

Source Data Extended Data Fig. 3

Statistical source data for Extended Data Fig. 3

Source Data Extended Data Fig. 5

Statistical source data for Extended Data Fig. 5.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Trudel, N., Scholl, J., Klein-Flügge, M.C. et al. Polarity of uncertainty representation during exploration and exploitation in ventromedial prefrontal cortex. Nat Hum Behav 5, 83–98 (2021). https://doi.org/10.1038/s41562-020-0929-3

Download citation

Received: 25 November 2019
Accepted: 17 July 2020
Published: 31 August 2020
Issue Date: January 2021
DOI: https://doi.org/10.1038/s41562-020-0929-3

This article is cited by

Prefrontal signals precede striatal signals for biased credit assignment in motivational learning biases
- Johannes Algermissen
- Jennifer C. Swart
- Hanneke E. M. den Ouden
Nature Communications (2024)
Studying the neural representations of uncertainty
- Edgar Y. Walker
- Stephan Pohl
- Florent Meyniel
Nature Neuroscience (2023)
Neural and computational underpinnings of biased confidence in human reinforcement learning
- Chih-Chung Ting
- Nahuel Salem-Garcia
- Maël Lebreton
Nature Communications (2023)
Neurons in human pre-supplementary motor area encode key computations for value-based choice
- Tomas G. Aquino
- Jeffrey Cockburn
- John P. O’Doherty
Nature Human Behaviour (2023)
Interactions between ventrolateral prefrontal and anterior cingulate cortex during learning and behavioural change
- Ilya E. Monosov
- Matthew F. S. Rushworth
Neuropsychopharmacology (2022)

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links