Polarity of uncertainty representation during exploration and exploitation in ventromedial prefrontal cortex

Abstract

Environments furnish multiple information sources for making predictions about future events. Here we use behavioural modelling and functional magnetic resonance imaging to describe how humans select predictors that might be most relevant. First, during early encounters with potential predictors, participants’ selections were explorative and directed towards subjectively uncertain predictors (positive uncertainty effect). This was particularly the case when many future opportunities remained to exploit knowledge gained. Then, preferences for accurate predictors increased over time, while uncertain predictors were avoided (negative uncertainty effect). The behavioural transition from positive to negative uncertainty-driven selections was accompanied by changes in the representations of belief uncertainty in ventromedial prefrontal cortex (vmPFC). The polarity of uncertainty representations (positive or negative encoding of uncertainty) changed between exploration and exploitation periods. Moreover, the two periods were separated by a third transitional period in which beliefs about predictors’ accuracy predominated. The vmPFC signals a multiplicity of decision variables, the strength and polarity of which vary with behavioural context.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Experimental task and design.
Fig. 2: Task statistics, Bayesian model and choice hypotheses.
Fig. 3: Dissociable effects of accuracy and uncertainty on predictor selections and subjective confidence judgements.
Fig. 4: Modulation of uncertainty prediction difference in vmPFC according to behavioural mode.
Fig. 5: Whole-brain maps for uncertainty prediction difference during exploration and exploitation.
Fig. 6: Interaction of repetition and uncertainty representation in vmPFC.
Fig. 7: Accuracy processing mediates uncertainty polarity change from exploration to exploitation.
Fig. 8: Polarity of subjective uncertainty in vmPFC changes from exploration to exploitation.

Data availability

We have deposited all choice raw data used for the analyses in the OSF repository at https://osf.io/d5qzw/?view_only=037ea3b875914623a06999cef97ac57f. We have deposited unthresholded fMRI maps of all contrasts depicted in the manuscript on NeuroVault at https://identifiers.org/neurovault.collection:8073. Source data are provided with this paper.

Code availability

The above OSF repository includes the full Bayesian modelling pipeline. Relevant behavioural and neural regressors were derived from this pipeline. We also provide the code for behavioural GLMs shown in Fig. 3. Please follow the README file inside the repository for details of its use: https://osf.io/d5qzw/?view_only=037ea3b875914623a06999cef97ac57f.

References

  1. 1.

    Akaishi, R., Kolling, N., Brown, J. W. & Rushworth, M. Neural mechanisms of credit assignment in a multicue environment. J. Neurosci. 36, 1096–1112 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. 2.

    Leong, Y. C., Radulescu, A., Daniel, R., DeWoskin, V. & Niv, Y. Dynamic interaction between reinforcement learning and attention in multidimensional environments. Neuron 93, 451–463 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. 3.

    Garrett, N., González-Garzón, A. M., Foulkes, L., Levita, L. & Sharot, T. Updating beliefs under perceived threat. J. Neurosci. 38, 7901–7911 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Charpentier, C. J., Bromberg-Martin, E. S. & Sharot, T. Valuation of knowledge and ignorance in mesolimbic reward circuitry. Proc. Natl Acad. Sci. USA 115, E7255–E7264 (2018).

    CAS  PubMed  Google Scholar 

  5. 5.

    Mackintosh, N. J. A theory of attention: variations in the associability of stimuli with reinforcement. Psychol. Rev. 82, 276–298 (1975).

    Google Scholar 

  6. 6.

    Pearce, J. M. & Hall, G. A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychol. Rev. 87, 532–552 (1980).

    CAS  PubMed  Google Scholar 

  7. 7.

    Wilson, R. C., Geana, A., White, J. M., Ludvig, E. A. & Cohen, J. D. Humans use directed and random exploration to solve the explore–exploit dilemma. J. Exp. Psychol. Gen. 143, 2074–2081 (2014).

    PubMed  PubMed Central  Google Scholar 

  8. 8.

    Kolling, N., Scholl, J., Chekroud, A., Trier, H. A. & Rushworth, M. F. S. Prospection, perseverance, and insight in sequential behavior. Neuron 99, 1069–1082.e7 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Findling, C., Skvortsova, V., Dromnelle, R., Palminteri, S. & Wyart, V. Computational noise in reward-guided learning drives behavioral variability in volatile environments. Nat. Neurosci. 22, 2066–2077 (2019).

    CAS  PubMed  Google Scholar 

  10. 10.

    Basten, U., Biele, G., Heekeren, H. R. & Fiebach, C. J. How the brain integrates costs and benefits during decision making. Proc. Natl Acad. Sci. USA 107, 21767–21772 (2010).

    CAS  PubMed  Google Scholar 

  11. 11.

    Boorman, E. D., Behrens, T. E. J., Woolrich, M. W. & Rushworth, M. F. S. How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron 62, 733–743 (2009).

    CAS  PubMed  Google Scholar 

  12. 12.

    Chau, B. K. H., Kolling, N., Hunt, L. T., Walton, M. E. & Rushworth, M. F. S. A neural mechanism underlying failure of optimal choice with multiple alternatives. Nat. Neurosci. 17, 463–470 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. 13.

    De Martino, B., Fleming, S. M., Garrett, N. & Dolan, R. J. Confidence in value-based choice. Nat. Neurosci. 16, 105–110 (2012).

    PubMed  PubMed Central  Google Scholar 

  14. 14.

    FitzGerald, T. H. B., Seymour, B. & Dolan, R. J. The role of human orbitofrontal cortex in value comparison for incommensurable objects. J. Neurosci. 29, 8388–8395 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. 15.

    Fouragnan, E. F. et al. The macaque anterior cingulate cortex translates counterfactual choice value into actual behavioral change. Nat. Neurosci. 22, 797–808 (2019).

    CAS  PubMed  Google Scholar 

  16. 16.

    Papageorgiou, G. K. et al. Inverted activity patterns in ventromedial prefrontal cortex during value-guided decision-making in a less-is-more task. Nat. Commun. 8, 1886 (2017).

    PubMed  PubMed Central  Google Scholar 

  17. 17.

    Philiastides, M. G., Biele, G. & Heekeren, H. R. A mechanistic account of value computation in the human brain. Proc. Natl Acad. Sci. USA 107, 9430–9435 (2010).

    CAS  PubMed  Google Scholar 

  18. 18.

    Wunderlich, K., Dayan, P. & Dolan, R. J. Mapping value based planning and extensively trained choice in the human brain. Nat. Neurosci. 15, 786–791 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. 19.

    Hunt, L. T. et al. Triple dissociation of attention and decision computations across prefrontal cortex. Nat. Neurosci. 21, 1471–1481 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Lim, S.-L., O’Doherty, J. P. & Rangel, A. The decision value computations in the vmPFC and striatum use a relative value code that is guided by visual attention. J. Neurosci. 31, 13214–13223 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. 21.

    Lopez-Persem, A., Domenech, P. & Pessiglione, M. How prior preferences determine decision-making frames and biases in the human brain. eLife 5, e20317 (2016).

    PubMed  PubMed Central  Google Scholar 

  22. 22.

    Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B. & Dolan, R. J. Cortical substrates for exploratory decisions in humans. Nature 441, 876–879 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Kolling, N., Behrens, T. E. J., Mars, R. B. & Rushworth, M. F. S. Neural mechanisms of foraging. Science 336, 95–98 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Zajkowski, W. K., Kossut, M. & Wilson, R. C. A causal role for right frontopolar cortex in directed, but not random, exploration. eLife https://doi.org/10.7554/eLife.27430 (2017).

  25. 25.

    Badre, D., Doll, B. B., Long, N. M. & Frank, M. J. Rostrolateral prefrontal cortex and individual differences in uncertainty-driven exploration. Neuron 73, 595–607 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Costa, V. D., Mitz, A. R. & Averbeck, B. B. Subcortical substrates of explore–exploit decisions in primates. Neuron 103, 533–545.e5 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. 27.

    Noonan, M. P., Kolling, N., Walton, M. E. & Rushworth, M. F. S. Re-evaluating the role of the orbitofrontal cortex in reward and reinforcement: re-evaluating the OFC. Eur. J. Neurosci. 35, 997–1010 (2012).

    CAS  PubMed  Google Scholar 

  28. 28.

    Hunt, L. T. et al. Mechanisms underlying cortical activity during value-guided choice. Nat. Neurosci. 15, 470–476 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Rushworth, M. F. S., Noonan, M. P., Boorman, E. D., Walton, M. E. & Behrens, T. E. Frontal cortex and reward-guided learning and decision-making. Neuron 70, 1054–1069 (2011).

    CAS  PubMed  Google Scholar 

  30. 30.

    Wilson, R. C., Takahashi, Y. K., Schoenbaum, G. & Niv, Y. Orbitofrontal cortex as a cognitive map of task space. Neuron 81, 267–279 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Meder, D. et al. Simultaneous representation of a spectrum of dynamically changing value estimates during decision making. Nat. Commun. 8, 1942 (2017).

    PubMed  PubMed Central  Google Scholar 

  32. 32.

    Kolling, N., Wittmann, M. & Rushworth, M. F. S. Multiple neural mechanisms of decision making and their competition under changing risk pressure. Neuron 81, 1190–1202 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. 33.

    Wittmann, M. K. et al. Predictive decision making driven by multiple time-linked reward representations in the anterior cingulate cortex. Nat. Commun. 7, 12327 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. 34.

    Boorman, E. D., Behrens, T. E. & Rushworth, M. F. Counterfactual choice and learning in a neural network centered on human lateral frontopolar cortex. PLoS Biol. 9, e1001093 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Boorman, E. D., Rushworth, M. F. & Behrens, T. E. Ventromedial prefrontal and anterior cingulate cortex adopt choice and default reference frames during sequential multi-alternative choice. J. Neurosci. 33, 2242–2253 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Kolling, N., Behrens, T., Wittmann, M. & Rushworth, M. Multiple signals in anterior cingulate cortex. Curr. Opin. Neurobiol. 37, 36–43 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. 37.

    Kolling, N. et al. Value, search, persistence and model updating in anterior cingulate cortex. Nat. Neurosci. 19, 1280–1285 (2016).

    CAS  PubMed  Google Scholar 

  38. 38.

    Hayden, B. Y., Pearson, J. M. & Platt, M. L. Neuronal basis of sequential foraging decisions in a patchy environment. Nat. Neurosci. 14, 933–939 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. 39.

    Quilodran, R., Rothé, M. & Procyk, E. Behavioral shifts and action valuation in the anterior cingulate cortex. Neuron 57, 314–325 (2008).

    CAS  PubMed  Google Scholar 

  40. 40.

    Stoll, F. M., Fontanier, V. & Procyk, E. Specific frontal neural dynamics contribute to decisions to check. Nat. Commun. 7, 11990 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Karlsson, M. P., Tervo, D. G. R. & Karpova, A. Y. Network resets in medial prefrontal cortex mark the onset of behavioral uncertainty. Science 338, 135–139 (2012).

    CAS  PubMed  Google Scholar 

  42. 42.

    O’Reilly, J. X. et al. Dissociable effects of surprise and model update in parietal and anterior cingulate cortex. Proc. Natl Acad. Sci. USA 110, E3660–E3669 (2013).

    PubMed  Google Scholar 

  43. 43.

    Tervo, D. G. R. et al. Behavioral variability through stochastic choice and its gating by anterior cingulate cortex. Cell 159, 21–32 (2014).

    CAS  PubMed  Google Scholar 

  44. 44.

    Bernacchia, A., Seo, H., Lee, D. & Wang, X.-J. A reservoir of time constants for memory traces in cortical neurons. Nat. Neurosci. 14, 366–372 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. 45.

    Lebreton, M., Abitbol, R., Daunizeau, J. & Pessiglione, M. Automatic integration of confidence in the brain valuation signal. Nat. Neurosci. 18, 1159–1167 (2015).

    CAS  PubMed  Google Scholar 

  46. 46.

    Smith, S. M. et al. Advances in functional and structural MR image analysis and implementation as FSL. Neuroimage 23(Suppl. 1), S208–S219 (2004).

    Google Scholar 

  47. 47.

    Jenkinson, M. & Smith, S. A global optimisation method for robust affine registration of brain images. Med. Image Anal. 5, 143–156 (2001).

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

N.T. was funded by a DTC ESRC studentship (ES/J500112/1), J.S. was supported by a MRC Skills Development Fellowship (MR/NO14448/1), M.C.K.-F. by a Sir Henry Wellcome Fellowship (103184/Z/13/Z), M.F.S.R. was funded by a Wellcome Senior Investigator Award (WT100973AIA). E.F. was funded by UKRI FLF (MR/T023007/1). The funders had no role in the study design, data collection and analysis, decision to publish or preparation of the manuscript. We would like to thank all members of the Rushworth lab for great discussions on this project.

Author information

Affiliations

Authors

Contributions

N.T., M.K.W. and M.F.S.R. conceived and designed the experiment; N.T., J.S. and M.K.W. constructed the Bayesian model; N.T. conducted the experiment; N.T., E.F., L.T., M.K.W. and M.F.S.R. conceived behavioural analyses; N.T., M.C.K.-F., M.K.W. and M.F.S.R. conceived neural analyses; N.T. conducted data analyses; N.T., M.K.W. and M.F.S.R. wrote the manuscript; all authors provided expertise and feedback on the write-up; M.K.W. and M.F.S.R. supervised the research project.

Corresponding author

Correspondence to Nadescha Trudel.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Primary Handling Editor: Marike Schiffer

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Model comparison: Bayesian model with informative prior.

a, Participants might form expectations about possible sigma values across time. Then prior beliefs at block start should reflect information gathered from past observations. We constructed a Bayesian model that incorporated block-wise priors that reflect the previous history of all observations irrespective of predictor (referred to as adaptive model; green model) and compared it to the Bayesian model using uniform priors (referred to as original model; grey model). b, We replicated all effects of interest when deriving belief estimates with the adaptive Bayesian model (accuracy, t(23) = 4.7, p<0.001, d=0.96, 95% CI=[0.91 2.3]; uncertainty, t(23) = 1.2, p= 0.25, d=−0.24, 95% CI=[−0.77 0.21]; uncertainty x block time, t(23) = 6, p<0.001, d=1.2, 95% CI=[0.83 1.73]; accuracy x block time, t(23) = 2.6, p= 0.015, d=−0.54, 95% CI=[−1.1 −0.13]). c, The original Bayesian model was a better model fit for all behavioural analyses (here we only show across all trials) compared to the adaptive Bayesian model. One reason might be that a uniform prior provides more flexibility for estimates to converge towards their true value across time. d, Importantly, for behavioural and neural analyses, variables are constructed in relative terms (for example difference between left and right predictor): changing prior distributions only impact absolute values while the relative values keep the same proportions. For this reason, the key results remain unchanged when modifying initial block-wise priors. e, Next, we used the confidence judgement (that is interval size) at each block start averaged across four predictors as an index of prior beliefs and compared it across all (six) blocks. We show confidence judgements at the start of a block (red line) and at the end of a block (blue line). Participants reset their prior beliefs from one block to the next (blue line). Analysis of the interval size shows no credible evidence for a change of confidence judgment during the first encounters across blocks when excluding blocks that were affected by practice trials (red line: Mauchly’s test indicated a violation of equal variances: x2(9)=23,p=0.006, therefore we used Greenhouse Geiser: F(2.6,92) = 2.5, p =0.08, η2=0.71, Bayes factor10=0.91, error%=0.36). (see Supplementary Methods, Section 2 for detailed model construction and confidence judgement analysis) (n = 24; error bars are SEM across participants). Source data

Extended Data Fig. 2 Model comparison: reinforcement learning model tracking payoff history.

The payoff scheme reflects participants’ beliefs in the accuracies and certainties associated with selected predictors, however it may itself exert an additional independent effect on behaviour and neural activity. We constructed a reinforcement learning (RL) model that tracked each predictor’s payoff history in a recency-weighted way and compared it to the Bayesian model using uniform priors (referred to as original model). a, Behavioural effects of interest (Fig. 3a) were replicated when controlling for the RL-derived value difference (in yellow) (accuracy, t(23) = 5.5, p<0.001, d=1.1, 95% CI=[0.48 1.1]; uncertainty, t(23) = −3.1, p= 0.0049, d=−0.63, 95% CI=[−0.75 −0.15]; uncertainty x block time, t(23) = 5.2, p<0.001, d=1.1, 95%CI =[0.49 1.13]; accuracy x block time, t(23) = −6.8, p<0.001, d=−1.4, 95% CI=[−0.84 −0.44] and RL value difference, t(23) = 11.9, p<0.001, d= 2.43, 95% CI =[0.7 0.99]). b, This is consistent with the relative lack of correlation between variables derived from the Bayesian model and the RL-derive value difference. c, A combination (red bar) of the RL value model (yellow bar) and original Bayesian model (grey bar) was the best fit for choice behaviour, supporting the relevance of value-based and information-based variables for explaining choice behaviour. d, We repeated the previous whole-brain analysis (fMRI-GLM1) and additionally included the RL value difference. We replicated a domain general prediction difference in vmPFC (upper panel), while there was no cluster-significant activation for RL-derived value difference (lower panel); however, the activation that was strongest was located within vmPFC. In conclusion, RL value terms complement the Bayesian model but do not substitute for the Bayesian model terms as an explanation of behaviour. (n = 24; error bars are SEM across participants; whole-brain effects family-wise error cluster corrected with z > 2.3 and p < 0.05). Source data

Extended Data Fig. 3 Model fit across time.

a, Model fits are better for the second compared to the first block half (paired t-test: t(23) 8.5,p<0.001,d=1.74,95% confidence interval(CI)=[28.5 46.8]). We tested whether we can replicate main effects of interest when equalizing model fits across block halves. b, c, We used choice residuals which offer an index of model fit but unlike BIC measures, they are specific to each trial and they do not depend on trial number or number of parameters. b, We show absolute choice residuals and block time for one example participant across all trials (‘short, medium and long’ refer to the horizon length). There is a limited correlation for all choice residuals across all trials and the block time variable (inset shows correlation across all participants; r=0.27;7% shared variance,95% CI=[−0.15 0.7]). This makes it very unlikely that results are driven by linear changes in model fit alone. c, We tested this empirically. First, we extracted absolute residuals from the main GLM across all trials and separated them into first and second halves. c-i, In accord with the model fit results, there was more residual variance during the first compared to second block half (paired t-test: t(23)=7.7, p<0.001, d=1.6, 95% CI=[0.08 0.14]). c-ii, Next, we excluded trials on the basis of the trial-wise choice residuals until there was no credible evidence that block halves were different in their residual variance (in effect this meant trials with residuals above 0.6 had to be excluded; paired t-test: t(23)=1.35, p=0.19, d= 0.28, 95% CI=[−0.004 0.02], Bayes factor10=0.48, error%=1.164e-4). d, Trials were collapsed back into one category and the main GLM (Fig. 3a) was applied onto the new subset of trials. We replicated all effects of interest (accuracy: t(23)=13.1, p<0.001, d=2.67, 95% CI=[2.8 3.8]; uncertainty: t(23) = −1.14, p=0.26, d=−0.23, 95% CI=[−1.6 0.45]; uncertainty x block time: t(23)=9.7, p<0.001, d=1.98, 95% CI=[2 3.1]; accuracy x block time: t(23)= −6.81, p<0.001, d=−1.39, 95% CI=[−2.8 -1.5]). (n = 24; error bars are SEM across participants). Source data

Extended Data Fig. 4 Trial classification into exploration/ exploitation according to individual choices.

A-i, We classified trials into exploration and exploitation according to subjects’ choices. To this end, we compared accuracy and uncertainty of the chosen and unchosen predictors, defining the prediction difference. Explorative choices (Ai-1) were defined as those directed towards higher uncertainty (positive uncertainty prediction difference) and less accurate predictors (negative accuracy prediction difference) (approximately 18% of trials). An exploitative trial (Ai-2) was defined by choices of predictors with a higher accuracy (positive accuracy prediction difference) and lower uncertainty (negative uncertainty prediction difference) than the unchosen predictor (approximately 52% of trials). A-ii, When participants chose predictors that were both more accurate and more uncertain, we compared the relative magnitudes of the accuracy prediction difference and the uncertainty prediction difference (Ai-3). If the difference in accuracy prediction was greater than the uncertainty prediction difference, then that trial was allocated to the exploitative bin. If the difference in uncertainty prediction difference was greater than the difference in accuracy prediction difference, then the trial was assigned to the exploratory bin. However, if the difference between the sizes of the decision variables was small (<5) then the trial was assigned to both exploration and exploitation bins (5% of trials). The remaining 25% of trials (white area in panel i) were not assigned to either exploitative or exploratory bins, because these choices were neither guided by uncertainty nor accuracy. A-iii, Example of an exploitative choice: the chosen predictor has a higher accuracy and a lower uncertainty prediction difference. B, As a manipulation check, we plot the prediction differences for accuracy and uncertainty separated by exploration and exploitation. We find that indeed exploratory trials are characterized by a positive uncertainty prediction difference (the chosen predictor was associated with greater uncertainty than the unchosen predictor) while exploitative trials are defined by a positive accuracy prediction difference (the chosen predictor was associated with greater predictive accuracy than the unchosen predictor) and negative uncertainty prediction difference (the chosen predictor was associated with greater negative predictive uncertainty than the unchosen predictor). For robustness of trial classification, see Supplementary Fig. 8. (n = 24).

Extended Data Fig. 5 Uncertainty-related signals in subcortical regions during exploration and exploitation.

We show subcortical activation associated with the uncertainty prediction difference for exploration, exploitation, and their difference. Activation is shown during the decision phase, and when relevant, during the outcome phase. We used bilateral masks and averaged the results over both hemispheres for each ROI. a, Amygdala represents uncertainty prediction difference during exploration more strongly than during exploitation (paired t-test: uncertainty prediction difference, explore vs exploit: t(23) = 3.5, p=0.002, d=0.71, 95% confidence interval (CI)=[6 23.5]). b, Activation patterns in VS during decision (left panel) and outcome phases (right panel) suggest its primary involvement is during exploitation. VS represented both a negative uncertainty prediction difference during the decision phase during exploitation (t(23)=−2.4, p=0.02, d=−0.49, 95% confidence interval=[−15.8 −1.3]) and the payoff during the outcome phase during both exploration and exploitation but it did so more strongly during exploitation (paired t-test: payoff for explore vs exploit (t(23) = −2.3, p=0.033, d=−0.47, 95% CI=[−21.2 −0.96]). c, Finally, VTA activity reflected uncertainty during both exploration and exploitation in the decision phase (exploration: t(23) = 2.3, p= 0.03, d=0.47, 95% CI=[1.94 40]; exploitation: t(23) = −3, p=0.007, d=−0.6, 95% CI=[−25.3 −4.4]). (n = 24; error bars are SEM across participants). Source data

Supplementary information

Supplementary Information

Supplementary Methods (1, details on task versions; 2, alternative computational models), Supplementary Figs. 1–9 (related to: 3, methods/experimental design; 4, neural results), Supplementary Tables 1–3 (6, peak coordinates of cluster-corrected whole-brain effects) and Supplementary References.

Reporting Summary

Source data

Source Data Fig. 3

Statistical source data for Fig. 3.

Source Data Fig. 6

Statistical source data for Fig. 6.

Source Data Fig. 7

Statistical source data for Fig. 7.

Source Data Extended Data Fig. 1

Statistical source data for Extended Data Fig. 1.

Source Data Extended Data Fig. 2

Statistical source data for Extended Data Fig. 2.

Source Data Extended Data Fig. 3

Statistical source data for Extended Data Fig. 3

Source Data Extended Data Fig. 5

Statistical source data for Extended Data Fig. 5.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Trudel, N., Scholl, J., Klein-Flügge, M.C. et al. Polarity of uncertainty representation during exploration and exploitation in ventromedial prefrontal cortex. Nat Hum Behav (2020). https://doi.org/10.1038/s41562-020-0929-3

Download citation

Search

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing