Behavioural and neural characterization of optimistic reinforcement learning

Lefebvre, Germain; Lebreton, Maël; Meyniel, Florent; Bourgeois-Gironde, Sacha; Palminteri, Stefano

doi:10.1038/s41562-017-0067

Article
Published: 20 March 2017

Behavioural and neural characterization of optimistic reinforcement learning

Germain Lefebvre^1,2,
Maël Lebreton^3,4,
Florent Meyniel⁵,
Sacha Bourgeois-Gironde^2,6 &
…
Stefano Palminteri ORCID: orcid.org/0000-0001-5768-6646^1,7

Nature Human Behaviour volume 1, Article number: 0067 (2017) Cite this article

6887 Accesses
121 Citations
89 Altmetric
Metrics details

Subjects

Abstract

When forming and updating beliefs about future life outcomes, people tend to consider good news and to disregard bad news. This tendency is assumed to support the optimism bias. Whether this learning bias is specific to ‘high-level’ abstract belief update or a particular expression of a more general ‘low-level’ reinforcement learning process is unknown. Here we report evidence in favour of the second hypothesis. In a simple instrumental learning task, participants incorporated better-than-expected outcomes at a higher rate than worse-than-expected ones. In addition, functional imaging indicated that inter-individual difference in the expression of optimistic update corresponds to enhanced prediction error signalling in the reward circuitry. Our results constitute a step towards the understanding of the genesis of optimism bias at the neurocomputational level.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Behavioural task and variables.**

**Figure 2: Behavioural and computational identification of optimistic reinforcement learning.**

**Figure 3: Functional signatures of the optimistic reinforcement learning.**

**Figure 4: Robustness of optimistic reinforcement learning.**

Neural and computational underpinnings of biased confidence in human reinforcement learning

Article Open access 28 October 2023

Believing in dopamine

Article 30 September 2019

The effect of optimistic expectancies on attention bias: Neural and behavioral correlates

Article Open access 16 April 2020

References

Burtt, E. A. The English Philosophers from Bacon to Mill (Modern Library, 1939).
Google Scholar
Weinstein, N. D. Unrealistic optimism about future life events. J. Pers. Soc. Psychol. 39, 806–820 (1980).
Article Google Scholar
Shepperd, J. A., Klein, W. M. P., Waters, E. A. & Weinstein, N. D. Taking stock of unrealistic optimism. Perspect. Psychol. Sci. 8, 395–411 (2013).
Article PubMed PubMed Central Google Scholar
Shepperd, J. A., Waters, E. A., Weinstein, N. D. & Klein, W. M. P. A primer on unrealistic optimism. Curr. Dir. Psychol. Sci. 24, 232–237 (2015).
Article PubMed PubMed Central Google Scholar
Shepperd, J. A., Ouellette, J. A. & Fernandez, J. K. Abandoning unrealistic optimism: performance estimates and the temporal proximity of self-relevant feedback. J. Pers. Soc. Psychol. 70, 844–855 (1996).
Article Google Scholar
Waters, E. A. et al. Correlates of unrealistic risk beliefs in a nationally representative sample. J. Behav. Med. 34, 225–235 (2011).
Article PubMed Google Scholar
Schoenbaum, M. Do smokers understand the mortality effects of smoking? Evidence from the health and retirement survey. Am. J. Public Health 87, 755–759 (1997).
Article CAS PubMed PubMed Central Google Scholar
Sharot, T., Korn, C. W. & Dolan, R. J. How unrealistic optimism is maintained in the face of reality. Nat. Neurosci. 14, 1475–1479 (2011).
Article CAS PubMed PubMed Central Google Scholar
Eil, D. & Rao, J. M. The good news–bad news effect: asymmetric processing of objective information about yourself. Am. Econ. J. Microecon. 3, 114–138 (2011).
Article Google Scholar
Sharot, T. & Garrett, N. Forming beliefs: why valence matters. Trends Cogn. Sci. 20, 25–33 (2016).
Article PubMed Google Scholar
Sharot, T., Riccardi, A. M., Raio, C. M. & Phelps, E. A. Neural mechanisms mediating optimism bias. Nature 450, 102–105 (2007).
Article CAS PubMed Google Scholar
Moutsiana, C. et al. Human development of the ability to learn from bad news. Proc. Natl Acad. Sci. USA 110, 16396–16401 (2013).
Article CAS PubMed PubMed Central Google Scholar
Garrett, N. et al. Losing the rose tinted glasses: neural substrates of unbiased belief updating in depression. Front. Hum. Neurosci. 8, 639 (2014).
Article PubMed PubMed Central Google Scholar
Moutsiana, C., Charpentier, C. J., Garrett, N., Cohen, M. X. & Sharot, T. Human frontal-subcortical circuit and asymmetric belief updating. J. Neurosci. 35, 14077–14085 (2015).
Article CAS PubMed PubMed Central Google Scholar
Garrison, J., Erdeniz, B. & Done, J. Prediction error in reinforcement learning: a meta-analysis of neuroimaging studies. Neurosci. Biobehav. Rev. 37, 1297–1310 (2013).
Article PubMed Google Scholar
Worbe, Y. et al. Reinforcement learning and Gilles de la Tourette syndrome: dissociation of clinical phenotypes and pharmacological treatments. Arch. Gen. Psychiatry 68, 1257–1266 (2011).
Article PubMed Google Scholar
Palminteri, S., Boraud, T., Lafargue, G., Dubois, B. & Pessiglione, M. Brain hemispheres selectively track the expected value of contralateral options. J. Neurosci. 29, 13465–13472 (2009).
Article CAS PubMed PubMed Central Google Scholar
Palminteri, S. et al. Critical roles for anterior insula and dorsal striatum in punishment-based avoidance learning. Neuron 76, 998–1009 (2012).
Article CAS PubMed Google Scholar
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 1998).
Google Scholar
Rescorla, R. A. & Wagner, A. R. in Classical Conditioning: Current Research and Theory 64–99 (Appleton Century Crofts, 1972).
Google Scholar
Daunizeau, J., Adam, V. & Rigoux, L. VBA: a probabilistic treatment of nonlinear models for neurobiological and behavioural data. PLoS Comput. Biol. 10, e1003441 (2014).
Article PubMed PubMed Central Google Scholar
O’Doherty, J. P., Hampton, A. & Kim, H. Model-based fMRI and its application to reward learning and decision making. Ann. N. Y. Acad. Sci. 1104, 35–53 (2007).
Article PubMed Google Scholar
Shah, P., Harris, A. J. L., Bird, G., Catmur, C. & Hahn, U. A pessimistic view of optimistic belief updating. Cogn. Psychol. 90, 71–127 (2016).
Article PubMed Google Scholar
Sharot, T. & Garrett, N. The myth of a pessimistic view of optimistic belief updating — a commentary on Shah et al. Preprint at http://dx.doi.org/10.2139/ssrn.2811752 (2016).
Doll, B. B., Hutchison, K. E. & Frank, M. J. Dopaminergic genes predict individual differences in susceptibility to confirmation bias. J. Neurosci. 31, 6188–6198 (2011).
Article CAS PubMed PubMed Central Google Scholar
Niv, Y. et al. Reinforcement learning in multidimensional environments relies on attention mechanisms. J. Neurosci. 35, 8145–8157 (2015).
Article CAS PubMed PubMed Central Google Scholar
Sharot, T., Guitart-Masip, M., Korn, C. W., Chowdhury, R. & Dolan, R. J. How dopamine enhances an optimism bias in humans. Curr. Biol. 22, 1477–1481 (2012).
Article CAS PubMed PubMed Central Google Scholar
Kahneman, D. & Tversky, A. Prospect theory: an analysis of decision under risk. Econometrica 47, 263–292 (1979).
Article Google Scholar
Huys, Q. J. M., Maia, T. V & Frank, M. J. Computational psychiatry as a bridge from neuroscience to clinical applications. Nat. Neurosci. 19, 404–413 (2016).
Article CAS PubMed PubMed Central Google Scholar
Sharot, T. The Optimism Bias: Why We’re Wired to Look on the Bright Side (Robinson, 2012).
Google Scholar
Voltaire . Candide, or Optimism (Penguin, 2013).
Google Scholar
Gifford, R. The dragons of inaction: pychological barriers that limit climate change mitigation and adaptation. Am. Psychol. 66, 290–302 (2011).
Article PubMed Google Scholar
Sharot, T., Guitart-Masip, M., Korn, C. W., Chowdhury, R. & Dolan, R. J. How dopamine enhances an optimism bias in humans. Curr. Biol. 22, 1477–1481 (2012).
Article CAS PubMed PubMed Central Google Scholar
Kuzmanovic, B., Jefferson, A. & Vogeley, K. The role of the neural reward circuitry in self-referential optimistic belief updates. Neuroimage 133, 151–162 (2016).
Article PubMed Google Scholar
Skvortsova, V., Palminteri, S. & Pessiglione, M. Learning to minimize efforts versus maximizing rewards: computational principles and neural correlates. J. Neurosci. 34, 15621–15630 (2014).
Article CAS PubMed PubMed Central Google Scholar
Bartra, O., McGuire, J. T. & Kable, J. W. The valuation system: a coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value. Neuroimage 76, 412–427 (2013).
Article PubMed Google Scholar
Domenech, P. & Koechlin, E. Executive control and decision-making in the prefrontal cortex. Curr. Opin. Behav. Sci. 1, 101–106 (2015).
Article Google Scholar
Kolling, N., Behrens, T. E. J., Wittmann, M. K. & Rushworth, M. F. S. Multiple signals in anterior cingulate cortex. Curr. Opin. Neurobiol. 37, 36–43 (2016).
Article CAS PubMed PubMed Central Google Scholar
Mathys, C., Daunizeau, J., Friston, K. J. & Stephan, K. E. A Bayesian foundation for individual learning under uncertainty. Front. Hum. Neurosci. 5, 39 (2011).
Article PubMed PubMed Central Google Scholar
Lebreton, M., Abitbol, R., Daunizeau, J. & Pessiglione, M. Automatic integration of confidence in the brain valuation signal. Nat. Neurosci. 18, 1159–1167 (2015).
Article CAS PubMed Google Scholar
Hampton, A. N., Bossaerts, P. & O’Doherty, J. P. The role of the ventromedial prefrontal cortex in abstract state-based inference during decision making in humans. J. Neurosci. 26, 8360–8367 (2006).
Article CAS PubMed PubMed Central Google Scholar
van Den Bos, W., Cohen, M. X., Kahnt, T. & Crone, E. A. Striatum-medial prefrontal cortex connectivity predicts developmental changes in reinforcement learning. Cereb. Cortex 22, 1247–1255 (2012).
Article PubMed PubMed Central Google Scholar
Frank, M. J., Moustafa, A. A, Haughey, H. M., Curran, T. & Hutchison, K. E. Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning. Proc. Natl Acad. Sci. USA 104, 16311–16316 (2007).
Article CAS PubMed PubMed Central Google Scholar
Niv, Y., Edlund, J. A., Dayan, P. & O’Doherty, J. P. Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain. J. Neurosci. 32, 551–562 (2012).
Article CAS PubMed PubMed Central Google Scholar
Carver, C. S., Scheier, M. F. & Segerstrom, S. C. Optimism. Clin. Psychol. Rev. 30, 879–889 (2010).
Article PubMed PubMed Central Google Scholar
Tindle, H. A. et al. Optimism, cynical hostility, and incident coronary heart disease and mortality in the Women’s Health Initiative. Circulation 120, 656–662 (2009).
Article PubMed PubMed Central Google Scholar
Macleod, A. K. & Conway, C. Well-being and the anticipation of future positive experiences: the role of income, social networks, and planning ability. Cogn. Emot. 19, 357–374 (2005).
Article PubMed Google Scholar
Johnson, D. D. P. & Fowler, J. H. The evolution of overconfidence. Nature 477, 317–320 (2011).
Article CAS PubMed Google Scholar
Cazé, R. D. & van der Meer, M. A. A. Adaptive properties of differential learning rates for positive and negative outcomes. Biol. Cybern. 107, 711–719 (2013).
Article PubMed Google Scholar
Raafat, R. M., Chater, N. & Frith, C. Herding in humans. Trends Cogn. Sci. 13, 420–428 (2009).
Article PubMed Google Scholar
Hills, T. T., Todd, P. M., Lazer, D., Redish, A. D. & Couzin, I. D. Exploration versus exploitation in space, mind, and society. Trends Cogn. Sci. 19, 46–54 (2015).
Article PubMed Google Scholar
Palminteri, S., Khamassi, M., Joffily, M. & Coricelli, G. Contextual modulation of value signals in reward and punishment learning. Nat. Commun. 6, 8096 (2015).
Article CAS PubMed Google Scholar
Popper, K. The Logic of Scientific Discovery (Routledge, 2005).
Book Google Scholar
Dienes, Z. Understanding Psychology as a Science: An Introduction to Scientific and Statistical Inference (Palgrave Macmillan, 2008).
Google Scholar
Lebreton, M. & Palminteri, S. Assessing inter-individual variability in brain-behavior relationship with functional neuroimaging. Preprint at bioRxivhttp://dx.doi.org/10.1101/036772 (2016).

Download references

Acknowledgements

We thank Y. Worbe and M. Pessiglione for granting access to the first dataset, V. Wyart, B. Bahrami and B. Kuzmanovic for comments, and T. Sharot and N. Garrett for providing activation masks. S.P. was supported by a Marie Sklodowska-Curie Individual European Fellowship (PIEF-GA-2012 Grant 328822) and is currently supported by an ATIP-Avenir grant (R16069JS). G.L. was supported by a PhD fellowship of the Ministère de l'enseignement supérieur et de la recherche. M.L. was supported by an EU Marie Sklodowska-Curie Individual Fellowship (IF-2015 Grant 657904) and acknowledges the support of the Bettencourt-Schueller Foundation. The second experiment was supported by the ANR-ORA, NESSHI 2010–2015 research grant to S.B.-G. The Institut d’Étude de la Cognition is supported by the LabEx IEC (ANR-10-LABX-0087 IEC) and the IDEX PSL* (ANR-10-IDEX-0001-02 PSL*). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Authors and Affiliations

Laboratoire de Neurosciences Cognitives, Institut National de la Santé et de la Recherche Médicale, 75005 Paris, France
Germain Lefebvre & Stefano Palminteri
Laboratoire d'Économie Mathématique et de Microéconomie Appliquée (LEMMA), Université Panthéon-Assas, Paris, 75006, France
Germain Lefebvre & Sacha Bourgeois-Gironde
Amsterdam Brain and Cognition (ABC), Nieuwe Achtergracht 129, Amsterdam, 1018 WS, The Netherlands
Maël Lebreton
Amsterdam School of Economics (ASE), Faculty of Economics and Business (FEB), Roetersstraat 11, Amsterdam, 1018 WB, The Netherlands
Maël Lebreton
INSERM-CEA Cognitive Neuroimaging Unit (UNICOG), , NeuroSpin Centre, 91191 Gif sur Yvette, France
Florent Meyniel
Institut Jean-Nicod (IJN), CNRS UMR 8129, Ecole Normale Supérieure, Paris, 75005, France
Sacha Bourgeois-Gironde
Institut d’Étude de la Cognition, Departement d’Études Cognitives, École Normale Supérieure, Paris, 75005, France
Stefano Palminteri

Authors

Germain Lefebvre
View author publications
You can also search for this author in PubMed Google Scholar
Maël Lebreton
View author publications
You can also search for this author in PubMed Google Scholar
Florent Meyniel
View author publications
You can also search for this author in PubMed Google Scholar
Sacha Bourgeois-Gironde
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Palminteri
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

G.L. performed the experiment, analysed the data and wrote the manuscript. M.L. provided analytical tools, interpreted the results and edited the manuscript. F.M. provided analytical tools and edited the manuscript. S.B.-G. interpreted the results and edited the manuscript. S.P. designed the study, performed the experiments, analysed the data and wrote the manuscript.

Corresponding author

Correspondence to Stefano Palminteri.

Ethics declarations

Competing interests

The authors declare no competing interests.

Supplementary information

Supplementary Information

Supplementary Notes, Supplementary Figures 1–7, Supplementary Discussion, Supplementary Methods, Supplementary References. (PDF 606 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lefebvre, G., Lebreton, M., Meyniel, F. et al. Behavioural and neural characterization of optimistic reinforcement learning. Nat Hum Behav 1, 0067 (2017). https://doi.org/10.1038/s41562-017-0067

Download citation

Received: 18 April 2016
Accepted: 10 February 2017
Published: 20 March 2017
DOI: https://doi.org/10.1038/s41562-017-0067

This article is cited by

Understanding the development of reward learning through the lens of meta-learning
- Kate Nussenbaum
- Catherine A. Hartley
Nature Reviews Psychology (2024)
The serial mediation effect of prospective imagery vividness and anxiety symptoms on the relationship between perceived stress and depressive symptoms among Chinese vocational college students during the COVID-19 pandemic
- Mingfan Liu
- Yuanyuan Deng
- Yao Zhang
Current Psychology (2024)
Neural and computational underpinnings of biased confidence in human reinforcement learning
- Chih-Chung Ting
- Nahuel Salem-Garcia
- Maël Lebreton
Nature Communications (2023)
Influences of Reinforcement and Choice Histories on Choice Behavior in Actor-Critic Learning
- Kentaro Katahira
- Kenta Kimura
Computational Brain & Behavior (2023)
Examinations of Biases by Model Misspecification and Parameter Reliability of Reinforcement Learning Models
- Asako Toyama
- Kentaro Katahira
- Yoshihiko Kunisato
Computational Brain & Behavior (2023)