Article

Behavioural and neural characterization of optimistic reinforcement learning

  • Nature Human Behaviour 1, Article number: 0067 (2017)
  • doi:10.1038/s41562-017-0067
  • Download Citation
Received:
Accepted:
Published online:

Abstract

When forming and updating beliefs about future life outcomes, people tend to consider good news and to disregard bad news. This tendency is assumed to support the optimism bias. Whether this learning bias is specific to ‘high-level’ abstract belief update or a particular expression of a more general ‘low-level’ reinforcement learning process is unknown. Here we report evidence in favour of the second hypothesis. In a simple instrumental learning task, participants incorporated better-than-expected outcomes at a higher rate than worse-than-expected ones. In addition, functional imaging indicated that inter-individual difference in the expression of optimistic update corresponds to enhanced prediction error signalling in the reward circuitry. Our results constitute a step towards the understanding of the genesis of optimism bias at the neurocomputational level.

  • Subscribe to Nature Human Behaviour for full access:

    $99

    Subscribe

Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.

References

  1. 1.

    The English Philosophers from Bacon to Mill (Modern Library, 1939).

  2. 2.

    Unrealistic optimism about future life events. J. Pers. Soc. Psychol. 39, 806–820 (1980).

  3. 3.

    , , & Taking stock of unrealistic optimism. Perspect. Psychol. Sci. 8, 395–411 (2013).

  4. 4.

    , , & A primer on unrealistic optimism. Curr. Dir. Psychol. Sci. 24, 232–237 (2015).

  5. 5.

    , & Abandoning unrealistic optimism: performance estimates and the temporal proximity of self-relevant feedback. J. Pers. Soc. Psychol. 70, 844–855 (1996).

  6. 6.

    et al. Correlates of unrealistic risk beliefs in a nationally representative sample. J. Behav. Med. 34, 225–235 (2011).

  7. 7.

    Do smokers understand the mortality effects of smoking? Evidence from the health and retirement survey. Am. J. Public Health 87, 755–759 (1997).

  8. 8.

    , & How unrealistic optimism is maintained in the face of reality. Nat. Neurosci. 14, 1475–1479 (2011).

  9. 9.

    & The good news–bad news effect: asymmetric processing of objective information about yourself. Am. Econ. J. Microecon. 3, 114–138 (2011).

  10. 10.

    & Forming beliefs: why valence matters. Trends Cogn. Sci. 20, 25–33 (2016).

  11. 11.

    , , & Neural mechanisms mediating optimism bias. Nature 450, 102–105 (2007).

  12. 12.

    et al. Human development of the ability to learn from bad news. Proc. Natl Acad. Sci. USA 110, 16396–16401 (2013).

  13. 13.

    et al. Losing the rose tinted glasses: neural substrates of unbiased belief updating in depression. Front. Hum. Neurosci. 8, 639 (2014).

  14. 14.

    , , , & Human frontal-subcortical circuit and asymmetric belief updating. J. Neurosci. 35, 14077–14085 (2015).

  15. 15.

    , & Prediction error in reinforcement learning: a meta-analysis of neuroimaging studies. Neurosci. Biobehav. Rev. 37, 1297–1310 (2013).

  16. 16.

    et al. Reinforcement learning and Gilles de la Tourette syndrome: dissociation of clinical phenotypes and pharmacological treatments. Arch. Gen. Psychiatry 68, 1257–1266 (2011).

  17. 17.

    , , , & Brain hemispheres selectively track the expected value of contralateral options. J. Neurosci. 29, 13465–13472 (2009).

  18. 18.

    et al. Critical roles for anterior insula and dorsal striatum in punishment-based avoidance learning. Neuron 76, 998–1009 (2012).

  19. 19.

    & Reinforcement Learning: An Introduction (MIT Press, 1998).

  20. 20.

    & in Classical Conditioning: Current Research and Theory 64–99 (Appleton Century Crofts, 1972).

  21. 21.

    , & VBA: a probabilistic treatment of nonlinear models for neurobiological and behavioural data. PLoS Comput. Biol. 10, e1003441 (2014).

  22. 22.

    , & Model-based fMRI and its application to reward learning and decision making. Ann. N. Y. Acad. Sci. 1104, 35–53 (2007).

  23. 23.

    , , , & A pessimistic view of optimistic belief updating. Cogn. Psychol. 90, 71–127 (2016).

  24. 24.

    & The myth of a pessimistic view of optimistic belief updating — a commentary on Shah et al. Preprint at (2016).

  25. 25.

    , & Dopaminergic genes predict individual differences in susceptibility to confirmation bias. J. Neurosci. 31, 6188–6198 (2011).

  26. 26.

    et al. Reinforcement learning in multidimensional environments relies on attention mechanisms. J. Neurosci. 35, 8145–8157 (2015).

  27. 27.

    , , , & How dopamine enhances an optimism bias in humans. Curr. Biol. 22, 1477–1481 (2012).

  28. 28.

    & Prospect theory: an analysis of decision under risk. Econometrica 47, 263–292 (1979).

  29. 29.

    , & Computational psychiatry as a bridge from neuroscience to clinical applications. Nat. Neurosci. 19, 404–413 (2016).

  30. 30.

    The Optimism Bias: Why We’re Wired to Look on the Bright Side (Robinson, 2012).

  31. 31.

    . Candide, or Optimism (Penguin, 2013).

  32. 32.

    The dragons of inaction: pychological barriers that limit climate change mitigation and adaptation. Am. Psychol. 66, 290–302 (2011).

  33. 33.

    , , , & How dopamine enhances an optimism bias in humans. Curr. Biol. 22, 1477–1481 (2012).

  34. 34.

    , & The role of the neural reward circuitry in self-referential optimistic belief updates. Neuroimage 133, 151–162 (2016).

  35. 35.

    , & Learning to minimize efforts versus maximizing rewards: computational principles and neural correlates. J. Neurosci. 34, 15621–15630 (2014).

  36. 36.

    , & The valuation system: a coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value. Neuroimage 76, 412–427 (2013).

  37. 37.

    & Executive control and decision-making in the prefrontal cortex. Curr. Opin. Behav. Sci. 1, 101–106 (2015).

  38. 38.

    , , & Multiple signals in anterior cingulate cortex. Curr. Opin. Neurobiol. 37, 36–43 (2016).

  39. 39.

    , , & A Bayesian foundation for individual learning under uncertainty. Front. Hum. Neurosci. 5, 39 (2011).

  40. 40.

    , , & Automatic integration of confidence in the brain valuation signal. Nat. Neurosci. 18, 1159–1167 (2015).

  41. 41.

    , & The role of the ventromedial prefrontal cortex in abstract state-based inference during decision making in humans. J. Neurosci. 26, 8360–8367 (2006).

  42. 42.

    , , & Striatum-medial prefrontal cortex connectivity predicts developmental changes in reinforcement learning. Cereb. Cortex 22, 1247–1255 (2012).

  43. 43.

    , , , & Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning. Proc. Natl Acad. Sci. USA 104, 16311–16316 (2007).

  44. 44.

    , , & Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain. J. Neurosci. 32, 551–562 (2012).

  45. 45.

    , & Optimism. Clin. Psychol. Rev. 30, 879–889 (2010).

  46. 46.

    et al. Optimism, cynical hostility, and incident coronary heart disease and mortality in the Women’s Health Initiative. Circulation 120, 656–662 (2009).

  47. 47.

    & Well-being and the anticipation of future positive experiences: the role of income, social networks, and planning ability. Cogn. Emot. 19, 357–374 (2005).

  48. 48.

    & The evolution of overconfidence. Nature 477, 317–320 (2011).

  49. 49.

    & Adaptive properties of differential learning rates for positive and negative outcomes. Biol. Cybern. 107, 711–719 (2013).

  50. 50.

    , & Herding in humans. Trends Cogn. Sci. 13, 420–428 (2009).

  51. 51.

    , , , & Exploration versus exploitation in space, mind, and society. Trends Cogn. Sci. 19, 46–54 (2015).

  52. 52.

    , , & Contextual modulation of value signals in reward and punishment learning. Nat. Commun. 6, 8096 (2015).

  53. 53.

    The Logic of Scientific Discovery (Routledge, 2005).

  54. 54.

    Understanding Psychology as a Science: An Introduction to Scientific and Statistical Inference (Palgrave Macmillan, 2008).

  55. 55.

    & Assessing inter-individual variability in brain-behavior relationship with functional neuroimaging. Preprint at bioRxiv (2016).

Download references

Acknowledgements

We thank Y. Worbe and M. Pessiglione for granting access to the first dataset, V. Wyart, B. Bahrami and B. Kuzmanovic for comments, and T. Sharot and N. Garrett for providing activation masks. S.P. was supported by a Marie Sklodowska-Curie Individual European Fellowship (PIEF-GA-2012 Grant 328822) and is currently supported by an ATIP-Avenir grant (R16069JS). G.L. was supported by a PhD fellowship of the Ministère de l'enseignement supérieur et de la recherche. M.L. was supported by an EU Marie Sklodowska-Curie Individual Fellowship (IF-2015 Grant 657904) and acknowledges the support of the Bettencourt-Schueller Foundation. The second experiment was supported by the ANR-ORA, NESSHI 2010–2015 research grant to S.B.-G. The Institut d’Étude de la Cognition is supported by the LabEx IEC (ANR-10-LABX-0087 IEC) and the IDEX PSL* (ANR-10-IDEX-0001-02 PSL*). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Affiliations

  1. Laboratoire de Neurosciences Cognitives, Institut National de la Santé et de la Recherche Médicale, 75005 Paris, France

    • Germain Lefebvre
    •  & Stefano Palminteri
  2. Laboratoire d'Économie Mathématique et de Microéconomie Appliquée (LEMMA), Université Panthéon-Assas, 75006 Paris, France

    • Germain Lefebvre
    •  & Sacha Bourgeois-Gironde
  3. Amsterdam Brain and Cognition (ABC), Nieuwe Achtergracht 129, 1018 WS Amsterdam, The Netherlands

    • Maël Lebreton
  4. Amsterdam School of Economics (ASE), Faculty of Economics and Business (FEB), Roetersstraat 11, 1018 WB Amsterdam, The Netherlands

    • Maël Lebreton
  5. INSERM-CEA Cognitive Neuroimaging Unit (UNICOG), NeuroSpin Centre, 91191 Gif sur Yvette, France

    • Florent Meyniel
  6. Institut Jean-Nicod (IJN), CNRS UMR 8129, Ecole Normale Supérieure, 75005 Paris, France

    • Sacha Bourgeois-Gironde
  7. Institut d’Étude de la Cognition, Departement d’Études Cognitives, École Normale Supérieure, 75005 Paris, France

    • Stefano Palminteri

Authors

  1. Search for Germain Lefebvre in:

  2. Search for Maël Lebreton in:

  3. Search for Florent Meyniel in:

  4. Search for Sacha Bourgeois-Gironde in:

  5. Search for Stefano Palminteri in:

Contributions

G.L. performed the experiment, analysed the data and wrote the manuscript. M.L. provided analytical tools, interpreted the results and edited the manuscript. F.M. provided analytical tools and edited the manuscript. S.B.-G. interpreted the results and edited the manuscript. S.P. designed the study, performed the experiments, analysed the data and wrote the manuscript.

Competing interests

The authors declare no competing interests.

Corresponding author

Correspondence to Stefano Palminteri.

Supplementary information

PDF files

  1. 1.

    Supplementary Information

    Supplementary Notes, Supplementary Figures 1–7, Supplementary Discussion, Supplementary Methods, Supplementary References.