Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Behavioural and neural characterization of optimistic reinforcement learning

Abstract

When forming and updating beliefs about future life outcomes, people tend to consider good news and to disregard bad news. This tendency is assumed to support the optimism bias. Whether this learning bias is specific to ‘high-level’ abstract belief update or a particular expression of a more general ‘low-level’ reinforcement learning process is unknown. Here we report evidence in favour of the second hypothesis. In a simple instrumental learning task, participants incorporated better-than-expected outcomes at a higher rate than worse-than-expected ones. In addition, functional imaging indicated that inter-individual difference in the expression of optimistic update corresponds to enhanced prediction error signalling in the reward circuitry. Our results constitute a step towards the understanding of the genesis of optimism bias at the neurocomputational level.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Behavioural task and variables.
Figure 2: Behavioural and computational identification of optimistic reinforcement learning.
Figure 3: Functional signatures of the optimistic reinforcement learning.
Figure 4: Robustness of optimistic reinforcement learning.

References

  1. 1

    Burtt, E. A. The English Philosophers from Bacon to Mill (Modern Library, 1939).

    Google Scholar 

  2. 2

    Weinstein, N. D. Unrealistic optimism about future life events. J. Pers. Soc. Psychol. 39, 806–820 (1980).

    Article  Google Scholar 

  3. 3

    Shepperd, J. A., Klein, W. M. P., Waters, E. A. & Weinstein, N. D. Taking stock of unrealistic optimism. Perspect. Psychol. Sci. 8, 395–411 (2013).

    Article  Google Scholar 

  4. 4

    Shepperd, J. A., Waters, E. A., Weinstein, N. D. & Klein, W. M. P. A primer on unrealistic optimism. Curr. Dir. Psychol. Sci. 24, 232–237 (2015).

    Article  Google Scholar 

  5. 5

    Shepperd, J. A., Ouellette, J. A. & Fernandez, J. K. Abandoning unrealistic optimism: performance estimates and the temporal proximity of self-relevant feedback. J. Pers. Soc. Psychol. 70, 844–855 (1996).

    Article  Google Scholar 

  6. 6

    Waters, E. A. et al. Correlates of unrealistic risk beliefs in a nationally representative sample. J. Behav. Med. 34, 225–235 (2011).

    Article  Google Scholar 

  7. 7

    Schoenbaum, M. Do smokers understand the mortality effects of smoking? Evidence from the health and retirement survey. Am. J. Public Health 87, 755–759 (1997).

    CAS  Article  Google Scholar 

  8. 8

    Sharot, T., Korn, C. W. & Dolan, R. J. How unrealistic optimism is maintained in the face of reality. Nat. Neurosci. 14, 1475–1479 (2011).

    CAS  Article  Google Scholar 

  9. 9

    Eil, D. & Rao, J. M. The good news–bad news effect: asymmetric processing of objective information about yourself. Am. Econ. J. Microecon. 3, 114–138 (2011).

    Article  Google Scholar 

  10. 10

    Sharot, T. & Garrett, N. Forming beliefs: why valence matters. Trends Cogn. Sci. 20, 25–33 (2016).

    Article  Google Scholar 

  11. 11

    Sharot, T., Riccardi, A. M., Raio, C. M. & Phelps, E. A. Neural mechanisms mediating optimism bias. Nature 450, 102–105 (2007).

    CAS  Article  Google Scholar 

  12. 12

    Moutsiana, C. et al. Human development of the ability to learn from bad news. Proc. Natl Acad. Sci. USA 110, 16396–16401 (2013).

    CAS  Article  Google Scholar 

  13. 13

    Garrett, N. et al. Losing the rose tinted glasses: neural substrates of unbiased belief updating in depression. Front. Hum. Neurosci. 8, 639 (2014).

    Article  Google Scholar 

  14. 14

    Moutsiana, C., Charpentier, C. J., Garrett, N., Cohen, M. X. & Sharot, T. Human frontal-subcortical circuit and asymmetric belief updating. J. Neurosci. 35, 14077–14085 (2015).

    CAS  Article  Google Scholar 

  15. 15

    Garrison, J., Erdeniz, B. & Done, J. Prediction error in reinforcement learning: a meta-analysis of neuroimaging studies. Neurosci. Biobehav. Rev. 37, 1297–1310 (2013).

    Article  Google Scholar 

  16. 16

    Worbe, Y. et al. Reinforcement learning and Gilles de la Tourette syndrome: dissociation of clinical phenotypes and pharmacological treatments. Arch. Gen. Psychiatry 68, 1257–1266 (2011).

    Article  Google Scholar 

  17. 17

    Palminteri, S., Boraud, T., Lafargue, G., Dubois, B. & Pessiglione, M. Brain hemispheres selectively track the expected value of contralateral options. J. Neurosci. 29, 13465–13472 (2009).

    CAS  Article  Google Scholar 

  18. 18

    Palminteri, S. et al. Critical roles for anterior insula and dorsal striatum in punishment-based avoidance learning. Neuron 76, 998–1009 (2012).

    CAS  Article  Google Scholar 

  19. 19

    Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 1998).

    Google Scholar 

  20. 20

    Rescorla, R. A. & Wagner, A. R. in Classical Conditioning: Current Research and Theory 64–99 (Appleton Century Crofts, 1972).

    Google Scholar 

  21. 21

    Daunizeau, J., Adam, V. & Rigoux, L. VBA: a probabilistic treatment of nonlinear models for neurobiological and behavioural data. PLoS Comput. Biol. 10, e1003441 (2014).

    Article  Google Scholar 

  22. 22

    O’Doherty, J. P., Hampton, A. & Kim, H. Model-based fMRI and its application to reward learning and decision making. Ann. N. Y. Acad. Sci. 1104, 35–53 (2007).

    Article  Google Scholar 

  23. 23

    Shah, P., Harris, A. J. L., Bird, G., Catmur, C. & Hahn, U. A pessimistic view of optimistic belief updating. Cogn. Psychol. 90, 71–127 (2016).

    Article  Google Scholar 

  24. 24

    Sharot, T. & Garrett, N. The myth of a pessimistic view of optimistic belief updating — a commentary on Shah et al. Preprint at http://dx.doi.org/10.2139/ssrn.2811752 (2016).

  25. 25

    Doll, B. B., Hutchison, K. E. & Frank, M. J. Dopaminergic genes predict individual differences in susceptibility to confirmation bias. J. Neurosci. 31, 6188–6198 (2011).

    CAS  Article  Google Scholar 

  26. 26

    Niv, Y. et al. Reinforcement learning in multidimensional environments relies on attention mechanisms. J. Neurosci. 35, 8145–8157 (2015).

    CAS  Article  Google Scholar 

  27. 27

    Sharot, T., Guitart-Masip, M., Korn, C. W., Chowdhury, R. & Dolan, R. J. How dopamine enhances an optimism bias in humans. Curr. Biol. 22, 1477–1481 (2012).

    CAS  Article  Google Scholar 

  28. 28

    Kahneman, D. & Tversky, A. Prospect theory: an analysis of decision under risk. Econometrica 47, 263–292 (1979).

    Article  Google Scholar 

  29. 29

    Huys, Q. J. M., Maia, T. V & Frank, M. J. Computational psychiatry as a bridge from neuroscience to clinical applications. Nat. Neurosci. 19, 404–413 (2016).

    CAS  Article  Google Scholar 

  30. 30

    Sharot, T. The Optimism Bias: Why We’re Wired to Look on the Bright Side (Robinson, 2012).

    Google Scholar 

  31. 31

    Voltaire . Candide, or Optimism (Penguin, 2013).

    Google Scholar 

  32. 32

    Gifford, R. The dragons of inaction: pychological barriers that limit climate change mitigation and adaptation. Am. Psychol. 66, 290–302 (2011).

    Article  Google Scholar 

  33. 33

    Sharot, T., Guitart-Masip, M., Korn, C. W., Chowdhury, R. & Dolan, R. J. How dopamine enhances an optimism bias in humans. Curr. Biol. 22, 1477–1481 (2012).

    CAS  Article  Google Scholar 

  34. 34

    Kuzmanovic, B., Jefferson, A. & Vogeley, K. The role of the neural reward circuitry in self-referential optimistic belief updates. Neuroimage 133, 151–162 (2016).

    Article  Google Scholar 

  35. 35

    Skvortsova, V., Palminteri, S. & Pessiglione, M. Learning to minimize efforts versus maximizing rewards: computational principles and neural correlates. J. Neurosci. 34, 15621–15630 (2014).

    CAS  Article  Google Scholar 

  36. 36

    Bartra, O., McGuire, J. T. & Kable, J. W. The valuation system: a coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value. Neuroimage 76, 412–427 (2013).

    Article  Google Scholar 

  37. 37

    Domenech, P. & Koechlin, E. Executive control and decision-making in the prefrontal cortex. Curr. Opin. Behav. Sci. 1, 101–106 (2015).

    Article  Google Scholar 

  38. 38

    Kolling, N., Behrens, T. E. J., Wittmann, M. K. & Rushworth, M. F. S. Multiple signals in anterior cingulate cortex. Curr. Opin. Neurobiol. 37, 36–43 (2016).

    CAS  Article  Google Scholar 

  39. 39

    Mathys, C., Daunizeau, J., Friston, K. J. & Stephan, K. E. A Bayesian foundation for individual learning under uncertainty. Front. Hum. Neurosci. 5, 39 (2011).

    Article  Google Scholar 

  40. 40

    Lebreton, M., Abitbol, R., Daunizeau, J. & Pessiglione, M. Automatic integration of confidence in the brain valuation signal. Nat. Neurosci. 18, 1159–1167 (2015).

    CAS  Article  Google Scholar 

  41. 41

    Hampton, A. N., Bossaerts, P. & O’Doherty, J. P. The role of the ventromedial prefrontal cortex in abstract state-based inference during decision making in humans. J. Neurosci. 26, 8360–8367 (2006).

    CAS  Article  Google Scholar 

  42. 42

    van Den Bos, W., Cohen, M. X., Kahnt, T. & Crone, E. A. Striatum-medial prefrontal cortex connectivity predicts developmental changes in reinforcement learning. Cereb. Cortex 22, 1247–1255 (2012).

    Article  Google Scholar 

  43. 43

    Frank, M. J., Moustafa, A. A, Haughey, H. M., Curran, T. & Hutchison, K. E. Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning. Proc. Natl Acad. Sci. USA 104, 16311–16316 (2007).

    CAS  Article  Google Scholar 

  44. 44

    Niv, Y., Edlund, J. A., Dayan, P. & O’Doherty, J. P. Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain. J. Neurosci. 32, 551–562 (2012).

    CAS  Article  Google Scholar 

  45. 45

    Carver, C. S., Scheier, M. F. & Segerstrom, S. C. Optimism. Clin. Psychol. Rev. 30, 879–889 (2010).

    Article  Google Scholar 

  46. 46

    Tindle, H. A. et al. Optimism, cynical hostility, and incident coronary heart disease and mortality in the Women’s Health Initiative. Circulation 120, 656–662 (2009).

    Article  Google Scholar 

  47. 47

    Macleod, A. K. & Conway, C. Well-being and the anticipation of future positive experiences: the role of income, social networks, and planning ability. Cogn. Emot. 19, 357–374 (2005).

    Article  Google Scholar 

  48. 48

    Johnson, D. D. P. & Fowler, J. H. The evolution of overconfidence. Nature 477, 317–320 (2011).

    CAS  Article  Google Scholar 

  49. 49

    Cazé, R. D. & van der Meer, M. A. A. Adaptive properties of differential learning rates for positive and negative outcomes. Biol. Cybern. 107, 711–719 (2013).

    Article  Google Scholar 

  50. 50

    Raafat, R. M., Chater, N. & Frith, C. Herding in humans. Trends Cogn. Sci. 13, 420–428 (2009).

    Article  Google Scholar 

  51. 51

    Hills, T. T., Todd, P. M., Lazer, D., Redish, A. D. & Couzin, I. D. Exploration versus exploitation in space, mind, and society. Trends Cogn. Sci. 19, 46–54 (2015).

    Article  Google Scholar 

  52. 52

    Palminteri, S., Khamassi, M., Joffily, M. & Coricelli, G. Contextual modulation of value signals in reward and punishment learning. Nat. Commun. 6, 8096 (2015).

    CAS  Article  Google Scholar 

  53. 53

    Popper, K. The Logic of Scientific Discovery (Routledge, 2005).

    Book  Google Scholar 

  54. 54

    Dienes, Z. Understanding Psychology as a Science: An Introduction to Scientific and Statistical Inference (Palgrave Macmillan, 2008).

    Google Scholar 

  55. 55

    Lebreton, M. & Palminteri, S. Assessing inter-individual variability in brain-behavior relationship with functional neuroimaging. Preprint at bioRxivhttp://dx.doi.org/10.1101/036772 (2016).

Download references

Acknowledgements

We thank Y. Worbe and M. Pessiglione for granting access to the first dataset, V. Wyart, B. Bahrami and B. Kuzmanovic for comments, and T. Sharot and N. Garrett for providing activation masks. S.P. was supported by a Marie Sklodowska-Curie Individual European Fellowship (PIEF-GA-2012 Grant 328822) and is currently supported by an ATIP-Avenir grant (R16069JS). G.L. was supported by a PhD fellowship of the Ministère de l'enseignement supérieur et de la recherche. M.L. was supported by an EU Marie Sklodowska-Curie Individual Fellowship (IF-2015 Grant 657904) and acknowledges the support of the Bettencourt-Schueller Foundation. The second experiment was supported by the ANR-ORA, NESSHI 2010–2015 research grant to S.B.-G. The Institut d’Étude de la Cognition is supported by the LabEx IEC (ANR-10-LABX-0087 IEC) and the IDEX PSL* (ANR-10-IDEX-0001-02 PSL*). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Affiliations

Authors

Contributions

G.L. performed the experiment, analysed the data and wrote the manuscript. M.L. provided analytical tools, interpreted the results and edited the manuscript. F.M. provided analytical tools and edited the manuscript. S.B.-G. interpreted the results and edited the manuscript. S.P. designed the study, performed the experiments, analysed the data and wrote the manuscript.

Corresponding author

Correspondence to Stefano Palminteri.

Ethics declarations

Competing interests

The authors declare no competing interests.

Supplementary information

Supplementary Information

Supplementary Notes, Supplementary Figures 1–7, Supplementary Discussion, Supplementary Methods, Supplementary References. (PDF 606 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lefebvre, G., Lebreton, M., Meyniel, F. et al. Behavioural and neural characterization of optimistic reinforcement learning. Nat Hum Behav 1, 0067 (2017). https://doi.org/10.1038/s41562-017-0067

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing