Information about action outcomes differentially affects learning from self-determined versus imposed choices

Chambon, Valérian; Théro, Héloïse; Vidal, Marie; Vandendriessche, Henri; Haggard, Patrick; Palminteri, Stefano

doi:10.1038/s41562-020-0919-5

Article
Published: 03 August 2020

Information about action outcomes differentially affects learning from self-determined versus imposed choices

Nature Human Behaviour volume 4, pages 1067–1079 (2020)Cite this article

3195 Accesses
32 Citations
88 Altmetric
Metrics details

Subjects

Abstract

The valence of new information influences learning rates in humans: good news tends to receive more weight than bad news. We investigated this learning bias in four experiments, by systematically manipulating the source of required action (free versus forced choices), outcome contingencies (low versus high reward) and motor requirements (go versus no-go choices). Analysis of model-estimated learning rates showed that the confirmation bias in learning rates was specific to free choices, but was independent of outcome contingencies. The bias was also unaffected by the motor requirements, thus suggesting that it operates in the representational space of decisions, rather than motoric actions. Finally, model simulations revealed that learning rates estimated from the choice-confirmation model had the effect of maximizing performance across low- and high-reward environments. We therefore suggest that choice-confirmation bias may be adaptive for efficient learning of action–outcome contingencies, above and beyond fostering person-level dispositions such as self-esteem.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Schematic of the trial procedure and stimuli.**

**Fig. 3: Parameter results of the full model from all four experiments.**

**Fig. 4: Model comparison results from all four experiments.**

**Fig. 5: Model comparison from experiment 2.**

**Fig. 6: Comparison of the winning model (3α) with a model with a simple perseveration parameter.**

**Fig. 7: Learning rate analysis and model comparison for H&L models.**

**Fig. 8: Valence-dependent learning biases as a function of the choice type, execution mode and outcome type.**

Active inference and the two-step task

Article Open access 21 October 2022

Dissociation between asymmetric value updating and perseverance in human reinforcement learning

Article Open access 11 February 2021

Memory for rewards guides retrieval

Article Open access 16 April 2024

Data availability

The data that support the findings of this study are available from the GitHub repository (https://github.com/spalminteri/agency).

Code availability

Custom code scripts have been made available on the GitHub repository (https://github.com/spalminteri/agency). Additional modified scripts can be accessed upon request.

References

Barto, A. G. & Sutton, R. S. Reinforcement Learning: An Introduction (MIT Press, 1998).
Lefebvre, G., Lebreton, M., Meyniel, F., Bourgeois-Gironde, S. & Palminteri, S. Behavioural and neural characterization of optimistic reinforcement learning. Nat. Hum. Behav. 1, 0067 (2017).
Article Google Scholar
Aberg, K. C., Doell, K. C. & Schwartz, S. Linking individual learning styles to approach-avoidance motivational traits and computational aspects of reinforcement learning. PLoS ONE 11, e0166675 (2016).
Article Google Scholar
Frank, M. J., Moustafa, A. A., Haughey, H. M., Curran, T. & Hutchison, K. E. Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning. Proc. Natl Acad. Sci. USA 104, 16311–16316 (2007).
Article CAS Google Scholar
Sharot, T. & Garrett, N. Forming beliefs: why valence matters. Trends Cogn. Sci. 20, 25–33 (2016).
Article Google Scholar
Kuzmanovic, B. & Rigoux. L. Optimistic belief updating deviates from Bayesian learning. SSRN https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2810063 (2016).
Palminteri, S., Lefebvre, G., Kilford, E. J. & Blakemore, S. J. Confirmation bias in human reinforcement learning: evidence from counterfactual feedback processing. PLoS Comput. Biol. 138, e1005684 (2017).
Article Google Scholar
Nickerson, R. S. Confirmation bias: a ubiquitous phenomenon in many guises. Rev. Gen. Psychol. 2, 175–220 (1998).
Article Google Scholar
Katahira, K. The statistical structures of reinforcement learning with asymmetric value updates. J. Math. Psychol. 87, 31–45 (2018).
Article Google Scholar
Boureau, Y. L. & Dayan, P. Opponency revisited: competition and cooperation between dopamine and serotonin. Neuropsychopharmacology 36, 74–97 (2011).
Article CAS Google Scholar
Guitart-Masip, M. et al. Go and no-go learning in reward and punishment: interactions between affect and effect. NeuroImage 62, 154–166 (2012).
Article Google Scholar
Daunizeau, J., Adam, V. & Rigoux, L. VBA: a probabilistic treatment of nonlinear models for neurobiological and behavioural data. PLoS Comput. Biol. 10, e1003441 (2014).
Article Google Scholar
Correa, C. M. et al. How the level of reward awareness changes the computational and electrophysiological signatures of reinforcement learning. J. Neurosci. 38, 10338–10348 (2018).
Article CAS Google Scholar
Cazé, R. D. & van der Meer, M. A. Adaptive properties of differential learning rates for positive and negative outcomes. Biol. Cybern. 107, 711–719 (2013).
Article Google Scholar
Benjamin, D. J. Errors in Probabilistic Reasoning and Judgment Biases No. w25200 (National Bureau of Economic Research, 2018).
Alicke, M. D. & Govorun, O. in The Self in Social Judgement (eds Alicke, M. et al.) 83–106 (Psychology Press, 2005).
Harris, A. J. & Osman, M. The illusion of control: a Bayesian perspective. Synthese 189, 29–38 (2012).
Article Google Scholar
Ajzen, I. Perceived behavioral control, self‐efficacy, locus of control, and the theory of planned behavior. J. Appl. Soc. Psychol. 32, 665–683 (2002).
Article Google Scholar
Kool, W., Getz, S. J. & Botvinick, M. M. Neural representation of reward probability: evidence from the illusion of control. J. Cogn. Neurosci. 25, 852–861 (2013).
Article Google Scholar
Izuma, K. et al. Neural correlates of cognitive dissonance and choice-induced preference change. Proc. Natl Acad. Sci. USA 107, 22014–22019 (2010).
Article CAS Google Scholar
Lau, B. & Glimcher, P. W. Dynamic response‐by‐response models of matching behavior in rhesus monkeys. J. Exp. Anal. Behav. 84, 555–579 (2005).
Article Google Scholar
Gershman, S. J. Do learning rates adapt to the distribution of rewards? Psychon. Bull. Rev. 22, 1320–1327 (2015).
Article Google Scholar
Findley, K. A. & Scott, M. S. Multiple dimensions of tunnel vision in criminal cases. Wis. L. Rev. 2006, 291–397 (2006).
Google Scholar
Rosenthal, R. & Jacobson, L. Pygmalion in the Classroom (Irvington, 1992).
Loehle, C. Hypothesis testing in ecology: psychological aspects and the importance of theory maturation. Q. Rev. Biol. 62, 397–409 (1987).
Article CAS Google Scholar
Fawcett, T. W. et al. The evolution of decision rules in complex environments. Trends Cogn. Sci. 18, 153–161 (2014).
Article Google Scholar
Murayama, K. et al. How self-determined choice facilitates performance: a key role of the ventromedial prefrontal cortex. Cereb. Cortex 25, 1241–1251 (2013).
Article Google Scholar
Voss, J. L., Gonsalves, B. D., Federmeier, K. D., Tranel, D. & Cohen, N. J. Hippocampal brain-network coordination during volitional exploratory behavior enhances learning. Nat. Neurosci. 14, 115–120 (2011).
Article CAS Google Scholar
Talluri, B. C., Urai, A. E., Tsetsos, K., Usher, M. & Donner, T. H. Confirmation bias through selective overweighting of choice-consistent evidence. Curr. Biol. 28, 3128–3135 (2018).
Article CAS Google Scholar
Chambon, V. et al. Neural coding of prior expectations in hierarchical intention inference. Sci. Rep. 7, 1278 (2017).
Article Google Scholar
Markant, D. & Gureckis, T. Category learning through active sampling. In Proceedings of the 32nd Annual Conference of the Cognitive Science Society (eds Ohlsson, S. & Catrambone, R.) 248–253 (Cognitive Science Society, 2010).
Xu, F. & Tenenbaum, J. B. Word learning as Bayesian inference. Psychol. Rev. 114, 245–272 (2007).
Article Google Scholar
Gureckis, T. M. & Markant, D. B. Self-directed learning: a cognitive and computational perspective. Perspect. Psychol. Sci. 7, 464–481 (2012).
Article Google Scholar
Leotti, L. A. & Delgado, M. R. The inherent reward of choice. Psychol. Sci. 22, 1310–1318 (2011).
Article Google Scholar
Cockburn, J., Collins, A. G. & Frank, M. J. A reinforcement learning mechanism responsible for the valuation of free choice. Neuron 83, 551–557 (2014).
Article CAS Google Scholar
Dorfman, H. M., Bhui, R., Hughes, B. L. & Gershman, S. J. Causal inference about good and bad outcomes. Psychol. Sci. 30, 516–525 (2019).
Article Google Scholar
Gershman, S. J. How to never be wrong. Psychon. Bull. Rev. 26, 13–28 (2019).
Article Google Scholar
Chambon, V., Thero, H., Findling, C. & Koechlin, E. Believing in one’s power: a counterfactual heuristic for goal-directed control. Preprint at bioRxiv https://doi.org/10.1101/498675 (2018).
Rotter, J. B. Social Learning and Clinical Psychology (Prentice-Hall, 1954).
Abramson, L. Y., Seligman, M. E. & Teasdale, J. D. Learned helplessness in humans: critique and reformulation. J. Abnorm. Psychol. 87, 49–74 (1978).
Article CAS Google Scholar
Palminteri, S., Khamassi, M., Joffily, M. & Coricelli, G. Contextual modulation of value signals in reward and punishment learning. Nat. Commun. 6, 8096 (2015).
Article CAS Google Scholar
Bishop, C. M. Pattern Recognition and Machine Learning (Springer, 2006).
Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P. & Dolan, R. J. Model-based influences on humans’ choices and striatal prediction errors. Neuron 69, 1204–1215 (2011).
Article CAS Google Scholar
Palminteri, S., Wyart, V. & Koechlin, E. The importance of falsification in computational cognitive modeling. Trends Cogn. Sci. 21, 425–433 (2017).
Article Google Scholar
Meyniel, F. et al. A specific role for serotonin in overcoming effort cost. eLife 5, e17282 (2016).
Article Google Scholar
Schwarz, G. Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978).
Article Google Scholar

Download references

Acknowledgements

V.C. was supported by the Agence Nationale de la Recherche (ANR) grants ANR-17-EURE-0017 (Frontiers in Cognition), ANR-10-IDEX-0001-02 PSL (program ‘Investissements d’Avenir’) and ANR-16-CE37-0012-01 (ANR JCJ) and ANR-19-CE37-0014-01 (ANR PRC). H.T. was supported by a PSL/ENS studentship. M.V. was supported by FIRE (‘Programme Bettencourt’) and by a Région Île-de-France studentship. P.H. was supported by the Chaire Blaise Pascal of the Région Île-de-France. S.P. was supported by an ATIP-Avenir grant (R16069JS), the Programme Emergence(s) de la Ville de Paris, the Fyssen Foundation and the Fondation Schlumberger pour l’Education et la Recherche (FSER). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

These authors contributed equally: Valérian Chambon, Héloïse Théro.

Authors and Affiliations

Institut Jean Nicod, Département d’Études Cognitives, École Normale Supérieure, EHESS, CNRS, PSL University, Paris, France
Valérian Chambon & Marie Vidal
Laboratoire de Neurosciences Cognitives et Computationnelles, Département d’Études Cognitives, École Normale Supérieure, INSERM, PSL University, Paris, France
Héloïse Théro, Henri Vandendriessche, Patrick Haggard & Stefano Palminteri
Institute of Psychiatry and Neuroscience of Paris (IPNP), INSERM U1266, Université de Paris, Paris, France
Marie Vidal
Institute of Cognitive Neuroscience, University College London, London, UK
Patrick Haggard

Authors

Valérian Chambon
View author publications
You can also search for this author in PubMed Google Scholar
Héloïse Théro
View author publications
You can also search for this author in PubMed Google Scholar
Marie Vidal
View author publications
You can also search for this author in PubMed Google Scholar
Henri Vandendriessche
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Haggard
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Palminteri
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

V.C., S.P. and P.H. developed the study concept. Testing and data collection were performed by H.T. and M.V. H.V. helped to write the Psychtoolbox script for data collection. Data analysis was performed by V.C., H.T., M.V. and S.P. V.C. and H.T. drafted the manuscript. S.P. and P.H. provided critical revisions. All authors approved the final version of the manuscript for submission.

Corresponding authors

Correspondence to Valérian Chambon, Héloïse Théro or Stefano Palminteri.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Primary Handling Editor: Marike Schiffer.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Methods, Supplementary Results, Supplementary Figs. 1–3, Supplementary Tables 1–3 and Supplementary References.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chambon, V., Théro, H., Vidal, M. et al. Information about action outcomes differentially affects learning from self-determined versus imposed choices. Nat Hum Behav 4, 1067–1079 (2020). https://doi.org/10.1038/s41562-020-0919-5

Download citation

Received: 10 September 2019
Accepted: 26 June 2020
Published: 03 August 2020
Issue Date: October 2020
DOI: https://doi.org/10.1038/s41562-020-0919-5

This article is cited by

Understanding the development of reward learning through the lens of meta-learning
- Kate Nussenbaum
- Catherine A. Hartley
Nature Reviews Psychology (2024)
The reliability of assistance systems modulates the sense of control and acceptability of human operators
- Quentin Vantrepotte
- Valérian Chambon
- Bruno Berberian
Scientific Reports (2023)
Agency rescues competition for credit assignment among predictive cues from adverse learning conditions
- Mihwa Kang
- Ingrid Reverte
- Guillem R. Esber
Scientific Reports (2021)