Abstract
The valence of new information influences learning rates in humans: good news tends to receive more weight than bad news. We investigated this learning bias in four experiments, by systematically manipulating the source of required action (free versus forced choices), outcome contingencies (low versus high reward) and motor requirements (go versus no-go choices). Analysis of model-estimated learning rates showed that the confirmation bias in learning rates was specific to free choices, but was independent of outcome contingencies. The bias was also unaffected by the motor requirements, thus suggesting that it operates in the representational space of decisions, rather than motoric actions. Finally, model simulations revealed that learning rates estimated from the choice-confirmation model had the effect of maximizing performance across low- and high-reward environments. We therefore suggest that choice-confirmation bias may be adaptive for efficient learning of action–outcome contingencies, above and beyond fostering person-level dispositions such as self-esteem.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
The reliability of assistance systems modulates the sense of control and acceptability of human operators
Scientific Reports Open Access 02 September 2023
-
Agency rescues competition for credit assignment among predictive cues from adverse learning conditions
Scientific Reports Open Access 10 August 2021
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout








Data availability
The data that support the findings of this study are available from the GitHub repository (https://github.com/spalminteri/agency).
Code availability
Custom code scripts have been made available on the GitHub repository (https://github.com/spalminteri/agency). Additional modified scripts can be accessed upon request.
References
Barto, A. G. & Sutton, R. S. Reinforcement Learning: An Introduction (MIT Press, 1998).
Lefebvre, G., Lebreton, M., Meyniel, F., Bourgeois-Gironde, S. & Palminteri, S. Behavioural and neural characterization of optimistic reinforcement learning. Nat. Hum. Behav. 1, 0067 (2017).
Aberg, K. C., Doell, K. C. & Schwartz, S. Linking individual learning styles to approach-avoidance motivational traits and computational aspects of reinforcement learning. PLoS ONE 11, e0166675 (2016).
Frank, M. J., Moustafa, A. A., Haughey, H. M., Curran, T. & Hutchison, K. E. Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning. Proc. Natl Acad. Sci. USA 104, 16311–16316 (2007).
Sharot, T. & Garrett, N. Forming beliefs: why valence matters. Trends Cogn. Sci. 20, 25–33 (2016).
Kuzmanovic, B. & Rigoux. L. Optimistic belief updating deviates from Bayesian learning. SSRN https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2810063 (2016).
Palminteri, S., Lefebvre, G., Kilford, E. J. & Blakemore, S. J. Confirmation bias in human reinforcement learning: evidence from counterfactual feedback processing. PLoS Comput. Biol. 138, e1005684 (2017).
Nickerson, R. S. Confirmation bias: a ubiquitous phenomenon in many guises. Rev. Gen. Psychol. 2, 175–220 (1998).
Katahira, K. The statistical structures of reinforcement learning with asymmetric value updates. J. Math. Psychol. 87, 31–45 (2018).
Boureau, Y. L. & Dayan, P. Opponency revisited: competition and cooperation between dopamine and serotonin. Neuropsychopharmacology 36, 74–97 (2011).
Guitart-Masip, M. et al. Go and no-go learning in reward and punishment: interactions between affect and effect. NeuroImage 62, 154–166 (2012).
Daunizeau, J., Adam, V. & Rigoux, L. VBA: a probabilistic treatment of nonlinear models for neurobiological and behavioural data. PLoS Comput. Biol. 10, e1003441 (2014).
Correa, C. M. et al. How the level of reward awareness changes the computational and electrophysiological signatures of reinforcement learning. J. Neurosci. 38, 10338–10348 (2018).
Cazé, R. D. & van der Meer, M. A. Adaptive properties of differential learning rates for positive and negative outcomes. Biol. Cybern. 107, 711–719 (2013).
Benjamin, D. J. Errors in Probabilistic Reasoning and Judgment Biases No. w25200 (National Bureau of Economic Research, 2018).
Alicke, M. D. & Govorun, O. in The Self in Social Judgement (eds Alicke, M. et al.) 83–106 (Psychology Press, 2005).
Harris, A. J. & Osman, M. The illusion of control: a Bayesian perspective. Synthese 189, 29–38 (2012).
Ajzen, I. Perceived behavioral control, self‐efficacy, locus of control, and the theory of planned behavior. J. Appl. Soc. Psychol. 32, 665–683 (2002).
Kool, W., Getz, S. J. & Botvinick, M. M. Neural representation of reward probability: evidence from the illusion of control. J. Cogn. Neurosci. 25, 852–861 (2013).
Izuma, K. et al. Neural correlates of cognitive dissonance and choice-induced preference change. Proc. Natl Acad. Sci. USA 107, 22014–22019 (2010).
Lau, B. & Glimcher, P. W. Dynamic response‐by‐response models of matching behavior in rhesus monkeys. J. Exp. Anal. Behav. 84, 555–579 (2005).
Gershman, S. J. Do learning rates adapt to the distribution of rewards? Psychon. Bull. Rev. 22, 1320–1327 (2015).
Findley, K. A. & Scott, M. S. Multiple dimensions of tunnel vision in criminal cases. Wis. L. Rev. 2006, 291–397 (2006).
Rosenthal, R. & Jacobson, L. Pygmalion in the Classroom (Irvington, 1992).
Loehle, C. Hypothesis testing in ecology: psychological aspects and the importance of theory maturation. Q. Rev. Biol. 62, 397–409 (1987).
Fawcett, T. W. et al. The evolution of decision rules in complex environments. Trends Cogn. Sci. 18, 153–161 (2014).
Murayama, K. et al. How self-determined choice facilitates performance: a key role of the ventromedial prefrontal cortex. Cereb. Cortex 25, 1241–1251 (2013).
Voss, J. L., Gonsalves, B. D., Federmeier, K. D., Tranel, D. & Cohen, N. J. Hippocampal brain-network coordination during volitional exploratory behavior enhances learning. Nat. Neurosci. 14, 115–120 (2011).
Talluri, B. C., Urai, A. E., Tsetsos, K., Usher, M. & Donner, T. H. Confirmation bias through selective overweighting of choice-consistent evidence. Curr. Biol. 28, 3128–3135 (2018).
Chambon, V. et al. Neural coding of prior expectations in hierarchical intention inference. Sci. Rep. 7, 1278 (2017).
Markant, D. & Gureckis, T. Category learning through active sampling. In Proceedings of the 32nd Annual Conference of the Cognitive Science Society (eds Ohlsson, S. & Catrambone, R.) 248–253 (Cognitive Science Society, 2010).
Xu, F. & Tenenbaum, J. B. Word learning as Bayesian inference. Psychol. Rev. 114, 245–272 (2007).
Gureckis, T. M. & Markant, D. B. Self-directed learning: a cognitive and computational perspective. Perspect. Psychol. Sci. 7, 464–481 (2012).
Leotti, L. A. & Delgado, M. R. The inherent reward of choice. Psychol. Sci. 22, 1310–1318 (2011).
Cockburn, J., Collins, A. G. & Frank, M. J. A reinforcement learning mechanism responsible for the valuation of free choice. Neuron 83, 551–557 (2014).
Dorfman, H. M., Bhui, R., Hughes, B. L. & Gershman, S. J. Causal inference about good and bad outcomes. Psychol. Sci. 30, 516–525 (2019).
Gershman, S. J. How to never be wrong. Psychon. Bull. Rev. 26, 13–28 (2019).
Chambon, V., Thero, H., Findling, C. & Koechlin, E. Believing in one’s power: a counterfactual heuristic for goal-directed control. Preprint at bioRxiv https://doi.org/10.1101/498675 (2018).
Rotter, J. B. Social Learning and Clinical Psychology (Prentice-Hall, 1954).
Abramson, L. Y., Seligman, M. E. & Teasdale, J. D. Learned helplessness in humans: critique and reformulation. J. Abnorm. Psychol. 87, 49–74 (1978).
Palminteri, S., Khamassi, M., Joffily, M. & Coricelli, G. Contextual modulation of value signals in reward and punishment learning. Nat. Commun. 6, 8096 (2015).
Bishop, C. M. Pattern Recognition and Machine Learning (Springer, 2006).
Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P. & Dolan, R. J. Model-based influences on humans’ choices and striatal prediction errors. Neuron 69, 1204–1215 (2011).
Palminteri, S., Wyart, V. & Koechlin, E. The importance of falsification in computational cognitive modeling. Trends Cogn. Sci. 21, 425–433 (2017).
Meyniel, F. et al. A specific role for serotonin in overcoming effort cost. eLife 5, e17282 (2016).
Schwarz, G. Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978).
Acknowledgements
V.C. was supported by the Agence Nationale de la Recherche (ANR) grants ANR-17-EURE-0017 (Frontiers in Cognition), ANR-10-IDEX-0001-02 PSL (program ‘Investissements d’Avenir’) and ANR-16-CE37-0012-01 (ANR JCJ) and ANR-19-CE37-0014-01 (ANR PRC). H.T. was supported by a PSL/ENS studentship. M.V. was supported by FIRE (‘Programme Bettencourt’) and by a Région Île-de-France studentship. P.H. was supported by the Chaire Blaise Pascal of the Région Île-de-France. S.P. was supported by an ATIP-Avenir grant (R16069JS), the Programme Emergence(s) de la Ville de Paris, the Fyssen Foundation and the Fondation Schlumberger pour l’Education et la Recherche (FSER). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
V.C., S.P. and P.H. developed the study concept. Testing and data collection were performed by H.T. and M.V. H.V. helped to write the Psychtoolbox script for data collection. Data analysis was performed by V.C., H.T., M.V. and S.P. V.C. and H.T. drafted the manuscript. S.P. and P.H. provided critical revisions. All authors approved the final version of the manuscript for submission.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Primary Handling Editor: Marike Schiffer.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Methods, Supplementary Results, Supplementary Figs. 1–3, Supplementary Tables 1–3 and Supplementary References.
Rights and permissions
About this article
Cite this article
Chambon, V., Théro, H., Vidal, M. et al. Information about action outcomes differentially affects learning from self-determined versus imposed choices. Nat Hum Behav 4, 1067–1079 (2020). https://doi.org/10.1038/s41562-020-0919-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41562-020-0919-5
This article is cited by
-
The reliability of assistance systems modulates the sense of control and acceptability of human operators
Scientific Reports (2023)
-
Agency rescues competition for credit assignment among predictive cues from adverse learning conditions
Scientific Reports (2021)