Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advances in modeling learning and decision-making in neuroscience

Subjects

Abstract

An organism’s survival depends on its ability to learn about its environment and to make adaptive decisions in the service of achieving the best possible outcomes in that environment. To study the neural circuits that support these functions, researchers have increasingly relied on models that formalize the computations required to carry them out. Here, we review the recent history of computational modeling of learning and decision-making, and how these models have been used to advance understanding of prefrontal cortex function. We discuss how such models have advanced from their origins in basic algorithms of updating and action selection to increasingly account for complexities in the cognitive processes required for learning and decision-making, and the representations over which they operate. We further discuss how a deeper understanding of the real-world complexities in these computations has shed light on the fundamental constraints on optimal behavior, and on the complex interactions between corticostriatal pathways to determine such behavior. The continuing and rapid development of these models holds great promise for understanding the mechanisms by which animals adapt to their environments, and what leads to maladaptive forms of learning and decision-making within clinical populations.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Levels of state and action representation during learning.
Fig. 2: Evaluating and selecting between options.

References

  1. 1.

    Averbeck B, O’Doherty JP. Reinforcement-learning in fronto-striatal circuits. Neuropsychopharmacology. 2021. https://doi.org/10.1038/s41386-021-01108-0.

  2. 2.

    Monosov IE, Rushworth MF. Interactions between ventrolateral prefrontal and anterior cingulate cortex during learning and behavioural change. Neuropsychopharmacology. 2021. https://doi.org/10.1038/s41386-021-01079-2.

  3. 3.

    Friedman, N.P., Robbins, T.W. The role of prefrontal cortex in cognitive control and executive function. Neuropsychopharmacology. (2021). https://doi.org/10.1038/s41386-021-01132-0.

  4. 4.

    Dickinson A, Mackintosh NJ. Classical conditioning in animals. Annu Rev Psychol. 1978;29:587–612.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  5. 5.

    Wagner AR, Rescorla RA. Inhibition in Pavlovian conditioning: application of a theory. Inhibition and learning. 1972:301–36.

  6. 6.

    Skinner BF. Conditioning and extinction and their relation to drive. J Gen Psychol. 1936;14:296–317.

    Article  Google Scholar 

  7. 7.

    Montague P, Dayan P, Sejnowski T. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J Neurosci. 1996;16:1936–47.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  8. 8.

    Schultz W, Dayan P, Montague PR. A neural substrate of prediction and reward. Science 1997;275:1593–9.

    CAS  Article  Google Scholar 

  9. 9.

    Marr D. Vision: a computational approach. Freeman & Co.: San Francisco; 1982.

  10. 10.

    Niv Y, Langdon A. Reinforcement learning with Marr. Curr Opin Behav Sci. 2016;11:67–73.

    PubMed  PubMed Central  Article  Google Scholar 

  11. 11.

    Samejima K. Representation of action-specific reward values in the striatum. Science 2005;310:1337–40.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  12. 12.

    Tai L-H, Lee AM, Benavidez N, Bonci A, Wilbrecht L. Transient stimulation of distinct subpopulations of striatal neurons mimics changes in action value. Nat Neurosci. 2012;15:1281–\s9.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  13. 13.

    Calabresi P, Picconi B, Tozzi A, Di Filippo M. Dopamine-mediated regulation of corticostriatal synaptic plasticity. Trends Neurosci. 2007;30:211–9.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  14. 14.

    Collins AGE, Frank MJ. Opponent actor learning (OpAL): modeling interactive effects of striatal dopamine on reinforcement learning and choice incentive. Psychol Rev. 2014;121:337–66.

    PubMed  Article  PubMed Central  Google Scholar 

  15. 15.

    Frank MJ. By Carrot or by Stick: cognitive reinforcement learning in Parkinsonism. Science. 2004;306:1940–3.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  16. 16.

    Alexander GE, DeLong MR, Strick PL. Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annu Rev Neurosci. 1986;9:357–81.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  17. 17.

    Hazy TE, Frank MJ, O’Reilly RC. Banishing the homunculus: making working memory work. Neuroscience 2006;139:105–18.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  18. 18.

    Rmus M, McDougle SD, Collins AG. The role of executive function in shaping reinforcement learning. Curr Opin Behav Sci. 2021;38:66–73.

    Article  Google Scholar 

  19. 19.

    Sutton RS, Barto AG. Reinforcement learning: an introduction. MIT Press: Cambridge, Mass; 1998.

  20. 20.

    Niv Y, Daniel R, Geana A, Gershman SJ, Leong YC, Radulescu A, et al. Reinforcement learning in multidimensional environments relies on attention mechanisms. J Neurosci. 2015;35:8145–57.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  21. 21.

    Leong YC, Radulescu A, Daniel R, DeWoskin V, Niv Y. Dynamic interaction between reinforcement learning and attention in multidimensional environments. Neuron 2017;93:451–63.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  22. 22.

    Wilson RC, Niv Y. Inferring relevance in a changing world. Front Hum Neurosci. 2012;5.

  23. 23.

    Farashahi S, Xu J, Wu S-W, Soltani A. Learning arbitrary stimulus-reward associations for naturalistic stimuli involves transition from learning about features to learning about objects. Cognition. 2020;205:104425.

    PubMed  Article  PubMed Central  Google Scholar 

  24. 24.

    Song MR, Lee SW. Dynamic resource allocation during reinforcement learning accounts for ramping and phasic dopamine activity. Neural Netw. 2020;126:95–107.

    PubMed  Article  PubMed Central  Google Scholar 

  25. 25.

    Babayan BM, Uchida N, Gershman SJ. Belief state representation in the dopamine system. Nat Commun. 2018;9:1891.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  26. 26.

    Gershman SJ, Niv Y. Learning latent structure: carving nature at its joints. Curr Opin Neurobiol. 2010;20:251–6.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  27. 27.

    Gershman SJ, Uchida N. Believing in dopamine. Nat Rev Neurosci. 2019;20:703–14.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  28. 28.

    Niv Y. Learning task-state representations. Nat Neurosci. 2019;22:1544–53.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  29. 29.

    Sanders H, Wilson MA, Gershman SJ. Hippocampal remapping as hidden state inference. eLife. 2020;9:e51140.

    PubMed  PubMed Central  Article  Google Scholar 

  30. 30.

    Schuck NW, Wilson R, Niv Y. A state representation for reinforcement learning and decision-making in the orbitofrontal cortex. Goal-directed decision making. Elsevier; 2018. p. 259–78.

  31. 31.

    Wilson Robert C, Takahashi Yuji K, Schoenbaum G, Niv Y. Orbitofrontal cortex as a cognitive map of task space. Neuron. 2014;81:267–79.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  32. 32.

    Schuck Nicolas W, Cai Ming B, Wilson Robert C, Niv Y. Human orbitofrontal cortex represents a cognitive map of state space. Neuron. 2016;91:1402–12.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  33. 33.

    Schoenbaum G, Roesch MR, Stalnaker TA, Takahashi YK. A new perspective on the role of the orbitofrontal cortex in adaptive behaviour. Nat Rev Neurosci. 2009;10:885–92.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  34. 34.

    Zhou J, Gardner MPH, Stalnaker TA, Ramus SJ, Wikenheiser AM, Niv Y, et al. Rat orbitofrontal ensemble activity contains multiplexed but dissociable representations of value and task structure in an odor sequence task. Curr Biol. 2019;29:897–907.e3.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  35. 35.

    Brunec, IK, & Momennejad, I Predictive representations in hippocampal and prefrontal hierarchies. BioRxiv. 2019;786434.

  36. 36.

    Momennejad I. Learning structures: predictive representations, replay, and generalization. Curr Opin Behav Sci. 2020;32:155–66.

    Article  Google Scholar 

  37. 37.

    Whittington JCR, Muller TH, Mark S, Chen G, Barry C, Burgess N, et al. The Tolman-Eichenbaum machine: unifying space and relational memory through generalization in the hippocampal formation. Cell. 2020;183:1249–63.e23.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  38. 38.

    Morris A, Phillips JS, Huang K, Cushman FA Generating options and choosing between them rely on distinct forms of value representation. Psychol Sci. in press.

  39. 39.

    Botvinick MM, Niv Y, Barto AG. Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition. 2009;113:262–80.

    PubMed  Article  PubMed Central  Google Scholar 

  40. 40.

    Cooper RP, Shallice T. Hierarchical schemas and goals in the control of sequential behavior. Psychol Rev. 2006;113:887–916.

    PubMed  Article  PubMed Central  Google Scholar 

  41. 41.

    Solway A, Diuk C, Córdova N, Yee D, Barto AG, Niv Y, et al. Optimal behavioral hierarchy. PLoS Comput Biol. 2014;10:e1003779.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  42. 42.

    Xia L, Collins AGE. Temporal and state abstractions for efficient learning, transfer and composition in humans. Psychol Rev. 2021;128:643–66.

    PubMed  Article  PubMed Central  Google Scholar 

  43. 43.

    Diuk C, Tsai K, Wallis J, Botvinick M, Niv Y. Hierarchical learning induces two simultaneous, but separable, prediction errors in human basal ganglia. J Neurosci. 2013;33:5797–805.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  44. 44.

    Ribas-Fernandes José JF, Solway A, Diuk C, McGuire Joseph T, Barto Andrew G, Niv Y, et al. A neural signature of hierarchical reinforcement learning. Neuron 2011;71:370–9.

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  45. 45.

    Badre D, Wagner AD. Left ventrolateral prefrontal cortex and the cognitive control of memory. Neuropsychologia. 2007;45:2883–901.

    PubMed  Article  PubMed Central  Google Scholar 

  46. 46.

    Koechlin E. The architecture of cognitive control in the human prefrontal cortex. Science. 2003;302:1181–5.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  47. 47.

    Collins AGE, Frank MJ. How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis: Working memory in reinforcement learning. Eur J Neurosci. 2012;35:1024–35.

    PubMed  PubMed Central  Article  Google Scholar 

  48. 48.

    Collins AGE, Frank MJ. Cognitive control over learning: creating, clustering, and generalizing task-set structure. Psychol Rev. 2013;120:190–229.

    PubMed  PubMed Central  Article  Google Scholar 

  49. 49.

    Donoso M, Collins AGE, Koechlin E. Foundations of human reasoning in the prefrontal cortex. Science. 2014;344:1481–6.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  50. 50.

    Alexander WH, Brown JW. Hierarchical error representation: a computational model of anterior cingulate and dorsolateral prefrontal cortex. Neural Comput. 2015;27:2354–410.

    PubMed  Article  PubMed Central  Google Scholar 

  51. 51.

    Zarr N, Brown JW. Hierarchical error representation in medial prefrontal cortex. NeuroImage. 2016;124:238–47.

    PubMed  Article  PubMed Central  Google Scholar 

  52. 52.

    Frank MJ, Badre D. Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis. Cereb Cortex. 2012;22:509–26.

    PubMed  Article  PubMed Central  Google Scholar 

  53. 53.

    Ballard I, Miller EM, Piantadosi ST, Goodman ND, McClure SM. Beyond reward prediction errors: human striatum updates rule values during learning. Cereb Cortex. 2018;28:3965–75.

    PubMed  Article  PubMed Central  Google Scholar 

  54. 54.

    Eckstein MK, Collins AGE. Computational evidence for hierarchically structured reinforcement learning in humans. Proc Natl Acad Sci. 2020;117:29381–9.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  55. 55.

    Collins A, Koechlin E. Reasoning, learning, and creativity: frontal lobe function and human decision-making. PLoS Biol. 2012;10:e1001293.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  56. 56.

    Badre D, Frank MJ. Mechanisms of hierarchical reinforcement learning in cortico-striatal circuits 2: evidence from fMRI. Cereb Cortex. 2012;22:527–36.

    PubMed  Article  PubMed Central  Google Scholar 

  57. 57.

    Daw Nathaniel D, Gershman Samuel J, Seymour B, Dayan P, Dolan Raymond J. Model-based influences on humans’ choices and striatal prediction errors. Neuron. 2011;69:1204–15.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  58. 58.

    Stanovich KE, West RF. Individual differences in reasoning: implications for the rationality debate? Behav Brain Sci. 2000;23:645–65.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  59. 59.

    Doll BB, Duncan KD, Simon DA, Shohamy D, Daw ND. Model-based choices involve prospective neural activity. Nat Neurosci. 2015;18:767–72.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  60. 60.

    Otto AR, Raio CM, Chiang A, Phelps EA, Daw ND. Working-memory capacity protects model-based learning from stress. Proc Natl Acad Sci. 2013;110:20941–46.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  61. 61.

    Collins AGE, Cockburn J. Beyond dichotomies in reinforcement learning. Nat Rev Neurosci. 2020;21:576–86.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  62. 62.

    Miller KJ, Ludvig EA, Pezzulo G, Shenhav A. Re-aligning models of habitual and goal-directed decision-making. In: Bornstein AM, Morris RW, Shenhav A, editors. Goal-directed decision making: computations and neural circuits. Amsterdam: Elsevier; 2018. p. 407–28.

    Chapter  Google Scholar 

  63. 63.

    Moran R, Keramati M, Dayan P, Dolan RJ. Retrospective model-based inference guides model-free credit assignment. Nat Commun. 2019;10:750.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  64. 64.

    Yang GR, Joglekar MR, Song HF, Newsome WT, Wang XJ. Task representations in neural networks trained to perform many cognitive tasks. Nat Neurosci. 2019;22:297–306.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  65. 65.

    Collins AGE. The tortoise and the hare: interactions between reinforcement learning and working memory. J Cogn Neurosci. 2018;30:1422–32.

    PubMed  Article  PubMed Central  Google Scholar 

  66. 66.

    Viejo G, Khamassi M, Brovelli A, Girard B. Modeling choice and reaction time during arbitrary visuomotor learning through the coordination of adaptive working memory and reinforcement learning. Front Behav Neurosci. 2015;9.

  67. 67.

    McDougle SD, Collins AGE. Modeling the influence of working memory, reinforcement, and action uncertainty on reaction time and choice during instrumental learning. Psychon Bull Rev. 2021;28:20–39.

    PubMed  Article  PubMed Central  Google Scholar 

  68. 68.

    Frank MJ, Moustafa AA, Haughey HM, Curran T, Hutchison KE. Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning. Proc Natl Acad Sci. 2007;104:16311–16.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  69. 69.

    Poldrack RA, Clark J, Paré-Blagoev EJ, Shohamy D, Creso Moyano J, Myers C, et al. Interactive memory systems in the human brain. Nature. 2001;414:546–50.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  70. 70.

    Foerde K, Shohamy D. The role of the basal ganglia in learning and memory: Insight from Parkinson’s disease. Neurobiol Learn Mem. 2011;96:624–36.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  71. 71.

    Wimmer GE, Braun EK, Daw ND, Shohamy D. Episodic memory encoding interferes with reward learning and decreases striatal prediction errors. J Neurosci. 2014;34:14901–12.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  72. 72.

    Bornstein AM, Khaw MW, Shohamy D, Daw ND. Reminders of past choices bias decisions for reward in humans. Nat Commun. 2017;8:15958.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  73. 73.

    Bornstein AM, Norman KA. Reinstated episodic context guides sampling-based decisions for reward. Nat Neurosci. 2017;20:997–1003.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  74. 74.

    Vikbladh OM, Meager MR, King J, Blackmon K, Devinsky O, Shohamy D, et al. Hippocampal contributions to model-based planning and spatial memory. Neuron. 2019;102:683–93.e4.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  75. 75.

    Behrens TEJ, Woolrich MW, Walton ME, Rushworth MFS. Learning the value of information in an uncertain world. Nat Neurosci. 2007;10:1214–21.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  76. 76.

    Findling C, Chopin N, Koechlin E. Imprecise neural computations as a source of adaptive behaviour in volatile environments. Nat Hum Behav. 2021;5:99–112.

    PubMed  Article  PubMed Central  Google Scholar 

  77. 77.

    Brown VM, Zhu L, Wang JM, Frueh BC, King-Casas B, Chiu PH. Associability-modulated loss learning is increased in posttraumatic stress disorder. eLife. 2018;7:e30150.

    PubMed  PubMed Central  Article  Google Scholar 

  78. 78.

    Li J, Schiller D, Schoenbaum G, Phelps EA, Daw ND. Differential roles of human striatum and amygdala in associative learning. Nat Neurosci. 2011;14:1250–2.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  79. 79.

    Nassar MR, Wilson RC, Heasly B, Gold JI. An approximately Bayesian delta-rule model explains the dynamics of belief updating in a changing environment. J Neurosci. 2010;30:12366–78.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  80. 80.

    Bavard S, Lebreton M, Khamassi M, Coricelli G, Palminteri S. Reference-point centering and range-adaptation enhance human reinforcement learning at the cost of irrational preferences. Nat Commun. 2018;9:4503.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  81. 81.

    Boorman ED, Behrens TE, Rushworth MF. Counterfactual choice and learning in a neural network centered on human lateral frontopolar cortex. PLoS Biol. 2011;9:e1001093.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  82. 82.

    Palminteri S, Lefebvre G, Kilford EJ, Blakemore S-J. Confirmation bias in human reinforcement learning: evidence from counterfactual feedback processing. PLOS Comput Biol. 2017;13:e1005684.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  83. 83.

    Mohr H, Zwosta K, Markovic D, Bitzer S, Wolfensteller U, Ruge H. Deterministic response strategies in a trial-and-error learning task. PLoS Comput Biol. 2018;14:e1006621.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  84. 84.

    Thaler RH. Behavioral economics: past, present, and future. Am Econ Rev. 2016;106:1577–600.

    Article  Google Scholar 

  85. 85.

    Stewart N, Reimers S, Harris AJ. On the origin of utility, weighting, and discounting functions: how they get their shapes and how to change their shapes. Manag Sci. 2015;61:687–705.

    Article  Google Scholar 

  86. 86.

    Mitchell SH. Discounting the value of commodities according to different types of cost. Choice, behavioural economics and addiction. 2003. p. 339–62.

  87. 87.

    Chong T, Bonnelle V, Husain M. Quantifying motivation with effort-based decision-making paradigms in health and disease. Prog Brain Res. 2016;229:71–100.

    PubMed  Article  PubMed Central  Google Scholar 

  88. 88.

    Tobler PN, Weber EU Valuation for risky and uncertain choices. In: Glimcher PW, Fehr E, editors. Neuroeconomics: Decision Making and the Brain: Second Edition. Academic Press; 2014. p. 149-72.

  89. 89.

    Kable JW. Valuation, intertemporal choice, and self-control. In: Glimcher PW, Fehr E, editors. Neuroeconomics: Decision Making and the Brain: Second Edition. Academic Press; 2014. p. 173-92.

  90. 90.

    Kahneman D, Tversky A. Prospect theory: an analysis of decision under risk. Econom J Econom Soc. 1979;47:263–91.

    Google Scholar 

  91. 91.

    Mata R, Frey R, Richter D, Schupp J, Hertwig R. Risk preference: a view from psychology. J Econ Perspect. 2018;32:155–72.

    PubMed  Article  PubMed Central  Google Scholar 

  92. 92.

    Berns GS, Laibson D, Loewenstein G. Intertemporal choice–toward an integrative framework. Trends Cogn Sci. 2007;11:482–8.

    PubMed  Article  PubMed Central  Google Scholar 

  93. 93.

    Klein-Flügge MC, Kennerley SW, Saraiva AC, Penny WD, Bestmann S. Behavioral modeling of human choices reveals dissociable effects of physical effort and temporal delay on reward devaluation. PLoS Comput Biol. 2015;11:e1004116.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  94. 94.

    Chong TTJ, Apps M, Giehl K, Sillence A, Grima LL, Husain M. Neurocomputational mechanisms underlying subjective valuation of effort costs. PLoS Biol. 2017;15:e1002598–28.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  95. 95.

    Levy I, Snell J, Nelson AJ, Rustichini A, Glimcher PW. Neural representation of subjective value under risk and ambiguity. J Neurophysiol. 2010;103:1036–47.

    PubMed  Article  PubMed Central  Google Scholar 

  96. 96.

    Preuschoff K, Bossaerts P, Quartz S. Neural differentiation of expected reward and risk in human subcortical structures. Neuron. 2006;51:381–90.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  97. 97.

    Tobler PN, O’Doherty JP, Dolan RJ, Schultz W. Reward value coding distinct from risk attitude-related uncertainty coding in human reward systems. J Neurophysiol. 2006;97:1621–32.

    PubMed  PubMed Central  Article  Google Scholar 

  98. 98.

    Tom SM, Fox CR, Trepel C, Poldrack RA. The neural basis of loss aversion in decision-making under risk. Science. 2007;315:515–8.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  99. 99.

    Kable JW, Glimcher PW. The neural correlates of subjective value during intertemporal choice. Nat Neurosci. 2007;10:1625–33.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  100. 100.

    McClure SM, Laibson D, Loewenstein GF, Cohen JD. Separate neural systems value immediate and delayed monetary rewards. Science. 2004;306:503–7.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  101. 101.

    Prévost C, Pessiglione M, Météreau E, Cléry-Melin M, Dreher J. Separate valuation subsystems for delay and effort decision costs. J Neurosci. 2010;30:14080–90.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  102. 102.

    Schmidt L, Lebreton M, Cléry-Melin M-L, Daunizeau J, Pessiglione M. Neural mechanisms underlying motivation of mental versus physical effort. PLoS Biol. 2012;10:e1001266.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  103. 103.

    Levy DJ, Glimcher PW. The root of all value: a neural common currency for choice. Curr Opin Neurobiol. 2012;22:1027–38.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  104. 104.

    Bartra O, McGuire JT, Kable JW. The valuation system: a coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value. NeuroImage. 2013;76:412–27.

    PubMed  Article  PubMed Central  Google Scholar 

  105. 105.

    Seaman KL, Brooks N, Karrer TM, Dang L, Hsu M, Zald DH, et al. Neural subjective value representations across age and discount factors: time delay, physical effort, and probability discounting. Soc, Cogn, Affect Neurosci. 2018;13:449–59.

    Article  Google Scholar 

  106. 106.

    Oud B, Krajbich I, Miller K, Cheong JH, Botvinick M, Fehr E. Irrational time allocation in decision-making. Proc R Soc B. 2016;283:20151439–8.

    PubMed  PubMed Central  Article  Google Scholar 

  107. 107.

    Grueschow M, Polanía R, Hare TA, Ruff CC. Automatic versus choice-dependent value representations in the human brain. Neuron 2015;85:874–85.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  108. 108.

    Wyart V, Koechlin E. Choice variability and suboptimality in uncertain environments. Curr Opin Behav Sci. 2016;11:109–15.

    Article  Google Scholar 

  109. 109.

    Wilson RC, Bonawitz E, Costa VD, Ebitz RB. Balancing Exploration and Exploitation with Information and Randomization. Curr Opin Behav Sci. 2021;49-56:49–56.

    Article  Google Scholar 

  110. 110.

    Webb R. The (Neural) dynamics of stochastic choice. Manag Sci. 2019;64:230–55.

    Article  Google Scholar 

  111. 111.

    Becker GM, DeGroot MH, Marschak J. Stochastic models of choice behavior. Behav Sci. 1963;8:41–55.

    Article  Google Scholar 

  112. 112.

    Juechems K, Summerfield C. Where does value come from? Trends Cogn Sci. 2019;23:836–50.

    PubMed  Article  PubMed Central  Google Scholar 

  113. 113.

    Vlaev I, Chater N, Stewart N, Brown GDA. Does the brain calculate value? Trends Cogn Sci. 2011;15:546–54.

    PubMed  Article  PubMed Central  Google Scholar 

  114. 114.

    Wald A, Wolfowitz J. Optimum character of the sequential probability ratio test. Ann Math Stat. 1948;19:326–39.

    Article  Google Scholar 

  115. 115.

    Ratcliff R, Smith PL, Brown SD, McKoon G. Diffusion decision model: current issues and history. Trends Cogn Sci. 2016;20:260–81.

    PubMed  PubMed Central  Article  Google Scholar 

  116. 116.

    Shadlen MN, Kiani R. Decision making as a window on cognition. Neuron 2013;80:791–806.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  117. 117.

    Bogacz R. Optimal decision-making theories: linking neurobiology with behaviour. Trends Cogn Sci. 2007;11:118–25.

    PubMed  Article  PubMed Central  Google Scholar 

  118. 118.

    Teodorescu AR, Usher M. Disentangling Decision Models: From Independence to Competition. Psychol Rev. 2013;120:1–38.

    PubMed  Article  PubMed Central  Google Scholar 

  119. 119.

    Ratcliff R. A theory of memory retrieval. Psychol Rev. 1978;85:59–108.

    Article  Google Scholar 

  120. 120.

    Milosavljevic M, Malmaud J, Huth A, Koch C, Rangel A. The Drift Diffusion Model can account for the accuracy and reaction time of value-based choices under high and low time pressure. Judgm Decis Mak. 2010;5:437–49.

    Google Scholar 

  121. 121.

    Hutcherson CA, Bushong B, Rangel A. A neurocomputational model of altruistic choice and its implications. Neuron. 2015;87:451–62.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  122. 122.

    Shenhav A, Straccia MA, Cohen JD, Botvinick MM. Anterior cingulate engagement in a foraging context reflects choice difficulty, not foraging value. Nat Neurosci. 2014;16:1127–39.

    Article  CAS  Google Scholar 

  123. 123.

    Peters J, D’Esposito M. The drift diffusion model as the choice rule in inter-temporal and risky choice: a case study in medial orbitofrontal cortex lesion patients and controls. PLoS Comput Biol. 2020;16:e1007615.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  124. 124.

    Clithero JA. Improving out-of-sample predictions using response times and a model of the decision process. J Econ Behav Organ. 2018;148:344–75.

    Article  Google Scholar 

  125. 125.

    Usher M, Mcclelland JL. The time course of perceptual choice: the leaky, competing accumulator model. Psychol Rev. 2001;108:550–92.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  126. 126.

    Shadlen MN, Newsome WT. Neural basis of a perceptual decision in the parietal cortex (area LIP) of the rhesus monkey. J Neurophysiol. 2001;86:1916–36.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  127. 127.

    Frömer R, Dean Wolf CK, Shenhav A. Goal congruency dominates reward value in accounting for behavioral and neural correlates of value-based decision-making. Nat Commun. 2019;10:4926.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  128. 128.

    Hunt LT, Kolling N, Soltani A, Woolrich MW, Rushworth MFS, Behrens TEJ. Mechanisms underlying cortical activity during value-guided choice. Nat Neurosci. 2012;15:470–6.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  129. 129.

    Wong KF, Wang XJ. A recurrent network mechanism of time integration in perceptual decisions. J Neurosci. 2006;26:1314–28.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  130. 130.

    Soltani A, Chaisangmongkon W, Wang XJ Neural circuit mechanisms of value-based decision-making and reinforcement learning. In: Dreher J, Tremblay L, editors. Decision Neuroscience. Academic Press; 2017. p. 163-76.

  131. 131.

    Hunt LT, Hayden BY. A distributed, hierarchical and recurrent framework for reward-based choice. Nat Rev Neurosci. 2017;18:172–82.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  132. 132.

    Enel P, Wallis JD, Rich EL. Stable and dynamic representations of value in the prefrontal cortex. Elife. 2020;9.

  133. 133.

    Padoa-Schioppa C. Neuronal origins of choice variability in economic decisions. Neuron. 2013;80:1322–36.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  134. 134.

    Rich EL, Wallis JD. Decoding subjective decisions from orbitofrontal cortex. Nat Neurosci. 2016;19:973–80.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  135. 135.

    Hare TA, Schultz W, Camerer CF, O’Doherty JP, Rangel A. Transformation of stimulus value signals into motor commands during simple choice. Proc Natl Acad Sci. 2011;108:18120–25.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  136. 136.

    Louie K, Glimcher PW. Separating value from choice: delay discounting activity in the lateral intraparietal area. J Neurosci. 2010;30:5498–507.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  137. 137.

    Gluth S, Rieskamp J, Buchel C. Deciding when to decide: time-variant sequential sampling models explain the emergence of value-based decisions in the human brain. J Neurosci. 2012;32:10686–98.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  138. 138.

    Gluth S, Rieskamp J, Buchel C. Classic EEG motor potentials track the emergence of value-based decisions. NeuroImage. 2013;79:394–403.

    PubMed  Article  PubMed Central  Google Scholar 

  139. 139.

    Pisauro MA, Fouragnan E, Retzler C, Philiastides MG. Neural correlates of evidence accumulation during value-based decisions revealed via simultaneous EEG-fMRI. Nat Commun. 2017;8:1–9.

    Article  Google Scholar 

  140. 140.

    Frömer R, Shenhav A. Filling the gaps: cognitive control as a critical lens for understanding mechanisms of value-based decision-making. PsyArXiv. 2021.

  141. 141.

    Pleskac TJ, Busemeyer JR. Two-stage dynamic signal detection: a theory of choice, decision time, and confidence. Psychol Rev. 2010;117:864–901.

    PubMed  Article  PubMed Central  Google Scholar 

  142. 142.

    De Martino B, Fleming SM, Garrett N, Dolan RJ. Confidence in value-based choice. Nat Neurosci. 2013;16:105–10.

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  143. 143.

    Desender K, Donner TH, Verguts T. Dynamic expressions of confidence within an evidence accumulation framework. Cognition. 2021;207:104522.

    PubMed  Article  PubMed Central  Google Scholar 

  144. 144.

    Resulaj A, Kiani R, Wolpert DM, Shadlen MN. Changes of mind in decision-making. Nature. 2009;461:263–6.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  145. 145.

    Yeung N, Summerfield C. Metacognition in human decision-making: confidence and error monitoring. Philos Trans R Soc Lond B Biol Sci. 2012;367:1310–21.

    PubMed  PubMed Central  Article  Google Scholar 

  146. 146.

    Desender K, Murphy P, Boldt A, Verguts T, Yeung N. A postdecisional neural marker of confidence predicts information-seeking in decision-making. J Neurosci. 2019;39:3309–19.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  147. 147.

    Rouault M, Dayan P, Fleming SM. Forming global estimates of self-performance from local confidence. Nat Commun. 2019;10:1141.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  148. 148.

    Desender K, Boldt A, Yeung N. Subjective confidence predicts information seeking in decision making. Psychol Sci. 2018;29:761–78.

    PubMed  Article  PubMed Central  Google Scholar 

  149. 149.

    Frank MJ, Gagne C, Nyhus E, Masters S, Wiecki TV, Cavanagh JF, et al. fMRI and EEG predictors of dynamic decision parameters during human reinforcement learning. J Neurosci. 2015;35:485–94.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  150. 150.

    Wiecki TV, Frank MJ. A computational model of executive control in frontal cortex and basal ganglia: multiple levels of analysis. Psychol Rev. 2013;120:329–55.

    PubMed  Article  PubMed Central  Google Scholar 

  151. 151.

    Frömer R, Shenhav A. Spatiotemporally distinct neural mechanisms underlie our reactions to and comparison between value-based options. BioRxiv. 2019; 609198.

  152. 152.

    Hanks TD, Kopec CD, Brunton BW, Duan CA, Erlich JC, Brody CD. Distinct relationships of parietal and prefrontal cortices to evidence accumulation. Nature. 2015;520:220–3.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  153. 153.

    Erlich JC, Brunton BW, Duan CA, Hanks TD, Brody CD. Distinct effects of prefrontal and parietal cortex inactivations on an accumulation of evidence task in the rat. eLife. 2015;4:e05457.

    PubMed Central  Article  PubMed  Google Scholar 

  154. 154.

    Shenhav A, Straccia MA, Botvinick MM, Cohen JD. Dorsal anterior cingulate and ventromedial prefrontal cortex have inverse roles in both foraging and economic choice. Cogn Affect Behav Neurosci. 2016;19:1286–91.

    CAS  Google Scholar 

  155. 155.

    Wolpert DM, Landy MS. Motor control is decision-making. Curr Opin Neurobiol. 2012;22:996–1003.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  156. 156.

    Manohar SG, Chong TTJ, Apps MAJ, Batla A, Stamelou M, Jarman PR, et al. Reward Pays the Cost of Noise Reduction in Motor and Cognitive Control. Curr Biol. 2015;13:1707–16.

    Article  CAS  Google Scholar 

  157. 157.

    Todorov E. Efficient computation of optimal actions. Proc Natl Acad Sci USA. 2009;106:11478–83.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  158. 158.

    Shenhav A, Botvinick MM, Cohen JD. The expected value of control: An integrative theory of anterior cingulate cortex function. Neuron 2013;79:217–40.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  159. 159.

    Shenhav A, Musslick S, Lieder F, Kool W, Griffiths TL, Cohen JD, et al. Toward a rational and mechanistic account of mental effort. Annu Rev Neurosci. 2017;40:99–124.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  160. 160.

    Frömer R, Lin H, Dean Wolf CK, Inzlicht M, Shenhav A. Expectations of reward and efficacy guide cognitive control allocation. Nat Commun. 2021;12.

  161. 161.

    Shenhav A, Cohen JD, Botvinick MM. Dorsal anterior cingulate cortex and the value of control. Nat Neurosci. 2016;19:1286–91.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  162. 162.

    Nachev P, Kennard C, Husain M. Functional role of the supplementary and pre-supplementary motor areas. Nat Rev Neurosci. 2008;9:856–69.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  163. 163.

    Venkatraman V, Huettel SA. Strategic control in decision-making under uncertainty. Eur J Neurosci. 2012;35:1075–82.

    PubMed  PubMed Central  Article  Google Scholar 

  164. 164.

    Cisek P, Pastor-Bernier A. On the challenges and mechanisms of embodied decisions. Philos Trans R Soc Lond B Biol Sci. 2014;369.

  165. 165.

    Solway A, Botvinick MM. Evidence integration in model-based tree search. Proc Natl Acad Sci. 2015;112:11708–13.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  166. 166.

    Hunt LT, Dolan RJ, Behrens TEJ. Hierarchical competitions subserving multi-attribute choice. Nat Neurosci. 2014;17:1613–22.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  167. 167.

    Solway A, Botvinick MM. Goal-directed decision making as probabilistic inference: a computational framework and potential neural correlates. Psychol Rev. 2012;119:120–54.

    PubMed  PubMed Central  Article  Google Scholar 

  168. 168.

    Holroyd CB, McClure SM. Hierarchical control over effortful behavior by rodent medial frontal cortex: a computational model. Psychol Rev. 2015;122:54–83.

    PubMed  Article  PubMed Central  Google Scholar 

  169. 169.

    Busemeyer JR, Gluth S, Rieskamp J, Turner BM. Cognitive and neural bases of multi-attribute, multi-alternative, value-based decisions. Trends Cogn Sci. 2019;23:251–63.

    PubMed  Article  PubMed Central  Google Scholar 

  170. 170.

    Tversky A. Elimination by aspects: a theory of choice. Psychol Rev. 1972;79:281–99.

    Article  Google Scholar 

  171. 171.

    Busemeyer JR, Townsend JT. Decision field theory: a dynamic-cognitive approach to decision making in an uncertain environment. Psychol Rev. 1993;100:432–59.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  172. 172.

    Bogacz R, Usher M, Zhang J, Mcclelland JL. Extending a biologically inspired model of choice: multi-alternatives, nonlinearity and value-based multidimensional choice. Philos Trans R Soc Lond, Ser B Biol Sci. 2007;362:1655–70.

    Article  Google Scholar 

  173. 173.

    Harris A, Clithero JA, Hutcherson CA. Accounting for taste: a multi-attribute neurocomputational model explains the neural dynamics of choices for self and others. J Neurosci. 2018;38:7952–68.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  174. 174.

    Krajbich I, Armel C, Rangel A. Visual fixations and the computation and comparison of value in simple choice. Nat Neurosci. 2010;13:1292–8.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  175. 175.

    Krajbich I, Rangel A. Multialternative drift-diffusion model predicts the relationship between visual fixations and choice in value-based decisions. Proc Natl Acad Sci. 2011;108:13852–7.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  176. 176.

    Armel KC, Beaumel A, Rangel A. Biasing simple choices by manipulating relative visual attention. Judgm Decis Mak. 2008;3:396–403.

    Google Scholar 

  177. 177.

    Cavanagh JF, Wiecki TV, Kochar A, Frank MJ. Eye Tracking and Pupillometry Are Indicators of Dissociable Latent Decision Processes. Journal of Experimental Psychology General. 2014;143:1476–88.

    PubMed  PubMed Central  Article  Google Scholar 

  178. 178.

    Westbrook A, van den Bosch R, Maatta JI, Hofmans L, Papadopetraki D, Cools R, et al. Dopamine promotes cognitive effort by biasing the benefits versus costs of cognitive work. Science. 2020;367:1362–6.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  179. 179.

    Callaway F, Rangel A, Griffiths TL. Fixation patterns in simple choice are consistent with optimal use of cognitive resources. PLoS Computational Biology. 2021;17:e1008863.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  180. 180.

    Callaway F, van Opheusden B, Gul S, Das P, Krueger P, Lieder F, et al. Human planning as optimal information seeking. PsyArXiv. 2021.

  181. 181.

    Jang AI, Sharma R, Drugowitsch J. Optimal policy for attention-modulated decisions explains human fixation behavior. eLife. 2021;10:e63436.

    PubMed  PubMed Central  Article  Google Scholar 

  182. 182.

    Polania R, Woodford M, Ruff CC. Efficient coding of subjective value. Nat Neurosci. 2019;22:134–42.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  183. 183.

    Frömer R, Callaway F, Griffiths TL, Shenhav A. Considering what we know and what we don’t know: expectations and confidence guide value integration in value-based decision-making. in prep.

  184. 184.

    Hare TA, Camerer C, Rangel A. Self-control in decision-making involves modulation of the vmPFC valuation system. Science. 2009;324:646–48.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  185. 185.

    Hare TA, Malmaud J, Rangel A. Focusing attention on the health aspects of foods changes value signals in vmPFC and improves dietary choice. J Neurosci. 2011;31:11077–87.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  186. 186.

    Lim S-L, O’doherty JP, Rangel A. Stimulus value signals in ventromedial pfc reflect the integration of attribute value signals computed in fusiform gyrus and posterior superior temporal gyrus. J Neurosci. 2013;33:8729–41.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  187. 187.

    Shenhav A, Greene JD. Integrative moral judgment: dissociating the roles of the amygdala and ventromedial prefrontal cortex. J Neurosci. 2014;34:4741–9.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  188. 188.

    Hutcherson CA, Montaser-Kouhsari L, Woodward J, Rangel A. Emotional and utilitarian appraisals of moral dilemmas are encoded in separate areas and integrated in ventromedial prefrontal cortex. J Neurosci. 2015;35:12593–605.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  189. 189.

    Tusche A, Hutcherson CA. Cognitive regulation alters social and dietary choice by changing attribute representations in domain-general and domain-specific brain circuits. Elife. 2018;7:e31185.

    PubMed  PubMed Central  Article  Google Scholar 

  190. 190.

    Tversky A, Simonson I. Context-dependent preferences. Manag Sci. 1993;39:1179–89.

    Article  Google Scholar 

  191. 191.

    Tsetsos K, Usher M, Chater N. Preference reversal in multiattribute choice. Psychol Rev. 2010;117:1275–91.

    PubMed  Article  PubMed Central  Google Scholar 

  192. 192.

    Reynolds JH, Heeger DJ. The normalization model of attention. Neuron. 2009;61:168–85.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  193. 193.

    Summerfield C, Tsetsos K. Do humans make good decisions? Trends Cogn Sci. 2015;19:27–34.

    PubMed  Article  PubMed Central  Google Scholar 

  194. 194.

    Louie K, Glimcher PW. Efficient coding and the neural representation of value. Ann NY Acad Sci. 2012;1251:13–32.

    PubMed  Article  PubMed Central  Google Scholar 

  195. 195.

    Louie K, Khaw MW, Glimcher PW. Normalization is a general neural mechanism for context-dependent decision making. Proc Natl Acad Sci. 2013;110:6139–44.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  196. 196.

    Gluth S, Kern N, Kortmann M, Vitali CL. Value-based attention but not divisive normalization influences decisions with multiple alternatives. Nat Hum Behav. 2020;4:634–45.

    PubMed  PubMed Central  Article  Google Scholar 

  197. 197.

    Khaw MW, Glimcher PW, Louie K. Normalized value coding explains dynamic adaptation in the human valuation process. Proc Natl Acad Sci. 2017;114:12696–701.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  198. 198.

    Frömer R, Shenhav A Overriding first impressions: evidence for a reference-dependent and attentionally-weighted multi-stage process of value-based decision-making, in The 4th Multidisciplinary Conference on Reinforcement Learning and Decision Making. 2019.

  199. 199.

    Bornstein AM, Pickard H. “Chasing the first high”: memory sampling in drug choice. Neuropsychopharmacology. 2020;45:907–15.

    PubMed  PubMed Central  Article  Google Scholar 

  200. 200.

    Bakkour A, Zylberberg A, Shadlen MN, Shohamy D. Value-based decisions involve sequential sampling from memory. BioRxiv. 2018; 269290.

  201. 201.

    Vaidya AR, Badre D. Neural systems for memory-based value judgment and decision-making. J Cogn Neurosci. 2020;32:1896–923.

    PubMed  PubMed Central  Article  Google Scholar 

  202. 202.

    Ludvig EA, Madan CR, McMillan N, Xu Y, Spetch ML. Living near the edge: how extreme outcomes and their neighbors drive risky choice. J Exp Psychol Gen. 2018;147:1905–18.

    PubMed  Article  PubMed Central  Google Scholar 

  203. 203.

    Madan CR, Ludvig EA, Spetch ML. Remembering the best and worst of times: memories for extreme outcomes bias risky decisions. Psychon Bull Rev. 2014;21:629–36.

    PubMed  Article  PubMed Central  Google Scholar 

  204. 204.

    Lieder F, Griffiths TL, Hsu M. Overrepresentation of extreme events in decision making reflects rational use of cognitive resources. Psychol Rev. 2018;125:1–32.

    PubMed  Article  PubMed Central  Google Scholar 

  205. 205.

    Pedersen ML, Frank MJ, Biele G. The drift diffusion model as the choice rule in reinforcement learning. Psychon Bull Rev. 2017;24:1234–51.

    PubMed  PubMed Central  Article  Google Scholar 

  206. 206.

    Ballard IC, McClure SM. Joint modeling of reaction times and choice improves parameter identifiability in reinforcement learning models. J Neurosci Methods. 2019;317:37–44.

    PubMed  Article  PubMed Central  Google Scholar 

  207. 207.

    Miletic S, Boag RJ, Trutti AC, Stevenson N, Forstmann BU, Heathcote A. A new model of decision processing in instrumental learning tasks. Elife. 2021;10:e63055.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  208. 208.

    Miller KJ, Shenhav A, Ludvig EA. Habits without values. Psychol Rev. 2019;126:292.

    PubMed  PubMed Central  Article  Google Scholar 

  209. 209.

    Lau B, Glimcher PW. Dynamic response-by-response models of matching behavior in rhesus monkeys. J Exp Anal Behav. 2005;84:555–79.

    PubMed  PubMed Central  Article  Google Scholar 

  210. 210.

    Urai AE, de Gee JW, Tsetsos K, Donner TH. Choice history biases subsequent evidence accumulation. Elife. 2019;8:e46331.

    PubMed  PubMed Central  Article  Google Scholar 

  211. 211.

    Katahira K, Toyama A. Revisiting the importance of model fitting for model-based fMRI: It does matter in computational psychiatry. PLoS Comput Biol. 2021;17:e1008738.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  212. 212.

    van der Meer M, Kurth-Nelson Z, Redish AD. Information processing in decision-making systems. Neuroscientist. 2012;18:342–59.

    PubMed  PubMed Central  Article  Google Scholar 

  213. 213.

    Sutton RS. First results with Dyna, an interesting architecture for learning, planning, and reacting. In: Miller T, RS Sutton, Werbos P, editors. Neural networks for control. Cambridge, MA: MIT Press; 1990. p. 179–9.

  214. 214.

    Gershman SJ, Markman AB, Otto AR. Retrospective revaluation in sequential decision making: a tale of two systems. J Exp Psychol Gen. 2012;143:182.

    PubMed  Article  PubMed Central  Google Scholar 

  215. 215.

    Keramati M, Dezfouli A, Piray P. Speed/accuracy trade-off between the habitual and the goal-directed processes. PLoS Comput Biol. 2011;7:e1002055.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  216. 216.

    Daw ND, Niv Y, Dayan P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat Neurosci. 2005;8:1704–11.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  217. 217.

    Kool W, Cushman FA, Gershman SJ. Competition and cooperation between multiple reinforcement learning systems. In: Bornstein AM, Morris RW, Shenhav A, editors. Goal-Directed Decision Making: Computations and Neural Circuits. Amsterdam: Elsevier; 2018. p. 153–78.

    Chapter  Google Scholar 

  218. 218.

    Bach DR, Dayan P. Algorithms for survival: a comparative perspective on emotions. Nat Rev Neurosci. 2017;18:311–9.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  219. 219.

    Dayan P, Niv Y, Seymour B, Daw ND. The misbehavior of value and the discipline of the will. Neural Netw. 2006;19:1153–60.

    PubMed  Article  PubMed Central  Google Scholar 

  220. 220.

    Cavanagh JF, Eisenberg I, Guitart-Masip M, Huys Q, Frank MJ. Frontal theta overrides pavlovian learning biases. J Neurosci. 2013;33:8541–8.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  221. 221.

    Rangel A. Regulation of dietary choice by the decision-making circuitry. Nat Neurosci. 2013;16:1717–24.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  222. 222.

    Wilson RC, Wang S, Sadeghiyeh H, Cohen JD. Deep exploration as a unifying account of explore-exploit behavior. PsyArXiv. 2020.

  223. 223.

    Wang S, Wilson RC. Any way the brain blows? The nature of decision noise in random exploration. PsyArXiv. 2018.

  224. 224.

    Schulz E, Gershman SJ. The algorithmic architecture of exploration in the human brain. Curr Opin Neurobiol. 2019;55:7–14.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  225. 225.

    Schulz E, Bhuia R, Love BC, Brier B, Todd MT, Gershman SJ. Structured, uncertainty-driven exploration in real-world consumer choice. Proc Natl Acad Sci USA. 2019;116:13903–8.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  226. 226.

    Huys QJM, Eshel N, O'Nions E, Sheridan L, Dayan P, Roiser JP. Bonsai trees in your head: how the pavlovian system sculpts goal-directed choices by pruning decision trees. PLoS Comput Biol. 2012;8:e1002410.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  227. 227.

    Rutledge RB, Skandali N, Dayan P, Dolan RJ. A computational and neural model of momentary subjective well-being. Proc Natl Acad Sci. 2014;111:12252–57.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  228. 228.

    Bennett D, Davidson G, Niv Y. A model of mood as integrated advantage. PsyArXiv. 2020.

  229. 229.

    Eldar E, Rutledge RB, Dolan RJ, Niv Y. Mood as representation of momentum. Trends Cogn Sci. 2016;20:15–24.

    PubMed  PubMed Central  Article  Google Scholar 

  230. 230.

    Eldar E, Roth C, Dayan P, Dolan RJ. Decodability of reward learning signals predicts mood fluctuations. Curr Biol. 2018;28:1433–39 e7.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  231. 231.

    Vinckier F, Rigoux L, Oudiette D, Pessiglione M. Neuro-computational account of how mood fluctuations arise and affect decision making. Nat Commun. 2018;9:1708.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  232. 232.

    Moutoussis M, Eldar E, Dolan RJ. Building a new field of computational psychiatry. Biol Psychiatry. 2017;82:388–90.

    PubMed  Article  PubMed Central  Google Scholar 

  233. 233.

    Montague PR, Dolan RJ, Friston KJ, Dayan P. Computational psychiatry. Trends Cogn Sci. 2012;16:72–80.

    PubMed  Article  PubMed Central  Google Scholar 

  234. 234.

    Huys Q, Maia TV, Frank MJ. Computational psychiatry as a bridge between neuroscience and clinical applications. Nat Neurosci. 2016;19:404–13.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  235. 235.

    Huys QJ, Guitart-Masip M, Dolan RJ, Dayan P. Decision-theoretic psychiatry. Clin Psychol Sci. 2015;3:400–21.

    Article  Google Scholar 

  236. 236.

    Deserno L, Sterzer P, Wustenberg T, Heinz A, Schlagenhauf F. Reduced prefrontal-parietal effective connectivity and working memory deficits in schizophrenia. J Neurosci. 2012;32:12–20.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  237. 237.

    Collins AGE, Brown JK, Gold JM, Waltz JA, Frank MJ. Working memory contributions to reinforcement learning impairments in schizophrenia. J Neurosci. 2014;34:13747–56.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  238. 238.

    Collins AGE, Albrecht MA, Waltz JA, Gold JM, Frank MJ. Interactions among working memory, reinforcement learning, and effort in value-based choice: a new paradigm and selective deficits in schizophrenia. Biol Psychiatry. 2017;82:431–9.

    PubMed  PubMed Central  Article  Google Scholar 

  239. 239.

    Daniel R, Radulescu A, Niv Y. Intact reinforcement learning but impaired attentional control during multidimensional probabilistic learning in older adults. J Neurosci. 2020;40:1084–96.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  240. 240.

    Charpentier CJ, Aylward J, Roiser JP, Robinson OJ. Enhanced risk aversion, but not loss aversion, in unmedicated pathological anxiety. Biol Psychiatry. 2017;81:1014–22.

    PubMed  PubMed Central  Article  Google Scholar 

  241. 241.

    Bishop SJ, Gagne C. Anxiety, depression, and decision making: a computational perspective. Annu Rev Neurosci. 2018;41:371–88.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  242. 242.

    Kirby K, Petry N, Bickel W. Heroin addicts have higher discount rates for delayed rewards than non-drug-using controls. J Exp Psychol Gen. 1999;128:78–87.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  243. 243.

    Bickel WK, Miller ML, Yi R, Kowal BP, Lindquist DM, Pitcock JA. Behavioral and neuroeconomics of drug addiction: competing neural systems and temporal discounting processes. Drug Alcohol Depend. 2007;90:S85–S91.

    PubMed  Article  PubMed Central  Google Scholar 

  244. 244.

    Cooper JA, Arulpragasam AR, Treadway MT. Anhedonia in depression: biological mechanisms and computational models. Curr Opin Behav Sci. 2018;22:128–35.

    PubMed  PubMed Central  Article  Google Scholar 

  245. 245.

    Treadway MT, Buckholtz JW, Schwartzman AN, Lambert WE, Zald DH. Worth the ‘EEfRT’? The effort expenditure for rewards task as an objective measure of motivation and anhedonia. PLoS One. 2009;4:e6598.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  246. 246.

    Collins AGE, Ciullo B, Frank MJ, Badre D. Working memory load strengthens reward prediction errors. J Neurosci. 2017;37:4332–42.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  247. 247.

    Collins AGE, Frank MJ. Within- and across-trial dynamics of human EEG reveal cooperative interplay between reinforcement learning and working memory. Proc Natl Acad Sci USA. 2018;115:2502–7.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  248. 248.

    Eckstein MK, Wilbrecht L, Collins AGE. What do reinforcement learning models measure? Interpreting model parameters in cognition and neuroscience. Curr Opin Behav Sci. 2021;41:128–37.

    Article  Google Scholar 

  249. 249.

    Frömer R, Sturmer B, Sommer W. The better, the bigger: The effect of graded positive performance feedback on the reward positivity. Biol Psychol. 2016;114:61–8.

    PubMed  Article  PubMed Central  Google Scholar 

  250. 250.

    Frömer R, Nassar MR, Bruckner R, Stuermer B, Sommer W, Yeung N. Response-based outcome predictions and confidence regulate feedback processing and learning. Elife. 2021;10:e62825.

    PubMed  PubMed Central  Article  Google Scholar 

  251. 251.

    McDougle SD, Ballard IC, Baribault B, Bishop SJ, Collins AG. Executive function supports single-shot endowment of value to arbitrary transient goals. Cereb Cortex. 2021.

  252. 252.

    Hunter LE, Daw ND. Context-sensitive valuation and learning. Curr Opin Behav Sci. 2021;41:122–7.

    PubMed  Article  PubMed Central  Google Scholar 

  253. 253.

    Kurzban R, Duckworth A, Kable JW, Myers J. An opportunity cost model of subjective effort and task performance. Behav Brain Sci. 2013;36:661–79.

    PubMed  Article  PubMed Central  Google Scholar 

  254. 254.

    Marcora SM, Staiano W. The limit to exercise tolerance in humans: mind over muscle? Eur J Appl Physiol. 2010;109:763–70.

    PubMed  Article  PubMed Central  Google Scholar 

  255. 255.

    Inzlicht M, Schmeichel BJ. What is ego depletion? Toward a mechanistic revision of the resource model of self-control. Perspect Psychol Sci. 2012;7:450–63.

    PubMed  Article  PubMed Central  Google Scholar 

  256. 256.

    Agrawal M, Mattar M, Cohen JD, Daw ND. The temporal dynamics of opportunity costs: a normative account of cognitive fatigue and boredom. Psychol Rev. in press.

  257. 257.

    Boureau Y-L, Sokol-Hessner P, Daw ND. Deciding how to decide: self-control and meta-decision making. Trends Cogn Sci. 2015;19:700–10.

    PubMed  Article  PubMed Central  Google Scholar 

  258. 258.

    Musslick S, Cohen JD. A mechanistic account of constraints on control-dependent processing: Shared representation, conflict and persistence. in Proceedings of the 41st annual meeting of the Cognitive Science Society. 849–55 (Montreal, CA, 2019).

  259. 259.

    Zenon A, Solopchuk O, Pezzulo G. An information-theoretic perspective on the costs of cognition. Neuropsychologia. 2019;123:5–18.

    PubMed  Article  PubMed Central  Google Scholar 

  260. 260.

    Inzlicht M, Shenhav A, Olivola CY. The effort paradox: effort is both costly and valued. Trends Cogn Sci. 2018;22:337–49.

    PubMed  PubMed Central  Article  Google Scholar 

  261. 261.

    Yamins DL, DiCarlo JJ. Using goal-driven deep learning models to understand sensory cortex. Nat Neurosci. 2016;19:356–65.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  262. 262.

    Botvinick M, Wang JX, Dabney W, Miller KJ, Kurth-Nelson Z. Deep reinforcement learning and its neuroscientific implications. Neuron. 2020;107:603–16.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  263. 263.

    Wang JX, Kurth-Nelson Z, Kumaran D, Tirumala D, Soyer H, Leibo JZ, et al. Prefrontal cortex as a meta-reinforcement learning system. Nat Neurosci. 2018;21:860−+.

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  264. 264.

    Cross L, Cockburn J, Yue Y, O’Doherty JP. Using deep reinforcement learning to reveal how the brain encodes abstract state-space representations in high-dimensional environments. Neuron 2021;109:724–38.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  265. 265.

    Battleday RM, Peterson JC, Griffiths TL. Capturing human categorization of natural images by combining deep networks and cognitive models. Nat Commun. 2020;11:5418.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  266. 266.

    Fengler A, Govindarajan LN, Chen T, Frank MJ. Likelihood approximation networks (LANs) for fast inference of simulation models in cognitive neuroscience. Elife. 2021;10(Apr):e65074.

    PubMed  PubMed Central  Article  Google Scholar 

  267. 267.

    Dezfouli A, Morris R, Ramos F, Dayan P, Balleine BW. Integrated accounts of behavioral and neuroimaging data using flexible recurrent neural network models. BioRxiv. 2018;328849.

  268. 268.

    Bouchacourt F, Buschman TJ. A flexible model of working memory. Neuron 2019;103:147–60 e8.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  269. 269.

    Alexander WH, Brown JW. Hierarchical error representation: a computational model of anterior cingulate and dorsolateral prefrontal cortex. Neural Comput. 2015;22:1–57.

    Google Scholar 

Download references

Acknowledgements

The authors are grateful to Romy Frömer for helpful feedback on earlier drafts. AC is supported by NSF2020844 and NIH grant R01MH119383. AS is supported by an Alfred P. Sloan Foundation Research Fellowship in Neuroscience and NIH grants R01MH124849, R21MH122863, and P20GM103645.

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Anne G. E. Collins or Amitai Shenhav.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Collins, A.G.E., Shenhav, A. Advances in modeling learning and decision-making in neuroscience. Neuropsychopharmacol. (2021). https://doi.org/10.1038/s41386-021-01126-y

Download citation

Search

Quick links