Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Model-based choices involve prospective neural activity


Decisions may arise via 'model-free' repetition of previously reinforced actions or by 'model-based' evaluation, which is widely thought to follow from prospective anticipation of action consequences using a learned map or model. While choices and neural correlates of decision variables sometimes reflect knowledge of their consequences, it remains unclear whether this actually arises from prospective evaluation. Using functional magnetic resonance imaging and a sequential reward-learning task in which paths contained decodable object categories, we found that humans' model-based choices were associated with neural signatures of future paths observed at decision time, suggesting a prospective mechanism for choice. Prospection also covaried with the degree of model-based influences on neural correlates of decision variables and was inversely related to prediction error signals thought to underlie model-free learning. These results dissociate separate mechanisms underlying model-based and model-free evaluation and support the hypothesis that model-based influences on choices and neural decision variables result from prospection.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Get just this article for as long as you need it


Prices may be subject to local taxes which are calculated during checkout

Figure 1: Task design.
Figure 2: Model behavioral predictions and data.
Figure 3: Neural evidence of prospective activation correlates with model-based behavior.
Figure 4: Correlates of choice probabilities derived from chosen minus unchosen values estimated by model-free and model-based learning at the task's first stage.
Figure 5: Neural evidence of model-free prediction errors and correlates of prediction error with model-free behavior.


  1. Thorndike, E.L. Animal Intelligence: Experimental Studies (Macmillan, New York, 1911).

  2. Sutton, R.S. & Barto, A.G. Introduction to Reinforcement Learning〉 (MIT Press, 1998).

  3. Tolman, E.C. Cognitive maps in rats and men. Psychol. Rev. 55, 189–208 (1948).

    Article  CAS  PubMed  Google Scholar 

  4. Shohamy, D. & Wagner, A.D. Integrating memories in the human brain: hippocampal-midbrain encoding of overlapping events. Neuron 60, 378–389 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Wimmer, G.E. & Shohamy, D. Preference by association: how memory mechanisms in the hippocampus bias decisions. Science 338, 270–273 (2012).

    Article  CAS  PubMed  Google Scholar 

  6. Barron, H.C., Dolan, R.J. & Behrens, T.E.J. Online evaluation of novel choices by simultaneous representation of multiple memories. Nat. Neurosci. 16, 1492–1498 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Doll, B.B., Simon, D.A. & Daw, N.D. The ubiquity of model-based reinforcement learning. Curr. Opin. Neurobiol. 22, 1075–1081 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Dolan, R.J. & Dayan, P. Goals and habits in the brain. Neuron 80, 312–325 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Doya, K. What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural Netw. 12, 961–974 (1999).

    Article  CAS  PubMed  Google Scholar 

  10. Fermin, A., Yoshida, T., Ito, M., Yoshimoto, J. & Doya, K. Evidence for model-based action planning in a sequential finger movement task. J. Mot. Behav. 42, 371–379 (2010).

    Article  PubMed  Google Scholar 

  11. Gläscher, J., Daw, N., Dayan, P. & O'Doherty, J.P. States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66, 585–595 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  12. Daw, N.D., Gershman, S.J., Seymour, B., Dayan, P. & Dolan, R.J. Model-based influences on humans' choices and striatal prediction errors. Neuron 69, 1204–1215 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Eppinger, B., Walter, M., Heekeren, H.R. & Li, S.-C. Of goals and habits: age-related and individual differences in goal-directed decision-making. Front. Neurosci. 7, 253 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Pfeiffer, B.E. & Foster, D.J. Hippocampal place-cell sequences depict future paths to remembered goals. Nature 497, 74–79 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Johnson, A. & Redish, A.D. Neural ensembles in CA3 transiently encode paths forward of the animal at a decision point. J. Neurosci. 27, 12176–12189 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Schapiro, A.C., Kustner, L.V. & Turk-Browne, N.B. Shaping of object representations in the human medial temporal lobe based on temporal regularities. Curr. Biol. 22, 1622–1627 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Moore, A.W. & Atkeson, C.G. Prioritized sweeping: reinforcement learning with less data and less time. Mach. Learn. 13, 103–130 (1993).

    Google Scholar 

  18. Sutton, R.S. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. Machine Learning: Proc. Seventh Int. Conf. on Machine Learning (eds. Porter, B.W. & Mooney, R.J.) 216–224 (Morgan Kaufmann, Palo Alto, California, USA, 1990).

    Chapter  Google Scholar 

  19. Daw, N.D. & Dayan, P. The algorithmic anatomy of model-based evaluation. Philos. Trans. R. Soc. Lond. B Biol. Sci. 369, 20130478 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Zeithamova, D., Dominick, A.L. & Preston, A.R. Hippocampal and ventral medial prefrontal activation during retrieval-mediated learning supports novel inference. Neuron 75, 168–179 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Gershman, S.J., Markman, A.B. & Otto, A.R. Retrospective revaluation in sequential decision making: a tale of two systems. J. Exp. Psychol. Gen. 143, 182–194 (2014).

    Article  PubMed  Google Scholar 

  22. Doll, B.B., Shohamy, D. & Daw, N.D. Multiple memory systems as substrates for multiple decision systems. Neurobiol. Learn. Mem. 117, 4–13 (2015).

    Article  PubMed  Google Scholar 

  23. Lee, S.W., Shimojo, S. & O'Doherty, J.P. Neural computations underlying arbitration between model-based and model-free learning. Neuron 81, 687–699 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Reddy, L. & Kanwisher, N. Coding of visual objects in the ventral stream. Curr. Opin. Neurobiol. 16, 408–414 (2006).

    Article  CAS  PubMed  Google Scholar 

  25. FitzGerald, T.H.B., Seymour, B. & Dolan, R.J. The role of human orbitofrontal cortex in value comparison for incommensurable objects. J. Neurosci. 29, 8388–8395 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Boorman, E.D., Behrens, T.E.J., Woolrich, M.W. & Rushworth, M.F.S. How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron 62, 733–743 (2009).

    Article  CAS  PubMed  Google Scholar 

  27. Daw, N.D., O'Doherty, J.P., Dayan, P., Seymour, B. & Dolan, R.J. Cortical substrates for exploratory decisions in humans. Nature 441, 876–879 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Boorman, E.D., Behrens, T.E. & Rushworth, M.F. Counterfactual choice and learning in a neural network centered on human lateral frontopolar cortex. PLoS Biol. 9, e1001093 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Kolling, N., Behrens, T.E.J., Mars, R.B. & Rushworth, M.F.S. Neural mechanisms of foraging. Science 336, 95–98 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Shenhav, A., Straccia, M.A., Cohen, J.D. & Botvinick, M.M. Anterior cingulate engagement in a foraging context reflects choice difficulty, not foraging value. Nat. Neurosci. 17, 1249–1254 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Garrison, J., Erdeniz, B. & Done, J. Prediction error in reinforcement learning: a meta-analysis of neuroimaging studies. Neurosci. Biobehav. Rev. 37, 1297–1310 (2013).

    Article  PubMed  Google Scholar 

  32. Foerde, K., Knowlton, B.J. & Poldrack, R.A. Modulation of competing memory systems by distraction. Proc. Natl. Acad. Sci. USA 103, 11778–11783 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Tricomi, E., Balleine, B.W. & O'Doherty, J.P. A specific role for posterior dorsolateral striatum in human habit learning. Eur. J. Neurosci. 29, 2225–2232 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Wunderlich, K., Dayan, P. & Dolan, R.J. Mapping value based planning and extensively trained choice in the human brain. Nat. Neurosci. 15, 786–791 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Kurth-Nelson, Z., Barnes, G., Sejdinovic, D., Dolan, R. & Dayan, P. Temporal structure in associative retrieval. Elife 4, e04919 (2015).

    Article  PubMed Central  Google Scholar 

  36. Tolman, E.C. & Honzik, C.H. Introduction and removal of reward, and maze performance in rats. Univ. Calif. Publ. Psychol. 4, 257–275 (1930).

    Google Scholar 

  37. Daw, N.D., Niv, Y. & Dayan, P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711 (2005).

    Article  CAS  PubMed  Google Scholar 

  38. Dayan, P. Improving generalization for temporal difference learning: the successor representation. Neural Comput. 5, 613–624 (1993).

    Article  Google Scholar 

  39. Botvinick, M. & Weinstein, A. Model-based hierarchical reinforcement learning and human action control. Philos. Trans. R. Soc. Lond. B Biol. Sci. 369, 20130480 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  40. Schapiro, A.C., Rogers, T.T., Cordova, N.I., Turk-Browne, N.B. & Botvinick, M.M. Neural representations of events arise from temporal community structure. Nat. Neurosci. 16, 486–492 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Gluck, M.A. & Myers, C.E. Hippocampal mediation of stimulus representation: a computational theory. Hippocampus 3, 491–516 (1993).

    Article  CAS  PubMed  Google Scholar 

  42. Badre, D., Kayser, A.S. & D'Esposito, M. Frontal cortex and the discovery of abstract action rules. Neuron 66, 315–326 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Botvinick, M.M., Niv, Y. & Barto, A.C. Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition 113, 262–280 (2009).

    Article  PubMed  Google Scholar 

  44. Simon, D.A. & Daw, N.D. Neural correlates of forward planning in a spatial decision task in humans. J. Neurosci. 31, 5526–5539 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Everitt, B.J. & Robbins, T.W. Neural systems of reinforcement for drug addiction: from actions to habits to compulsion. Nat. Neurosci. 8, 1481–1489 (2005).

    Article  CAS  PubMed  Google Scholar 

  46. Redish, A.D. Addiction as a computational process gone awry. Science 306, 1944–1947 (2004).

    Article  CAS  PubMed  Google Scholar 

  47. Voon, V. et al. Mechanisms underlying dopamine-mediated reward bias in compulsive behaviors. Neuron 65, 135–142 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Otto, A.R., Gershman, S.J., Markman, A.B. & Daw, N.D. The curse of planning: dissecting multiple reinforcement-learning systems by taxing the central executive. Psychol. Sci. 24, 751–761 (2013).

    Article  PubMed  Google Scholar 

  49. Akaike, H. A new look at the statistical model identification. IEEE Trans. Automat. Contr. 19, 716–723 (1974).

    Article  Google Scholar 

  50. Daw, N.D. in Atten. Perform. XXIII (Delgado, M.R., Phelps, E.A. & Robbins, T.W.) 1–26 (Oxford University Press, 2011).

Download references


We thank S.M. Fleming and L.Y. Atlas for helpful discussions. This work was supported by NINDS grant R01NS078784.

Author information

Authors and Affiliations



All authors designed the experiment and analyses. B.B.D. and K.D.D. performed the experiment. B.B.D. analyzed the data. B.B.D., N.D.D. and D.S. wrote the paper.

Corresponding author

Correspondence to Bradley B Doll.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Inferior frontal gyrus activation and model-free behavior

Relationship between inferior frontal gyrus (IFG) activation and model-free behavior (Online Methods, GLM4). A prospective model-based learner is indifferent to changes in start states, facing the same prospective problem on each trial. In contrast, a model-free learner who maintains a separate set of expected values for each start state may face additional processing demands (e.g., retrieval) when start states change. To test this possibility, we sought regions where such a switch cost might be reflected in the BOLD response, via greater activation when start states differed from one trial to the next relative to when they remained the same. a. Contrast of task start states (faces, tools) that differed from the previous trial, relative to those that matched. Effect plotted at P = 0.001 uncorrected for display purposes. (Peak voxel: −48 16 22; P = 1.1 × 10−7, cluster family-wise error corrected for whole-brain comparisons. Cluster size: 833 voxels. Peak t(19) = 6.27. No other clusters survived correction) b. IFG activation correlates with model-free behavior. Individual values reflect average activation of cluster identified from group-level contrast. IFG activation correlates negatively with model-based behavior (estimate = −0.65, χ2(1) = 11.91, P = 0.0006). Lines depict group-level linear effects and 95% confidence curves.

Supplementary Figure 2 Group level depiction of category-specific activation

Group level depiction of category-specific activation used to create functional ROIs from localizer data (ROIs for analysis were created in native space for each subject). Each category ROI constructed from the intersection of contrasts with all other categories (e.g. scenes ROI: scenes > body parts ∩ scenes > faces ∩ scenes > tools), thus preventing any overlap in ROIs (here, the conjunction of these group level contrasts is presented). Each contrast thresholded at P < 0.001, uncorrected. Peaks of clusters surviving family-wise error correction for whole-brain multiple comparisons: body parts: 50 −78 8, t(19)=9.23, cluster P = 2 × 10−6; −48 −76 12, t(19)=6.48, cluster P = 0.008; scenes: −26 −46 −10, t(19) = 11.71, cluster P = 6.8 × 10−5, 24 −34 −16, t(19) = 9.42, P = 1.7 × 10−5,−12 −98 0, t(19) = 8.7, P = 2.8 × 10−9; tools: −8 −78 6, t(19) = 10.22, P = 9.3 × 10−14. No clusters survived correction for the faces category (peak: 34 −90 −12, t(19) = 4.28, P = 0.992).

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1 and 2 and Supplementary Tables 1–4 (PDF 295 kb)

Supplementary Methods Checklist (PDF 175 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Doll, B., Duncan, K., Simon, D. et al. Model-based choices involve prospective neural activity. Nat Neurosci 18, 767–772 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing