Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Rethinking model-based and model-free influences on mental effort and striatal prediction errors

Abstract

A standard assumption in neuroscience is that low-effort model-free learning is automatic and continuously used, whereas more complex model-based strategies are only used when the rewards they generate are worth the additional effort. We present evidence refuting this assumption. First, we demonstrate flaws in previous reports of combined model-free and model-based reward prediction errors in the ventral striatum that probably led to spurious results. More appropriate analyses yield no evidence of model-free prediction errors in this region. Second, we find that task instructions generating more correct model-based behaviour reduce rather than increase mental effort. This is inconsistent with cost–benefit arbitration between model-based and model-free strategies. Together, our data indicate that model-free learning may not be automatic. Instead, humans can reduce mental effort by using a model-based strategy alone rather than arbitrating between multiple strategies. Our results call for re-evaluation of the assumptions in influential theories of learning and decision-making.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Structure and timeline of two-stage tasks.
Fig. 2: Results of a logistic regression analysis of the stay probability in consecutive trial pairs.
Fig. 3: Distribution of ratings by condition for effort, understanding and complexity.
Fig. 4: Correlation between BOLD activity and RPEs in the striatum and the medial prefrontal cortex (mPFC) for the abstract condition (n = 48).
Fig. 5: Brain activity comparison between the abstract (n = 48) and story (n = 46) conditions.
Fig. 6: Pupil diameter z scores at feedback for the abstract (n = 36) and story (n = 37) conditions.

Similar content being viewed by others

Data availability

The behavioural and eye-tracking data can be found on https://github.com/carolfs/fmri_magic_carpet and the fMRI images can be found on https://openneuro.org/datasets/ds004455.

Code availability

The code used to run the task and the analyses can be found on https://github.com/carolfs/fmri_magic_carpet and makes use of PsychoPy v.1.90.3, SPM v.12, FSL v.6.0.5, Python v.3.8.13, R v.4.1.3, Julia v.1.7.2, MATLAB v.R2019b and MACS v.1.3.

References

  1. Daw, N. D., Niv, Y. & Dayan, P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711 (2005).

    CAS  PubMed  Google Scholar 

  2. Akam, T., Costa, R. & Dayan, P. Simple plans or sophisticated habits? state, transition and learning interactions in the two-step task. PLoS Comput. Biol. 11, e1004648 (2015).

    PubMed  PubMed Central  Google Scholar 

  3. Kool, W., Cushman, F. A. & Gershman, S. J. When does model-based control pay off? PLoS Comput. Biol. 12, e1005090 (2016).

    PubMed  PubMed Central  Google Scholar 

  4. Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P. & Dolan, R. J. Model-based influences on humans’ choices and striatal prediction errors. Neuron 69, 1204–1215 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. Wunderlich, K., Smittenaar, P. & Dolan, R. J. Dopamine enhances model-based over model-free choice behavior. Neuron 75, 418–424 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  6. Dezfouli, A. & Balleine, B. W. Actions, action sequences and habits: evidence that goal-directed and habitual action control are hierarchically organized. PLoS Comput. Biol. 9, e1003364 (2013).

    PubMed  PubMed Central  Google Scholar 

  7. Otto, A. R., Raio, C. M., Chiang, A., Phelps, E. A. & Daw, N. D. Working-memory capacity protects model-based learning from stress. Proc. Natl Acad. Sci. USA 110, 20941–20946 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Smittenaar, P., FitzGerald, T. H., Romei, V., Wright, N. D. & Dolan, R. J. Disruption of dorsolateral prefrontal cortex decreases model-based in favor of model-free control in humans. Neuron 80, 914–919 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Eppinger, B., Walter, M., Heekeren, H. R. & Li, S.-C. Of goals and habits: age-related and individual differences in goal-directed decision-making. Front. Neurosci. https://doi.org/10.3389/fnins.2013.00253 (2013).

  10. Dezfouli, A., Lingawi, N. W. & Balleine, B. W. Habits as action sequences: hierarchical action control and changes in outcome value. Philos. Trans. R. Soc. B: Biol. Sci. 369, 20130482–20130482 (2014).

    Google Scholar 

  11. Otto, A. R., Skatova, A., Madlon-Kay, S. & Daw, N. D. Cognitive control predicts use of model-based reinforcement learning. J. Cogn. Neurosci. 27, 319–333 (2014).

    Google Scholar 

  12. Friedel, E. et al. Devaluation and sequential decisions: linking goal-directed and model-based behavior. Front. Human Neurosci. https://doi.org/10.3389/fnhum.2014.00587 (2014).

  13. Economides, M., Kurth-Nelson, Z., Lübbert, A., Guitart-Masip, M. & Dolan, R. J. Model-based reasoning in humans becomes automatic with training. PLoS Comput. Biol. 11, e1004463 (2015).

    PubMed  PubMed Central  Google Scholar 

  14. Deserno, L. et al. Ventral striatal dopamine reflects behavioral and neural signatures of model-based control during sequential decision making. Proc. Natl Acad. Sci. USA 112, 1595–1600 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. Voon, V. et al. Disorders of compulsivity: a common bias towards learning habits. Mol. Psychiatry 20, 345–352 (2015).

    CAS  PubMed  Google Scholar 

  16. Gillan, C. M., Otto, A. R., Phelps, E. A. & Daw, N. D. Model-based learning protects against forming habits. Cogn., Affect., Behav. Neurosci. 15, 523–536 (2015).

    PubMed  Google Scholar 

  17. Doll, B. B., Bath, K. G., Daw, N. D. & Frank, M. J. Variability in dopamine genes dissociates model-based and model-free reinforcement learning. J. Neurosci. 36, 1211–1222 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Decker, J. H., Otto, A. R., Daw, N. D. & Hartley, C. A. From creatures of habit to goal-directed learners: tracking the developmental emergence of model-based reinforcement learning. Psychol. Sci. 27, 848–858 (2016).

    PubMed  PubMed Central  Google Scholar 

  19. Konovalov, A. & Krajbich, I. Gaze data reveal distinct choice processes underlying model-based and model-free reinforcement learning. Nat. Commun. 7, 12438 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Gillan, C. M., Kosinski, M., Whelan, R., Phelps, E. A. & Daw, N. D. Characterizing a psychiatric symptom dimension related to deficits in goal-directed control. eLife https://elifesciences.org/articles/11305 (2016).

  21. Sharp, M. E., Foerde, K., Daw, N. D. & Shohamy, D. Dopamine selectively remediates ‘model-based’ reward learning: a computational approach. Brain 139, 355–364 (2016).

    PubMed  Google Scholar 

  22. Miller, K. J., Botvinick, M. M. & Brody, C. D. Dorsal hippocampus contributes to model-based planning. Nat. Neurosci. 20, 1269–1276 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Shahar, N. et al. Credit assignment to state-independent task representations and its relationship with model-based decision making. Proc. Natl Acad. Sci. USA 116, 15871–15876 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Shahar, N. et al. Improving the reliability of model-based decision-making estimates in the two-stage decision task with reaction-times and drift-diffusion modeling. PLoS Comput. Biol. 15, e1006803 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Grosskurth, E. D., Bach, D. R., Economides, M., Huys, Q. J. M. & Holper, L. No substantial change in the balance between model-free and model-based control via training on the two-step task. PLoS Comput. Biol. 15, e1007443 (2019).

    PubMed  PubMed Central  Google Scholar 

  26. Sebold, M. et al. When habits are dangerous: alcohol expectancies and habitual decision making predict relapse in alcohol dependence. Biol. Psychiatry 82, 847–856 (2017).

    PubMed  Google Scholar 

  27. Nebe, S. et al. No association of goal-directed and habitual control with alcohol consumption in young adults. Addiction Biol. 23, 379–393 (2018).

    Google Scholar 

  28. Feher da Silva, C. & Hare, T. A. Humans primarily use model-based inference in the two-stage task. Nat. Hum. Behav. 4, 1053–1066 (2020).

  29. Seow, T. X. F. et al. Model-based planning deficits in compulsivity are linked to faulty neural representations of task structure. J. Neurosci. 41, 6539–6550 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Doll, B. B., Simon, D. A. & Daw, N. D. The ubiquity of model-based reinforcement learning. Curr. Opin. Neurobiol. 22, 1075–1081 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. Chen, H. et al. Model-based and model-free control predicts alcohol consumption developmental trajectory in young adults: a 3-year prospective study. Biol. Psychiatry 89, 980–989 (2021).

    PubMed  Google Scholar 

  32. Sharp, P. B., Dolan, R. J. & Eldar, E. Disrupted state transition learning as a computational marker of compulsivity. Psychol. Med. https://doi.org/10.1017/S0033291721003846 (2021).

  33. Dromnelle, R. et al. in Biomimetic and Biohybrid Systems (eds Vouloutsi, V. et al.) 68–79 (Springer International Publishing, 2020).

  34. Wise, R. A. Dopamine, learning and motivation. Nat. Rev. Neurosci. 5, 483–494 (2004).

    CAS  PubMed  Google Scholar 

  35. Gläscher, J., Daw, N., Dayan, P. & O’Doherty, J. P. States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66, 585–595 (2010).

    PubMed  PubMed Central  Google Scholar 

  36. Lee, S. W., Shimojo, S. & O’Doherty, J. P. Neural computations underlying arbitration between model-based and model-free learning. Neuron 81, 687–699 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. Donoso, M., Collins, A. G. E. & Koechlin, E. Foundations of human reasoning in the prefrontal cortex. Science 344, 1481–1486 (2014).

    CAS  PubMed  Google Scholar 

  38. Charpentier, C. J., Iigaya, K. & O’Doherty, J. P. A neuro-computational account of arbitration between choice imitation and goal emulation during human observational learning. Neuron 106, 687–699.e7 (2020).

    PubMed  PubMed Central  Google Scholar 

  39. Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B. & Dolan, R. J. Cortical substrates for exploratory decisions in humans. Nature 441, 876–879 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. Raja Beharelle, A., Polania, R., Hare, T. A. & Ruff, C. C. Transcranial stimulation over frontopolar cortex elucidates the choice attributes and neural mechanisms used to resolve exploration-exploitation trade-offs. J. Neurosci. 35, 14544–14556 (2015).

    PubMed  PubMed Central  Google Scholar 

  41. Kahneman, D. & Beatty, J. Pupil diameter and load on memory. Science 154, 1583–1585 (1966).

    CAS  PubMed  Google Scholar 

  42. Poock, G. K. Information processing vs pupil diameter. Percept. Mot. Skills 37, 1000–1002 (1973).

    CAS  PubMed  Google Scholar 

  43. Jepma, M. & Nieuwenhuis, S. Pupil diameter predicts changes in the exploration-exploitation trade-off: evidence for the adaptive gain theory. J. Cogn. Neurosci. 23, 1587–1596 (2011).

    PubMed  Google Scholar 

  44. Reimer, J. et al. Pupil fluctuations track fast switching of cortical states during quiet wakefulness. Neuron 84, 355–362 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. Richer, F. & Beatty, J. Contrasting effects of response uncertainty on the task-evoked pupillary response and reaction time. Psychophysiology 24, 258–262 (1987).

    CAS  PubMed  Google Scholar 

  46. Urai, A. E., Braun, A. & Donner, T. H. Pupil-linked arousal is driven by decision uncertainty and alters serial choice bias. Nat. Commun. 8, 14637 (2017).

    PubMed  PubMed Central  Google Scholar 

  47. O’Reilly, J. X. et al. Dissociable effects of surprise and model update in parietal and anterior cingulate cortex. Proc. Natl Acad. Sci. USA 110, E3660–E3669 (2013).

    PubMed  PubMed Central  Google Scholar 

  48. Grueschow, M., Kleim, B. & Ruff, C. C. Role of the locus coeruleus arousal system in cognitive control. J. Neuroendocrinol. 32, e12890 (2020).

    CAS  PubMed  Google Scholar 

  49. Kool, W., Gershman, S. J. & Cushman, F. A. Cost-benefit arbitration between multiple reinforcement-learning systems. Psychol. Sci. https://doi.org/10.1177/0956797617708288 (2017).

  50. Kool, W., Gershman, S. J. & Cushman, F. A. Planning complexity registers as a cost in metacontrol. J. Cogn. Neurosci. 30, 1391–1404 (2018).

    PubMed  Google Scholar 

  51. Daw, N. D. Are we of two minds? Nat. Neurosci. 21, 1497 (2018).

    CAS  PubMed  Google Scholar 

  52. Collins, A. G. & Cockburn, J. Beyond dichotomies in reinforcement learning. Nat. Rev. Neurosci. 21, 576–586 (2020).

  53. Bennett, D., Niv, Y. & Langdon, A. J. Value-free reinforcement learning: Policy optimization as a minimal model of operant behavior. Curr. Opin. Behav. Sci. 41, 114–121 (2021).

    PubMed  PubMed Central  Google Scholar 

  54. Heo, S., Sung, Y. & Lee, S. W. Effects of subclinical depression on prefrontal-striatal model-based and model-free learning. PloS Comput. Biol. 17, e1009003 (2021).

  55. Bromberg-Martin, E. S., Matsumoto, M., Hong, S. & Hikosaka, O. A pallidus-habenula-dopamine pathway signals inferred stimulus values. J. Neurophysiol. 104, 1068–1076 (2010).

    PubMed  PubMed Central  Google Scholar 

  56. Sadacca, B. F., Jones, J. L. & Schoenbaum, G. Midbrain dopamine neurons compute inferred and cached value prediction errors in a common framework. eLife https://elifesciences.org/articles/13665 (2016).

  57. Sharpe, M. J. et al. Dopamine transients are sufficient and necessary for acquisition of model-based associations. Nat. Neurosci. 20, 735–742 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. Feher da Silva, C., Lombardi, G., Edelson, M. & Hare, T. Is model-based learning related to dietary self-control? (Centre for Open Science, 2018); osf.io/wkcvx

  59. Esteban, O., Markiewicz, C.J., Blair, R.W. et al. fMRIPrep: a robust preprocessing pipeline for functional MRI. Nat. Methods 16, 111–116 (2019).

  60. Esteban, O. et al. Fmriprep 1.2.5 (2018).

  61. Lewandowski, D., Kurowicka, D. & Joe, H. Generating random correlation matrices based on vines and extended onion method. J. Multivar. Anal. 100, 1989–2001 (2009).

    Google Scholar 

  62. Stan modeling language users guide and reference manual, version 2.16.0 (Stan Development Team, 2017).

  63. Carpenter, B. et al. Stan: a probabilistic programming language. J. Statist. Softw. http://www.jstatsoft.org/v76/i01/ (2017).

  64. PyStan: the Python interface to Stan (Stan Development Team, 2017); http://mc-stan.org

  65. Vehtari, A., Gelman, A. & Gabry, J. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat. Comput. https://doi.org/10.1007/s11222-016-9696-4 (2016).

  66. McElreath, R. Monsters and Mixtures 2nd edn, 369–397 (CRC Press, 2020).

  67. Gorgolewski, K. et al. Nipype: a flexible, lightweight and extensible neuroimaging data processing framework in python. Front. Neuroinform. 5, 13 (2011).

    PubMed  PubMed Central  Google Scholar 

  68. Gorgolewski, K. J. et al. Nipype (2018).

  69. Tustison, N. J. et al. N4itk: improved n3 bias correction. IEEE Trans. Med. Imaging 29, 1310–1320 (2010).

    PubMed  PubMed Central  Google Scholar 

  70. Fonov, V., Evans, A., McKinstry, R., Almli, C. & Collins, D. Unbiased nonlinear average age-appropriate brain templates from birth to adulthood. NeuroImage 47, S102 (2009).

    Google Scholar 

  71. Avants, B., Epstein, C., Grossman, M. & Gee, J. Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain. Med. Image Anal. 12, 26–41 (2008).

    CAS  PubMed  Google Scholar 

  72. Zhang, Y., Brady, M. & Smith, S. Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm. IEEE Trans. Med. Imaging 20, 45–57 (2001).

    CAS  PubMed  Google Scholar 

  73. Wang, S. et al. Evaluation of field map and nonlinear registration methods for correction of susceptibility artifacts in diffusion MRI. Front. Neuroinform. http://journal.frontiersin.org/article/10.3389/fninf.2017.00017/full (2017).

  74. Huntenburg, J. M. Evaluating Nonlinear Coregistration of BOLD EPI and T1w Images. Master’s thesis, Freie Univ., Berlin (2014).

  75. Treiber, J. M. et al. Characterization and correction of geometric distortions in 814 diffusion weighted images. PLoS ONE 11, e0152472 (2016).

    PubMed  PubMed Central  Google Scholar 

  76. Jenkinson, M. & Smith, S. A global optimisation method for robust affine registration of brain images. Med. Image Anal. 5, 143–156 (2001).

    CAS  PubMed  Google Scholar 

  77. Greve, D. N. & Fischl, B. Accurate and robust brain image alignment using boundary-based registration. NeuroImage 48, 63–72 (2009).

    PubMed  Google Scholar 

  78. Jenkinson, M., Bannister, P., Brady, M. & Smith, S. Improved optimization for the robust and accurate linear registration and motion correction of brain images. NeuroImage 17, 825–841 (2002).

    PubMed  Google Scholar 

  79. Cox, R. W. & Hyde, J. S. Software tools for analysis and visualization of fMRI data. NMR Biomed. 10, 171–178 (1997).

    CAS  PubMed  Google Scholar 

  80. Power, J. D. et al. Methods to detect, characterize, and remove motion artifact in resting state fMRI. NeuroImage 84, 320–341 (2014).

    PubMed  Google Scholar 

  81. Behzadi, Y., Restom, K., Liau, J. & Liu, T. T. A component based noise correction method (CompCor) for BOLD and perfusion based fmri. NeuroImage 37, 90–101 (2007).

    PubMed  Google Scholar 

  82. Lanczos, C. Evaluation of noisy data. J. Soc. Ind. Appl. Math. Ser. B Numer. Anal. 1, 76–85 (1964).

    Google Scholar 

  83. Abraham, A. et al. Machine learning for neuroimaging with scikit-learn. Front. Neuroinform. https://www.frontiersin.org/articles/10.3389/fninf.2014.00014/full (2014).

  84. Gorgolewski, K. J. Confounds from fmriprep: which one would you use for GLM? (2017); https://neurostars.org/t/confounds-from-fmriprep-which-one-would-you-use-for-glm/326/2

  85. Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  86. Bürkner, P.-C. brms: an R package for Bayesian multilevel models using Stan. J. Stat. Softw. 80, 1–28 (2017).

    Google Scholar 

Download references

Acknowledgements

We thank G.M. Parente for the illustrations used in the experimental tasks, K. Treiber and E. Silingardi for helping with the fMRI data collection, S. Gobbi for helping with the fMRI preprocessing and analysis as well as reviewing our calculations, and N.D. Daw, P. Dayan, M. Grueschow, A. Konovalov, I. Krajbich and S. Nebe for helpful comments on early drafts of this manuscript. Our acknowledgement of their feedback does not imply that these individuals fully agree with our conclusions or opinions in this paper. This work was supported by the CAPES Foundation (grant no. 88881.119317/2016-01), awarded to C.F.S., and the European Union’s Seventh Framework programme for research, technological development and demonstration under grant agreement no. 607310 (Nudge-it), awarded to T.A.H. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

C.F.S. and T.A.H. conceived the project. All authors designed the experiments. C.F.S. and G.L. collected and analysed the data with input from M.E. and T.A.H. C.F.S. and T.A.H. wrote the first draft of the manuscript. All authors revised the manuscript for submission.

Corresponding authors

Correspondence to Carolina Feher da Silva or Todd A. Hare.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Human Behaviour thanks Mehdi Khamassi, Jan Gläscher and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Mean estimated coefficients from the combined-RPE and separated-RPE GLMs within the nucleus accumbens.

Mean estimated coefficients from the a, c) combined-RPE and b, d) separated-RPE GLMs within the nucleus accumbens for the abstract (N = 48) and story (N = 46) conditions. Each black dot represents the coefficient from a single participant. The box and whisker plots show the distribution across the entire sample. The box extends from the first quartile to the third quartile of the distribution, with a line at the median. The whiskers extend from the box by 1.5 times the inter-quartile range. These coefficients were obtained using hybrid parameter estimates from the previous sample4.

Supplementary information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Feher da Silva, C., Lombardi, G., Edelson, M. et al. Rethinking model-based and model-free influences on mental effort and striatal prediction errors. Nat Hum Behav 7, 956–969 (2023). https://doi.org/10.1038/s41562-023-01573-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41562-023-01573-1

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing