Article | Published:

Dorsal hippocampus contributes to model-based planning

Nature Neuroscience volume 20, pages 12691276 (2017) | Download Citation

This article has been updated

Abstract

Planning can be defined as action selection that leverages an internal model of the outcomes likely to follow each possible action. Its neural mechanisms remain poorly understood. Here we adapt recent advances from human research for rats, presenting for the first time an animal task that produces many trials of planned behavior per session, making multitrial rodent experimental tools available to study planning. We use part of this toolkit to address a perennially controversial issue in planning: the role of the dorsal hippocampus. Although prospective hippocampal representations have been proposed to support planning, intact planning in animals with damaged hippocampi has been repeatedly observed. Combining formal algorithmic behavioral analysis with muscimol inactivation, we provide causal evidence directly linking dorsal hippocampus with planning behavior. Our results and methods open the door to new and more detailed investigations of the neural mechanisms of planning in the hippocampus and throughout the brain.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Change history

  • 17 November 2017

    In the version of this article initially published, the green label in Fig. 1c read "rightward choices" instead of "leftward choices." The error has been corrected in the HTML and PDF versions of the article.

References

  1. 1.

    & Reinforcement Learning: an Introduction (MIT Press, 1998).

  2. 2.

    Cognitive maps in rats and men. Psychol. Rev. 55, 189–208 (1948).

  3. 3.

    & Goals and habits in the brain. Neuron 80, 312–325 (2013).

  4. 4.

    & Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology 35, 48–69 (2010).

  5. 5.

    , & Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711 (2005).

  6. 6.

    Sensory pre-conditioning. J. Exp. Psychol. 25, 323–332 (1939).

  7. 7.

    & Instrumental responding following reinforcer devaluation. Q. J. Exp. Psychol. B 33, 109–121 (1981).

  8. 8.

    , , & Endocannabinoid signaling is critical for habit formation. Front. Integr. Neurosci. 1, 6 (2007).

  9. 9.

    , , , & Model-based influences on humans' choices and striatal prediction errors. Neuron 69, 1204–1215 (2011).

  10. 10.

    & Neural correlates of forward planning in a spatial decision task in humans. J. Neurosci. 31, 5526–5539 (2011).

  11. 11.

    , & Mapping value based planning and extensively trained choice in the human brain. Nat. Neurosci. 15, 786–791 (2012).

  12. 12.

    et al. Interplay of approximate planning strategies. Proc. Natl. Acad. Sci. USA 112, 3098–3103 (2015).

  13. 13.

    & The Hippocampus as a Cognitive Map (Clarendon Press Oxford, 1978).

  14. 14.

    & Inactivation of hippocampus or caudate nucleus with lidocaine differentially affects expression of place and response learning. Neurobiol. Learn. Mem. 65, 65–72 (1996).

  15. 15.

    , , & Place navigation impaired in rats with hippocampal lesions. Nature 297, 681–683 (1982).

  16. 16.

    & The hippocampus as a spatial map. Preliminary evidence from unit activity in the freely-moving rat. Brain Res. 34, 171–175 (1971).

  17. 17.

    & Hippocampal theta sequences reflect current goals. Nat. Neurosci. 18, 289–294 (2015).

  18. 18.

    & Hippocampal place-cell sequences depict future paths to remembered goals. Nature 497, 74–79 (2013).

  19. 19.

    , , & Modeling goal-directed spatial navigation in the rat based on physiological data from the hippocampal formation. Neural Netw. 16, 577–584 (2003).

  20. 20.

    & Sequence learning and the role of the hippocampus in rodent navigation. Curr. Opin. Neurobiol. 22, 294–300 (2012).

  21. 21.

    , , & Internally generated sequences in learning and executing goal-directed behavior. Trends Cogn. Sci. 18, 647–657 (2014).

  22. 22.

    & Latent learning in hippocampal-lesioned rats. Physiol. Behav. 26, 1055–1059 (1981).

  23. 23.

    , & Further evidence for latent learning in hippocampal-lesioned rats. Physiol. Behav. 29, 401–407 (1982).

  24. 24.

    & The role of the hippocampus in instrumental conditioning. J. Neurosci. 20, 4233–4239 (2000).

  25. 25.

    , & Sensitivity to instrumental contingency degradation is mediated by the entorhinal cortex and its efferents via the dorsal hippocampus. J. Neurosci. 22, 10976–10984 (2002).

  26. 26.

    et al. Excitotoxic lesions of the hippocampus leave sensory preconditioning intact: implications for models of hippocampal function. Behav. Neurosci. 115, 1357–1362 (2001).

  27. 27.

    , & Inactivation of the dorsal hippocampus does not affect learning during exploration of a novel environment. Hippocampus 15, 1085–1093 (2005).

  28. 28.

    & Conservation of hippocampal memory function in rats and humans. Nature 379, 255–257 (1996).

  29. 29.

    & The hippocampus and memory for orderly stimulus relations. Proc. Natl. Acad. Sci. USA 94, 7109–7114 (1997).

  30. 30.

    & Memory for the order of events in specific sequences: contributions of the hippocampus and medial prefrontal cortex. J. Neurosci. 31, 3169–3175 (2011).

  31. 31.

    et al. Orbitofrontal cortex supports behavior and learning using inferred but not cached values. Science 338, 953–956 (2012).

  32. 32.

    , , , & Ventral striatum and orbitofrontal cortex are both required for model-based, but not model-free, reinforcement learning. J. Neurosci. 31, 2700–2705 (2011).

  33. 33.

    & Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions. Nat. Commun. 4, 2264 (2013).

  34. 34.

    , & Identifying model-based and model-free patterns in behavior on multi-step tasks. Preprint at (2016).

  35. 35.

    , , , & Model-based reasoning in humans becomes automatic with training. PLOS Comput. Biol. 11, e1004463 (2015).

  36. 36.

    , & Speed/accuracy trade-off between the habitual and the goal-directed processes. PLOS Comput. Biol. 7, e1002055 (2011).

  37. 37.

    , & When does model-based control pay off? PLOS Comput. Biol. 12, e1005090 (2016).

  38. 38.

    , & Simple plans or sophisticated habits? State, transition and learning interactions in the two-step task. PLOS Comput. Biol. 11, e1004648 (2015).

  39. 39.

    Neurobiology of economic choice: a good-based model. Annu. Rev. Neurosci. 34, 333–359 (2011).

  40. 40.

    , , & Orbitofrontal cortex as a cognitive map of task space. Neuron 81, 267–279 (2014).

  41. 41.

    , & What the orbitofrontal cortex does not do. Nat. Neurosci. 18, 620–627 (2015).

  42. 42.

    & Orbitofrontal cortex mediates outcome encoding in Pavlovian but not instrumental conditioning. J. Neurosci. 27, 4819–4825 (2007).

  43. 43.

    , & A model of hippocampally dependent navigation, using the temporal difference learning rule. Hippocampus 10, 1–16 (2000).

  44. 44.

    , & Hippocampus, space, and memory. Behav. Brain Sci. 2, 313–322 (1979).

  45. 45.

    & Hippocampal lesions and delayed alternation in the rat. Psychon. Sci. 3, 285–286 (1965).

  46. 46.

    , , & Higher-order conditioning is impaired by hippocampal lesions. Curr. Biol. 24, 2202–2207 (2014).

  47. 47.

    , , & Hippocampus and trace conditioning of the rabbit's classically conditioned nictitating membrane response. Behav. Neurosci. 100, 729–744 (1986).

  48. 48.

    , , & Space in the brain: how the hippocampal formation supports spatial cognition. Phil. Trans. R. Soc. Lond. B 369, 20120510 (2013).

  49. 49.

    , & Deconstructing episodic memory with construction. Trends in Cog. Sci., 11, 299–306 (2007).

  50. 50.

    & Can we reconcile the declarative memory and spatial navigation views on hippocampal function? Neuron 83, 764–770 (2014).

  51. 51.

    & Dynamic response-by-response models of matching behavior in rhesus monkeys. J. Exp. Anal. Behav. 84, 555–579 (2005).

  52. 52.

    Stan Development Team. MatlabStan: the MATLAB interface to Stan. Stan.org. (2016).

  53. 53.

    et al. Stan: a probabilistic programming language. J. Stat. Softw. 76, 1–32 (2017).

  54. 54.

    et al. Bayesian Data Analysis, Third Edition (CRC Press, 2013).

  55. 55.

    , & Immediate thalamic sensory plasticity depends on corticothalamic feedback. Proc. Natl. Acad. Sci. USA 96, 8200–8205 (1999).

  56. 56.

    Autoradiographic estimation of the extent of reversible inactivation produced by microinjection of lidocaine and muscimol in the rat. Neurosci. Lett. 127, 160–164 (1991).

  57. 57.

    , , , & A solution to dependency: using multilevel analysis to accommodate nested data. Nat. Neurosci. 17, 491–496 (2014).

  58. 58.

    in Decision Making, Affect, and Learning (eds. Delgado, M.R., Phelps, E.A. & Robbins, T.W.) 3–38 (Oxford University Press, 2011).

  59. 59.

    , , & Hybrid Monte Carlo. Phys. Lett. B 195, 216–222 (1987).

Download references

Acknowledgements

We thank J. Erlich, C. Kopec, C.A. Duan, T. Hanks and A. Begelfer for training K.J.M. in the techniques necessary to carry out these experiments, as well as for comments and advice on the project. We thank N. Daw, I. Witten, Y. Niv, B. Wilson, T. Akam, A. Akrami and A. Solway for comments and advice on the project, and we thank J. Teran, K. Osorio, A. Sirko, R. LaTourette, L. Teachen and S. Stein for assistance in carrying out behavioral experiments. We especially thank T. Akam for suggestions on the physical layout of the behavior box and other experimental details. We thank A. Bornstein, B. Scott, A. Piet and L. Hunter for comments on the manuscript. K.J.M. was supported by training grant NIH T-32 MH065214 and by a Harold W. Dodds fellowship from Princeton University.

Author information

Affiliations

  1. Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, USA.

    • Kevin J Miller
    • , Matthew M Botvinick
    •  & Carlos D Brody
  2. Gatsby Computational Neuroscience Unit, University College London, London, UK.

    • Matthew M Botvinick
  3. Google DeepMind, London, UK.

    • Matthew M Botvinick
  4. Howard Hughes Medical Institute and Department of Molecular Biology, Princeton University, Princeton, New Jersey, USA.

    • Carlos D Brody

Authors

  1. Search for Kevin J Miller in:

  2. Search for Matthew M Botvinick in:

  3. Search for Carlos D Brody in:

Contributions

K.J.M., M.M.B. and C.D.B. conceived the project. K.J.M. designed and carried out the experiments and the data analysis, with supervision from M.M.B. and C.D.B. K.J.M., M.M.B. and C.D.B. wrote the paper, starting from an initial draft by K.J.M.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Matthew M Botvinick or Carlos D Brody.

Integrated supplementary information

Supplementary figures

  1. 1.

    Reward rates of model-based and model-free agents.

  2. 2.

    Results of one-trial-back analysis applied to the behavioral dataset.

  3. 3.

    Results of logistic regression analysis applied to each rat, as well as simulated data generated from a fit of the mixture model to that rat’s dataset.

  4. 4.

    Movement times are faster following common transition trials.

  5. 5.

    Placement of cannula in individual rats.

  6. 6.

    Results of logistic regression analysis applied to the inactivation dataset.

  7. 7.

    Results of logistic regression analysis applied to each rat in the inactivation dataset.

  8. 8.

    Results of logistic regression analysis applied to simulated data generated by the reduced model fit to each rat in the inactivation dataset

  9. 9.

    Rat performances compared between inactivation and control sessions.

  10. 10.

    Results of one-trial-back analysis applied to the inactivation dataset.

  11. 11.

    Results of one-trial-back stay/switch analysis applied to each rat in the inactivation dataset

  12. 12.

    Results of fitting the multiagent model jointly to the OFC inactivation and saline datasets.

  13. 13.

    Results of fitting the multiagent model jointly to the dH inactivation and saline datasets.

  14. 14.

    Plots of posterior density projected onto planes defined by the parameter governing change in model-based weight and other population parameters for hippocampus (top, orange) and OFC (bottom, purple) inactivation datasets.

  15. 15.

    Normalized cross-validated likelihood for logistic regression models (Online Methods), as a function of the number of previous trials used to predict the upcoming choice.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–15 and Supplementary Discussion

  2. 2.

    Life Sciences Reporting Summary

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nn.4613