Abstract
We use mental models of the world—cognitive maps—to guide behavior. The lateral orbitofrontal cortex (lOFC) is typically thought to support behavior by deploying these maps to simulate outcomes, but recent evidence suggests that it may instead support behavior by underlying map creation. We tested between these two alternatives using outcome-specific devaluation and a high-potency chemogenetic approach. Selectively inactivating lOFC principal neurons when male rats learned distinct cue–outcome associations, but before outcome devaluation, disrupted subsequent inference, confirming a role for the lOFC in creating new maps. However, lOFC inactivation surprisingly led to generalized devaluation, a result that is inconsistent with a complete mapping failure. Using a reinforcement learning framework, we show that this effect is best explained by a circumscribed deficit in credit assignment precision during map construction, suggesting that the lOFC has a selective role in defining the specificity of associations that comprise cognitive maps.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All data used in this study are available at https://colab.research.google.com/drive/1VYRAnvAO8OmzQpVaJe5radKIZnpEn638?usp=sharing and https://colab.research.google.com/drive/1ORP8Q9ceLBXlupvrCDh7HLAhQKsjQswr?usp=sharing. Additional information on materials and protocols are available upon request to the corresponding authors.
Code availability
All code used in this study are available at https://colab.research.google.com/drive/1VYRAnvAO8OmzQpVaJe5radKIZnpEn638?usp=sharing and https://colab.research.google.com/drive/1ORP8Q9ceLBXlupvrCDh7HLAhQKsjQswr?usp=sharing.
References
Behrens, T. E. J. et al. What is a cognitive map? Organizing knowledge for flexible behavior. Neuron 100, 490–509 (2018).
Titone, D., Ditman, T., Holzman, P. S., Eichenbaum, H. & Levy, D. L. Transitive inference in schizophrenia: impairments in relational memory organization. Schizophr. Res. 68, 235–247 (2004).
Schoenbaum, G., Chang, C. Y., Lucantonio, F. & Takahashi, Y. K. Thinking outside the box: orbitofrontal cortex, imagination, and how we can treat addiction. Neuropsychopharmacology 41, 2966–2976 (2016).
Sharp, P., Dolan, R., & Eldar, E. Disrupted state transition learning as a computational marker of compulsivity. Psychological Medicine, 1–11 (2021) https://psyarxiv.com/x29jq/
Wallis, J. D. Cross-species studies of orbitofrontal cortex and value-based decision-making. Nat. Neurosci. 15, 13–19 (2011).
Rudebeck, P. H. & Rich, E. L. Orbitofrontal cortex. Curr. Biol. 28, R1083–R1088 (2018).
Rudebeck, P. H. & Murray, E. A. The orbitofrontal oracle: cortical mechanisms for the prediction and evaluation of specific behavioral outcomes. Neuron 84, 1143–1156 (2014).
Wilson, R. C., Takahashi, Y. K., Schoenbaum, G. & Niv, Y. Orbitofrontal cortex as a cognitive map of task space. Neuron 81, 267–279 (2014).
Schuck, N. W., Cai, M. B., Wilson, R. C. & Niv, Y. Human orbitofrontal cortex represents a cognitive map of state space. Neuron 91, 1402–1412 (2016).
Rustichini, A. & Padoa-Schioppa, C. A neuro-computational model of economic decisions. J. Neurophysiol. 114, 1382–1398 (2015).
Gallagher, M., McMahan, R. W. & Schoenbaum, G. Orbitofrontal cortex and representation of incentive value in associative learning. J. Neurosci. 19, 6610–6614 (1999).
Izquierdo, A., Suda, R. K. & Murray, E. A. Bilateral orbital prefrontal cortex lesions in rhesus monkeys disrupt choices guided by both reward value and reward contingency. J. Neurosci. 24, 7540–7548 (2004).
Howard, J. D. et al. Targeted stimulation of human orbitofrontal networks disrupts outcome-guided behavior. Curr. Biol. 30, 490–498.e4 (2020).
West, E. A., DesJardin, J. T., Gale, K. & Malkova, L. Transient inactivation of orbitofrontal cortex blocks reinforcer devaluation in macaques. J. Neurosci. 31, 15128–15135 (2011).
Rudebeck, P. H., Saunders, R. C., Prescott, A. T., Chau, L. S. & Murray, E. A. Prefrontal mechanisms of behavioral flexibility, emotion regulation and value updating. Nat. Neurosci. 16, 1140–1145 (2013).
Gardner, M. P. H. & Schoenbaum, G. The orbitofrontal cartographer. Behav. Neurosci. 135, 267–276 (2021).
Miller, K. J., Botvinick, M. M. & Brody, C. D. Value representations in the rodent orbitofrontal cortex drive learning, not choice. eLife 11, e64575 (2022).
Malvaez, M., Shieh, C., Murphy, M. D., Greenfield, V. Y. & Wassum, K. M. Distinct cortical–amygdala projections drive reward value encoding and retrieval. Nat. Neurosci. 22, 762–769 (2019).
Baltz, E. T., Yalcinbas, E. A., Renteria, R. & Gremel, C. M. Orbital frontal cortex updates state-induced value change for decision-making. eLife 7, e35988 (2018).
Gardner, M. P. H. et al. Processing in lateral orbitofrontal cortex is required to estimate subjective preference during initial, but not established, economic choice. Neuron 108, 526–537.e4 (2020).
Hart, E. E., Sharpe, M. J., Gardner, M. P. & Schoenbaum, G. Responding to preconditioned cues is devaluation sensitive and requires orbitofrontal cortex during cue-cue learning. eLife 9, e59998 (2020).
Sias, A. C. et al. A bidirectional corticoamygdala circuit for the encoding and retrieval of detailed reward memories. eLife 10, e68617 (2021).
Bonaventura, J. et al. High-potency ligands for DREADD imaging and activation in rodents and monkeys. Nat. Commun. 10, 4627 (2019).
Costa, K. M., Sengupta, A. & Schoenbaum, G. The orbitofrontal cortex is necessary for learning to ignore. Curr. Biol. 31, 2652–2657.e3 (2021).
Gomez, J. L. et al. Chemogenetics revealed: DREADD occupancy and activation via converted clozapine. Science 357, 503–507 (2017).
Panayi, M. C. & Killcross, S. The role of the rodent lateral orbitofrontal cortex in simple pavlovian cue–outcome learning depends on training experience. Cereb. Cortex Commun. 2, tgab010 (2021).
Weiler, M. et al. Effects of repetitive transcranial magnetic stimulation in aged rats depend on pre-treatment cognitive status: toward individualized intervention for successful cognitive aging. Brain Stimul. 14, 1219–1225 (2021).
Daw, N. D., Niv, Y. & Dayan, P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711 (2005).
Panayi, M. C., Khamassi, M. & Killcross, S. The rodent lateral orbitofrontal cortex as an arbitrator selecting between model-based and model-free learning systems. Behav. Neurosci. 135, 226–244 (2021).
Miller, K. J., Shenhav, A. & Ludvig, E. A. Habits without values. Psychol. Rev. 126, 292–311 (2019).
Walton, M. E., Behrens, T. E. J., Buckley, M. J., Rudebeck, P. H. & Rushworth, M. F. S. Separable learning systems in the macaque brain and the role of orbitofrontal cortex in contingent learning. Neuron 65, 927–939 (2010).
Dearden, R., Friedman, N. & Russell, S. Bayesian Q-learning. Proc. 15th National Conference on Artificial Intelligence 761–768 (AAAI, 1998).
Wilson, R. C. & Collins, A. G. E. Ten simple rules for the computational modeling of behavioral data. eLife 8, e49547 (2019).
Dezfouli, A. & Balleine, B. W. Learning the structure of the world: the adaptive nature of state-space and action representations in multi-stage decision-making. PLoS Comput. Biol. 15, e1007334 (2019).
Gershman, S. J. & Niv, Y. Learning latent structure: carving nature at its joints. Curr. Opin. Neurobiol. 20, 251–256 (2010).
Panayi, M. C. & Killcross, S. Functional heterogeneity within the rodent lateral orbitofrontal cortex dissociates outcome devaluation and reversal learning deficits. eLife 7, e37357 (2018).
Jones, J. L. et al. Orbitofrontal cortex supports behavior and learning using inferred but not cached values. Science 338, 953–956 (2012).
McDannald, M. A., Saddoris, M. P., Gallagher, M. & Holland, P. C. Lesions of orbitofrontal cortex impair rats’ differential outcome expectancy learning but not conditioned stimulus-potentiated feeding. J. Neurosci. 25, 4626–4632 (2005).
Volkow, N. D. & Fowler, J. S. Addiction, a disease of compulsion and drive: involvement of the orbitofrontal cortex. Cereb. Cortex 10, 318–325 (2000).
Van Hoesen, G. W., Parvizi, J. & Chu, C. C. Orbitofrontal cortex pathology in Alzheimer’s disease. Cereb. Cortex 10, 243–251 (2000).
Denburg, N. L. et al. The orbitofrontal cortex, real-world decision making, and normal aging. Ann. NY Acad. Sci. 1121, 480–498 (2007).
Jackowski, A. P. et al. The involvement of the orbitofrontal cortex in psychiatric disorders: an update of neuroimaging findings. Braz. J. Psychiatry 34, 207–212 (2012).
Decker, J. H., Otto, A. R., Daw, N. D. & Hartley, C. A. From creatures of habit to goal-directed learners: tracking the developmental emergence of model-based reinforcement learning. Psychol. Sci. 27, 848–858 (2016).
Rauch, S. L. et al. Functional magnetic resonance imaging study of regional brain activation during implicit sequence learning in obsessive–compulsive disorder. Biol. Psychiatry 61, 330–336 (2007).
Walther, S. et al. Limbic links to paranoia: increased resting-state functional connectivity between amygdala, hippocampus and orbitofrontal cortex in schizophrenia patients with paranoia. Eur. Arch. Psychiatry Clin. Neurosci. 1, 1021–1032 (2022).
Winkowski, D. E. et al. Orbitofrontal cortex neurons respond to sound and activate primary auditory cortex neurons. Cereb. Cortex 28, 868–879 (2018).
Banerjee, A. et al. Value-guided remapping of sensory cortex by lateral orbitofrontal cortex. Nature 585, 245–250 (2020).
Gardner, M. P. H., Conroy, J. S., Shaham, M. H., Styer, C. V. & Schoenbaum, G. Lateral orbitofrontal inactivation dissociates devaluation-sensitive behavior and economic choice. Neuron 96, 1192–1203.e4 (2017).
Takahashi, Y. K. et al. The orbitofrontal cortex and ventral tegmental area are necessary for learning from unexpected outcomes. Neuron 62, 269–280 (2009).
Ostlund, S. B. & Balleine, B. W. Orbitofrontal cortex mediates outcome encoding in pavlovian but not instrumental conditioning. J. Neurosci. 27, 4819–4825 (2007).
Schoenbaum, G., Nugent, S. L., Saddoris, M. P. & Setlow, B. Orbitofrontal lesions in rats impair reversal but not acquisition of go, no-go odor discriminations. Neuroreport 13, 885–890 (2002).
Dias, R., Robbins, T. W. & Roberts, A. C. Dissociable forms of inhibitory control within prefrontal cortex with an analog of the wisconsin card sort test: Restriction to novel situations and independence from “on-line” processing. J. Neurosci. 17, 9285–9297 (1997).
Murray, E. A., Moylan, E. J., Saleem, K. S., Basile, B. M. & Turchi, J. Specialized areas for value updating and goal selection in the primate orbitofrontal cortex. eLife 4, e11695 (2015).
Murray, E. A. & Rudebeck, P. H. Specializations for reward-guided decision-making in the primate ventral prefrontal cortex. Nat. Rev. Neurosci. 19, 404–417 (2018).
Folloni, D. et al. Ultrasound modulation of macaque prefrontal cortex selectively alters credit assignment-related activity and behavior. Sci. Adv. 7, eabg7700 (2021).
Rudebeck, P. H., Saunders, R. C., Lundgren, D. A. & Murray, E. A. Specialized representations of value in the orbital and ventrolateral prefrontal cortex: desirability versus availability of outcomes. Neuron 95, 1208–1220.e5 (2017).
Iordanova, M. D., Killcross, A. S. & Honey, R. C. Role of the medial prefrontal cortex in acquired distinctiveness and equivalence of cues. Behav. Neurosci. 121, 1431–1436 (2007).
West, E. A. et al. Noninvasive brain stimulation rescues cocaine-induced prefrontal hypoactivity and restores flexible behavior. Biol. Psychiatry 89, 1001–1011 (2021).
Blair, C. A. J., Blundell, P., Galtress, T., Hall, G. & Killcross, S. Discrimination between outcomes in instrumental learning: effects of preexposure to the reinforcers. Q. J. Exp. Psychol. B. 56, 253–265 (2003).
Acknowledgements
We thank J. Bonaventura for guidance on chemogenetic methods, N. Raheja and S. Agyemang for technical assistance, and M. Panayi, E. Hart, P. Rudebeck and P. Holland for stimulating discussions. The opinions expressed in this article are the authors’ own and do not reflect the view of the NIH or Department of Health and Human Services. This work was funded by the NIDA IRP (K.M.C., M.G. and G.S.), the Max Planck Society (R.S., K.L. and P.D.), the German Federal Ministry of Education and Research (R.S.), the Humboldt Foundation (P.D.) and the German Research Foundation grant MA 8509/1-1 (K.M.C.).
Author information
Authors and Affiliations
Contributions
K.M.C., M.P.H.G., and G.S. conceived the study. K.M.C., R.S., K.L., P.M.C., M.P.H.G., P.D. and G.S. developed the methods. K.M.C. performed the investigation. R.S., K.L. and P.D. prepared the software. K.M.C., R.S., K.L. and P.D. verified the results. K.M.C. and R.S. contributed to data curation. K.M.C., R.S. and P.M.C. performed the formal analysis. K.M.C. and R.S. prepared the data for visualization. P.D. and G.S. were responsible for the provision of resources. P.D. and G.S. supervised the work. P.D. and G.S. managed the project. K.M.C. wrote the initial draft. K.M.C., R.S., P.D. and G.S. reviewed and edited the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Neuroscience thanks Kate Wassum and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Data fitting with a reinforcement learning model that allows for a shift between model-based (MB) and model-free (MF) learning.
(a) Model fit results for our MB vs MF reinforcement learning model. Note that it can also replicate our behavioral results well. (b) Schematic of the critical aspect of the model and the expected result: the observation rate for both the MB and MF systems, as well as the potential contribution of each to behavior, were free parameters, and we expected that the contribution of the MB system would be diminished, either by a reduced MB observation rate or an increase in the MF contribution. (c) Values of the critical observation rate-related parameters, namely the proportion of contribution of the MF (wmf) system, the MF observation rate (ηmf), and the MB observation rate (ηmb) for both control and hM4d model fits (two-tailed unpaired t-test; P = 0.007**). Note that instead of a reduction in MB learning or proportional contribution, only the MF observation rate was significantly higher in the hM4d group. See Supplementary Table 2 for detailed parameter comparisons. (d) Correlations between estimated and original parameters for the MB vs MF model. Note that parameter recovery of all critical observation rate-related parameters was not very faithful (linear regression; r < 0.7). Data are represented as mean ± SEM. CTRL n = 13 and hM4d n = 15 fits of data from biologically independent animals. **P < 0.01.
Extended Data Fig. 2 Parameter recovery and correlations for the reinforcement learning model with association specificity deficit.
(a) Correlations (linear regression) between estimated and original parameters. Note that most parameters were recovered with r > 0.7, with the least faithfully recovered parameter being the state transition observation rate ηtm with r < 0.6. (b) Correlations between fitted parameters (linear regression). Note that only correlations between \(\nabla _{{{{\mathrm{pell2cue}}}}}\) and wmf (r=−0.54) in HB and between \(\nabla _{{{{\mathrm{pell2cue}}}}}\) and ηtm (r=−0.57) are substantial. CTRL n= 13 and hM4d n=15 fits of data from biologically independent animals.
Extended Data Fig. 3 Replication of the results of Sias et al.22 with the imprecision model.
(a) Plots of the empirical data retrieved from the study by Sias et al.22 (Fig. 4 of that paper), where it was shown that inactivation of lOFC terminals in basolateral amygdala (ArchT group) during outcome-specific Pavlovian training did not impair Pavlovian acquisition (left panel) but did prevent subsequent PIT effects on the elevation ratio of lever pressing for congruent rewards (right panel), in relation to controls (eYFP group). (b) Modeling of the empirical results in A with the imprecision model. Note that the model fully recapitulates the observed effects. (c) average values of the model parameters and their definitions. Note that the imprecision term χ was increased by ~60% in the model fits for the behavior of ArchT rats in comparison to eYFP controls. CTRL n= 13 and hM4d n=15 fits of data from biologically independent animals. eYFP and Arch T n= 8 fits of data from biologically independent animals.
Supplementary information
Supplementary Information
Supplementary Tables 1 and 2; Extended Data Figure legends 1–3
Rights and permissions
About this article
Cite this article
Costa, K.M., Scholz, R., Lloyd, K. et al. The role of the lateral orbitofrontal cortex in creating cognitive maps. Nat Neurosci 26, 107–115 (2023). https://doi.org/10.1038/s41593-022-01216-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41593-022-01216-0
This article is cited by
-
Midbrain signaling of identity prediction errors depends on orbitofrontal cortex networks
Nature Communications (2024)
-
Adolescent rats engage the orbitofrontal-striatal pathway differently than adults during impulsive actions
Scientific Reports (2024)
-
Dopamine projections to the basolateral amygdala drive the encoding of identity-specific reward memories
Nature Neuroscience (2024)
-
Forming cognitive maps for abstract spaces: the roles of the human hippocampus and orbitofrontal cortex
Communications Biology (2024)