A quantitative reward prediction error signal in the ventral pallidum

Ottenheimer, David J.; Bari, Bilal A.; Sutlief, Elissa; Fraser, Kurt M.; Kim, Tabitha H.; Richard, Jocelyn M.; Cohen, Jeremiah Y.; Janak, Patricia H.

doi:10.1038/s41593-020-0688-5

Article
Published: 10 August 2020

A quantitative reward prediction error signal in the ventral pallidum

Nature Neuroscience volume 23, pages 1267–1276 (2020)Cite this article

7743 Accesses
39 Citations
65 Altmetric
Metrics details

Subjects

Abstract

The nervous system is hypothesized to compute reward prediction errors (RPEs) to promote adaptive behavior. Correlates of RPEs have been observed in the midbrain dopamine system, but the extent to which RPE signals exist in other reward-processing regions is less well understood. In the present study, we quantified outcome history-based RPE signals in the ventral pallidum (VP), a basal ganglia region functionally linked to reward-seeking behavior. We trained rats to respond to reward-predicting cues, and we fit computational models to predict the firing rates of individual neurons at the time of reward delivery. We found that a subset of VP neurons encoded RPEs and did so more robustly than the nucleus accumbens, an input to the VP. VP RPEs predicted changes in task engagement, and optogenetic manipulation of the VP during reward delivery bidirectionally altered rats’ subsequent reward-seeking behavior. Our data suggest a pivotal role for the VP in computing teaching signals that influence adaptive reward seeking.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: A subset of VP neurons signal preference-based RPEs.**

**Fig. 2: RPE encoding is more prevalent and robust in the VP than in the NAc.**

**Fig. 3: An expanded value space reveals stronger RPE signaling in the VP.**

**Fig. 4: VP reward activity tracks changes in trial-by-trial task engagement.**

**Fig. 5: Manipulation of VP reward activity bidirectionally alters task engagement.**

**Fig. 6: VP RPE neuron signaling adapts across reward blocks.**

Signals of anticipation of reward and of mean reward rates in the human brain

Article Open access 09 March 2020

The cost of obtaining rewards enhances the reward prediction error signal of midbrain dopamine neurons

Article Open access 15 August 2019

Dissociable dopamine dynamics for learning and motivation

Article 22 May 2019

Data availability

The data generated and analyzed for this manuscript are available publicly at https://doi.org/10.12751/g-node.3lbd0c and ref. ⁵¹.

Code availability

The code used to analyze and visualize the data in this manuscript are available as Supplementary software and online at https://doi.org/10.12751/g-node.3lbd0c and ref. ⁵¹.

References

Sutton, R. S. & Barto, A. G. Introduction to Reinforcement Learning (MIT Press, Cambridge, MA, 1998).
Book Google Scholar
Rescorla, R. A. & Wagner, A. R. A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement, in Classical Conditioning II: Current Research and Theory, Vol. 2 (eds Black, A. H. & Prokasy, W. F.), 64–99 (Apple-Century-Crofts, 1972).
Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997).
Article CAS PubMed Google Scholar
Bayer, H. M. & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47, 129–141 (2005).
Article CAS PubMed PubMed Central Google Scholar
Smith, K. S., Tindell, A. J., Aldridge, J. W. & Berridge, K. C. Ventral pallidum roles in reward and motivation. Behav. Brain Res. 196, 155–167 (2009).
Article PubMed Google Scholar
Root, D. H., Melendez, R. I., Zaborszky, L. & Napier, T. C. The ventral pallidum: subregion-specific functional anatomy and roles in motivated behaviors. Prog. Neurobiol. 130, 29–70 (2015).
Article PubMed PubMed Central Google Scholar
de Olmos, J. S. & Heimer, L. The concepts of the ventral striatopallidal system and extended amygdala. Ann. NY Acad. Sci. 877, 1–32 (1999).
Article PubMed Google Scholar
Richard, J. M., Ambroggi, F., Janak, P. H. & Fields, H. L. Ventral pallidum neurons encode incentive value and promote cue-elicited instrumental actions. Neuron 90, 1165–1173 (2016).
Article CAS PubMed PubMed Central Google Scholar
Ottenheimer, D., Richard, J. M. & Janak, P. H. Ventral pallidum encodes relative reward value earlier and more robustly than nucleus accumbens. Nat. Commun. 9, 4350 (2018).
Article PubMed PubMed Central CAS Google Scholar
Fujimoto, A. et al. Signaling incentive and drive in the primate ventral pallidum for motivational control of goal-directed action. J. Neurosci. 39, 1793–1804 (2019).
Article CAS PubMed PubMed Central Google Scholar
White, J. K. et al. A neural network for information seeking. Nat. Commun. 10, 1–19 (2019).
Article CAS Google Scholar
Tindell, A. J., Berridge, K. C. & Aldridge, J. W. Ventral pallidal representation of Pavlovian cues and reward: population and rate codes. J. Neurosci. 24, 1058–1069 (2004).
Article CAS PubMed PubMed Central Google Scholar
Tachibana, Y. & Hikosaka, O. The primate ventral pallidum encodes expected reward value and regulates motor action. Neuron 76, 826–837 (2012).
Article CAS PubMed PubMed Central Google Scholar
Tian, J. et al. Distributed and mixed information in monosynaptic inputs to dopamine neurons. Neuron 91, 1374–1389 (2016).
Article CAS PubMed PubMed Central Google Scholar
Stephenson-Jones, M. et al. Opposing contributions of gabaergic and glutamatergic ventral pallidal neurons to motivational behaviors. Neuron 105, 921–933 (2020).
Article CAS PubMed PubMed Central Google Scholar
Kaplan, A., Mizrahi-Kliger, A. D., Israel, Z., Adler, A. & Bergman, H. Dissociable roles of ventral pallidum neurons in the basal ganglia reinforcement learning network. Nat. Neurosci. 23, 556–564 (2020).
Article CAS PubMed Google Scholar
Tooley, J. et al. Glutamatergic ventral pallidal neurons modulate activity of the habenula–tegmental circuitry and constrain reward seeking. Biol. Psychiatry 83, 1012–1023 (2018).
Article CAS PubMed PubMed Central Google Scholar
Faget, L. et al. Opponent control of behavioral reinforcement by inhibitory and excitatory projections from the ventral pallidum. Nat. Commun. 9, 849 (2018).
Article PubMed PubMed Central CAS Google Scholar
Sclafani, A., Hertwig, H., Vigorito, M. & Feigin, M. B. Sex differences in polysaccharide and sugar preferences in rats. Neurosci. Biobehav. Rev. 11, 241–251 (1987).
Article CAS PubMed Google Scholar
Mohebi, A. et al. Dissociable dopamine dynamics for learning and motivation. Nature 570, 65–70 (2019).
Article CAS PubMed PubMed Central Google Scholar
Roesch, M. R., Calu, D. J. & Schoenbaum, G. Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards. Nat. Neurosci. 10, 1615 (2007).
Article CAS PubMed PubMed Central Google Scholar
Takahashi, Y. K. et al. Expectancy-related changes in firing of dopamine neurons depend on orbitofrontal cortex. Nat. Neurosci. 14, 1590 (2011).
Article CAS PubMed PubMed Central Google Scholar
Takahashi, Y. K., Langdon, A. J., Niv, Y. & Schoenbaum, G. Temporal specificity of reward prediction errors signaled by putative dopamine neurons in rat VTA depends on ventral striatum. Neuron 91, 182–193 (2016).
Article CAS PubMed PubMed Central Google Scholar
Sutton, R. S. Learning to predict by the methods of temporal differences. Mach. Learn. 3, 9–44 (1988).
Google Scholar
Nakahara, H., Itoh, H., Kawagoe, R., Takikawa, Y. & Hikosaka, O. Dopamine neurons can represent context-dependent prediction error. Neuron 41, 269–280 (2004).
Article CAS PubMed Google Scholar
Fiorillo, C. D., Tobler, P. N. & Schultz, W. Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299, 1898–1902 (2003).
Article CAS PubMed Google Scholar
Eshel, N. et al. Arithmetic and local circuitry underlying dopamine prediction errors. Nature 525, 243–246 (2015).
Article CAS PubMed PubMed Central Google Scholar
Keiflin, R. & Janak, P. H. Dopamine prediction errors in reward learning and addiction: from theory to neural circuitry. Neuron 88, 247–263 (2015).
Article CAS PubMed PubMed Central Google Scholar
Watabe-Uchida, M., Eshel, N. & Uchida, N. Neural circuitry of reward prediction error. Annu. Rev. Neurosci. 40, 373–394 (2017).
Article CAS PubMed PubMed Central Google Scholar
Matsumoto, M. & Hikosaka, O. Lateral habenula as a source of negative reward signals in dopamine neurons. Nature 447, 1111–1115 (2007).
Article CAS PubMed Google Scholar
Tian, J. & Uchida, N. Habenula lesions reveal that multiple mechanisms underlie dopamine prediction errors. Neuron 87, 1304–1316 (2015).
Article CAS PubMed PubMed Central Google Scholar
Jhou, T. C., Fields, H. L., Baxter, M. G., Saper, C. B. & Holland, P. C. The rostromedial tegmental nucleus (RMTg), a GABAergic afferent to midbrain dopamine neurons, encodes aversive stimuli and inhibits motor responses. Neuron 61, 786–800 (2009).
Article CAS PubMed PubMed Central Google Scholar
Hong, S., Jhou, T. C., Smith, M., Saleem, K. S. & Hikosaka, O. Negative reward signals from the lateral habenula to dopamine neurons are mediated by rostromedial tegmental nucleus in primates. J. Neurosci. 31, 11457–11471 (2011).
Article CAS PubMed PubMed Central Google Scholar
Niv, Y., Daw, N. D., Joel, D. & Dayan, P. Tonic dopamine: opportunity costs and the control of response vigor. Psychopharmacology 191, 507–520 (2007).
Article CAS PubMed Google Scholar
Hamid, A. A. et al. Mesolimbic dopamine signals the value of work. Nat. Neurosci. 19, 117–126 (2016).
Article CAS PubMed Google Scholar
Bari, B. A. et al. Stable representations of decision variables for flexible behavior. Neuron 103, 922–933 (2019).
Article CAS PubMed PubMed Central Google Scholar
Beier, K. T. et al. Circuit architecture of vta dopamine neurons revealed by systematic input–output mapping. Cell 162, 622–634 (2015).
Article CAS PubMed PubMed Central Google Scholar
Hong, S. & Hikosaka, O. Diverse sources of reward value signals in the basal ganglia nuclei transmitted to the lateral habenula in the monkey. Front. Hum. Neurosci. 7, 778 (2013).
PubMed PubMed Central Google Scholar
Knowland, D. et al. Distinct ventral pallidal neural populations mediate separate symptoms of depression. Cell 170, 284–297 (2017).
Article CAS PubMed PubMed Central Google Scholar
Gale, S. D. & Perkel, D. J. A basal ganglia pathway drives selective auditory responses in songbird dopaminergic neurons via disinhibition. J. Neurosci. 30, 1027–1037 (2010).
Article CAS PubMed PubMed Central Google Scholar
Chen, R. et al. Songbird ventral pallidum sends diverse performance error signals to dopaminergic midbrain. Neuron 103, 266–276 (2019).
Article CAS PubMed PubMed Central Google Scholar
Kearney, M. G., Warren, T. L., Hisey, E., Qi, J. & Mooney, R. Discrete evaluative and premotor circuits enable vocal learning in songbirds. Neuron 104, 559–575 (2019).
Article CAS PubMed PubMed Central Google Scholar
Hnasko, T. S., Hjelmstad, G. O., Fields, H. L. & Edwards, R. H. Ventral tegmental area glutamate neurons: electrophysiological properties and projections. J. Neurosci. 32, 15076–15085 (2012).
Article CAS PubMed PubMed Central Google Scholar
Leung, B. K. & Balleine, B. W. Ventral pallidal projections to mediodorsal thalamus and ventral tegmental area play distinct roles in outcome-specific Pavlovian-instrumental transfer. J. Neurosci. 35, 4953–4964 (2015).
Article CAS PubMed PubMed Central Google Scholar
Prasad, A. A. et al. Complementary roles for ventral pallidum cell types and their projections in relapse. J. Neurosci. 40, 880–893 (2020).
Article CAS PubMed PubMed Central Google Scholar
Richard, J. M., Stout, N., Acs, D. & Janak, P. H. Ventral pallidal encoding of reward-seeking behavior depends on the underlying associative structure. eLife 7, e33107 (2018).
Article PubMed PubMed Central Google Scholar
Ottenheimer, D. J., Wang, K., Haimbaugh, A., Janak, P. H. & Richard, J. M. Recruitment and disruption of ventral pallidal cue encoding during alcohol seeking. Eur. J. Neurosci. 50, 3428–3444 (2019).
Article PubMed PubMed Central Google Scholar
Elber-Dorozko, L. & Loewenstein, Y. Striatal action-value neurons reconsidered. eLife 7, e34248 (2018).
Mathis, A. et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nat. Neurosci. 21, 1281–1289 (2018).
Article CAS PubMed Google Scholar
Nath, T. et al. Using DeepLabCut for 3D markerless pose estimation across species and behaviors. Nat. Protoc. 14, 2152–2176 (2019).
Article CAS PubMed Google Scholar
Ottenheimer, D. J. et al. Analysis of a reward prediction error signal in ventral pallidum. G-Node https://doi.org/10.12751/g-node.3lbd0c (2020).

Download references

Acknowledgements

This work was supported by the National Institutes of Health (grant nos. 5T32NS91018-17 (to D.J.O.), F30MH110084 (to B.A.B.), K99AA025384 (to J.M.R.), R01DA042038 and R01NS104834 (to J.Y.C.), and R01DA035943 (to P.H.J.)), by Klingenstein-Simons, MQ, NARSAD, and Whitehall (to J.Y.C.), by a NARSAD Young Investigator Award (to J.M.R.) and by the National Science Foundation Graduate Research Fellowship (grant no. DGE1746891 to D.J.O.). We thank K. Wang and X. Tong for technical assistance.

Author information

These authors contributed equally: David J. Ottenheimer, Bilal A. Bari.

Authors and Affiliations

Solomon H. Snyder Department of Neuroscience, Johns Hopkins University, Baltimore, MD, USA
David J. Ottenheimer, Bilal A. Bari, Elissa Sutlief, Jeremiah Y. Cohen & Patricia H. Janak
Brain Science Institute, Johns Hopkins University, Baltimore, MD, USA
Bilal A. Bari & Jeremiah Y. Cohen
Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, USA
Kurt M. Fraser, Tabitha H. Kim, Jocelyn M. Richard & Patricia H. Janak
Department of Neuroscience, University of Minnesota, Minneapolis, MN, USA
Jocelyn M. Richard
Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD, USA
Jeremiah Y. Cohen & Patricia H. Janak

Authors

David J. Ottenheimer
View author publications
You can also search for this author in PubMed Google Scholar
Bilal A. Bari
View author publications
You can also search for this author in PubMed Google Scholar
Elissa Sutlief
View author publications
You can also search for this author in PubMed Google Scholar
Kurt M. Fraser
View author publications
You can also search for this author in PubMed Google Scholar
Tabitha H. Kim
View author publications
You can also search for this author in PubMed Google Scholar
Jocelyn M. Richard
View author publications
You can also search for this author in PubMed Google Scholar
Jeremiah Y. Cohen
View author publications
You can also search for this author in PubMed Google Scholar
Patricia H. Janak
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

D.J.O., J.M.R. and P.H.J. designed the experiments. D.J.O. collected the electrophysiology data. D.J.O., K.M.F. and T.H.K. collected the optogenetic data. B.A.B. designed and fit the models in consultation with D.J.O. D.J.O., B.A.B. and E.S. analyzed and visualized the data. D.J.O., B.A.B., J.M.R., J.Y.C. and P.H.J. interpreted the data. D.J.O., B.A.B. and P.H.J. prepared the manuscript with comments from E.S., K.M.F., T.H.K., J.M.R. and J.Y.C.

Corresponding author

Correspondence to Patricia H. Janak.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Placements for random sucrose/maltodextrin, random sucrose/maltodextrin/water, and blocked sucrose/maltodextrin rats.

Recording locations for nucleus accumbens (left) and ventral pallidum (right) rats.

Extended Data Fig. 2 Evaluation of model fitting.

(a) Distribution of the learning rate, α, for RPE neurons in VP (green) and NAc (orange). (b) Likelihood (LH) per trial for RPE and Current outcome neurons (n = 72 RPE and 126 Current outcome neurons from 5 rats) for RPE and Current outcome models, relative to the LH per trial of the Unmodulated model. Lower (more negative) indicates a better fit. Line represents median, box represents 25th and 75th percentile, and whiskers extend to 1.5 times the interquartile range. Red highlights the AIC-selected model. Median [25^th to 75th percentile; min to max] ∆LH/trial are: RPE neurons, RPE model -0.21 [-0.39 to −0.14; −3.16 to −0.05], RPE neurons, Current outcome model −0.15 [−0.32 to −0.09; −3.03 to −0.02], Current outcome neurons, RPE model -0.12 [-0.23 to -0.07; -0.174 to -0.03], Current outcome neurons, Current outcome model -0.12 [-0.22 to -0.07; -1.73 to -0.03]. Median [25th-75th percentile] LH per trial for RPE neurons was 2.29 [2.04 to 2.49] and for Current outcome neurons was 2.15 [1.92 to 2.37]. (c) Model recovery, plotted as the fraction of neurons simulated with each model recovered as that model. (d) Distribution of difference between the true value of the parameters used to simulate the neurons in (c) and the values recovered by MLE.

Extended Data Fig. 3 Placements for optogenetic experiments.

(a) Expression of ArchT3.0:YFP and fiber tip placement for the rats included in the ArchT3.0 group for the optogenetic experiment in Fig. 3. (b) Expression of ChR2:GFP and fiber tip placement for the rats included in the ChR2 group. Pattern of results remained unchanged with or without inclusion of the rat with the most caudal placement.

Extended Data Fig. 4 Supplemental optogenetic data.

(a) Mean(+/−SEM) port occupancy in time surrounding reward delivery on laser and no laser trials for YFP (left, n = 7 rats) and ArchT (right, n = 7 rats) groups. (b) Mean(+/−SEM) port occupancy in time surrounding reward delivery on laser and no laser trials for GFP (left, n = 7 rats) and ChR2 (right, n = 11 rats) groups. To account for the disruption of port occupancy by laser stimulation, we ran our distance from port analysis on the time beyond 15 s past reward delivery and found the same pattern of results. (c) Additional optogenetic experiment in ChR2 rats and controls where the 2 sec of laser stimulation was at the onset of the cue. (d) Mean(+/−SEM) distance from port in the ITI following laser stimulation did not differ from no laser trials for GFP (p = 0.94, Wilcoxon signed-rank test, two-sided, n = 7 rats) or ChR2 (p = 0.11, Wilcoxon signed-rank test, two-sided, n = 10 rats) groups. (e) The effect of laser was similar across both groups (median: 0.06 GFP, n = 7 rats; -0.09 ChR2, n = 10 rats; p = 0.36, Wilcoxon rank-sum test, two-sided).

Extended Data Fig. 5 Value encoding in VP at the time of cue onset in the random sucrose/maltodextrin task.

(a) Schematic of model-fitting and neuron classification process. For each neuron, the reward outcome and spike count following reward delivery on each trial were used to fit two models: Value and Unmodulated. Akaike information criterion (AIC) was used to select the best model (right). (b) Mean(+/−SEM) activity of neurons best fit by each of the models, plotted according to previous outcome (n = 39 Value and 397 Unmodulated neurons from 5 rats). (c) Coefficients(+/−SE) for outcome history linear regression for each class of neurons (n = 39 Value and 397 Unmodulated neurons). (d) Mean(+/−SEM) activity of all Value neurons with trials binned by model-derived Value. (e) Mean(+/−SEM) population activity of simulated and actual Value neurons according to each trial’s Value (V). (f) Model recovery, plotted as the fraction of neurons simulated with each model recovered as that model.

Extended Data Fig. 6 Value encoding at the time of cue onset in the random sucrose/maltodextrin/water task.

(a) Fraction of VP neurons best fit by the Value and Unmodulated models in the random sucrose/maltodextrin/water task. (b) Mean(+/−SEM) activity of neurons best fit by each of the models, plotted according to previous outcome (n = 38 Value and 216 Unmodulated neurons from 3 rats). (c) Coefficients(+/−SE) for outcome history linear regression for each class of neurons (n = 38 Value and 216 Unmodulated neurons). (d) Mean(+/−SEM) population activity of simulated and actual Value neurons according to each trial’s Value (V). (e) Mean(+/−SEM) activity of all Value neurons with trials binned by model-derived Value. (f) Distribution of correlations between individual VP neurons’ firing rates at cue onset on each trial and the distance from the port during the previous ITI. * = p = 0.00001 for negative shift in mean correlation coefficient (vertical line) compared to 1000 shuffles of data for Value neurons, Wilcoxon signed-rank test, two-sided, as well as p = 0.0000002 for more negative coefficients for Value neurons compared to Unmodulated neurons, Wilcoxon rank-sum test, two-sided. See also Fig. 4c,d.

Extended Data Fig. 7 Placements for predictable and random sucrose/maltodextrin rats.

Recording locations for rats from predictable and random sucrose/maltodextrin experiment in Extended Data Fig. 8.

Extended Data Fig. 8 Impact of specific cue-derived predictions on VP firing.

(a) Task schematic: three auditory cues indicated three trial types. (b) Median latency to enter reward port following onset of cue for each trial type, plotted as the mean(+/−SEM) across all sessions for each rat (gray lines, n = 8, 9, 10, and 10 sessions for the 4 rats) and the overall mean(+/−SEM) (n = 37 sessions). (c) Percentage sucrose of total solution consumption in a two-bottle choice, before (‘Initial’) and after (‘Final’) recording (n = 4 rats). (d) Mean(+/−SEM) lick rate relative to reward delivery for each trial type (n = 37 sessions from 4 rats). (e) Mean(+/−SEM) activity of all neurons recorded in the predictable and random sucrose/maltodextrin task, aligned to reward delivery (n = 487 neurons from 4 rats). (f) Schematic of cue model-fitting. The best model (of 6 total) was selected with Akaike information criterion. (g) Fraction of the population best fit by each model. (h) Coefficients(+/−SE) for outcome history regression for each class of neurons with no cue effect (n = 38 RPE, 135 Current outcome, and 204 Unmodulated neurons). (i) Mean(+/−SEM) activity of all RPE neurons with no cue effect (n = 38 neurons). The trials for each neuron are binned according to their model-derived RPE. (j) Population activity of simulated and actual VP RPE neurons with no cue effect according to each trial’s RPE value. (k) Scatterplot of each cue effect neuron’s weight for specific sucrose and maltodextrin cues (n = 7 RPE, 33 Current outcome, and 70 Unmodulated cells with cue effects). The percentage of neurons falling in each quadrant is indicated. The percentage in our quadrant of interest (positive value for sucrose and negative value for maltodextrin) did not differ from chance (p = 0.1 for exact binomial test compared to null of 25%). (l) Mean(+/−SEM) activity of neurons with sucrose values > 0 and maltodextrin values < 0, consistent with a value-based cued expectation modulation. (m) Neurons with cue effects for cue-evoked signaling, rather than reward-evoked signaling, as in (g). (n) As in (k), for activity at the time of the cue rather than time of reward (n = 143 neurons with cue effects). * = p = 0.00001 for exact binomial test compared to null of 25%. (o) As in (l), for activity at the time of the cue rather than time of reward.

Extended Data Fig. 9 Classifying neurons with BIC instead of AIC.

(a) Fraction of neurons classified as RPE, Current outcome, and Unmodulated in VP and NAc in the random sucrose/maltodextrin task using Bayesian information criterion (BIC) as the selection criterion. (b) Coefficients(+/−SE) for outcome history regression for VP neurons of each BIC subset (n = 37 RPE, 110 Current outcome, and 289 Unmodulated cells from 5 rats). (c) Population mean(+/−SEM) of all VP BIC RPE neurons, binned according to the model-derived RPE. (d) Mean(+/−SEM) population activity of simulated and actual BIC RPE neurons according to each trial’s RPE value for VP (left) and NAc (right). (e) Distribution of correlations between model-predicted and actual spiking for all RPE neurons from each region. (f) Distribution of α for RPE neurons in VP (green) and NAc (orange). (g) Mean(+/−SEM) activity of VP neurons classified as RPE by AIC but not BIC according to current and previous outcome (n = 35 neurons). (h) Coefficients(+/−SE) for outcome history regression for these neurons. (i) Mean(+/−SEM) activity of these neurons binned according to model-derived RPE on each trial.

Supplementary information

Reporting summary

Supplementary software

Software used to analyze the data in this manuscript.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ottenheimer, D.J., Bari, B.A., Sutlief, E. et al. A quantitative reward prediction error signal in the ventral pallidum. Nat Neurosci 23, 1267–1276 (2020). https://doi.org/10.1038/s41593-020-0688-5

Download citation

Received: 16 October 2019
Accepted: 07 July 2020
Published: 10 August 2020
Issue Date: October 2020
DOI: https://doi.org/10.1038/s41593-020-0688-5

This article is cited by

A neural mechanism for conserved value computations integrating information and rewards
- Ethan S. Bromberg-Martin
- Yang-Yang Feng
- Ilya E. Monosov
Nature Neuroscience (2024)
Disruption of positive- and negative-feature morphine interoceptive occasion setters by dopamine receptor agonism and antagonism in male and female rats
- Davin R Peart
- Caitlin J Nolan
- Jennifer E Murray
Psychopharmacology (2024)
The behavioral signature of stepwise learning strategy in male rats and its neural correlate in the basal forebrain
- Hachi E. Manzur
- Ksenia Vlasov
- Shih-Chieh Lin
Nature Communications (2023)
Additive cortical gray matter deficits in people living with HIV who use cocaine
- Ryan P. Bell
- Sheri L. Towe
- Christina S. Meade
Journal of NeuroVirology (2023)
The antipsychotic drug sulpiride in the ventral pallidum paradoxically impairs learning and induces place preference
- Daniella Dusa
- Tamás Ollmann
- László Péczely
Scientific Reports (2022)