Abstract
Reinforcement learning models treat the basal ganglia (BG) as an actor–critic network. The ventral pallidum (VP) is a major component of the BG limbic system. However, its precise functional roles within the BG circuitry, particularly in comparison to the adjacent external segment of the globus pallidus (GPe), remain unexplored. We recorded the spiking activity of VP neurons, GPe cells (actor) and striatal cholinergic interneurons (critic) while monkeys performed a classical conditioning task. Here, we report that VP neurons can be classified into two distinct populations. The persistent population displayed sustained activation following visual cue presentation, was correlated with monkeys’ behavior and showed uncorrelated spiking activity. The transient population displayed phasic synchronized responses that were correlated with the rate of learning and the reinforcement learning model’s prediction error. Our results suggest that the VP is physiologically different from the GPe and identify the transient VP neurons as a BG critic.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Data from this study are available from the corresponding author upon reasonable request.
Code availability
The code related to this study is available from the corresponding author upon reasonable request.
References
Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997).
Deffains, M. & Bergman, H. Striatal cholinergic interneurons and cortico–striatal synaptic plasticity in health and disease. Mov. Disord. 30, 1014–1025 (2015).
Tachibana, Y. & Hikosaka, O. The primate ventral pallidum encodes expected reward value and regulates motor action. Neuron 76, 826–837 (2012).
Root, D. H. et al. Differential roles of ventral pallidum subregions during cocaine self-administration behaviors. J. Comp. Neurol. 521, 558–588 (2013).
Panagis, G. & Spyraki, C. Neuropharmacological evidence for the role of dopamine in ventral pallidum self-stimulation. Psychopharmacology 123, 280–288 (1996).
Miller, J. M. et al. Anhedonia after a selective bilateral lesion of the globus pallidus. Am. J. Psychiatry 163, 786–788 (2006).
Childress, A. R. et al. Prelude to passion: limbic activation by ‘unseen’ drug and sexual cues. PLoS One 3, e1506 (2008).
Smith, K. S., Tindell, A. J., Aldridge, J. W. & Berridge, K. C. Ventral pallidum roles in reward and motivation. Behav. Brain Res. 196, 155–167 (2009).
Switzer, R. C., Hill, J. & Heimer, L. The globus pallidus and its rostroventral extension into the olfactory tubercle of the rat: a cyto- and chemoarchitectural study. Neuroscience 7, 1891–1904 (1982).
Zahm, D. S. & Heimer, L. Ventral striatopallidal parts of the basal ganglia in the rat: I. Neurochemical compartmentation as reflected by the distributions of neurotensin and substance P immunoreactivity. J. Comp. Neurol. 272, 516–535 (1988).
Smith, Y., Parent, A., Seguela, P. & Descarries, L. Distribution of GABA-immunoreactive neurons in the basal ganglia of the squirrel monkey (Saimiri sciureus). J. Comp. Neurol. 259, 50–64 (1987).
Mitchell, S. J., Richardson, R. T., Baker, F. H. & DeLong, M. R. The primate globus pallidus: neuronal activity related to direction of movement. Exp. Brain Res. 68, 491–505 (1987).
Napier, T. C., Simson, P. E. & Givens, B. S. Dopamine electrophysiology of ventral pallidal/substantia innominata neurons: comparison with the dorsal globus pallidus. J. Pharmacol. Exp. Ther. 258, 249–262 (1991).
Alexander, G. Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annu. Rev. Neurosci. 9, 357–381 (1986).
Haber, S. N., Groenewegen, H. J., Grove, E. A. & Nauta, W. J. H. Efferent connections of the ventral pallidum: evidence of a dual striato pallidofugal pathway. J. Comp. Neurol. 235, 322–335 (1985).
Wright, C. I., Beijer, A. V. & Groenewegen, H. J. Basal amygdaloid complex afferents to the rat nucleus accumbens are compartmentally organized. J. Neurosci. 16, 1877–1893 (1996).
McGeorge, A. J. & Faull, R. L. The organization of the projection from the cerebral cortex to the striatum in the rat. Neuroscience 29, 503–537 (1989).
O’Doherty, J. Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science 304, 452–454 (2004).
Haber, S. N., Lynd-Balta, E. & Mitchell, S. J. The organization of the descending ventral pallidal projections in the monkey. J. Comp. Neurol. 329, 111–128 (1993).
Kalivas, P. W., Churchill, L. & Klitenick, M. A. GABA and enkephalin projection from the nucleus accumbens and ventral pallidum to the ventral tegmental area. Neuroscience 57, 1047–1060 (1993).
Grace, A. A. Dysregulation of the dopamine system in the pathophysiology of schizophrenia and depression. Nat. Rev. Neurosci. 17, 524–532 (2016).
DeLong, M. R. Activity of pallidal neurons during movement. J. Neurophysiol. 34, 414–427 (1971).
Contreras, C. M., Mexicano, G. & Guzman-Flores, C. A stereotaxic brain atlas of the green monkey (Cercopithecus aethiops aethiops). Bol. Estud. Med. Biol. 31, 383–428 (1981).
Martin, R. F. & Bowden, D. M. Primate Brain Maps: Structure of the Macaque Brain. (Elsevier, 2000).
Adler, A. et al. Temporal convergence of dynamic cell assemblies in the striato–pallidal network. J. Neurosci. 32, 2473–2484 (2012).
Kimura, M., Rajkowski, J. & Evarts, E. Tonically discharging putamen neurons exhibit set-dependent responses. Proc. Natl Acad. Sci. USA 81, 4998–5001 (1984).
Graybiel, A. M., Aosaki, T., Flaherty, A. W. & Kimura, M. The basal ganglia and adaptive motor control. Science 265, 1826–1831 (1994).
Raz, A., Vaadia, E. & Bergman, H. Firing patterns and correlations of spontaneous discharge of pallidal neurons in the normal and the tremulous 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine vervet model of parkinsonism. J. Neurosci. 20, 8559–8571 (2000).
Morris, G., Arkadir, D., Nevet, A., Vaadia, E. & Bergman, H. Coincident but distinct messages of midbrain dopamine and striatal tonically active neurons. Neuron 43, 133–143 (2004).
Joshua, M. et al. Synchronization of midbrain dopaminergic neurons is enhanced by rewarding events. Neuron 62, 695–704 (2009).
Gawne, T. J. & Richmond, B. J. How independent are the messages carried by adjacent inferior temporal cortical neurons? J. Neurosci. 13, 2758–2771 (1993).
Williams, Z. M. & Eskandar, E. N. Selective enhancement of associative learning by microstimulation of the anterior caudate. Nat. Neurosci. 9, 562–568 (2006).
Joshua, M., Adler, A., Mitelman, R., Vaadia, E. & Bergman, H. Midbrain dopaminergic neurons and striatal cholinergic interneurons encode the difference between reward and aversive events at different epochs of probabilistic classical conditioning trials. J. Neurosci. 28, 11673–11684 (2008).
Parent, A. & Hazrati, L. N. Functional anatomy of the basal ganglia. I. The cortico-basal ganglia-thalamo-cortical loop. Brain Res. Brain Res. Rev. 20, 91–127 (1995).
Adler, A., Finkes, I., Katabi, S., Prut, Y. & Bergman, H. Encoding by synchronization in the primate striatum. J. Neurosci. 33, 4854–4866 (2013).
Kita, H. Globus pallidus external segment. Prog. Brain Res. 160, 111–133 (2007).
Bar-Gad, I., Morris, G. & Bergman, H. Information processing, dimensionality reduction and reinforcement learning in the basal ganglia. Prog. Neurobiol. 71, 439–473 (2003).
Iskhakova, L. et al. Computational physiology of the basal ganglia, movement disorders, and their therapy. in Movement Disorders Curricula 3–10 (Springer Vienna, 2017).
Root, D. H., Melendez, R. I., Zaborszky, L. & Napier, T. C. The ventral pallidum: subregion-specific functional anatomy and roles in motivated behaviors. Prog. Neurobiol. 130, 29–70 (2015).
Saga, Y. et al. Ventral pallidum encodes contextual information and controls aversive behaviors. Cereb. Cortex 27, 2528–2543 (2017).
Schultz, W. Dopamine reward prediction-error signalling: a two-component response. Nat. Rev. Neurosci. 17, 183–195 (2016).
Mitelman, R., Joshua, M., Adler, A. & Bergman, H. A noninvasive, fast and inexpensive tool for the detection of eye open/closed state in primates. J. Neurosci. Methods 178, 350–356 (2009).
Joshua, M., Elias, S., Levine, O. & Bergman, H. Quantifying the isolation quality of extracellularly recorded action potentials. J. Neurosci. Methods 163, 267–282 (2007).
Bar-Gad, I., Ritov, Y., Vaadia, E. & Bergman, H. Failure in identification of overlapping spikes from multiple neuron activity causes artificial correlations. J. Neurosci. Methods 107, 1–13 (2001).
Acknowledgements
We thank Y. Dagan and T. Ravins-Yaish for assistance with animal care; A. Bick and A. Payis for assistance with the MRI scan; and H. Gabbay, S. Freeman, U. Werner-Reiss and E. Singer for general assistance. We thank A. Shapochnikov for help in preparing the experimental setup. We also thank M. Deffains for helpful comments and discussion. This work was supported by grants from the European Research Council (grant 322495) and the Rosetrees Trust (grant M93-F1) to H.B.
Author information
Authors and Affiliations
Contributions
A.K., A.M.K. and H.B. designed the research. Z.I. performed surgery. A.K. and A.M.K. collected the data. A.K. analyzed the data and implemented the reinforcement learning model. A.A. collected and analyzed the ventral striatum data. A.K. and H.B. wrote the manuscript. All authors read and commented on the final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Neuroscience thanks Atsushi Nambu, Yoshihisa Tachibana and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 GPe and VP neurons display different discharge properties and spatial arrangements.
a, Distributions of the mean discharge rates of GPe (top) and VP (bottom) neurons, computed during the last 3 s of the inter-trial interval (ITI) period. Red vertical lines represent the average discharge rate of each structure. N = 97 GPe and 43 VP neurons. ***P = 3 ×10−19, two-sided Mann-Whitney U test. b, Distributions of the mean coefficient of variation (CV) of the inter-spike intervals (ISIs). Same conventions as in (a). N = 97 GPe and 43 VP neurons. **P = 0.0073, two-sided Mann-Whitney U test. c, Spike waveform durations, computed as the time difference between the first negative peak and the next positive peak. N = 97 GPe and 43 VP neurons. ***P = 4 ×10−11, two-sided Mann-Whitney U test. d, Spatial distribution of GPe and VP neurons. Each dot represents a single cell. Brown, GPe cells; gray, VP cells. Abscissa: coordinates in a parasagittal plane (in mm); A, anterior; P, posterior; zero is coronal section AC0 (AC, anterior commissure); ordinate: coordinates in the horizontal plane (in mm); M, medial; L, lateral; zero is the central coordinate of both structures in our recordings; Z-axis: depth from entry to the cortex (in mm). N = 97 GPe and 43 VP neurons.
Extended Data Fig. 2 Absolute population responses of GPe and VP neurons.
a, Absolute population responses of GPe cells for the reward, neutral and aversive trials, N = 97 neurons. Abscissa, time (−0.5 – 2 s). Vertical dashed lines indicate the times of cue onset; ordinate, firing rate in Hz, normalized by the mean discharge rate during the last 3 s of the ITI period. Shaded regions represent SEMs. b, Absolute population responses of VP cells, N = 43 neurons. Same conventions as in (a).
Extended Data Fig. 3 VP neurons exhibit similar average responses across the different blocks.
a, Mean population responses of all VP neurons in the OTR-D block. Abscissa, time (−2 – 4 s). Vertical dashed lines at 0 and 2 s indicate the times of cue onset and outcome delivery, respectively; ordinate, firing rate in Hz, normalized by the mean discharge rate during the last 3 s of the ITI period. Shaded regions represent SEMs. b, Mean population responses of all VP neurons in the OTR-P block. Same conventions as in (a). c, Mean population responses of all VP neurons in the NOV-D block. Same conventions as in (a). d, Mean population responses of all VP neurons in the NOV-P block. Same conventions as in (a). Data for this figure were taken from all 43 VP neurons that met the inclusion criteria.
Extended Data Fig. 4 VP cells and TANs display higher TP indices than GPe neurons.
a, Transient-Persistent (TP) indices of GPe neurons, VP cells and TANs in the reward trials of all four blocks. The mean value of the absolute deviation from baseline on the interval [1-2 s] was subtracted from the maximum of the absolute deviation from baseline on the interval [0-1 s]. The result was divided by the sum of these two real numbers to obtain the TP index. Each dot represents the TP index of a single cell (N = 97, 43 and 24 for the GPe, VP and TAN, respectively). Bars indicate the mean TP indices. Error bars represent SEMs. **P = 0.0036 (VP and GPe), ***P = 6 ×10−4 (TAN and GPe), two-sided Mann-Whitney U test. b, TP indices of GPe neurons, VP cells and TANs in the neutral trials. Same conventions as in (a). **P = 0.0059 (VP and GPe), two-sided Mann-Whitney U test. c, TP indices of GPe neurons, VP cells and TANs in the aversive trials. Same conventions as in (a). *P = 0.0287 (TAN and GPe), two-sided Mann-Whitney U test.
Extended Data Fig. 5 Transient VP average neuronal responses are not strongly affected by the number of clusters in the k-means algorithm.
Grouping of VP neuronal responses using different numbers of clusters in the k-means algorithm. The data points on the left depict the output of the PCA for the representative case of the reward trials. PC1 and PC2 are the first and second principal components, respectively. Clusters 1-4 represent the average VP responses in the different groups. Abscissa, time (-0.5 – 2 s). Vertical dashed lines indicate the times of cue onset; ordinate, firing rate in Hz, normalized by the mean discharge rate during the last 3 s of the ITI period. Shaded regions represent SEMs. N represents the number of VP cells in each cluster.
Extended Data Fig. 6 Monkeys exhibit similar learning kinetics throughout the NOV-D block.
a, Learning curves for monkey D, for the reward, neutral and aversive trials. Abscissa, trial number; ordinate, mean number of licks minus the mean number of blinks (computed between 0-1 s; 0 is the time of cue onset), normalized between zero and one. Error bars represent SEMs across trials (N = 10 trials for each cue type, 117 repetitions per trial). All curves were smoothed with a 5-point moving average. b, Learning curves for monkey N. Same conventions as in (a). N = 10 trials for each cue type, 74 repetitions per trial.
Extended Data Fig. 7 Transient versus persistent VP neuronal responses do not reflect different discharge properties or spatial layout.
a, Distributions of the mean discharge rates of transient (top) and persistent (bottom) VP neurons (N = 29 and 14 neurons, respectively), computed during the last 3 s of the ITI period. Red vertical lines represent the average discharge rates. b, Distributions of the mean CV of the ISIs. Same conventions as in (a). c, Spike waveform durations, measured as the time difference from the first negative peak to the next positive peak. d, Spatial distribution of VP neurons. Each dot represents a single VP cell. Purple, transient VP cells; olive, persistent VP cells. Abscissa: coordinates in a parasagittal plane (in mm); A, anterior; P, posterior; zero is coronal section AC0 (AC, anterior commissure). Ordinate: coordinates in the horizontal plane (in mm); M, medial; L, lateral; zero is the center of the VP in our recordings. Z-axis: depth from entry to the cortex (in mm). e, Fraction of pairs of VP neurons belonging to the same transient/persistent cluster. “Same session”, cell pairs recorded from different electrodes in the same recording sessions; “different session”, cell pairs recorded in different sessions.
Extended Data Fig. 8 VS TANs display homogeneous responses to cue presentations, as well as correlational spiking activity.
a, TAN responses to cue events. Each row is the Z-score transformed PSTH of a single neuron to the presentation of cues that signal rewarding, neutral or aversive outcomes. Z-scores are color-coded. Abscissa, time (0 – 2 s), zero is the time of cue onset; ordinate, unit number. Cells are randomly ordered. N = 43 neurons. b, TAN average population responses to cue presentations. Abscissa, time (-0.5 – 2 s). The vertical dashed line at t = 0 indicates the time of cue onset; ordinate, normalized firing rate in Hz. Blue, reward trials; green, neutral trials; red, aversive trials. Shaded regions represent SEMs. N = 43 neurons for each trial type. c, Left: mean cross-correlation histogram of simultaneously recorded pairs of TANs (N = 8 pairs). Abscissa: time, ±1 s around the trigger spike (time = 0); ordinate: normalized firing rate in Hz. Shaded region represents SEM. Right: distribution of TAN signal correlations (N = 453 pairs).
Extended Data Fig. 9 Clustering analysis reveals three distinct populations of VS MSNs, which differ in their response profiles and correlation patterns.
a, MSN responses to cue events. Each row is the Z-score transformed PSTH of a single neuron to the presentation of cues that signal rewarding, neutral or aversive outcomes. Z-scores are color-coded. Abscissa, time (0–2 s), zero is the time of cue onset; ordinate, unit number. Cells are ordered by clusters. Within each cluster, cells are randomly ordered. N = 394 neurons. b, Top: MSN population responses to cue presentations. Abscissa, time (-0.5 – 2 s). The vertical dashed lines at t = 0 indicate the time of cue onset; ordinate, normalized firing rate in Hz. Blue, reward trials; green, neutral trials; red, aversive trials. Shaded regions represent SEMs. Bottom: MSN absolute population responses to cue presentations. N = 163, 85 and 146 neurons, from left to right. c, Top: mean cross-correlation histograms. N = 99, 34 and 70 pairs, from left to right. Abscissa: time, ±1 s around the trigger spike (time = 0); ordinate: normalized firing rate in Hz. Shaded regions represent SEMs. Bottom: distributions of signal correlations. N = 6771, 1776 and 5257 pairs, from left to right.
Extended Data Fig. 10 Transient VP neurons show evidence of reward prediction error encoding during learning, in contrast to persistent VP and GPe neurons.
a, Transient VP average population responses in the OTR-D and OTR-P blocks. In the OTR-P block, only trials in which the outcome was given are presented. Abscissa, time (-2 – 4 s). Vertical dashed lines at t = 0 and t = 2 s indicate the times of cue onset and outcome delivery, respectively; ordinate, firing rate in Hz, normalized by the mean discharge rate during the last 3 s of the ITI period. N = 29 neurons. Shaded regions represent SEMs. b, Transient VP, persistent VP and GPe average population responses in the first and last two trials of the NOV-D block. Top: reward trials; middle: neutral trials; bottom: aversive trials. N = 27, 12 and 94 neurons for the transient VP, persistent VP and GPe, respectively. Same conventions as in (a). c, Transient VP, persistent VP and GPe average population responses in the first and last trials of the NOV-P block, for trials in which the reward outcome was given (upper row) and in which it was omitted (lower row). N = 26, 12 and 93 neurons for the transient VP, persistent VP and GPe, respectively. Same conventions as in (a).
Supplementary information
Supplementary Information
Supplementary Figs. 1 and 2.
Rights and permissions
About this article
Cite this article
Kaplan, A., Mizrahi-Kliger, A.D., Israel, Z. et al. Dissociable roles of ventral pallidum neurons in the basal ganglia reinforcement learning network. Nat Neurosci 23, 556–564 (2020). https://doi.org/10.1038/s41593-020-0605-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41593-020-0605-y
This article is cited by
-
A Bio-Inspired Integration Model of Basal Ganglia and Cerebellum for Motion Learning of a Musculoskeletal Robot
Journal of Systems Science and Complexity (2024)
-
Activation of Ventral Pallidum CaMKIIa-Expressing Neurons Promotes Wakefulness
Neurochemical Research (2023)
-
Ventral pallidum neurons dynamically signal relative threat
Communications Biology (2021)
-
A quantitative reward prediction error signal in the ventral pallidum
Nature Neuroscience (2020)