Introduction

The dorsal striatum is a critical structure supporting reinforcement learning [1]. Classical intracranial self-stimulation (ICSS) studies in rats identified stimulation sites throughout the dorsal striatum that support operant responding [2, 3]. More recently, application of optogenetic techniques demonstrated that selective stimulation of nigrostriatal dopamine neurons is sufficient to support reinforcement learning [4, 5]. In addition, activation of striatal projection neurons (medium spiny neurons, MSNs) in either the medial (DMS) or lateral (DLS) subregion of the dorsal striatum reinforces actions [6,7,8]. Because MSN activity critically depends on glutamatergic inputs, it is necessary to consider how glutamatergic projections to the striatum support reinforcement learning. Currently, whether activation of glutamatergic inputs to the dorsal striatum is sufficient to drive reinforcement remains unclear.

MSNs receive glutamatergic inputs from cortical and thalamic regions [9,10,11]. Whereas corticostriatal involvement in reinforcement learning has been studied extensively [12], few studies have examined the contributions of thalamic projections [13]. Inputs to the striatum from rostral intralaminar thalamic nuclei and from the parafascicular nucleus provide excitatory inputs to MSNs [14,15,16,17]. In addition, thalamic inputs to striatal cholinergic interneurons (CINs) locally control dopamine release via acetylcholine-mediated activation of nicotinic receptors on dopamine neuron terminals [18, 19]. The combined actions of thalamostriatal pathways on MSNs and dopamine release in the striatum suggest that activation of these projections could support behavioral reinforcement. Concordantly, electrical stimulation of intralaminar thalamic nuclei supports ICSS [20]. Additionally, optogenetic stimulation of rostral intralaminar thalamic terminals in the dorsal striatum maintains operant responding in mice that were previously trained to respond for food [19]. However, it remains unknown whether stimulation of thalamostriatal terminals in the DMS is sufficient to reinforce a novel, self-initiated action.

Presynaptic G protein-coupled receptors (GPCRs) expressed on glutamatergic terminals in the striatum regulate synaptic transmission and exert substantial influence over reinforcement [21, 22]. Agonists of the Gi/o-coupled group II metabotropic glutamate (mGlu) receptors (mGlu2 and mGlu3) produce particularly robust presynaptic inhibition of glutamate release that is mediated by mGlu2 [23,24,25,26,27]. In addition, group II mGlu receptor activation reduces both basal and drug-evoked extracellular dopamine levels in the striatum [28,29,30,31,32]. Activation of mGlu2 produces strong inhibition of thalamostriatal glutamatergic transmission onto MSNs and CINs and reduces thalamically driven, acetylcholine-mediated dopamine release [23]. Thus, mGlu2 is poised to influence behavioral reinforcement associated with thalamostriatal transmission in the dorsal striatum.

Here we evaluated operant responding for optical stimulation of thalamic terminals in the DMS. We report that mice readily press a lever for thalamostriatal stimulation without prior training for an alternative outcome or manipulation of motivational state. Consistent with known effects of mGlu2 on thalamically driven glutamate and dopamine transmission [23], group II mGlu receptor activation reduces thalamostriatal self-stimulation, whereas mGlu2/3 receptor blockade enhances responding. These results support a role for thalamostriatal transmission in reinforcement learning.

Materials and methods

Animals

Male and female mice were 5–8 weeks old at the time of surgery and 10–14 weeks old when behavioral experiments began. C57BL/6J mice were purchased from the Jackson Laboratory (strain 000664; Bar Harbor, ME) and arrived in the housing facility at least 1 week prior to surgery. To produce mice in which ChR2 was expressed under the control of the Vglut2 promoter, Vglut2-IRES-Cre+/− mice (The Jackson Laboratory #028863) were bred with Ai32+/+ mice, which contain ChR2(H134R)-EYFP downstream of a loxP-flanked STOP cassette to express ChR2 in a Cre-dependent manner (The Jackson Laboratory #024109), to produce Vglut2-IRES-Cre+/− and Vglut2-IRES-Cre−/− mice hemizygous for the Ai32 allele [33, 34]. Animals were housed in the Fishers Lane Animal Care facility managed by the National Institute on Alcoholism and Alcohol Abuse (NIAAA). Studies were carried out in accordance with the National Institutes of Health guide for the care and use of laboratory animals and were approved by the NIAAA Animal Care and Use Committee. Mice were housed on 12 h light/dark cycle on ventilated racks in a temperature- and humidity-controlled room with ad libitum access to food and water, except in cases of food restriction. All experiments were carried out during the light phase.

Stereotaxic viral vector injection and optical fiber implantation

Mice were anesthetized with isoflurane and placed into a stereotaxic frame (David Kopf Instruments). Craniotomy and durotomy were performed above each injection or implantation site. C57BL/6J mice were injected bilaterally with 300 nL AAV-ChR2 or AAV-EGFP (see Supplementary Methods) (60 nL/min) using a Hamilton syringe with a 32-gauge needle. We targeted the anterior intralaminar nuclei of the thalamus using coordinates (relative to bregma) 1.3 posterior, 0.5 lateral, and 3.8 ventral from brain surface. For self-stimulation experiments, tips of optical fibers (5 mm long, 205 µm core, 0.22 NA, with ceramic ferrule; Thorlabs) were implanted bilaterally in the DMS using coordinates (relative to bregma) anterior 0.8, 1.4 lateral, and 2.2 ventral from brain surface. Fibers were secured to the skull using Teets denture material (Co-oral-ite Dental Mfg., Diamond Springs, CA).

Optical self-stimulation task

Configurations of operant chambers and additional details are described in Supplementary Methods. Sessions began with illumination of the house light and insertion of the lever(s). Optical stimulation (5–20 Hz, 1 s, 5 ms pulse width) in response to active lever presses was delivered on a fixed-ratio 1 (FR1) schedule. Sessions lasted 30 min with an unlimited number of reinforcements. The end of the session was signaled by extinguishing the house light and retracting the lever(s). To encourage exploration of the area immediately surrounding the active lever during early training, we performed a shaping procedure beginning 20 min into the first training session (see Supplementary Methods). Reversal learning was assessed in a subset of mice by reversing the lever that produced light delivery. Learning of the new lever-stimulation contingency was assessed over seven training sessions. Once stable pressing was observed following reversal learning, mice underwent extinction training in which an active lever press no longer produced a stimulation. Responding under extinction conditions was measured during three sessions, followed by a single reacquisition session in which we restored the active lever-stimulation contingency. No cues, priming, or other means of reinstatement were used.

Experimental timeline

Vglut2-Cre+/−;Ai32+/− mice underwent 6 days of training prior to assessment of drug effects on operant responding. Drug details are provided in Supplementary Methods. Experimental timelines for operant responding for food reinforcement and operant self-stimulation in the cohort of C57BL/6J mice expressing ChR2 in the thalamus are shown in Figs. S3a and S4a (and see Supplementary Methods).

Brain slice preparation, whole-cell patch clamp recording, and fast-scan cyclic voltammetry recording

Brain slice preparation, whole-cell voltage-clamp recordings, and fast-scan cyclic voltammetry (FSCV) recordings were conducted as previously described [23] (see Supplementary Methods).

Data analysis

Data visualization and statistical analyses were performed using GraphPad Prism or R. Lever pressing across training sessions was analyzed using two-way repeated measures (RM) analysis of variance (ANOVA) or a linear mixed effects model. Session and lever were treated as within-subject factors, and genotype and viral vector differences were analyzed as between-group factors. Significant factor effects were analyzed by the indicated post hoc multiple comparisons tests. Within-subject effects of drugs or changes in stimulation rate (compared to the preceding session) were analyzed using a paired t-test or two-way RM ANOVA when multiple doses or conditions of the same drug or manipulation were compared. For all statistical comparisons, alpha was set at 0.05. See Supplementary Methods for additional details.

Results

Optical stimulation of ChR2-expressing terminals in the DMS of Vglut2-Cre+/−;Ai32+/ − mice reinforces operant responding

Glutamatergic thalamic neurons that project to the dorsal striatum express the vesicular glutamate transporter Vglut2 [35]. To examine the reinforcing properties of stimulation of Vglut2+ terminals in the DMS, we expressed ChR2 in Vglut2+ projections to the striatum by crossing Vglut2-IRES-Cre+/− mice with Ai32+/+ mice (Fig. 1a). We implanted bilateral optical fibers in the DMS of Vglut2-Cre+/−;Ai32+/− or Vglut2-Cre−/−;Ai32+/−mice (Figs. 1b and S1a, b), and then trained mice to perform a self-paced self-stimulation task. Lever presses were continuously reinforced with an optical stimulation train (20 pulses of 473 nm light delivered at 20 Hz). Across six training sessions, optical stimulation reinforced pressing in Vglut2-Cre+/−;Ai32+/− mice, but not in Vglut2-Cre−/−;Ai32+/− mice (two-way RM ANOVA, session × genotype interaction, F(5,45) = 8.105, p < 0.0001) (Fig. 1c). Qualitatively, we observed that mice responded in clusters interspersed with breaks from engaging with the lever (Fig. 1d). We also observed stimulation-related movements that emerged over the course of training (Supplementary Results, Supplementary Video 1).

Fig. 1: Optical stimulation of Vglut2+ terminals in the DMS reinforces operant lever pressing.
figure 1

a Breeding scheme. b Diagram of optical fiber placement in the DMS. c Acquisition of lever pressing for 20 pulses of 20-Hz optical stimulation in Vglut2-Cre+/−;Ai32+/− (n = 7) and Vglut2-Cre−/−;Ai32+/− (n = 4) mice. Data represent mean ± SEM. d Examples of lever presses from three representative Vglut2-Cre+/−;Ai32+/− mice during acquisition session 6. Tick marks represent individual lever presses. e Representative whole-cell recording of EPSCs in a DMS medium spiny neuron evoked by 20 pulse, 20-Hz optical stimulation. Scale bars: 100 pA, 0.25 s. f Representative FSCV traces of dopamine release measured in the DMS in response to 20 pulses of 20-Hz optical stimulation, before and after bath application of DHβE (1 µM). Scale bars: 0.5 µM dopamine, 0.5 s.

We examined responses to 20-Hz optical stimulation in brain slices to determine what synaptic mechanisms were activated. This protocol produced excitatory postsynaptic currents (EPSCs) in MSNs recorded from DMS, albeit with some failures in response to later light pulses in each train (Fig. 1e). 20-Hz stimulation also evoked dopamine release in the DMS as measured by FSCV (Fig. 1f). Dopamine release was partially blocked by the nicotinic receptor antagonist DHβE (1 µM) (Figs. 1f and S1c), suggesting that about half of dopamine release was mediated by CIN-derived acetylcholine actions on dopamine terminals, while the remainder was attributable to direct activation of dopaminergic afferents or another indirect effect.

mGlu2 modulates responding for stimulation of Vglut2+ terminals in the DMS

Because mGlu2 robustly modulates striatal glutamatergic and dopaminergic transmission [23], we predicted that pharmacological manipulation of mGlu2 would alter the reinforcing properties of Vglut2+ terminal stimulation in the DMS. Systemic injection of the mGlu2/3-preferring antagonist LY341495 prior to the self-stimulation session did not alter the number of active lever presses per session (119.3 ± 14.3% of vehicle, t(6) = 1.28, p = 0.25, paired t-test) (Fig. 2a, c, d). Conversely, LY379268 (1 or 3 mg/kg) reduced responding (two-way RM ANOVA: main effect of treatment session (LY379268 vs. vehicle), F(1,12) = 13.57, p = 0.0031) (Fig. 2b, c, e), but we did not find a significant dose × treatment interaction (F(1,12) = 0.45, p = 0.51). To further understand the effects of mGlu2/3 activation on patterns of pressing, we quantified the number of clusters of pressing per session, the length of breaks from engaging with the lever, and within-cluster parameters including number of presses, duration, and press rate. Reduced lever pressing after 3 mg/kg LY379268 was primarily driven by an increase in the length of breaks between clusters of pressing (Fig. 2j). In DMS brain slices prepared from Vglut2-Cre+/−;Ai32+/− mice, optically evoked dopamine release was reduced by LY379268 (100 nM); prior application of DHβE occluded this effect, suggesting that mGlu2 exclusively attenuates dopamine release driven by glutamatergic inputs to CINs (Fig. S1d).

Fig. 2: mGlu2/3 activation reduces operant responding for optical stimulation of Vglut2+ terminals in the DMS.
figure 2

a, b Within-subject comparisons of lever presses per session in Vglut2-Cre+/−;Ai32+/− mice (n = 7) after injection of vehicle or 3 mg/kg LY341495 (a), or vehicle or 1 or 3 mg/kg LY379268 (b). For b, * indicates a main effect of treatment session (vehicle vs. LY379268; p = 0.0031, two-way RM ANOVA). c Average lever presses per session (normalized to vehicle presses for each mouse) for each drug treatment. Bars represent mean ± SEM, and individual data points are overlaid. Doses of each drug (mg/kg) are indicated in parentheses. d, e Examples of lever presses from a representative mouse during vehicle or drug sessions. Tick marks represent individual lever presses. f–j Within-subject comparisons of patterns of pressing during vehicle or LY379268 (3 mg/kg) sessions. Parameters analyzed were clusters of pressing per session (f), mean duration of clusters (g), mean number of presses per cluster (h), mean within-cluster press rate (i), and the mean length of breaks between clusters of pressing (j). For j, *t(6) = 3.10, p = 0.0211, paired t-test.

We trained an additional cohort of Vglut2-Cre+/−;Ai32+/− mice to distinguish between an active lever and an inactive lever to receive a stimulation train consisting of 10 pulses at 10 Hz (Fig. S2a). This stimulation protocol reliably produced EPSCs in MSNs (Fig. S2b) and evoked dopamine release in brain slices (Figs. S1c and S2c). Across six training sessions, mice escalated pressing of the active lever but not the inactive lever (Fig. 3a). Two-way RM ANOVA revealed a significant session × lever interaction (F(5,25) = 18.76, p < 0.0001). To determine the specific contribution of mGlu2 to modulation of responding for stimulation of ChR2-expressing terminals in the DMS, we compared active lever presses following injection of the mGlu2-selective positive allosteric modulator BINA (15 mg/kg, i.p.) or vehicle. Like LY379268, BINA robustly decreased responding (46.6 ± 12.2% of presses during vehicle session, t(5) = 4.537, p = 0.0062, paired t-test) (Fig. 3b), supporting a specific role for mGlu2.

Fig. 3: The mGlu2-selective positive allosteric modulator BINA reduces operant responding for optical stimulation of Vglut2+ terminals in the DMS.
figure 3

a Active and inactive lever pressing for 10 pulses of 10-Hz optical stimulation in Vglut2-Cre+/−;Ai32+/− mice (n = 6). Data represent mean ± SEM. b Within-subject comparison of active lever presses per session after injection of vehicle or 15 mg/kg BINA (n = 6). *p = 0.0062, paired t-test.

Previous studies have shown that mGlu2/3 activation modestly reduces operant responding for natural reinforcers [36, 37]. To assess the specificity of mGlu2/3 manipulations, we trained food-restricted C57BL/6J mice to press a lever for delivery of a food pellet (Fig. S3a, b). In contrast to the lack of effect on Vglut2+ terminal self-stimulation, the mGlu2/3 antagonist LY341495 markedly reduced lever pressing for a food reinforcer to 49.7 ± 5.0% of vehicle (Fig. S3c, e). Similar to previous reports in rats, LY379268 (1–3 mg/kg) modestly decreased pressing for a food reinforcer (1 mg/kg: 87.5 ± 3.3% of vehicle; 3 mg/kg: 87.1 ± 5.1% of vehicle) (Fig. S3d, e).

Optical stimulation of thalamic terminals in the DMS reinforces operant behavior

Although Vglut2+ terminals in the dorsal striatum are typically attributed to thalamic afferents [9, 35], and little co-localization with dopamine neuron markers has been observed in the adult SNc [38], our finding that optically evoked dopamine release in the DMS of Vglut2-Cre+/−;Ai32+/− mice is only partially sensitive to nicotinic receptor blockade suggests that ChR2 can stimulate dopamine release via direct effects on dopaminergic terminals. This is consistent with reports of broader Vglut2 expression in dopamine neurons during development [38, 39]. In addition, the basolateral amygdala and pedunculopontine nucleus are among other regions that could contribute Vglut2+ glutamatergic inputs to the DMS [33, 40]. To more selectively evaluate the reinforcing properties of thalamostriatal terminal stimulation, we virally expressed ChR2 or EGFP bilaterally in the intralaminar nuclei of the thalamus and implanted bilateral optical fibers in the DMS of C57BL/6J mice (Figs. 4a and S4b, c), then trained mice in a self-paced self-stimulation task with an active and inactive lever available (Fig. S4a). Active lever depression resulted in optical stimulation (10 pulses of blue light delivered at 10 Hz). ChR2-injected mice engaged in more active lever presses per session compared to EGFP-injected mice during later sessions (linear mixed effects model, t(10) = −4.028, p = 0.0024) (Fig. 4b, c). ChR2-injected mice performed more active lever presses per session during later sessions compared to the first session (t(193) = −5.60, p < 0.0001) and engaged in fewer inactive lever presses per session than active lever presses during later sessions (t(193) = −11.23, <0.0001) (Fig. 4b, Supplementary Results). Among ChR2-injected mice, there was a significant lever × session interaction (t(193) = −3.64, p = 0.0003).

Fig. 4: Optical stimulation of thalamic terminals in the DMS reinforces operant lever pressing.
figure 4

a Diagram of AAV-ChR2 or AAV-EGFP injection in the anterior intralaminar nuclei of the thalamus and optical fiber placement in the DMS. b, c Acquisition of lever pressing for 10 pulses of 10-Hz optical stimulation of thalamostriatal terminals in the DMS of C57BL/6J mice expressing ChR2 (b, n = 8) or EGFP (c, n = 4). d Examples of lever presses from three representative mice expressing ChR2 in thalamostriatal terminals during acquisition session 9. Tick marks represent individual lever presses. e Representative whole-cell recording of EPSCs in a DMS medium spiny neuron in response to 10 pulses of 10-Hz optical stimulation. Scale bars: 100 pA, 0.25 s. f Representative FSCV traces of dopamine release measured in the DMS in response to 10 pulses of 10-Hz optical stimulation, before and after bath application of DHβE (1 µM). Scale bars: 0.5 µM dopamine, 0.5 s. g Lever presses at baseline and across seven sessions of reversal learning (n = 5). *p < 0.05, active lever vs. inactive lever, Sidak’s multiple comparisons test. h Lever presses at baseline, during three extinction sessions, and during one reacquisition session (n = 5). *p < 0.05, active lever presses during extinction/reacquisition sessions vs. baseline, Tukey’s multiple comparisons test. For b, c and g, h, data represent mean ± SEM.

To test the influence of stimulation rate on responding for thalamostriatal stimulation, we varied the stimulation train to either 20 pulses delivered at 20 Hz or 5 pulses delivered at 5 Hz (Fig. S4d). In both cases, varying the stimulation train reduced the number of presses per session (Fig. S4e).

Brain slice electrophysiology and FSCV experiments confirmed that optical stimulation in DMS evoked both glutamate and dopamine release (Fig. 4e, f). In MSNs, the 10-Hz stimulation train reliably produced EPSCs (Fig. 4e). Dopamine release was blocked by DHβE, consistent with dopamine release being driven by thalamic activation of CINs and subsequent acetylcholine actions on dopaminergic terminals (Fig. 4f) [18, 19].

Next, we evaluated the ability of mice to flexibly update rates of lever pressing in response to changes in the lever-stimulation contingency. First, we reversed the lever-stimulation contingency such that the previously inactive lever became the active lever. Upon reversal, mice decreased pressing of the formerly active lever and increased pressing of the newly active lever (two-way RM ANOVA, lever × session interaction, F(7,28) = 4.554, p = 0.0017) (Fig. 4g). By the sixth and seventh reversal training sessions, mice pressed the newly active lever more than the previously active lever (session 6, p = 0.0022; session 7, p = 0.0001, Sidak’s multiple comparisons test). We then evaluated responding under extinction conditions. When stimulation was no longer delivered in response to a press on the previously active lever, mice rapidly decreased responding; restoration of press-stimulation contingency during a single reacquisition session restored pressing to 52.1 ± 7.5% of baseline levels (Fig. 4h). Across baseline, extinction, and reacquisition sessions, two-way RM ANOVA revealed a significant lever × session interaction (F(4,16) = 23.87, p < 0.0001). Post hoc comparisons demonstrated that compared with baseline pressing, mice pressed the active lever fewer times during each extinction session (p < 0.0001, Tukey’s multiple comparisons test). When the lever-stimulation contingency was reacquired, mice pressed the active lever significantly more than during the final extinction session (p < 0.0001), but less than on the baseline day (p < 0.0001).

Group II mGlu receptors modulate lever pressing for thalamostriatal stimulation

In the same group of mice trained to press a lever for thalamostriatal stimulation, we evaluated the ability of pharmacological manipulations of group II mGlu receptors to modulate lever pressing. Pharmacological interventions were performed after the acquisition and stimulation rate manipulations but prior to reversal learning and extinction training (Fig. S4a). Injection of the group II mGlu receptor antagonist LY341495 (3 mg/kg) increased total lever presses to 209 ± 23.7% of vehicle pressing (t(5) = 2.666, p = 0.045, paired t-test) (Fig. 5a, c, d). Conversely, the agonist LY379268 dose-dependently reduced responding (Fig. 5b, c, e). Two-way RM ANOVA revealed a significant treatment session (vehicle vs. LY379268) × dose interaction (F(1,13) = 6.611, p = 0.0232). Whereas 1 mg/kg LY379268 did not significantly reduce responding, we observed a substantial decrease in responding with 3 mg/kg (1 mg/kg: 78.5 ± 10.2% of vehicle, p = 0.19; 3 mg/kg: 39.1 ± 10.7% of vehicle, p = 0.0002, Sidak’s multiple comparisons test). Analysis of pressing patterns revealed that LY341495 and LY379268 had opposing effects on the number of clusters of pressing per session, with LY341495 increasing and LY379268 decreasing the number of clusters (LY341495: t(5) = 5.069, p = 0.0039; LY379268: t(5) = 3.752, p = 0.0133, paired t-test) (Fig. 5f, k). Similar to the effects of LY379268 in Vglut2-Cre+/−;Ai32+/− mice, the architecture of pressing within clusters was not consistently altered (Fig. 5l–n). In addition, we did not observe a significant increase in break length (Fig. 5o). Within-cluster parameters and break length were similarly unaffected by LY341495 (Fig. 5g–j).

Fig. 5: mGlu2/3 activity constrains operant responding for thalamostriatal stimulation.
figure 5

a, b Within-subject comparisons of lever presses per session in C57BL/6J mice expressing ChR2 in thalamostriatal terminals (n = 6–8) after injection of vehicle or 3 mg/kg LY341495 (a) or 1 or 3 mg/kg LY379268 (b). *p < 0.05, paired t-test (a) or Sidak’s multiple comparisons test (b). c Average active lever presses per session (normalized to vehicle presses for each mouse) for each drug treatment. Bars represent mean ± SEM, and individual data points are overlaid. Doses of each drug (mg/kg) are indicated in parentheses. d, e Examples of lever presses from a representative mouse during vehicle or drug sessions. Tick marks represent individual lever presses. f–o Within-subject comparisons of patterns of pressing during vehicle vs. 3 mg/kg LY341495 (f–j) or vehicle vs. 3 mg/kg LY379268 (k–o) sessions. Parameters analyzed were clusters of pressing per session (f, k), mean duration of clusters (g, l), mean number of presses per cluster (h, m), mean within-cluster press rate (i, n), and the mean length of breaks between clusters of pressing (j, o). For f, k, *p < 0.05, paired t-test.

Discussion

Recent studies examining the roles of thalamostriatal projections in reinforcement learning have identified roles for discrete components of this pathway (i.e. inputs from the rostral intralaminar or parafascicular nuclei) in behavioral flexibility, operant behaviors, and incubation of drug craving [13, 19, 41, 42]. Here we demonstrate that specific stimulation of thalamic terminals in the DMS supports operant conditioning in a self-paced task. Mice acquire thalamostriatal self-stimulation in the absence of predictive cues, without manipulation of motivational state (i.e. food restriction), and without prior training to respond for an alternative outcome. Our findings extend previous findings that manipulations localized to the dorsal striatum, including stimulation of MSNs or nigrostriatal dopamine release, are sufficient to reinforce a self-initiated action [4,5,6,7,8].

The behavioral roles of locally regulated striatal dopamine release mediated by thalamostriatal transmission remain a major question. A recent report that D1 receptor antagonists decrease operant responding maintained by thalamostriatal stimulation supports a role for thalamically evoked dopamine in behavioral reinforcement [19]. This is consistent with our finding that mGlu2 activation, which robustly decreases thalamically driven dopamine release [23], reduces operant responding for thalamostriatal stimulation. Thalamostriatal projections support methamphetamine seeking following forced abstinence in a D1 receptor-dependent manner, further supporting the behavioral relevance of thalamic regulation of striatal dopamine transmission [41]. In Vglut2-Cre+/−;Ai32+/− mice, optical stimulation evokes dopamine release driven by CIN-mediated activation of nicotinic receptors on dopamine neurons as well as other mechanisms, most likely direct stimulation of dopaminergic terminals. Importantly, our demonstration that mice specifically expressing ChR2 in thalamic inputs to the DMS also acquire self-stimulation behavior confirms that selective activation of thalamic glutamatergic inputs is sufficient to drive reinforcement.

Non-dopaminergic effects of glutamate released from thalamic terminals could also contribute to behavioral reinforcement. Notably, reinforcement learning driven by optogenetic stimulation of DMS MSNs does not depend on dopamine receptor activation [6]. Thalamic inputs target both D1- and D2-expressing MSNs [16, 17] and drive excitation [14, 15], raising the possibility that thalamostriatal self-stimulation is at least partially supported by direct activation of MSNs. Of note, ablation of CINs surrounding the site of thalamostriatal self-stimulation only partially impairs responding, suggesting involvement of mechanisms independent of locally evoked dopamine release [19]. Direct optogenetic activation of D1-expressing MSNs in the DMS supports reinforcement learning [6, 8], whereas optical stimulation of D2-expressing MSNs promotes avoidance [6]. Thus, concurrent stimulation of excitatory inputs to both populations of MSNs could produce competing reinforcing and aversive signals that are reflected in intermittent patterns of lever engagement.

Given our findings that mGlu2 activation reduces thalamostriatal glutamatergic transmission in both MSNs and CINs, and in turn reduces locally evoked dopamine release mediated by acetylcholine [23], we predicted that pharmacological manipulation of mGlu2 would modify the reinforcing properties of light trains during our self-stimulation task. Supporting this, the mGlu2/3 agonist LY379268 reduced thalamostriatal self-stimulation in both Vglut2-Cre+/−;Ai32+/− mice and C57BL/6J mice expressing ChR2 in the thalamus. In Vglut2-Cre+/−;Ai32+/− mice, this effect was mimicked by the mGlu2-selective positive allosteric modulator BINA. Moreover, our finding that the mGlu2/3 antagonist LY341495 increased responding suggests that mGlu2 is endogenously activated during thalamostriatal terminal stimulation and constrains the reinforcing properties of stimulation.

Our finding that mGlu2 activation constrains operant responding for thalamostriatal stimulation identifies a unique neural substrate by which mGlu2 can modify the value of a stimulus during reinforcement learning. Activation of group II mGlu receptors is known to reduce both basal and psychostimulant-evoked dopamine release [28,29,30,31,32]. However, mGlu2/3 agonist administration does not reduce extracellular dopamine levels or locomotion evoked by midbrain electrical stimulation or l-DOPA administration [31]. These data are consistent with previous reports that failed to observe mGlu2 expression in nigrostriatal projections [43] and our current demonstration that nicotinic receptor blockade occludes LY379268-mediated inhibition of dopamine release in Vglut2-Cre+/−;Ai32+/− mice. Collectively, these findings are inconsistent with mGlu2 reduction of dopamine release via direct actions on dopaminergic terminals. It is likely that mGlu2 acting on glutamatergic inputs to CINs, and possibly MSNs, underlies the dampened reinforcing properties of Vglut2+ or thalamostriatal terminal stimulation following administration of LY379268 or the mGlu2-selective PAM BINA. However, mGlu2 (and possibly mGlu3) actions in downstream circuit elements that support operant responding could also contribute to these effects.

Consistent with the ability to decrease drug-enhanced extracellular dopamine levels, mGlu2/3 activation decreases self-administration of psychoactive drugs including cocaine, amphetamines, nicotine, and alcohol [29, 32, 37, 44,45,46,47,48,49,50,51,52]. However, the ability of these receptors to constrain reinforcement appears dependent on the nature of the reinforcer. In previous studies, mGlu2/3 agonists reduced responding for natural reinforcers such as sucrose, although such findings are inconsistent and typically involve higher doses than are required to decrease responding for psychoactive drugs [21, 36, 37]. We observed a modest decrease in responding for palatable food following LY379268 administration. Notably, administration of the mGlu2/3 antagonist LY341495 produced opposing effects on responding for thalamostriatal stimulation (increased responding) vs. food reinforcement (decreased responding). Similar to effects on thalamostriatal self-stimulation, previous studies have shown that LY341495 administration or genetic deletion of mGlu2 increases self-administration of reinforcing drugs such as alcohol, cocaine, and heroin [53,54,55,56]. These incongruent effects might reflect differential engagement of circuitry modulated by mGlu receptors depending on the stimulus. Future studies measuring thalamostriatal activity and striatal dopamine dynamics during reinforcement learning are necessary to determine the engagement of this pathway during acquisition of operant responding for various outcomes, including natural reinforcers and psychoactive drugs.

Funding and disclosures

This work was supported by NIAAA Division of Intramural Clinical and Biological Research ZIA AA000416 (to D.M.L.) and NIH grant K99 AA025403 (to K.A.J.). The authors declare no conflicts of interest.