Introduction

Midbrain dopamine (DA) neurons are important in many behavioral and cognitive functions of the brain, including movement, reward learning and motivation. DA neuron dysfunction has been associated with a number of brain disorders, including Parkinson’s disease, drug abuse and depression. Anatomical and functional studies over the last several decades have shown that DA neurons integrate diverse excitatory glutamatergic, inhibitory GABAergic and a multitude of other neuromodulatory inputs to form complex firing patterns that are critical for DA neuron function.1, 2, 3, 4, 5 Recent genetic and optogenetic studies have indicated that these different inputs may transmit overlapping but distinguishable signals to DA neurons.6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17

Among these inputs, glutamatergic afferents mediated by both AMPA- and NMDA-type ionotropic glutamate receptors (AMPARs and NMDARs, respectively), arising from a number of cortical and subcortical structures, are the major excitatory input and play important roles in the regulation of DA neuron activity and plasticity both in vitro and in vivo.3, 5, 18, 19, 20, 21, 22, 23 Early clues to the role of glutamatergic input onto DA neurons in behavior were through observations that local administration of glutamate receptor antagonists impairs behavioral sensitization and reward learning,24, 25, 26, 27 although these studies have limitations in achieving brain-region or cell-type specificity. Recent experiments to genetically delete the Grin1 allele, which encodes the NMDAR obligatory GluN1 subunit, or individual AMPAR subunit genes in DA neurons revealed important roles for specific receptor subtypes in psychostimulant sensitization, reward and learning.6, 7, 8, 17, 28, 29, 30, 31, 32 Interestingly, these studies also reported that genetic deletion of NMDARs in DA neurons caused a nearly two-fold increase of AMPAR-mediated synaptic transmission,6, 8 while mice lacking individual AMPAR subunits had essentially normal glutamatergic synaptic strength,6, 28 suggesting possible adaptations in glutamatergic synaptic transmission following partial perturbation of glutamate receptors in DA neurons. Here we developed a DA-specific quadruple knockout mouse in which NMDARs are absent and AMPAR signaling is severely impaired (10% residual). We then used this unique model to test what behaviors are most crucially dependent on glutamate-mediated fast excitatory drive onto DA neurons. We found that these mutant animals have prominent deficits in tasks requiring high levels of effort, while leaving many other DA-related behaviors unchanged.

Materials and methods

Mouse genetics

All mice were bred and housed in a conventional vivarium at the National Institutes of Health, Bethesda, MD, USA. Gria1-3fl/flGrin1fl/fl mice (F4 mice) were generated as described previously.33, 34 Gria1-3fl/flGrin1fl/fl/DAT-Cre (F4Cre mice) and Gria1-3fl/flGrin1fl/fl/DAT-Cre/tdTomato (F4Cre/tdTomato mice) mice were generated by crossing Gria1-3fl/flGrin1fl/fl mice with DAT-Cre (Slc6a3+/cre) mice35 or with Ai14 tdTomato reporter mice.36 Pups were kept with the dam until weaning at postnatal day 21. After weaning, juveniles were group housed by sex in standard plastic cages in groups not exceeding four per cage. Cages were maintained in ventilated racks in a temperature (20 °C) and humidity (55%) controlled vivarium on a 12 h circadian cycle, lights on from 0600 to 1800 h, and all behavioral testing took place during the light portion of this cycle. Standard rodent chow and reverse-osmosis water were available ad libitum unless otherwise noted. In addition to standard bedding, a cardboard tube and nesting material were provided in each cage. All behavioral testing was performed with male littermate mice aged 3–5 months. All procedures followed the Institute of Laboratory Animal Research guidelines and were approved by the Animal Care and Use Committee of the National Institute of Neurological Disorders and Stroke.

Electrophysiology

Horizontal slices (200 μm) containing the ventral tegmental area (VTA) were cut on a DTK Microslicer vibratome (Ted Pella, Redding, CA, USA) in chilled high-sucrose cutting solution, containing (in mm): KCl 2.5, CaCl2 0.5, MgCl2 7, NaH2PO4 1.25, NaHCO3 25, glucose 7, sucrose 210 and ascorbic acid 1.3. Freshly cut slices were placed in an incubating chamber containing carbogenated artificial cerebrospinal fluid (ACSF), containing (in mM) NaCl 119, KCl 2.5, NaHCO3 26, Na2PO4 1, glucose 11, CaCl2 2.5, MgCl2 1.3, and recovered at 32 °C for ~30–60 min. Slices were then maintained in ACSF at room temperature prior to recording. After 0.5–1 h of incubation at room temperature slices were transferred to a submersion chamber on an upright Olympus microscope (Olympus USA, Center Valley, PA, USA), perfused in ACSF with appropriate pharmacological reagents. Neurons were visualized by infrared differential interference contrast microscopy. The intracellular solution for AMPA and NMDA excitatory postsynaptic current (EPSC) recording contained (in mM) CsMeSO4 135, NaCl 8, HEPES 10, Na-GTP 0.3, Mg-ATP 4, EGTA 0.3, QX-314 5 and spermine 0.1. For recording action potentials, the intracellular solution contained (in mM) K-MeSO4 130, KCl 10, HEPES 10, NaCl 4, Mg-ATP 4, Na-GTP 0.3, EGTA 1. For recording spontaneous action potentials (Cre/tdTomato: n=19, F4Cre/tdTomato: n=22), whole-cell current clamp was performed. For high-frequency stimulus-induced action potentials, the stimulus electrode was placed in the rostral part of VTA and a train of 100 Hz stimuli (1 s) was applied. The duration of each stimulus was 10 μs, and the intensities ranged between 20 μA and 30 mA. GABAA receptor-mediated inhibitory postsynaptic potentials were blocked by 100 μm picrotoxin. For recording evoked EPSCs (Cre/tdTomato, n=13, F4Cre/tdTomato, n=15; AMPA EPSCs were recorded at −70 mV and NMDA EPSCs were recorded at +40 mV), picrotoxin (100 μM) was added to the perfusing ACSF; DA and non-DA neurons were identified by tdTomato fluorescence and neighboring non-fluorescent neurons in VTA area, respectively. Stimulation electrodes were placed at the rostral part of the VTA. All paired recordings (Cre/tdTomato n=14, F4Cre/tdTomato n=8) involved simultaneous whole‐cell recordings from one tdTomato positive (DA neuron) and a neighboring tdTomato negative (non-DA) neuron. The stimulus was adjusted to evoke a measurable, monosynaptic EPSC in both cells. AMPA EPSCs were measured at a holding potential of −70 mV, and NMDA EPSCs were measured at +40 mV and at 150 ms after the stimulus, at which point the AMPA EPSC had completely decayed. Miniature EPSCs (mEPSCs) were acquired in the presence of 0.5–1 μM TTX and 100 μM picrotoxin and semiautomatically detected by offline analysis using in‐house software in Igor Pro (Wavemetrics, Portland, OR, USA). Cells were recorded with 3‐ to 5‐MΩ borosilicate glass pipettes, series resistance was monitored and not compensated, and cells in which series resistance varied by 25% during a recording session were discarded. Synaptic responses were collected with a Multiclamp 700B‐amplifier (Axon Instruments, Foster City, CA, USA), filtered at 2 kHz, digitized at 10 KHz.

Immunohistochemistry

Mice were transcardially perfused with 4% paraformaldehyde in 1 × phosphate-buffered saline. Brains were removed and post-fixed in 4% paraformaldehyde at 4 °C prior to being transferred to 30% sucrose in phosphate-buffered saline. Coronal sections (40 μm) were cut with a cryostat (Leica, Buffalo Grove, IL, USA), rinsed in 1 × phosphate-buffered saline, blocked 1 h in 10% normal goat serum and incubated overnight at 4 °C with primary antibodies (GluA1, Millipore AB1504, 1:250; GluA2/3, Millipore AB1506, 1:100; GluA4, Millipore AB1508, 1:50; NR1, Millipore MAB363, 1:100; TH, Millipore AB152, 1:2000) in a buffer containing 0.25% Triton-X, 10% normal goat serum and 1 × phosphate-buffered saline. The sections were then incubated with Alexa conjugated secondary antibodies (Alexa 594 and Alexa 488, 1:1000; Invitrogen, Carlsbad, CA, USA) at room temperature for 2 h. Brain sections were mounted in 4′,6′-diamidino-2-phenylindole dihydrochloride mounting medium (DAPI Fluoromount-G; SouthernBiotech, Birmingham, AL, USA). Fluorescent images were acquired on a Zeiss LSM 510 laser scanning confocal microscope using a × 10 air or × 40 oil objective with the pinhole set to 1 airy unit for all experiments. For the fluorescence intensity of the GluA4 subunit, representative images were acquired on the Zeiss LSM 510 confocal microscope using a × 40 oil objective (Cre/tdTomato n=24, F4Cre/tdTomato n=24). Multiple z-sections of neuronal somas were collected at 1.0 μm multiple z-sections with a zoom factor 4, and ImageJ (NIH, Bethesda, MD, USA) was used to calculate the corrected total cell fluorescence based on density, area and background fluorescence around the region of interest, and intensity data were analyzed with an unpaired t-test.

Fast-scan cyclic voltammetry

Sagittal brain slices (240 μm) from mice (12–14 weeks old) were obtained using a vibratome (VT-1200S Leica) and ice-cold cutting solution containing (in mM) 225 sucrose, 13.9 NaCl, 26.2 NaHCO3, 1 NaH2PO4, 1.25 glucose, 2.5 KCl, 0.1 CaCl2, 4.9 MgCl2 and 3 kynurenic acid. The slices were recovered for 30 min at 33 °C in ACSF containing (in mM) 124 NaCl, 1 NaH2PO4, 2.5 KCl, 1.3 MgCl2, 2.5 CaCl2, 20 glucose, 26.2 NaHCO3 and 0.4 ascorbic acid, and maintained at room temperature prior recordings. For recordings, slices were submerged in a chamber with continuous perfusion (2 ml min−1) of ACSF and kept at 32 °C using an in-line heater.

Cylindrical carbon fiber (7μm diameter) electrodes (~150 μm of exposed fiber) were inserted in the dorsal striatum or the nucleus accumbens. Carbon fiber electrodes were held at −0.4 V versus Ag/AgCl and a triangular voltage ramp (–0.4 to +1.2 and back to –0.4 V at 0.4 V ms−1) was delivered every 100 ms. DA transients were evoked by electric stimulations. A glass pipette filled with ACSF was placed near the tip of the carbon fiber (~100–200 μm) and a rectangular pulse (0.1 ms) was applied every 2 min. For the input/output experiments the stimulus intensity was increased every 10 μA from 10–100, then to 120, 150, 200, 300, 400 and 500 μA. For the rest of the experiments the amplitude of the current pulse (200 μA) was adjusted to use the minimal current needed to generate a maximal and stable responses. For the train stimulations we delivered four pulses at 1, 5, 10, 20, 50 and 100 Hz. Data were collected with a retrofit headstage (CB-7B/EC with 5 MΩ resistor) using a Multiclamp 700B amplifier after low-pass filter at 10 kHz and digitized at 100 kHz using pClamp10 software (all from Molecular Devices, Sunnyvale, CA, USA). For analysis, baseline voltammograms before stimulation were averaged and subtracted from the voltammograms during and after stimulation and transients were calculated from the oxidation peak region using custom-written analysis software in Igor Pro (Wavemetrics). The current peak amplitude of the evoked DA transients were converted to DA concentration according to the post-experimental calibration of the carbon fiber electrodes with DA (1 μM) applied locally through a glass pipette in the recording chamber.

Drugs and treatments

Cocaine hydrochloride (20 mg kg−1 IP) was obtained from the National Institute on Drug Abuse (NIDA) and dissolved in sterile 0.9% saline (5 mg ml−1). DHβE (dihydro-β-erythroidine hydrobromide) was purchased from Tocris (Bristol, UK), and all other pharmacological reagents were purchased from Abcam (Cambridge, MA, USA). All other chemicals were from Sigma (St. Louis, MO, USA). In the Place Preference, Operant and Effort procedures, 20 mg chocolate-flavored Dustless Precision Pellets (Bio-Serv, Flemington, NJ, USA) were used.

Open-field activity

Open-field activity was measured to evaluate unconditioned behavior of the two mouse lines (F4 n=12, F4Cre n=12). Locomotor activity was measured using an Omnitech Electronics Digiscan infrared photocell system (model RXYZCM; Omnitech Electronics, Columbus, OH, USA). Horizontal activity was measured with 16 pairs of infrared photocells located every 2.5 cm per side in a plane 2 cm above the floor of the arena. A second side-to-side array of 16 pairs of additional photocells located 5.5 cm above the arena floor measured vertical activity. The arena was divided into Center and Corner zones to assess exploratory behaviors. Animals were placed singly in a clear Plexiglas arena (40 × 40 × 30 cm) covered with a Plexiglas lid with multiple holes to ensure adequate ventilation. Data were obtained over 90 min sessions, with measurements broken down into 5-min intervals. Data were automatically gathered and transmitted to a computer via an Omnitech Model DCM-I-BBU analyzer. Activity data were analyzed over the 18 5-min intervals was analyzed using a two-way repeated-measures analysis of variance (ANOVA), and activity totals over the 90-min session were analyzed with an unpaired t-tests.

Rotarod

The rotarod test was used to assess motor coordination, balance and motor learning ability (F4 n=17, F4Cre n=17). The rotating drum (Ugo-Basile Mouse Rotarod 47600, Comerio, Italy) accelerated from a speed of 4–40 r.p.m. over 300 s. Falls from the drum were detected automatically by pressure on a plastic plate at the bottom of the apparatus, and the latency to fall off the rotarod within 300 s was recorded. If an animal stayed on the drum for the entire duration of the test, a score of 300 s was recorded. Each animal was given three trials per day with 5-min breaks in between for five consecutive days, followed by a recall test 1 week later. To control for situations when animals may have accidentally slipped from the drum immediately after being placed on it, the lowest score for each animal on each day was dropped and the two longest fall latencies on each day were averaged to give a daily score for analysis. Average fall latencies across the testing period were analyzed with a two-way repeated-measures ANOVA.

Sucrose preference

The ability to respond to a naturally rewarding stimulus was measured using the sucrose preference task (F4 n=8, F4Cre n=8). Individually housed mice were habituated to a two bottle paradigm in their home cage over 24 h with both bottles containing water. Twenty-four hours later, mice were given a free choice between two bottles, one with 2% (wt/vol) sucrose solution and another with tap water. The intake of water and the sucrose solution were measured by weighing the bottles every 24 h for five consecutive days. To control for side preferences the location of the sucrose and water bottles was alternated every 24 h. To control for liquid spillage due to moving the bottles and cages, additional water and sucrose bottles were placed on empty cages and treated identically to the other bottles. The average amount of liquid lost from these control bottles each day was subtracted from the daily averages for each type of fluid. The sucrose preference score was calculated as the amount of sucrose solution consumed relative to the total amount of liquid consumed (sucrose solution intake/total intake). Sucrose preferences over the 5-day period were analyzed using a two-way repeated-measures ANOVA, and the overall average preference score was analyzed with an unpaired t-test.

Food preference

Mice (F4 n=11, F4Cre n=12) were individually housed and habituated to the testing procedure for 2 days during which they were allowed ad lib access to standard laboratory chow placed into two clear plastic dishes on the floor of the cage. Food was weighed daily, with care being taken to collect pieces that had been scattered on the floor of the cage. For preference testing, mice were given ad lib access to two food dishes, one containing standard laboratory chow and the other chocolate Bio-Serv food pellets. The food dishes were placed in the back of the cages, and the location (right or left) was counterbalanced to avoid side preferences. The amount of each type of food consumed was monitored daily, with food being weighed and replaced every 24 h. Food preferences over the 5-day period were calculated as the amount of chocolate pellets consumed relative to the total amount of food (pellets plus chow) consumed for each strain. Food preferences over the 5-day period were analyzed using a two-way repeated-measures ANOVA, and the overall average preference score was analyzed with an unpaired t-test.

Novel object recognition

The object recognition task was used to assess the animal’s ability to recognize an object with a novel physical appearance (F4 n=11, F4Cre n=12). This test utilizes the natural inclination for a rodent to spend more time interacting with a new object over a familiar one. Object recognition testing was performed in an opaque white plastic box (50 × 50 × 30 cm) under low light levels (~120 lux). Animals were habituated to the empty test apparatus for 15 min one day prior to the start of recognition testing. Object recognition testing had two phases, the familiarization phase and the recognition test. During the familiarization phase two identical objects (A1 and A2) were placed in the apparatus 10 cm away from two adjacent corners of the box. Objects were of sufficient weight and size to ensure they could not be moved or knocked over by the animals. The mice were placed in the chamber facing the wall farthest from the objects and allowed to explore the apparatus and objects for 10 min. Animals were then returned to their home cage for a 1-h ITI, and then were reintroduced to the apparatus for a 5-min recognition test. During this test the apparatus contained one object identical to those used in the familiarization trial (A3) and one novel object (B1) that were placed in the same spatial location as the objects used in the habituation phase. All objects used were approximately the same size and shape, but differed in color and textural characteristics. The identity and position of the novel and familiar objects were counterbalanced across groups. The AnyMaze video tracking system (Stoelting, Kiel, WI, USA) was used to monitor and score behaviors. Object exploration time during the familiarization and test phases was analyzed with a mixed-model ANOVA. The discrimination ratio was calculated as the time spent interacting with object B1 divided by total object exploration time (A3+B1) during the recognition test, and was analyzed with an unpaired t-test.

Spontaneous alternation

The spontaneous alternation test was used to assess working memory and is based on the fact that mice prefer to visit less recently entered areas, thus implicating that it will need to recall which was the last arm visited (F4 n=11, F4Cre n=12). Testing was conducted in a Y-maze constructed of opaque gray Plexiglas. Choice arms were 35 cm long, 10 cm wide and 15 cm high. Mice were allowed to freely explore the entire maze for 6 m, and the total number of arm entries as well as the number of complete alternations in which the animal entered each of the three arms in turn was tracked and analyzed with AnyMaze software. The percent alternation was calculated as the number of correct alternations (entry into all three arms on consecutive choices) compared with the total number of alternations (sequences of entering into any three arms). Total arm entries and percent alternation were analyzed with unpaired t-tests.

Spatial recall

Long-term spatial memory was assessed with a non-matching-to-sample task in the Y-maze apparatus described above. In this task (F4 n=11, F4Cre n=12), the three arms of the maze were designated as the Start Arm, Familiar Arm and Novel Arm. At the beginning of the test, a barrier was placed at the entrance to the Novel Arm. The mouse will then be placed in the Start Arm and allowed to explore the Start Arm and Familiar Arm for 10 m. The mouse was then removed from the apparatus and placed in its home cage of 1 h. In the test stage, the barrier at the entrance to the Novel Arm was removed and the mouse was allowed to explore the entire apparatus for 10 m. Movement was tracked and analyzed with the AnyMaze software, and time spent in the Novel Arm relative to the Familiar Arm as well as latency to enter the Novel Arm was analyzed with unpaired t-tests.

Radial arm maze

The radial arm maze consisted of eight arms radiating from a central octagonal platform with eight arms. The maze was constructed of opaque gray plastic with a 30 cm diameter center arena and arms that were 86 cm × 10 cm × 10 cm. Opaque food wells (2 cm in diameter and 0.5 cm deep) in which reward pellets could be placed was located at the end of each arm. The test room contained a variety of salient visual cues. Animals (F4 n=7, F4Cre n=8) were placed on a restricted feeding schedule beginning 2 days prior to the start of testing, and were maintained at 85–90% free-feeding body weight for the duration of the procedure. Animals were given 2 days to acclimate to the maze. During the acclimation phase, two cagemate mice were placed in the maze and allowed to freely explore the apparatus. Following acclimation, mice underwent 6 days of habituation. During habituation, mice were allowed to explore the apparatus for 10 min with food rewards scattered throughout the arms (Days 1 and 2), and the ends of the arms (Days 3 and 4), and then only in the food cups (Days 5 and 6). In the testing phase, only four arms of the maze were baited with one food pellet each. For each mouse the same arms were baited on every trial, and the same pattern of baited and unbaited arms was used for all animals, with the orientation being rotated around the maze between mice to reduce any potential effects of location bias. At the beginning of each test trial, the mouse was placed in the center of the maze and allowed to explore the apparatus for 10 min or until all food rewards had been retrieved. This procedure was repeated for 10 days. The primary measures were number of correct entries into baited arms, number of entries into unbaited arms (reference memory errors), number of reentries into baited arms (working memory errors) and test duration. Arm entries were tracked with the AnyMaze software, and analyzed with two-way repeated-measures ANOVAs.

Cocaine-induced locomotor activity

The role of glutamate input in cocaine-induced locomotor activity was assessed through a behavioral sensitization procedure. Behavior (F4 n=12, F4Cre n=12) was monitored in the apparatus used for open-field activity described above. Prior to drug administration animals were habituated to the apparatus for two consecutive days. On the third day, animals were given a 90-min habituation period and were then removed from the apparatus and injected with saline. After injection animals were immediately returned to the arena and activity was monitored for 90 min. Beginning the following day, mice were given five consecutive days of cocaine (20 mg kg−1) injections. Each day animals were given a 90-min habituation session, and were then injected with cocaine and behavior was monitored for an additional 90 min. Locomotor activity was measured as total distance travelled over 90 min following injection of saline or cocaine, and data were analyzed through a two-way repeated-measures ANOVA.

Conditioned place preference

The ability to form conditioned associations to both drug and non-drug reinforcers was measured through the conditioned place preference (CPP) procedure. The CPP apparatus consisted of two large compartments (20 × 20 × 20 cm) that differed in both visual and tactile cues, separated by a smaller center compartment (10 × 10 × 20 cm). Gateways between the compartments were equipped with a photo beam array (National Instruments NI cFP-2000 detector) and Labview software (developed by George Dold, NIH) to monitor when the animal transitioned between compartments. To allow assessment of the development of the place preference, a five-test design was used. In the first test session, animals were placed in the center compartment and allowed 20 min to explore the entire apparatus. Based on this test, a biased design was used in which each animal was assigned to receive the reinforcer (chocolate-flavored food pellets, Bio-Serv 20 mg Dustless Precision Pellets, or 20 mg kg−1 cocaine) in its initially least preferred side. On the day following the first test, in the morning animals were confined to their initially most preferred side of the apparatus for 15 min. In the cocaine CPP procedure, animals received an injection of saline prior to being placed in the chamber. Approximately 3 h later, animals were confined to their least preferred side of the apparatus for 15 min. In the food CPP procedure (F4 n=8, F4Cre n=7), 50 chocolate food pellets were scattered throughout the chamber. In the cocaine CPP procedure, animals (F4 n=8, F4Cre n=8) were injected with 20 mg kg−1 cocaine (IP) immediately before being placed in the apparatus. The following day, animals were allowed to explore the entire apparatus for 20 min (Test 2). This cycle of alternating test and conditioning days was repeated until animals had received a total of five tests with four intervening conditioning days. During the food conditioning procedure, animals were maintained at 85–90% free-feeding body weight for the duration of the experiment. The preference for the reinforcer-paired side was calculated as the time spent in the least preferred side divided by the total time spent in the most and least preferred sides. Side preferences across the five-test trials were analyzed using a two-way repeated-measures ANOVA.

Operant conditioning

Operant conditioning can be used to model a wide range of behaviors including learning and memory, attention, and reward. For all operant procedure animals were maintained at 85–90% free-feeding body weight. Conditioning took place in sound-attenuated, operant chambers (ENV-300; Med Associates, Fairfax, VT, USA). The chambers were illuminated by a house light that turned on at the beginning of each session, and a fan provided white noise and ventilation, and were equipped with two nose-poke apertures on either side of a receptacle connected to a food pellet dispenser. One nose-poke aperture was designated as active and the other as inactive, with the locations counterbalanced across animals. Responding in the active port resulted in the delivery of a chocolate food pellet (20 mg; Bio-Serv), while responses in the inactive port were recorded but had no consequences. For two days prior to the start of conditioning, animals (F4 n=9, F4Cre n=12) underwent two pre-training sessions in which food pellets were delivered a variable interval schedule (mean of 45 s, range 4–132 s) for 45 min during which time responses in either of the nose-poke apertures had no consequence. Following the two pre-training sessions, animals began 45 min sessions for 10 consecutive days on a fixed ratio 1 (FR1) schedule of reinforcement, during which time each nose poke in an active hole resulted in the delivery of a food reinforcer. Following the 10 days at FR1 animals were transferred to a FR5 schedule for 3 days, during which five responses in the active nose-poke aperture were needed to deliver one food pellet. Animals then underwent extinction, when responses in the active nose-poke aperture did not have any consequence, for 6 days. Once response levels in the active port reached similar levels to responses in the inactive port, a reinstatement phase began in which the response schedule was returned to FR1. During reinstatement the first food pellet was delivered after 1 min or the first response in the active nose-poke aperture, whichever came first.

To assess the motivation to work for a food reward, a separate group of mice (F4 n=10; F4Cre n=10) was trained for progressive ratio (PR) responding. Animals underwent 2 days or pre-training as described above, followed by training on an FR1 schedule until the mouse earned 50 pellets in two consecutive sessions. To familiarize animals with a schedule requiring more effort, a PR3 schedule was used for 5 days. During this time, a linear increase in responses (3, 6, 9, 12, 15, etc.) was needed for delivery of each subsequent food pellet. Animals then underwent 15 days of training on a PR7 schedule to assess motivation and willingness to work for a palatable food reward. In each PR session the break point, the final ratio completed that resulted in the delivery of a food pellet, was recorded. All PR sessions were open-ended, with the session continuing until the animal went 5 min without a response in the active nose-poke port. Differences between groups in time to acquire stable FR1 responding (number of days to reach 50 responses in a 45-min session) were analyzed with an unpaired T-test, and break points over the PR3 and PR7 sessions were analyzed separately with two-way repeated-measures ANOVAs.

Effort-based choice

To further assess the role of glutamate inputs on DA cells in motivation and effort, a concurrent choice task (modified from Cousins and Salamone37 and Cagniard et al.38) was used. All testing took place in sound-attenuated, operant chambers (ENV-300; Med Associates) and animals (F4 n=10; F4Cre n=10) were maintained at 85–90% free-feeding body weight for the duration of the procedure. Testing lasted for 3 weeks and was conducted 5 days per week. Each week, on Days 1, 3 and 5, animals were allowed to either respond in a nose-poke aperture on an FR5 schedule for a highly palatable food (20 mg chocolate food pellets) or to consume standard rodent chow that was available in a dish on the floor of the operant chamber (‘Choice’ condition). On Days 2 and 4 only the food pellets delivered on the FR5 schedule were available (‘No choice’ condition). The number of nose-poke responses, quantity of standard chow consumed, total amount of food consumed (pellets plus chow) and the percentage of food obtained by lever pressing were recorded. Daily and average weekly scores for these measures were analyzed with two-way repeated-measures ANOVAs.

Data acquisition and analysis

Sample sizes of all data were not computed when the study was being designed. However, we performed post hoc power analyses using GPower 3.0. All positive findings fall within the accepted power range of 0.8–1.0. For animal studies, subject testing order and group assignments were pseudo-randomized to ensure strains and treatment groups were equally represented. For all replicated experiments, all groups were equally represented within each replicate. No blinding was done for animal studies. When possible, data were collected through automated systems (AnyMaze, Omnitech Electronics Digiscan (Ugo-Basile Mouse Rotarod 47600)) to reduce potential effects of investigator bias. Acquisition and analysis of electrophysiology data were performed with Igor Pro (Wavemetrics) software. Voltammetry data were acquired and analyzed using a custom-written analysis software in Igor Pro.39 Assessment of immunohistochemistry fluorescence intensity was performed with ImageJ (NIH). Statistical comparisons were performed using Igor Pro, GraphPad Prism or SPSS software. Comparisons between two groups were performed with t-tests, comparisons with three or more groups were performed with one-way ANOVAs and comparisons across multiple days or trials were performed with repeated-measures ANOVAs. Significance was set at P<0.05 for all tests (*P<0.05, **P<0.01, ***P<0.001, ****P<0.0001) and no significance was P0.05. Tukey’s or Sidak’s post hoc tests were used when appropriate. All statistical tests are two-sided. For all data, the estimate of variation was presented as the standard error of the mean (s.e.m.), and variances were found to be similar between groups. All data were presented as mean±s.e.m.

Results

The vast majority of glutamatergic inputs onto DA neurons are lost in F4Cre mice

To study the role of glutamatergic input in DA-related behavior, we crossed a well-characterized DAT (DA transporter)-Cre knock-in mouse line (Slc6a3+/cre or DAT-Cre) that expresses Cre specifically in midbrain DA neurons35 with Gria1–3fl/flGrin1fl/fl (hereafter F4) mice, in which three genes encoding AMPAR subunits (GluA1, A2 and A3) plus the gene encoding GluN1 are all homozygous conditional alleles,33 to generate Gria1-3fl/flGrin1fl/fl/Slc6a3+/cre (hereafter F4Cre) mice (quadruple conditional knockout, KO) (Figure 1a). Immunohistochemical assays in DAT-Cre/Rosa26-tdTomato mice (Cre/tdTomato), generated by crossing DAT-Cre mice with Ai14 tdTomato reporter mice,36 confirmed Cre expression selectively in DA neurons as indicated by colocalization of tdTomato (the product of Cre-mediated recombination) and tyrosine hydroxylase (TH), a cellular marker for DA neurons (Supplementary Figures S1A–C). In F4Cre mice, immunofluorescence labeling showed the losses of GluN1, GluA1 and GluA2/3 subunits in TH-positive neurons in VTA (Figures 1b–d). There was no change in the expression of GluA4, the only remaining AMPAR subunit, in TH-positive VTA neurons, suggesting that this protein was not upregulated (Supplementary Figure S1D). These data indicate the specific loss of targeted glutamate receptor subunits in VTA DA neurons in the F4Cre mice.

Figure 1
figure 1

Generation and characterization of F4 and F4Cre mouse lines. (a) Schematic of Grin1 and Gria1-3 alleles in F4 and F4Cre mice. The DAT-Cre mouse line was crossed with a Gria1-3fl/flGrin1fl/fl (F4) line to generate Cre-positive Gria1-3+/flGrin1+/fl mice that were then crossed with Gria1-3fl/flGrin1fl/fl mice to produce the conditional knockout mouse line (F4Cre). (b–d) High-power (× 40) images of staining for GluN1 and TH (b) for GluA1 and TH (c) and for GluA2/3 and TH (d) in ventral tegmental area (VTA) neurons. Scale bar: 25 μm. (e) mEPSC frequency but not amplitude in VTA DA neurons was strongly reduced in F4Cre/tdTomato mice (Cre/tdTomato, n=13, F4Cre/tdTomato, n=15). Scale bar: 10 pA, 0.5 s. The breeding scheme for all electrophysiological experiments is shown in Supplementary Figure S1B. (f) Stimulation-induced VTA DA neuron firing evoked by 100 Hz electrical stimulation (for 1 s) was lost in F4Cre/tdTomato mice (sample traces from stimulation at 0.5 mA). Scale bars: 25 mV/500 ms (top), 25 mV/200 ms (middle). (g) Representative DA transients (top) evoked by single pulse electrical stimulation in the dorsal striatum of brain slices from F4 (black) and F4Cre mice (red) measured by fast-scan cyclic voltammetry. Color plot voltammograms (bottom) and current–voltage plots (insets) showing the characteristic oxidation and reduction peaks for DA. (h) DA concentration and clearance properties of transients recorded in the dorsal striatum (dorsal, F4, n=7, F4Cre, n=6) and nucleus accumbens (NAc, F4, n=13, F4Cre, n=13) of F4 and F4Cre mice. (i) Representative DA transients evoked by single pulse electrical stimulation before (thin) and after cocaine (1 μM, thick) 30- min application in the NAc of F4 (black, n=3) and F4Cre mice (red, n=3) (*P<0.05, ****P<0.0001 and P0.05, no significance). All data were presented as mean±s.e.m. DA, dopamine; EPSC, excitatory postsynaptic current.

PowerPoint slide

To facilitate electrophysiological analysis, we also crossed Gria1-3fl/flGrin1fl/fl/DAT-Cre mice with the Ai14 Cre-mediated tdTomato reporter mouse line to generate Gria1-3fl/flGrin1fl/fl/DAT-Cre/tdTomato mice (hereafter F4Cre/tdTomato), in which Cre expression led to genetic deletion of NMDARs and AMPAR GluA1, A2 and A3 subunits as well as tdTomato expression in DA neurons as a fluorescent marker (Supplementary Figures S1B and C). Paired whole-cell voltage clamp recordings in VTA DA (tdTomato-positive) versus non-DA (tdTomato-negative) neurons in acute brain slices prepared from F4Cre/tdTomato mice confirmed the loss of synaptic NMDARs, as there were no measurable EPSCs recorded in DA neurons evoked by a single stimulation at 0.1 Hz at +40 mV holding potential (Supplementary Figures S1E and F). In contrast, NMDA EPSC amplitudes were similar between DA and non-DA neurons from Cre/tdTomato mice (Supplementary Figures S1E and F). In addition, paired whole-cell recordings of AMPAR-mediated evoked EPSCs at −70 mV demonstrated a nearly 90% reduction of AMPA EPSC amplitude in DA neurons in F4Cre/tdTomato mice (Supplementary Figures S1G and H). Furthermore, AMPA mEPSC recording at −70 mV showed a ~90% reduction of mEPSC frequency without change of mEPSC amplitude in DA neurons in F4Cre/tdTomato mice (Figure 1e). Taken together, these results indicate a complete loss of NMDA EPSCs and a dramatic reduction (~90%) of AMPA EPSCs in VTA DA neurons in F4Cre/tdTomato mice. These data also suggest that the remaining GluA4 subunit can traffic to a small subpopulation of synapses in VTA DA neurons in F4Cre/tdTomato mice and accounts for the residual AMPA EPSCs. Consistent with this, the decay constant of mEPSCs in Cre-positive VTA neurons in F4Cre/tdTomato was significantly faster than in control DA neurons in Cre/tdTomato mice, which is a characteristic of GluA4 homomeric receptors40 (Supplementary Figure S1I).

Current clamp recording of VTA DA neurons revealed similar rates of spontaneous firing between control and F4Cre/tdTomato mice (Figure 1f). Local stimulation at 100 Hz generated transient burst activity in VTA DA neurons across a wide range of stimulation intensities in control, but not in F4Cre/tdTomato mice (Figure 1f). These data indicate that loss of NMDAR- and most AMPAR-mediated synaptic transmission onto VTA DA neurons in F4Cre/tdTomato mice impairs stimulation-induced DA neuron firing in vitro.

We also characterized the DA signals in brain slices containing the striatum using fast-scan cyclic voltammetry, and found that DA transients evoked by single pulse stimulation (Figure 1g), input–output curves (Supplementary Figure S1J) and the effect of cocaine (Figure 1i) or nicotinic receptor antagonist DHβE (Supplementary Figure S1K) on DA transients in striatum were intact in F4Cre mice, although a small change in DA clearance was observed in F4Cre mice (Figure 1h). Furthermore, the DA signals evoked by trains showed no differences between genotypes when tested in basal conditions, but a modest decrease was detected at 50 and 100 Hz in the presence of DHβE (Supplementary Figure S1M). These results suggest that the dopaminergic system in F4Cre mice is largely preserved with a modest change in DA clearance and in DA release in response to high-frequency stimulation in the presence of DHβE.

Characterization of general behaviors in F4Cre mice

F4Cre mice were born and survived at the expected Mendelian ratio and had normal body weight (Supplementary Figures S2A–C). There were no differences between genotypes in either an open-field activity assessment (Figure 2a and Supplementary Figures 2D–H) or motor learning during a rotarod task (Figure 2b). In addition, the F4Cre mice performed normally in tests of memory including novel object recognition (Supplementary Figure 3A), spontaneous alternation (Supplementary Figure 3B), spatial recall (Supplementary Figure 3C) and radial arm maze-based reward learning (Supplementary Figures 3D–I). We also examined the response to natural reinforcers with 5-day preference tests for both sucrose and chocolate food pellets. Although the F4Cre mice showed a decreased preference for sucrose on day 1 (Figure 2c), this difference was not maintained and there was no difference in the average sucrose consumption (Figure 2c). Similarly, no differences were seen in preferences for chocolate-flavored food pellets compared with standard rodent chow (Figure 2d). These data show that the NMDAR-mediated and ~90% of AMPAR-mediated glutamatergic input onto DA neurons is dispensable for many general behavioral functions.

Figure 2
figure 2

General behavioral characterization of F4Cre mice. (a) No differences were seen in open-field activity (F4 n=12, F4Cre n=12). (b) An accelerating rotarod test revealed no differences between groups (F4 n=17, F4Cre n=17). (c) In a 5-day sucrose preference test, the F4Cre mice displayed reduced preference for sucrose solution on the first day than the F4 animals (***P<0.001), although no other group differences were seen (F4 n=8, F4Cre n=8). (d) No differences were seen in preferences for chocolate food pellets (F4 n=11, F4Cre n=12). (e) Animals from both strains showed a similar significant increase in activity in response to 20 mg kg−1 cocaine (F4 n=12, F4Cre n=12). (f–h) Experimental set-up of conditioned place preference (CPP) experiments (f). No differences were seen in CPPs induced by food (F4 n=8, F4Cre n=7) (g) or 20 mg kg−1 cocaine (F4 n=8, F4Cre n=8) (h). Least preferred side (LPS). (***P<0.001 and P0.05, no significance). All data were presented as mean±s.e.m. FR, fixed ratio; KO, knockout.

PowerPoint slide

F4Cre mice can form cue–reward associations

DA neurons are thought to play an important role in drug abuse.19, 41 Repeated injections of stimulant drugs induce behavioral sensitization that corresponds to certain aspects of drug reward and addiction-related behaviors.42 We monitored locomotor activity in response to five consecutive days of cocaine (20 mg kg−1) injections. Both mouse lines showed a significant increase in distance travelled following the cocaine injections, with no differences between genotypes (Figure 2e). The CPP test is widely used to assess the ability to form stimulus-mediated associative memories.24 In a test of CPP learning, when either chocolate-flavored food or 20 mg kg−1 cocaine was used as the conditioned stimulus, both F4 and F4Cre mice showed a similar increase in preference for the stimulus-paired chamber (Figures 2f–h). These data indicate that these behavioral responses to rewards can be formed in F4Cre mice in which the majority of glutamatergic input, mediated by both NMDARs and AMPARs, onto DA neurons is lost.

Given the importance of DA neurons in instrumental responding,43 we next wanted to test how F4Cre mice would perform in a series of operant conditioning tasks. To explore the role of glutamatergic input onto the DA neurons in operant conditioning (Figure 3a), we trained animals to respond in a nose-poke hole to earn a food reinforcer on both FR1 and FR5 schedules (Figures 3b and c). There were no differences between F4 and F4Cre mice in number of responses or rate of acquisition of these tasks. Animals then underwent an extinction phase, where responses in the nose-poke holes did not result in the delivery of a reinforcer. Again, no differences were seen between genotypes (Figure 3d). When food pellets were returned on an FR1 schedule, both F4 and F4cre animals rapidly reacquired responding at similar rates (Figure 3e). These data show that F4Cre mice can perform instrumental learning tasks.

Figure 3
figure 3

F4Cre mice show normal instrumental learning and decreased motivation in the PR7 schedule. (a) Set-up of the chambers used for operant conditioning. (b–e) There were no differences between genotypes during (b) FR1, (c) FR5, (d) extinction or (e) reinstatement (F4 n=9, F4Cre n=12). (f, g) There were no differences between genotypes in time to acquire an FR1 operant task (criteria for acquisition was 50 responses in a 45 min session) (f). Under a PR7 response schedule, genotype differences emerged at 7 days of responding, with the F4Cre animals having lower break points than the F4 mice on days 7–15 (g) (F4 n=10; F4Cre n=10) (*P<0.05, **P<0.01, ***P<0.001 and P0.05, no significance). All data were presented as mean±s.e.m. FR, fixed ratio; KO, knockout.

PowerPoint slide

Glutamatergic inputs onto DA neurons are important for motivation to work for rewards

While operant responding on a fixed ratio schedule is a good assessment of the ability to learn to respond to earn a reinforcer, it does not necessarily measure the subject’s level of motivation or willingness to work to receive the reward. We thus trained mice on a PR schedule of reinforcement, where each subsequent food pellet earned required a greater number of responses. Each session continues until the animal does not respond for 5 min, and the final number of responses an animal completes where a reward is delivered is called the break point, a measure of motivation.44 Mice were initially trained to respond in the active nose-poke hole on an FR1 schedule until they earned at least 50 pellets over a 45-min session (Figure 3f, Supplementary Figures 4A and B). Once an animal met the acquisition criteria it was trained on a progressive ratio 3 (PR3) schedule for 5 days (Supplementary Figures 4C and D). After this habituation to the PR task it was then tested under a PR7 schedule for 15 days to assess motivation to work for the chocolate food pellets. Beginning on Day 7 and continuing through Day 15, the F4Cre animals had lower break points (Figure 3g) and stopped responding sooner (Supplementary Figure 4E) than the F4 mice, indicating a decreased motivation to work for the food reinforcer.

Glutamatergic control of DA neurons plays a crucial role in high-effort behaviors

To further explore the role of glutamatergic input onto DA neurons in effort-related behaviors, we used a cost–benefit operant procedure.38 This procedure consisted of 3 weeks of 5-day behavioral testing (Figure 4a). On Days 1, 3 and 5 the animals could either respond in a nose-poke hole on an FR5 schedule to earn a chocolate food pellet, or consume standard laboratory chow that was freely available in the operant chamber (Figure 4b). On Days 2 and 4 of each week the animals could respond on the FR5 schedule for the chocolate food pellets, but the standard chow was not concurrently available (Figure 4g). When animals had a choice between free access to standard chow and FR5 access to chocolate food pellets, the F4Cre mice showed decreased motivation to work for the chocolate food (Figure 4c). Specifically, the F4Cre mice showed a decrease in responding for the chocolate food pellets (Figure 4d), but consumed a greater quantity of the freely available chow (Figure 4e) than the F4 mice. There were no differences between groups in total amount of food (pellets plus chow) consumed (Figure 4f). Importantly, when the chocolate food pellets at the FR5 schedule were the only source of food available, the F4Cre mice responded at similar levels as the F4 animals, indicating that these animals had no deficits in the ability to perform the task (Figure 4h). There were no differences in body weight between the F4 and F4Cre mice during the testing period (Supplementary Figure 4F), indicating that differences in food consumption patterns were not due to differences in body size.

Figure 4
figure 4

Inhibition of glutamatergic input onto dopamine (DA) neurons suppresses animal motivation in an effort-based choice task. (a) Scheme for the 3-week effort assessment. (b–h) Set-up of the chambers on the choice days (Days 1, 3 and 5 each week) (b). Animals had free access to standard lab chow and could also respond in the nose-poke port (FR5) for chocolate food pellets. (c) In the Choice sessions, the chocolate pellets constituted a lower percentage of total food consumed for F4Cre relative to F4 mice (****P<0.0001). (d) During the Choice sessions, F4Cre animals responded for fewer pellets than the F4 mice (***P<0.001). (e) F4Cre mice consumed a greater amount of chow during Choice days than the F4 animals (*P<0.05, ***P<0.001). (f) During the Choice sessions, no differences were seen in total food consumed. (g) Set-up of the operant chambers on the non-choice days (Days 2 and 4 each week). Animals could respond in the nose-poke port (FR5) for chocolate food pellets. No other food was available in the operant chambers. (h) On the non-choice days, the two groups showed similar rates of responding. (F4 n=10; F4Cre n=10). (*P<0.05, **P<0.01, ***P<0.001 and P0.05, no significance). All data were presented as mean±s.e.m. FR, fixed ratio; KO, knockout.

PowerPoint slide

Discussion

Glutamatergic synaptic transmission provides the majority of excitatory drive in the brain and is important in many aspects of normal behavior and cognition. However, the behavioral relevance of glutamatergic excitatory drive as a whole onto a defined population of neurons has not been directly tested. This is likely due to the fact that there are two major subtypes of ionotropic glutamate receptors (AMPARs and NMDARs) containing multiple subunits mediating glutamatergic synaptic transmission, leaving pharmacological and genetic manipulation of glutamatergic input onto a defined population of neurons difficult to achieve. In this study we developed a quadruple KO mouse line to block glutamatergic synaptic transmission onto midbrain DA neurons and applied this tool to examine the role of excitatory glutamatergic inputs in DA-related functions. The contribution of our data is threefold. First, we introduce a new genetic approach that allows in vivo functional inactivation of excitatory glutamatergic synaptic transmission onto a defined population of neurons. Second, we provide genetic evidence that reward associations in several reward learning tasks (i.e., radial arm maze-based reward learning, CPP and instrumental learning) can be formed in animals in which the vast majority of glutamatergic drive of DA neurons is disrupted. Third, this work reveals an important role of glutamatergic afferents onto DA neurons in the control of incentive motivation.

Functional dissection of neural circuits often involves manipulation of synaptic transmission and examination of resulting behavioral consequences. Traditional approaches such as lesions, electrical stimulation and pharmacology have been foundational but generally lack precision in tissue or cell-type specificity. More recently, genetic deletion of vesicular transporters or inhibition of neurotransmitter release by disrupting SNARE machineries has been successfully employed to study neural circuit function,45, 46 although these methods are designed to probe the function of neuronal output regardless of neuronal firing patterns or input identities. New optogenetic and chemogenetic approaches47, 48 have been powerful but are less suited to examine the role of widespread afferents originating from multiple sources onto a discrete population of neurons. Consequently, it has been difficult to address the behavioral relevance of glutamatergic input as a whole onto DA neurons with the aforementioned methods. Our F4Cre mice, in which excitatory glutamatergic inputs onto a defined population of neurons can be specifically disrupted in a Cre-dependent manner, provide an alternative approach to dissect neural circuit function in vivo.

DA neurons discharge in two characteristic modes: tonic firing at 1–5 Hz and brief higher-frequency burst firing. Phasic burst firing has been proposed to act as a teaching signal that associates biological significance to otherwise neutral cues and underlies reward learning.49 Glutamatergic afferents modulate DA neuron phasic firing,3, 23 and thus it is important to determine the physiological relevance of glutamatergic inputs in DA-related functions. In our F4Cre mice, the NMDAR-mediated and ~90% of AMPAR-mediated glutamatergic inputs to DA neurons are eliminated. Behavioral analysis shows that the majority of glutamatergic input onto midbrain DA neurons is dispensable for ‘liking’ of natural rewards such as sucrose or chocolate-flavored food, as demonstrated by normal preference ratios in F4Cre mice. In addition, cocaine-induced locomotion as well as food- and cocaine-induced CPP developed despite the near complete loss of glutamatergic input to DA neurons in F4Cre mice. Furthermore, these mutant animals can perform instrumental learning tasks.

These data demonstrate that mice can experience reward and perform reward learning in the absence of NMDAR- and GluA1, 2 and 3-containing AMPARs-mediated modulation of DA neuron activity and plasticity. Our data are consistent with previous studies of DA-deficient mice in that these animals can learn basic reward associations and display preferences for rewarding stimuli such as sucrose.17, 50, 51, 52, 53 It is worth pointing out that DA neuron bursting firing was reduced in mice lacking the GluN1 subunit in DA neurons.7, 31 Currently, we have no evidence indicating whether DA neuron burst activity in vivo is altered or not in our F4Cre mice. Thus, our data do not necessarily suggest that bursting firing of DA neurons is unimportant for reward learning. Although glutamatergic input to DA neurons play important roles in the regulation of DA neuron firing,54, 55, 56 our data may point to other non-glutamatergic sources of input to DA neurons as key alternative drivers of DA bursting, for example, via GABAergic disinhibition or cholinergic excitation. Indeed, recent studies have indicated that VTA and rostromedial tegmental (RMTg) GABAergic neurons exert a powerful influence on DA neuron firing and DA-related behavior.15, 16, 57, 58 Cholinergic excitatory inputs also promote DA neuron bursting firing.9, 59 Moreover, serotoninergic and various peptide signals regulate DA neuron activity.1 It is also possible that the genetic deletion of glutamate receptors could lead to neural circuit adaptations, inducing up- or downregulation of other synaptic inputs; and we cannot exclude the possibility that such adaptations would mask the importance of glutamatergic input in reward learning while still revealing effects on effort-based tasks. In addition, the remaining GluA4 subunit can support ~10% AMPAR-mediated synaptic transmission in F4Cre mice, which may contribute to glutamate-mediated activation of DA neurons in vivo (though see Figure 1f in vitro data). Furthermore, in F4Cre mice we observed a modest change in DA clearance and DA release in response to high-frequency stimulation in the presence of DHβE. Although it remains unclear how glutamate receptors are functionally coupled to DA clearance, we cannot exclude potential developmental adaptations. Finally, it is notable that recent studies have identified substantial heterogeneity for DA neurons based on projection target, their ability to co-release other transmitters, and functional properties that may confer DA neuron subpopulations with distinct, even opponent roles.10, 60, 61, 62 Thus, deletion of glutamate receptors within discrete DA neuron subpopulations could provide insights into the role of ionotropic glutamate receptors in reward learning that are masked by their deletion in all DA neurons. Thus although we show that animals lacking GluN1 and GluA1–3 subunits in DA neurons can perform several reward learning tasks, our data do not reject the importance of glutamatergic inputs to DA neurons in reward learning but rather highlight the potential importance for other types of inputs.

Previous work has shown that mice with genetic deletion of the obligatory NMDAR GluN1 subunit or individual AMPAR subunits in DA neurons can still form initial contextual or cue–reward associations in several DA-related tasks6, 7, 17, 30, 32 (but see ref. 8). Indeed, elegant work with mutant mice lacking GluN1 in DA neurons (GluN1 KO) has shown that these GluN1 KO mice can develop Pavlovian conditioning,7, 30 food or cocaine-inducedCPP (refs 6,32) (but see ref. 8) and instrumental conditioning.7, 17, 31 Importantly, loss of NMDARs in DA neurons strongly impaired DA neuron burst firing, indicating that NMDAR-mediated burst firing might not be necessary for the animal to form cue–reward association.7, 31 Similarly, mice with genetic deletion of the AMPAR GluA1 or GluA2 subunit in DA neurons could also develop CPP.6 Importantly, in each of these models AMPAR-mediated synaptic transmission was intact6, 28 or strongly elevated.6, 8 Thus, our results are consistent with and expand upon these reports, demonstrating that reward associations in the learning tasks that we have studied can be formed in mice with the combined lack of GluN1 and GluA1–3 subunits in DA neurons. Interestingly, it has been shown that although the AMPAR GluA1 subunit and NMDARs in DA neurons are not required for development of cocaine-induced CPP, the AMPAR GluA1 subunit is important for extinction of CPP and the GluN1 subunit is critical for reinstatement of CPP.6 Similarly, mutant mice lacking NMDARs in DA neurons show normal levels of cocaine self-administration and normal extinction, but impaired cue-induced reinstatement of cocaine-seeking.17 Taken together, these studies indicate that glutamate receptor-mediated signaling in DA neurons is important for the persistence of drug seeking.53 In a different learning test, we found that both extinction and reinstatement of operant conditioning are normal in F4Cre mice, indicating that glutamate receptor subunits are differentially involved in CPP, a form of classical conditioning, and operant conditioning. It is also worth mentioning that different from cocaine, nicotine-induced CPP is lost in mice lacking NMDARs in DA neurons.63 Nicotine directly stimulates DA neurons and can induce NMDAR-dependent LTP in DA neurons, and nicotine-induced synaptic plasticity has been proposed to play a role in nicotine-induced CPP.63, 64, 65 In addition, nicotine potentiates glutamate release onto DA neurons and thus enhances glutamate receptor activity in DA neurons.66, 67 Thus, it is possible that loss of NMDARs in DA neurons blocks nicotine-induced synaptic potentiation and diminishes the effect of nicotine-induced enhancement of glutamate release onto DA neurons, and consequently prevents CPP. Cocaine, on the other hand, blocks DAT and other monoamine transporters to directly increase extracellular monoamine concentration.68 The mechanistic differences of nicotine and cocaine acting on DA neurons may thus explain their distinct phenotypes in mice lacking NMDARs.

Our work reveals a critical role of glutamatergic modulation of DA neuron activity in the regulation of the animal’s willingness to exert effort for reward. Indeed, mice lacking the majority of glutamatergic input onto DA neurons discontinue instrumental responding after obtaining fewer food rewards and choose the less preferred food in a cost–benefit task. The early in vivo work in rats indicates that millisecond DA release plays a key role in the regulation of willingness to engage in goal-directed behavior.69, 70 Similarly, studies in DA-deficient mice suggest that DA is necessary for mice to seek rewards during goal-directed behavior52 and chemical-induced DA depletion in the accumbens in rats impairs effort-based choices.37 On the other hand, DAT knockdown mice exhibit increased extracellular DA levels and have an enhanced tendency to work for reward.38 Furthermore, in vivo pharmacological inhibition of DAT increases high-effort behaviors.71 Collectively, these data support the importance of midbrain DA neurons in motivational process.72, 73, 74, 75 It is worth noting that previous work in mutant mice lacking GluN1 in DA neurons has shown that motivation to work for food is not impaired in these mutant animals, as break point in a PR task is similar between KO and control mice during the first two days of testing.31 Consistently, in our PR test with F4Cre mice, we find that the break point is similar between control and F4Cre KO mice during the first 6-day test (Figure 3g). Interestingly, we observe a delayed deficit in motivation to obtain food in F4Cre mice, as these mice have significantly lower break points from Day 7 and continuing through Day 15 (Figure 3g). In addition, these F4Cre mice exhibit a decreased motivation to work for the chocolate food in a cost–benefit operant task (Figure 4). These data demonstrate that the glutamatergic drive of DA neurons plays a key role in generating and sustaining goal-directed behavior in high-effort tasks and underlies the coding of a motivational state that promotes reward seeking. It is possible that fast glutamatergic regulation of DA neuron activity may enable dynamic mesolimbic DA release that provides a value signal that influences the decision about whether to work for rewards.76 Thus, our results support the view that a key physiological function of DA neuron activity in vivo, driven by glutamatergic input as shown here, is to generate and sustain incentive motivation.74, 75, 77

Our findings have important clinical implications for understanding the etiology of drug addiction and neuropsychiatric disorders. We find that mice lacking the NMDARs and the vast majority of AMPARs in DA neurons can perform cocaine CPP and instrumental learning. These data suggest that non-glutamatergic afferents to DA neurons may play a less-appreciated, but important role in reward learning. In addition, our data show that glutamatergic modulation of DA neurons plays a critical role in the regulation of motivation. As impairment of motivation is associated with many psychiatric disorders such as depression and schizophrenia,77 our results indicate that therapeutic reagents modulating glutamatergic input to DA neurons may represent effective clinical interventions against mental illnesses.