Ventral tegmental area glutamate neurons co-release GABA and promote positive reinforcement

In addition to dopamine neurons, the ventral tegmental area (VTA) contains GABA-, glutamate- and co-releasing neurons, and recent reports suggest a complex role for the glutamate neurons in behavioural reinforcement. We report that optogenetic stimulation of VTA glutamate neurons or terminals serves as a positive reinforcer on operant behavioural assays. Mice display marked preference for brief over sustained VTA glutamate neuron stimulation resulting in behavioural responses that are notably distinct from dopamine neuron stimulation and resistant to dopamine receptor antagonists. Whole-cell recordings reveal EPSCs following stimulation of VTA glutamate terminals in the nucleus accumbens or local VTA collaterals; but reveal both excitatory and monosynaptic inhibitory currents in the ventral pallidum and lateral habenula, though the net effects on postsynaptic firing in each region are consistent with the observed rewarding behavioural effects. These data indicate that VTA glutamate neurons co-release GABA in a projection-target-dependent manner and that their transient activation drives positive reinforcement.

T he ventral tegmental area (VTA) is a heterogeneous brain region that serves as a critical hub in the control of motivated behaviours. VTA neurons send dense dopamine projections to the nucleus accumbens (NAc) and olfactory tubercle, and more modest projections to other limbic regions including the amygdala and prefrontal cortex (PFC) 1 . Dopamine neurons tend to fire in response to rewards and reward-predicting cues, and dopamine signalling has profound impacts on rewardseeking and behavioural reinforcement through activation of G protein-coupled dopamine receptors. But VTA neurons also release other signalling molecules, including the major inhibitory and excitatory neurotransmitters in the brain g-aminobutyric acid (GABA) and glutamate [2][3][4][5][6][7][8][9][10][11] ; though a systems-level understanding of the function of fast postsynaptic signals provoked by GABA or glutamate co-release has remained elusive 12,13 .
Although the co-release of GABA or glutamate with dopamine is more common than originally suspected, significant populations of VTA neurons are devoid of dopamine markers but express markers indicative of GABA or glutamate release. Indeed, the canonical GABA markers, glutamic acid decarboxylase and the vesicular GABA transporter (VGAT) are virtually absent from immunochemically identified dopamine neurons in the VTA, but present in 20-30% of all VTA neurons [14][15][16][17] . VTA GABA neurons are important targets of drugs of abuse, and plasticity at inhibitory synapses to and from VTA contribute to maladaptive behaviours common in drug addiction [18][19][20] . Optogenetic approaches have recently demonstrated that VTA GABA neurons are responsive to reward-predicting cues and to aversive stimuli 21,22 ; and their activation is sufficient to disrupt reward consumption or induce avoidance behaviour 23,24 .
Systematic efforts to characterize VTA glutamate neurons have begun only recently, in part, because of their relatively delayed discovery that required the in situ detection of mRNA encoding the vesicular glutamate transporter 2 (VGLUT2.) 25,26 . Also, some ambiguity persists regarding the prevalence of VGLUT2 in the VTA, with one quantitative stereological assessment indicating VGLUT2 þ neurons comprise just 2-3% of the rat VTA (ref. 14). However, VGLUT2 þ neurons are concentrated in medial VTA subregions where they can outnumber dopamine neurons; indeed, one study indicates that VGLUT2 þ neurons represent B35% of NAc-projecting neurons in VTA, as well B66% of those projecting to PFC (ref. 27). These findings are consistent with observations that most if not all neurons in the NAc receive glutamate input from VTA (refs 4,10,28) and multiple other lines of evidence suggesting that VGLUT2 þ VTA neurons may prove markedly more prevalent than initially surmised [29][30][31][32][33] .
Recent reports have used optogenetics to suggest that VTA neurons may convey either reward 34 or aversion 35,36 signals depending on local connectivity or projection target. In this report we selectively targeted VTA VGLUT2 þ neurons using optogenetics and showed that brief photostimulation of cell bodies or three major terminal fields was sufficient to induce positive reinforcement in instrumental behavioural assays; but that their sustained stimulation was less preferred and could even manifest as apparent behavioural avoidance. These data are in stark contrast to the effects produced by VTA dopamine neuron stimulation, which induced reward responses across a wide range of behavioural assays and stimulation parameters. We found also that a subset of VGLUT2 þ VTA neurons projecting to the ventral pallidum (VP) or lateral habenula (LHb) co-release glutamate and GABA. However, in the absence of other inputs and congruent with the behavioural reinforcement induced by short stimulus trains, train activation of VGLUT2 þ VTA inputs to VP were net excitatory, but net inhibitory in the LHb. We thus propose that transient activation of VGLUT2 þ VTA neurons promotes behavioural reinforcement through their ability to differentially excite or inhibit postsynaptic cells dependent on projection target, but that the effects of sustained activity are less preferred.

Results
Optogenetic manipulation of VGLUT2 þ VTA neurons. To selectively manipulate VGLUT2 þ neurons in the VTA, we stereotactically injected a Cre recombinase-dependent Adenoassociated viral vector engineered to express the fusion protein Channelrhodopsin-2:mCherry (AAV1-EF1a-DIO-ChR2:mCherry). The vector was infused unilaterally into the medial VTA of Slc17a6 IRES-Cre (VGLUT2-Cre) mice 37 , where VGLUT2 þ neurons are most abundant (Fig. 1a). A subset of sections were co-stained for the dopamine neuron marker tyrosine hydroxylase (TH); 11.4 ± 1.5% of clearly labelled ChR2:mCherry þ cell bodies co-labelled with TH (n ¼ 4 mice; Fig. 1b and 1c), consistent with previous reports showing overlap between VGLUT2 þ and TH þ cells 3,25,27,30,35,38 . Mice used in behavioural experiments were also implanted with chronic indwelling optic fibres dorsal to the medial VTA; implant sites were verified by histology after behavioural experiments ( Supplementary Fig. 1). Subsets of mice were also processed to demonstrate that photostimulation could induce a significant increase in the number of c-Fos labelled cells in the VTA (Fig. 1d).
To characterize the membrane properties of ChR2:mCherrylabelled cells and to demonstrate optogenetic control over their activity, whole-cell or cell-attached recordings were made from acute brain slices. Recordings showed that trains of photostimuli were sufficient to evoke action potentials with spike fidelity Z1 up to at least 40 pulses delivered at 40 Hz (Fig. 1e). Similar to previous reports using BAC transgenic VGLUT2-EGFP mice 28 , ChR2:mCherry-labelled cells in VTA were typically spontaneously active with firing rates of 5.2 ± 1.2 Hz ( Supplementary  Fig. 2).
Activation of VGLUT2 þ VTA soma or terminals is reinforcing. Because optogenetic stimulation of dopamine neurons is sufficient to drive positive behavioural reinforcement 39,40 , while activation of VTA GABA neurons can oppose reward 23,24 , we tested whether photostimulation of VGLUT2 þ VTA neurons can suffice as a reinforcer using operant behavioural tasks (Fig. 2a,b). Using a 2-hole nosepoke-operant discrimination assay where an active nosepoke triggered optical stimulation of VGLUT2 þ VTA neurons, mice demonstrated single-session discrimination for the active nosepoke ( Fig. 2c-e, Supplementary Video 1). Responding persisted over the course of 4 days and control littermates lacking ChR2:mCherry expression showed no discrimination and minimal responding overall. Similar results were observed using a different VGLUT2-Cre BAC transgenic mouse line 41 (Supplementary Fig. 3) or with a 2-bottle choice assay in water-deprived mice, even when optical stimulation was placed in competition with sucrose solution (Supplementary Fig. 4). Subsequent exposure to a 5-hole nosepoke apparatus, such that each nosepoke was coupled to light delivery at frequencies varying from 0-40 Hz (Fig. 2f), demonstrated that mice displayed a strong preference for light delivered at faster frequencies, with maximal responding for 40 Hz, the highest frequency tested (Fig. 2g-i). Several brain regions are densely innervated by VGLUT2 þ VTA neurons; including the medial shell of the NAc, VP, and LHb (refs 28,30). To determine whether individual VTA neurons might target more than one of these non-contiguous structures we injected different coloured fluorescent retrobeads into pairs of these brain regions (that is, NAc/VP, NAc/LHb and VP/LHb; Supplementary Fig. 5a). We found many retrobead-labelled cells in the VTA, but only a small fraction (o1%) co-localized for both colours ( Supplementary Fig. 5b-d). These data indicate that separate populations of VTA neurons, including VTA glutamate neurons, target each of these structures. Changes in neural activity in each of these regions could support behavioural reinforcement observed following cell-body stimulation (Fig. 2), but with different expected valences. For example, increased excitatory drive to the NAc or VP can drive positive reinforcement 42,43 , but activation of the LHb is negatively reinforcing 44 . If release of glutamate at these varied efferent targets is expected to have opposing effects on behavioural reinforcement, why does photostimulation of VGLUT2 þ VTA cell bodies produce robust positive reinforcement?
Surprisingly, photostimulation of VGLUT2 þ VTA terminal fields in each of the NAc, VP and LHb initially failed to induce positive reinforcement on the 5-nosepoke and 2-bottle choice tasks ( Supplementary Fig. 6a-c). However, when the mice were placed on a restricted feeding schedule and exposed to the 2-hole nosepoke-instrumental task we observed modest but significant increase in responses for the active hole (Fig. 2j,k). These results indicate that activation of VGLUT2 þ VTA terminals in each of these regions is capable of serving as a reinforcer, at least under certain conditions. The less robust reinforcement may relate to a reduced ability to efficiently recruit terminals on photostimulation, indicate that synchronous activation of multiple targets is a more potent reinforcer, or that the reinforcing effects are state dependent.
Because a subset of VGLUT2 þ neurons co-localize with TH (Fig. 1b,c) and co-release dopamine in the NAc (refs 3,4), the coincident release of dopamine is likely to contribute at least partly to the reinforcing properties of stimulating VGLUT2 þ VTA neurons or NAc terminals. As an initial test of the relative contribution of photo-evoked dopamine to the observed behavioural reinforcement, we repeated the experiments following systemic administration of dopamine D1 receptor (D 1 R), dopamine D2 receptor (D 2 R) or combined antagonists. We found that at doses that can inhibit dopamine-dependent reinforcement 45,46 , neither D 1 R ( Supplementary Fig. 7a) nor  Fig. 7b) antagonism blunted responding on the two-bottle choice assay. Similar results were obtained when both antagonists were provided jointly using a counter-balanced design in mice stably responding on the two-hole nosepoke instrumental reinforcement assay ( Supplementary Fig. 7c). Importantly, this treatment significantly attenuated responding for photostimulation of DAT þ VTA neurons ( Supplementary  Fig. 7d). These data suggest that self-stimulation of VGLUT2 þ VTA neurons is a potent reinforcer relatively resistant to pharmacological blockade of coincident photo-evoked dopamine.

D 2 R (Supplementary
Projection target-specific GABA/glutamate co-release. To investigate alternate mechanisms by which VGLUT2 þ VTA neurons may induce behavioural responses we assessed the postsynaptic effects induced by activation of VGLUT2 þ VTA neurons locally within the VTA and from distal terminals in NAc, VP and LHb. Using whole-cell voltage clamp we identified and recorded from mCherry-negative VTA neurons. Photostimulation of VTA cell bodies led to DNQX-sensitive excitatory postsynaptic currents (EPSCs) in all ChR2:mCherry-negative VTA neurons that we recorded from (Fig. 3a), consistent with previous reports suggesting local excitatory connections within the VTA 34,47 . Photostimulation of VGLUT2 þ VTA inputs to medial NAc shell (Fig. 3b), VP (Fig. 3c) and LHb (Fig. 3d) also revealed high rates of connectivity and EPSCs were present in all connected cells. However, photostimulation of VGLUT2 þ VTA terminals additionally revealed gabazine-sensitive inhibitory postsynaptic currents (IPSCs) in the VP and LHb, but not in VTA or NAc (Fig. 3). These IPSCs had short delays consistent with monosynaptic connections (   (d) ChR2-expressing mice trigger more photostimuli than control littermates; Po0.001. (e) ChR2-expressing mice make more active nosepokes than controls; Po0.001. (f) Schematic illustrating 5-choice nosepoke task where each of 5 nosepokes are coupled to 1-s photostimulus at varying frequency. Frequency-response histograms reveal ChR2-expressing mice display an (g) ascending preference for faster stimulation; Po0.001, (h) trigger more stimuli; Po0.001, and (i) have higher response rates at nosepoke holes coupled to faster photostimulation; Po0.001 compared with responding on control nosepoke hole (for no stimulation). (j) Mice expressing ChR2 in VGLUT2 þ VTA neurons were implanted instead with fibres in each of the NAc, VP or LHb to target presynaptic terminals. When placed on a restricted feeding schedule and assessed using the 2-nosepoke instrumental task, mice displayed an increase in responding; Po0.01 and a (k) preference for the active nosepoke compared with controls. Data in j and k represent average responses over 5 days.
tetrodotoxin and recovered in 4-AP ( Supplementary Fig. 8), confirming their monosynaptic nature. Importantly, when the experiment was performed using Slc32a1 IRES-Cre (VGAT-Cre) mice, both IPSCs and EPSCs were detected in the LHb and VP ( Supplementary Fig. 9), but only IPSCs were found in NAc and VTA; indicating that ectopic Cre expression is unlikely to account  for these findings. Rather, these results suggest that a subset of VGLUT2 þ neurons in the VTA co-release GABA in a projection-target specific manner.
To compare the relative impact of the GABA and glutamate signals on postsynaptic cells across regions, we first calculated the ratios of outward currents recorded at V h ¼ 0 mV to inward currents at V h ¼ À 60 mV in the VP and LHb in response to photostimulation of VGLUT2 þ VTA terminals. In the VP this GABA/AMPA ratio was equal to B1, but was B4 in the LHb (Fig. 4a), suggesting that the relative effects of GABA co-release may be stronger in the LHb compared to VP. We next tested whether the synaptic effects of terminal photostimulation could be sustained at high frequency. First, to assess how photostimulation of VGLUT2 þ VTA inputs influences firing of postsynaptic cells, we made cell-attached recordings, taking advantage of the fact that VP and LHb neurons tend to fire spontaneously in acute slice. Photostimulation (40 Hz, 5 s) of VGLUT2 þ VTA terminals in VP led to a consistent increase in firing in postsynaptic cells ( Fig. 4b,c) that persisted for the duration of the 5-s photostimulus trains (Fig. 4d). On the other hand, photostimulation of VGLUT2 þ VTA terminals in the LHb produced a persistent decrease in the firing rate of postsynaptic cells (Fig. 4b-d).
Together, these data suggest that the net effects of terminal stimulation in the VP are excitatory, but inhibitory in the LHb.
Although the effects of terminal photostimulation on postsynaptic firing demonstrate an ability of these synapses to maintain functional transmission at 40 Hz for at least 5 s, we noted that sustained photostimulation led to marked depression at each synapse examined and for both EPSCs and IPSCs ( Supplementary Fig. 10). The size of the second response (pairedpulse ratio) with inter-stimulus intervals ranging from 500 to 25 ms (thta is, 2-40 Hz) showed mean reductions (that is, pairedpulse depression) in amplitude of up to 84%; with shorter interstimulus interval and longer stimulus trains producing cumulatively greater depression ( Supplementary Fig. 10). These data suggest a high probability of release, consistent with previous  results examining glutamate co-release from dopamine terminals in the NAc (ref. 48), and could indicate that sustained stimulation would produce effects functionally distinct from brief stimulation.
Brief stimulation of VGLUT2 þ VTA neurons is preferred. To test the effects of sustained stimulation, we employed several additional behavioural tasks and compared the effects of stimulating VGLUT2 þ VTA cell bodies with their terminal stimulation, and also to the effects of stimulating DAT þ VTA dopamine cell bodies. First, we used an ascending stimulus frequency (0-40 Hz, increasing across days) real-time place procedure (RTPP), where mice were provided free access to two compartments, one of which was coupled to photostimulation. Note that in the RTPP, mice control the duration of stimulation by exiting the photostimulus-paired (that is, 'active') compartment. When photostimulation was used to activate DAT þ VTA dopamine neurons we, as expected, observed a frequency-dependent increase in time spent in the active compartment, though no change in the number of side changes (that is, 'crossings') ( Fig. 5a-c). Surprisingly, when we performed the same experiment to stimulate VGLUT2 þ VTA neurons, we observed an apparent place avoidance at low frequencies that was mitigated at higher frequencies (Fig. 5a,b). However, this was accompanied by a frequency-dependent appetitive increase in the number of crossings ( Fig. 5c; Supplementary Video 2) and a significant shift in the distribution towards shorter active-side visits (Fig. 5g).
Mice thus appeared to titrate the amount of time spent in the active compartment to maximize the number of brief photostimuli received. Photostimulation of VGLUT2 þ VTA terminals showed a similar apparent avoidance for the active side and distribution shift; although there was a tendency towards increased crossings at higher frequencies, they were not significant ( Fig. 5d-g). Similar results were observed using bilateral stimulation and a more typical single frequency RTPP design ( Supplementary Fig. 6d-h).
The apparent avoidance observed in the RTPP seems to conflict with the strong positive reinforcement observed following VGLUT2 þ VTA cell-body stimulation ( Fig. 2 and Supplementary Figs 3 and 4), as well as with the positive reinforcement observed when stimulating their terminals in the instrumental task (Fig. 2j,k), and the absence of avoidance by two-bottle choice ( Supplementary Fig. 6c). However, the increase in number of crossings in the RTPP (Fig. 5c) is consistent with the hypothesis that the mice are engaging in an appetitive behaviour, perhaps titrating time spent to achieve preferred short photostimulus trains. To directly compare short versus long stimulus trains, we employed 5-and 2-hole nosepoke instrumental choice tasks, now using constant frequency (40 Hz) with variable stimulus duration assigned to each nosepoke (Fig. 6a,d). Following 3 days of exposure to a 5-nosepoke apparatus comparing 0-40 s of 40 Hz stimulation (0, 40, 200, 800 or 1,600 pulses; Supplementary  Fig. 11), we observed a significant difference in responding for DAT þ versus VGLUT2 þ VTA stimulation. Mice showed a clear preference for brief (r5 s) stimulation trains when coupled to VGLUT2 þ VTA neuron stimulation compared with a preference for longer (Z5 s) trains when coupled to DAT þ VTA neuron stimulation (Fig. 6a-c). In a second experiment, we tested mice over 5 days on a 2-nosepoke task comparing 1-versus 20-s stimulation at 40 Hz (40 or 800 pulses). Mice receiving stimulation of VGLUT2 þ VTA neurons showed a stable preference for the nosepoke coupled to 1-s trains (Fig. 6e). In contrast, following an initial preference for the shorter stimulus train, mice receiving stimulation of DAT þ neurons ultimately developed a preference for the nosepoke coupled to longer stimulation (Fig. 6f). The results indicate that transient photostimulation of VGLUT2 þ VTA cell bodies is preferentially reinforcing in operant tasks, and this preference for short intermittent trains of stimuli may contribute to an apparent place avoidance by RTPP.
Consistent with the interpretation that prolonged activation of VGLUT2 þ VTA neurons is neither potently rewarding nor aversive, we found that their sustained inhibition using Halorhodopsin did not alter behaviour in the RTPP, whereas inhibition of DAT þ VTA neurons led to strong avoidance ( Supplementary Fig. 12). These data further distinguish the two cell populations and suggest that while tonic activity of VTA dopamine neurons provides a value signal, VTA glutamate neurons are either not tonically active or their tonic activity lacks such valence.

Discussion
The identification of the VGLUTs provided definitive markers of glutamate-releasing neurons 49 , and led to the subsequent discovery of VGLUT2-expressing neurons in the VTA 25,26 .
Much of the initial work targeting this population was aimed at deciphering the function of VGLUT2 in dopamine neurons; and conditional disruption of VGLUT2 from dopamine neurons resulted in reductions in psychostimulant-induced locomotion and evoked dopamine release 3,50-52 . However, the majority of VGLUT2 þ neurons detected in mouse VTA are not dopaminergic 29 , suggesting that the conditional knockout of VGLUT2 selectively from dopamine neurons left glutamate signalling from VTA neurons largely intact.
In this report, we used VGLUT2-Cre mice and optogenetics to target VGLUT2 þ VTA neurons directly, irrespective of their ability to co-release other signalling molecules. We found that photostimulation of VGLUT2 þ projection neurons served as a potent reinforcer on operant assays, consistent with another recent report 34 . Though this approach also recruits dopamine release and we cannot deny that dopamine signalling contributes to the behavioural results we observed, multiple lines of evidence suggest that VTA glutamate neurons impact reward in a manner that is distinct from dopamine neurons. First, we found only 11% of ChR2:mCherry-labelled neurons were TH þ in VGLUT2-Cre mice. Second, the ability of VGLUT2 þ VTA neurons to directly excite most neurons in the NAc and VP while inhibiting LHb neurons (via GABA co-release) are each plausible mechanisms to drive behavioural reinforcement [42][43][44]53 . Third, photostimulation (or photoinhibition) of DAT þ dopamine versus VGLUT2 þ glutamate neurons in the VTA led to divergent behavioural patterns, in particular suggesting that sustained high-frequency stimulation of VGLUT2 þ neurons is less preferred compared with brief stimulus trains. Fourth, operant responding for VGLUT2 þ VTA neuron stimulation is resistant to the effects of dopamine receptor antagonists at a dose that significantly reduced responding for DAT þ VTA neuron stimulation.
Despite the profound self-stimulation induced in the operant assays, we were unable to reliably induce a place preference using the RTPP for photostimulation of VGLUT2 þ VTA neurons. Rather, terminal stimulation as well as low-frequency stimulation of cell bodies led to a pronounced decrease in the time spent in the active compartment, potentially indicating that their prolonged activation conveys an aversive signal. Others too have reported that 20 Hz stimulation of VGLUT2 þ VTA terminals in the LHb or NAc induced avoidance using RTPP procedures 35,36,54 . Moreover, Qi et al. 35 found that mice trained to lever press for food preferred a lever not paired to photostimulation (20 Hz) of VGLUT2 þ terminals in the NAc, and preferred to spin a wheel that turned off ongoing photostimulation of the same. These data have been interpreted to indicate that VGLUT2 þ VTA neurons signal reward via local activation of other VTA neurons, but signal aversion via distal activation of NAc interneurons or LHb projection neurons 34,35 ; though it is not clear whether local connections are made by a discrete population of excitatory VTA interneurons, collateralization of VTA projection neurons or both. However, excitatory inputs to NAc from hippocampus, amygdala and PFC have been shown to drive positive reinforcement 42,55 . The excitatory input to NAc from VTA has the added ability to co-release dopamine 56 and data presented here and published 4,7,10,28,35 suggest that VTA glutamate inputs broadly target essentially all NAc shell cell types (though may vary by synaptic strength/incidence). Similarly, our data as well as work published by others 57 suggest that the dominant effects of activating VGLUT2 þ VTA terminals in the LHb is inhibitory, which is rather associated with positive reinforcement 44,53 . Together these data suggest that VGLUT2 þ VTA projections are likely to play a faciliatory role in positive reinforcement. Indeed, we found that activation of VGLUT2 þ VTA terminals in the LHb, VP and NAc were each sufficient to drive self-stimulation in an operant assay when animals were placed on a restricted feeding schedule. Interestingly, waterrestricted mice did not show a clear preference for a sipper coupled to terminal photostimulation in the two-bottle choice assay; though importantly no avoidance/aversion was detected either. Because these observations were made in multiple cohorts of mice, including mice individually tested in both operant and RTPP assays, we suspect that the apparent avoidance for cell-body or terminal stimulation that we and others observed does not represent aversion.
Why then might stimulation of VGLUT2 þ VTA neurons induce apparent avoidance behaviour in the RTPP task? The first clue came from our observation that mice exhibited the unusual behaviour of repeatedly making brief entries into the stimuluspaired compartment (Supplementary Video 2), an apparent appetitive behaviour quantified by an increasing number of side crossings. This observation is in contrast to the effects of photostimulation of VTA dopamine neurons which did not lead to more crossings. We hypothesized that the mice titrated their exposure to increase the number of relatively brief stimulus epochs; thus the decrease in time spent on the active side, rather than indicating aversion associated with VGLUT2 þ VTA neuron stimulation, reflects a preference for transient intermittent over more sustained stimulus trains. Indeed, when directly tested to determine a preferred stimulus train length using operant procedures, mice receiving VGLUT2 þ VTA stimulation showed a clear preference for shorter duration trains, again in notable contrast to the effects of VTA dopamine neuron stimulation. These data suggest that the initial effects of stimulating VGLUT2 þ VTA neurons are reinforcing, but rapid accommodations in signalling mitigate or extinguish the initial effect. Indeed, we observed pronounced short-term depression of both EPSCs and IPSCs following photoactivation of VGLUT2 þ VTA terminals in each of the NAc, VP and LHb. Combined with their ability to co-release glutamate and GABA, such properties may allow these neurons to encode distinct types of temporally dynamic reward-related information; for example, satiation where an initial increase in VGLUT2 þ VTA neuron activity could encode reward, but the reward signal loses potency or even reverses valence as activity is sustained.
Dopamine neurons projecting to NAc, striatum, PFC and other regions have been shown capable of co-releasing either glutamate or GABA 2,4-8,10,15 . The widespread co-release of fast excitatory or inhibitory signals along with the slower neuromodulatory dopamine signal is consistent with the idea that dopamine neurons also encode different types of information over different timescales 13,58 , with potentially important implications for the role of dopamine neurons in reward learning and in psychiatric illness 59 . However, the normal physiological and functional roles of VTA glutamate neurons cannot be determined by optogenetic stimulation alone, which tests sufficiency, is subject to caveats, and may not reflect in vivo firing patterns. Future studies assessing loss of function in more complex behavioural assays and direct observations of neural activity during behaviour will provide valuable insight.
A subset of TH-Cre þ neurons in the VTA release GABA, but little or no dopamine or glutamate, from terminals in the LHb (ref. 9). Questions have been raised regarding the use of TH-Cre mouse lines to selectively target midbrain dopamine neurons, due to the potential for 'ectopic' expression of TH transcript and Cre expression in neurons that contain little or no TH protein 54,60 . Such questions raise important issues about what markers constitute a specific class of VTA neuron, whether such markers are stable across development and in the adult, and the use of Cre-expressing mouse lines to target-specific cell types 61,62 . For these reasons we validated our findings using multiple Cre lines. To target VGLUT2 þ neurons we used both knock-in 37 and BAC transgenic VGLUT2-Cre lines 41 , finding comparable anatomical, electrophysiological and behavioural results. We also used VGAT-Cre knock-in mice to target VTA GABA neurons, and showed that photostimulation of terminals in the VP and LHb led to both the expected IPSCs, but also glutamate-mediated EPSCs.
Other groups have shown that the LHb receives input from neurons with the potential to co-release both glutamate and GABA. The input from the entopeduncular nucleus contains neurons that may release both glutamate and GABA from single vesicles, their photostimulation can drive place avoidance, and the relative ratio of inhibition to excitation may be subject to presynaptic regulation by anti-depressants 63,64 . Within VTA, a population of VGLUT2 þ VTA neurons that project to the LHb co-express GABAergic and glutamatergic markers and singlepulse optogenetic stimulation led to both excitatory and inhibitory responses 57 . Conversely, VTA cells projecting to NAc rarely labelled for both GABA and glutamate markers 35 . Our data are consistent with such findings, but show that train stimulation of VGLUT2 þ terminals in the LHb is reliably inhibitory and that that GABA/glutamate co-release from VTA neurons is projection-target dependent. Indeed, GABA co-release was also observed from VTA terminals in the VP where the net effect of sustained stimulation was instead reliably excitatory, but we found no evidence for monosynaptic GABA release from VGLUT2 þ projections to NAc or from local VTA collaterals. Importantly, these physiological findings are consistent with the behavioural observations that follow from optogenetic stimulation of VGLUT2 þ cell bodies; because the inhibition of the LHb or activation of the VP, NAc and VTA might each be predicted to drive positive reinforcement 1,[42][43][44]53 .
It is important to note that each of the target structures (excepting possibly the LHb) contains multiple cell types which may differentially alter behaviours in response to VGLUT2 þ VTA input, however, we find high rates of synaptic connectivity ranging from 85% in the VP to 100% of connected cells in the NAc and VTA. These data strongly suggest that inputs do not qualitatively discriminate by cell type, though further studies will be needed to determine if there exist differing rates of synaptic incidence or synaptic strength by postsynaptic cell type. Finally, our data suggests that NAc-, LHb-and VP-projecting VTA neurons, including glutamate neurons, rarely target more than one of these structures but rather represent non-overlapping populations of neurons within VTA, consistent with earlier work comparing other projection target combinations 65 . Future studies will be required to determine whether VTA neurons that release glutamate locally also project elsewhere or represent a population of excitatory interneuron, with important functional implications.
Though the phenomenon of glutamate and GABA co-transmission in the adult CNS has been observed for over a decade 58,66 ; the concept remains perplexing. What is the purpose of presynaptic neurons transmitting such 'mixed messages'? It is interesting to note that interneurons appear to be scarce within the LHb (refs 67,68), suggesting GABA/glutamate co-release may be an alternate mechanism through which inputs can produce bidirectional effects. GABA/glutamate co-transmission may be homeostatic, consistent with the idea that the relative abundance of GABA and glutamate markers may be adaptive and under dynamic regulation in the presynaptic compartment 63,69 . However the net effect of glutamate/GABA co-release will also depend on the instantaneous excitability of the postsynaptic cell; and on the relative expression and trafficking of postsynaptic receptors which are also subject to dynamic regulation [70][71][72] . Understanding the mechanisms by which glutamate/GABA co-release is regulated in maladaptive states such as depression, anxiety and drug addiction will prove a fruitful area of future investigation. Our data provide important new insights into the role of a population of VTA glutamate neurons that differentially release multiple small-molecule transmitters across diverse efferent targets and can drive positive reinforcement.

Methods
Animals. Mice were bred at UCSD, group housed, and maintained on a 12 h lightdark cycle with food and water available ad libitum unless noted. Initial breeders were obtained from: Slc17a6 tm2(cre)Lowl (stock no: 016963), Slc32a1 tm2(cre)Lowl (stock no: 016962), Slc6a3 tm1.1(cre)Bkmn (stock no: 006660; The Jackson Laboratory) and BAC Tg. Scl17a6-Cre from Dr Ole Kiehn (Karolinska Institute). All mice were maintained fully back-crossed on to C57Bl/6, with the exception of Slc32a1 tm2(cre)Lowl which were maintained as homozygous C57Bl/6 Â 129 Sv hybrids. Control mice included wild-type littermates receiving the same viral treatment and/or Cre þ littermates receiving treatment with a control viral vector as described below. Both male and female mice were included and all experiments performed in accordance with protocols approved by the University of California San Diego Institutional Animal Care and Use Committee.
For the colocalization of TH with mCherry, images of VTA were acquired using a 20 Â objective (Hamamatsu NanoZoomer) with identical acquisition settings across slides. Coronal sections were identified aligning to Paxinos & Watson Bregma points ( À 3.1, À 3.4 and À 3.8) from each of four mice. Counting was performed manually using NDP viewer software (Hamamatsu, Japan) to quantify mCherry þ VTA cells with clearly labelled soma, and then scored for co-labeling with TH.
For the quantification of c-Fos, mice were tested in a nosepoke discrimination task for 30 min (see below) and perfused 90 min after the beginning of the task. Immunohistochemistry was performed for c-Fos and TH. Images were acquired using a 10 Â objective (Olympus BX53) with identical acquisition settings across slides (n ¼ 4 controls; 4 ChR2 mice) and aligned to reference atlas images (in mm relative to Bregma): VTA ( À 3.1, À 3.4 and À 3.8). ImageJ was used to manually count the number of c-Fos positive neurons by an experimenter blind to treatment groups.
To count the number of cells containing retrobeads, coronal images were acquired using a 10 Â objective (Zeiss AxioObserver) at the injection site and throughout the VTA; 9 VTA sections per animal between À 2.9 to À 3.9 mm relative to Bregma, VP/NAc n ¼ 4; VP/LHb n ¼ 2; LHb/NAc n ¼ 1. Zen software (Zeiss) was used to count cells containing red-, green-or both-coloured beads.
2-nosepoke discrimination task. Mice were fed ad libitum or placed on a restricted feeding schedule as specified. Food restriction consisted of removing food the evening before the first day and access was then restricted to a 3-h period following the assay. At the beginning of the session, ferrules were connected to a 50-mm optical patch cable connected to an optical commutator (Doric Lenses, Canada) and mice were placed in operant chambers (Med Associates) controlled by MedPC IV software. The start of the session was signalled by a brief tone (2 kHz, 0.5 s), illumination of overhead house light, and LED cue lights over the nosepoke holes; sessions were 60 min unless specified. The chamber contained two photobeam-equipped nosepoke holes which were each baited at the start of each session with a sucrose pellet (Bio-Serv, F0071). Beam-breaks on the active nosepoke led to a 0.5 s tone, the LED cue lights over the nosepokes turned off for the duration of the photostimulus unless specified, and the activation of a TTLcontrolled DPSS laser (473 nm, Shanghai or OEM laser) set to deliver pulses at 10 mW (80 mW mm À 2 at 200-m fibre tip) at 20 or 40 Hz (1-20 s) with a 10-ms pulse width controlled by a Master-8 (A.M.P.I.) or customized Arduino stimulus generator. The output of the laser power was measured using a digital power meter (Thorlabs PM100D/S121C). Nosepokes that occurred during ongoing photostimulation were recorded but without effect; inactive nosepokes led to identical tone and cue light effects but did not trigger the laser.
For behavioural pharmacology studies, naïve mice were exposed to the task until their responding appeared stable over 3 days. Over the subsequent 2 days mice were injected with either vehicle or with a combination of SCH23390 (Tocris, 50 mg kg À 1 i.p.) and sulpiride (Tocris, 50 mg kg À 1 i.p.) 30 min before beginning of experiment and using a counter-balanced design.
5-choice nosepoke instrumental task. Mice were tethered to the patch cable as described for the 2-nosepoke discrimination task. Identical chambers, lasers and conditions were used but a 5-nosepoke wall (Med Associates) was used in place of the two nosepoke holes. Each of 5-nosepoke holes led to stimulation that varied by pulse number (that is, duration) or frequency as specified, sessions were 45 min unless specified, and all other parameters were as described in the nosepoke discrimination task. Two-bottle choice task. Before the first day of testing mice were water-deprived overnight and subsequently provided restricted access for 3 h daily at the end of each session. On day 1 (baseline) mice were provided water through two identical sippers and licks were recorded for 45 min using contact lickometers (Med Associates). On subsequent days one sipper was designated 'active' and the active sipper was assigned in a balanced manner such that on average no preference was present on baseline day. Every fifth lick on the active sipper led to a photostimulation of 40 pulses at 40 Hz (1 s) with a 10 ms pulse width at 10 mW. Licks on the inactive sipper were without effect. Licks during the photostimulation were recorded but did not contribute to triggering the next photostimulation; mice generally discontinued licking on photostimulation and often resumed shortly thereafter. On days 5-8 the inactive sipper was filled with escalating concentrations of sucrose. For behavioural pharmacology studies mice were injected with either SCH23390 (Tocris, i.p.) or sulpiride (Tocris, i.p.) 30 min before beginning of experiment. The total number of licks on the active and inactive side of sipper and the total number of photostimulations were recorded via Med-PC IV. Preference was calculated using active licks divided by the sum of all licks; on five occasions mice made zero licks on one of the two sippers (4 controls and 1 ChR2) and were excluded from preference calculations.
Real-time place procedure. On a baseline (pre-test) day mice were placed on the border between two adjoining compartments (20 Â 20 cm) and the amount of time spent in each compartment was recorded using video tracking software (Anymaze). Most mice displayed no preference, but those with greater than 80% side preference on pre-test were excluded from further study. On subsequent days one side was designated active and entry to the active side triggered photostimulation (0-40 Hz, 5-10-ms pulse width, 10 mW), using the lasers as described above but controlled by an ANY-maze interface (San Diego Instruments). In some sessions a post-test day was included that was identical to the pre-test day. Sessions lasted for 25 min unless specified and the amount of time spent in each compartment and number of crossings was recorded. The Halorhodopsin experiment was performed identically except for the photostimulation (continuous 10 mW, 532 nm, Shanghai laser) and the duration of each session (30 min).
Neurons were held in voltage-clamp at À 60 mV to record AMPAR EPSCs and at 0 mV to record GABA A R IPSCs in whole-cell configuration. For whole-cell voltage-clamp recordings, single-pulse (5-ms) photostimuli were applied every 55 s and 10 photo-evoked currents were averaged per neuron per condition. For cell-attached or current-clamp studies of spike fidelity and cell-attached studies on firing rate, photostimuli trains were delivered every 30 s and three responses averaged per neuron. Action potential frequency was averaged over the 5 s before, during and after the 5-s stimulation train. DMSO or H 2 O stock solutions of drugs were diluted 1,000-fold in ACSF and bath applied at the following concentrations: 6,7-dinitroquinoxaline-2,3-dione (DNQX, 10 mM, Sigma), picrotoxin (10 mM, Sigma), gabazine (10 mM, Tocris), tetrodotoxin (1 mM, Tocris) and 4-aminopyridine (500 mM, Tocris). Current sizes were calculated by using peak amplitude from baseline. Decay time constants (t) were calculated by fitting an exponential function to each averaged current trace using the following formula: f(t) ¼ e -t/t þ C.
Statistics. To evaluate statistical significance, data were subjected to Student's t test or ANOVA, in some experiments followed by post hoc analysis (KyPlot, Prism or Statistica) as described in Supplementary Tables 1 and 2. Statistical significance was set at Pr0.05. All data are presented as means ± s.e.m. unless noted.
Data availability. Data are available from the corresponding author (T.S.H.) upon request.