Dissociable dopamine dynamics for learning and motivation

Mohebi, Ali; Pettibone, Jeffrey R.; Hamid, Arif A.; Wong, Jenny-Marie T.; Vinson, Leah T.; Patriarchi, Tommaso; Tian, Lin; Kennedy, Robert T.; Berke, Joshua D.

doi:10.1038/s41586-019-1235-y

Article
Published: 22 May 2019

Dissociable dopamine dynamics for learning and motivation

Ali Mohebi¹^na1,
Jeffrey R. Pettibone¹^na1,
Arif A. Hamid²,
Jenny-Marie T. Wong³,
Leah T. Vinson⁴,
Tommaso Patriarchi⁵,
Lin Tian⁵,
Robert T. Kennedy³ &
…
Joshua D. Berke^1,4,6

Nature volume 570, pages 65–70 (2019)Cite this article

45k Accesses
334 Citations
190 Altmetric
Metrics details

Subjects

A Publisher Correction to this article was published on 20 June 2019

This article has been updated

Abstract

The dopamine projection from ventral tegmental area (VTA) to nucleus accumbens (NAc) is critical for motivation to work for rewards and reward-driven learning. How dopamine supports both functions is unclear. Dopamine cell spiking can encode prediction errors, which are vital learning signals in computational theories of adaptive behaviour. By contrast, dopamine release ramps up as animals approach rewards, mirroring reward expectation. This mismatch might reflect differences in behavioural tasks, slower changes in dopamine cell spiking or spike-independent modulation of dopamine release. Here we compare spiking of identified VTA dopamine cells with NAc dopamine release in the same decision-making task. Cues that indicate an upcoming reward increased both spiking and release. However, NAc core dopamine release also covaried with dynamically evolving reward expectations, without corresponding changes in VTA dopamine cell spiking. Our results suggest a fundamental difference in how dopamine release is regulated to achieve distinct functions: broadcast burst signals promote learning, whereas local control drives motivation.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Dopamine release covaries with reward rate specifically in NAc core and ventral prelimbic cortex.**

**Fig. 2: Activity of identified VTA dopamine neurons does not change with reward rate.**

**Fig. 3: Bridging timescales of dopamine measurement.**

**Fig. 4: Phasic VTA dopamine firing does not account for NAc dopamine dynamics.**

**Fig. 5: Reward history affects VTA dopamine cell firing and NAc dopamine release differently.**

Conjunctive encoding of exploratory intentions and spatial information in the hippocampus

Article Open access 15 April 2024

Climbing fibers provide essential instructive signals for associative learning

Article Open access 02 April 2024

Centripetal integration of past events in hippocampal astrocytes regulated by locus coeruleus

Article Open access 03 April 2024

Data availability

The AAV.Synapsin.dLight1.3b virus used in this study has been deposited with Addgene (no. 125560; http://www.addgene.org). All data will be available through the Collaborative Research in Computational Neuroscience data sharing website (https://doi.org/110.6080/K0VQ30V9).

Code availability

Custom MATLAB code is available on request from J.D.B.

Change history

20 June 2019
Change history: In this Article, an extraneous label appeared in Fig. 4b, and has been removed in the online version.

References

Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997).
Article CAS Google Scholar
Pan, W. X., Schmidt, R., Wickens, J. R. & Hyland, B. I. Dopamine cells respond to predicted events during classical conditioning: evidence for eligibility traces in the reward-learning network. J. Neurosci. 25, 6235–6242 (2005).
Article CAS Google Scholar
Cohen, J. Y., Haesler, S., Vong, L., Lowell, B. B. & Uchida, N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 482, 85–88 (2012).
Article CAS ADS Google Scholar
Steinberg, E. E. et al. A causal link between prediction errors, dopamine neurons and learning. Nat. Neurosci. 16, 966–973 (2013).
Article CAS Google Scholar
Hamid, A. A. et al. Mesolimbic dopamine signals the value of work. Nat. Neurosci. 19, 117–126 (2016).
Article CAS Google Scholar
Saunders, B. T., Richard, J. M., Margolis, E. B. & Janak, P. H. Dopamine neurons create Pavlovian conditioned stimuli with circuit-defined motivational properties. Nat. Neurosci. 21, 1072–1083 (2018).
Article CAS Google Scholar
Phillips, P. E., Stuber, G. D., Heien, M. L., Wightman, R. M. & Carelli, R. M. Subsecond dopamine release promotes cocaine seeking. Nature 422, 614–618 (2003).
Article CAS ADS Google Scholar
Roitman, M. F., Stuber, G. D., Phillips, P. E., Wightman, R. M. & Carelli, R. M. Dopamine operates as a subsecond modulator of food seeking. J. Neurosci. 24, 1265–1271 (2004).
Article CAS Google Scholar
Wassum, K. M., Ostlund, S. B. & Maidment, N. T. Phasic mesolimbic dopamine signaling precedes and predicts performance of a self-initiated action sequence task. Biol. Psychiatry 71, 846–854 (2012).
Article CAS Google Scholar
Howe, M. W., Tierney, P. L., Sandberg, S. G., Phillips, P. E. & Graybiel, A. M. Prolonged dopamine signalling in striatum signals proximity and value of distant rewards. Nature 500, 575–579 (2013).
Article CAS ADS Google Scholar
Syed, E. C. et al. Action initiation shapes mesolimbic dopamine encoding of future rewards. Nat. Neurosci. 19, 34–36 (2016).
Article CAS Google Scholar
Morris, G., Nevet, A., Arkadir, D., Vaadia, E. & Bergman, H. Midbrain dopamine neurons encode decisions for future action. Nat. Neurosci. 9, 1057–1063 (2006).
Article CAS Google Scholar
da Silva, J. A., Tecuapetla, F., Paixão, V. & Costa, R. M. Dopamine neuron activity before action initiation gates and invigorates future movements. Nature 554, 244–248 (2018).
Article ADS Google Scholar
Fiorillo, C. D., Tobler, P. N. & Schultz, W. Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299, 1898–1902 (2003).
Article CAS ADS Google Scholar
Patriarchi, T., Cho, J. R., Merten, K., Howe, M. W., et al. Ultrafast neuronal imaging of dopamine dynamics with designed genetically encoded sensors. Science 360, eaat4422 (2018).
Article Google Scholar
Salamone, J. D. & Correa, M. The mysterious motivational functions of mesolimbic dopamine. Neuron 76, 470–485 (2012).
Article CAS Google Scholar
Schultz, W. Predictive reward signal of dopamine neurons. J. Neurophysiol. 80, 1–27 (1998).
Article CAS Google Scholar
Garris, P. A. & Wightman, R. M. Different kinetics govern dopaminergic transmission in the amygdala, prefrontal cortex, and striatum: an in vivo voltammetric study. J. Neurosci. 14, 442–450 (1994).
Article CAS Google Scholar
Frank, M. J., Doll, B. B., Oas-Terpstra, J. & Moreno, F. Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation. Nat. Neurosci. 12, 1062–1068 (2009).
Article CAS Google Scholar
St Onge, J. R., Ahn, S., Phillips, A. G. & Floresco, S. B. Dynamic fluctuations in dopamine efflux in the prefrontal cortex and nucleus accumbens during risk-based decision making. J. Neurosci. 32, 16880–16891 (2012).
Article Google Scholar
Bartra, O., McGuire, J. T. & Kable, J. W. The valuation system: a coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value. Neuroimage 76, 412–427 (2013).
Article Google Scholar
Ikemoto, S. Dopamine reward circuitry: two projection systems from the ventral midbrain to the nucleus accumbens-olfactory tubercle complex. Brain Res. Brain Res. Rev. 56, 27–78 (2007).
Article CAS Google Scholar
Breton, J. M. et al. Relative contributions and mapping of ventral tegmental area dopamine and GABA neurons by projection target in the rat. J. Comp. Neurol. (2018).
Ungless, M. A., Magill, P. J. & Bolam, J. P. Uniform inhibition of dopamine neurons in the ventral tegmental area by aversive stimuli. Science 303, 2040–2042 (2004).
Article CAS ADS Google Scholar
Morales, M. & Margolis, E. B. Ventral tegmental area: cellular heterogeneity, connectivity and behaviour. Nat. Rev. Neurosci. 18, 73–85 (2017).
Article CAS Google Scholar
Morris, G., Arkadir, D., Nevet, A., Vaadia, E. & Bergman, H. Coincident but distinct messages of midbrain dopamine and striatal tonically active neurons. Neuron 43, 133–143 (2004).
Article CAS Google Scholar
Floresco, S. B., West, A. R., Ash, B., Moore, H. & Grace, A. A. Afferent modulation of dopamine neuron firing differentially regulates tonic and phasic dopamine transmission. Nat. Neurosci. 6, 968–973 (2003).
Article CAS Google Scholar
Grace, A. A. Dysregulation of the dopamine system in the pathophysiology of schizophrenia and depression. Nat. Rev. Neurosci. 17, 524–532 (2016).
Article CAS Google Scholar
Cohen, J. Y., Amoroso, M. W. & Uchida, N. Serotonergic neurons signal reward and punishment on multiple timescales. eLife 4, e06346 (2015).
Article Google Scholar
Niv, Y., Daw, N. & Dayan, P. How fast to work: response vigor, motivation and tonic dopamine. Adv. Neural Inf. Process. Syst. 18, 1019 (2006).
Google Scholar
Bayer, H. M., Lau, B. & Glimcher, P. W. Statistics of midbrain dopamine neuron spike trains in the awake primate. J. Neurophysiol. 98, 1428–1439 (2007).
Article Google Scholar
Chergui, K., Suaud-Chagny, M. F. & Gonon, F. Nonlinear relationship between impulse flow, dopamine release and dopamine elimination in the rat brain in vivo. Neuroscience 62, 641–645 (1994).
Article CAS Google Scholar
Parker, N. F. et al. Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target. Nat. Neurosci. 19, 845–854 (2016).
Article CAS Google Scholar
Menegas, W., Babayan, B. M., Uchida, N. & Watabe-Uchida, M. Opposite initialization to novel cues in dopamine signaling in ventral and posterior striatum in mice. eLife 6, e21886 (2017).
Article Google Scholar
Trulson, M. E. Simultaneous recording of substantia nigra neurons and voltammetric release of dopamine in the caudate of behaving cats. Brain Res. Bull. 15, 221–223 (1985).
Article CAS Google Scholar
Glowinski, J., Chéramy, A., Romo, R. & Barbeito, L. Presynaptic regulation of dopaminergic transmission in the striatum. Cell. Mol. Neurobiol. 8, 7–17 (1988).
Article CAS Google Scholar
Zhou, F. M., Liang, Y. & Dani, J. A. Endogenous nicotinic cholinergic activity regulates dopamine release in the striatum. Nat. Neurosci. 4, 1224–1229 (2001).
Article CAS Google Scholar
Threlfell, S. et al. Striatal dopamine release is triggered by synchronized activity in cholinergic interneurons. Neuron 75, 58–64 (2012).
Article CAS Google Scholar
Cachope, R. et al. Selective activation of cholinergic interneurons enhances accumbal phasic dopamine release: setting the tone for reward processing. Cell Reports 2, 33–41 (2012).
Article CAS Google Scholar
Sulzer, D., Cragg, S. J. & Rice, M. E. Striatal dopamine neurotransmission: regulation of release and uptake. Basal Ganglia 6, 123–148 (2016).
Article Google Scholar
Floresco, S. B., Yang, C. R., Phillips, A. G. & Blaha, C. D. Basolateral amygdala stimulation evokes glutamate receptor-dependent dopamine efflux in the nucleus accumbens of the anaesthetized rat. Eur. J. Neurosci. 10, 1241–1251 (1998).
Article CAS Google Scholar
Jones, J. L. et al. Basolateral amygdala modulates terminal dopamine release in the nucleus accumbens and conditioned responding. Biol. Psychiatry 67, 737–744 (2010).
Article CAS Google Scholar
Schultz, W. Responses of midbrain dopamine neurons to behavioral trigger stimuli in the monkey. J. Neurophysiol. 56, 1439–1461 (1986).
Article CAS Google Scholar
Berke, J. D. What does dopamine mean? Nat. Neurosci. 21, 787–793 (2018).
Article CAS Google Scholar
Bromberg-Martin, E. S., Matsumoto, M. & Hikosaka, O. Distinct tonic and phasic anticipatory activity in lateral habenula and dopamine neurons. Neuron 67, 144–155 (2010).
Article CAS Google Scholar
Pasquereau, B. & Turner, R. S. Dopamine neurons encode errors in predicting movement trigger occurrence. J. Neurophysiol. 113, 1110–1123 (2015).
Article Google Scholar
Fiorillo, C. D., Newsome, W. T. & Schultz, W. The temporal precision of reward prediction in dopamine neurons. Nat. Neurosci. 11, 966–973 (2008).
Article CAS Google Scholar
Morita, K. & Kato, A. Striatal dopamine ramping may indicate flexible reinforcement learning with forgetting in the cortico-basal ganglia circuits. Front. Neural Circuits 8, 36 (2014).
PubMed PubMed Central Google Scholar
Gershman, S. J. Dopamine ramps are a consequence of reward prediction errors. Neural Comput. 26, 467–471 (2014).
Article Google Scholar
Nicola, S. M. The flexible approach hypothesis: unification of effort and cue-responding hypotheses for the role of nucleus accumbens dopamine in the activation of reward-seeking behavior. J. Neurosci. 30, 16585–16600 (2010).
Article CAS Google Scholar
Paxinos, G. & Watson, C. The Rat Brain in Stereotaxic Coordinates 5th edn (Elsevier Academic, 2005).
Witten, I. B. et al. Recombinase-driver rat lines: tools, techniques, and optogenetic application to dopamine-mediated reinforcement. Neuron 72, 721–733 (2011).
Article CAS Google Scholar
Sugrue, L. P., Corrado, G. S. & Newsome, W. T. Matching behavior and the representation of value in the parietal cortex. Science 304, 1782–1787 (2004).
Article CAS ADS Google Scholar
Wong, J. M. et al. Benzoyl chloride derivatization with liquid chromatography-mass spectrometry for targeted metabolomics of neurochemicals in biological samples. J. Chromatogr. A 1446, 78–90 (2016).
Article CAS Google Scholar
Chung, J. E. et al. A fully automated approach to spike sorting. Neuron 95, 1381–1394 (2017).
Article CAS Google Scholar
Kvitsiani, D. et al. Distinct behavioural and network correlates of two interneuron types in prefrontal cortex. Nature 498, 363–366 (2013).
Article CAS ADS Google Scholar
Grace, A. A. & Bunney, B. S. The control of firing pattern in nigral dopamine neurons: burst firing. J. Neurosci. 4, 2877–2890 (1984).
Article CAS Google Scholar
Lerner, T. N. et al. Intact-brain analyses reveal distinct information carried by SNc dopamine subcircuits. Cell 162, 635–647 (2015).
Article CAS Google Scholar

Download references

Acknowledgements

We thank P. Dayan, H. Fields, L. Frank, C. Donaghue and T. Faust for their comments on an early version of the manuscript, and V. Hetrick, R. Hashim and T. Davidson for technical assistance and advice. This work was supported by the National Institute on Drug Abuse, the National Institute of Mental Health, the National Institute on Neurological Disorders and Stroke, the University of Michigan, Ann Arbor, and the University of California, San Francisco.

Reviewer information

Nature thanks Margaret Rice and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Author information

These authors contributed equally: Ali Mohebi, Jeffrey R. Pettibone

Authors and Affiliations

Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
Ali Mohebi, Jeffrey R. Pettibone & Joshua D. Berke
Department of Neuroscience, Brown University, Providence, RI, USA
Arif A. Hamid
Department of Chemistry, University of Michigan, Ann Arbor, MI, USA
Jenny-Marie T. Wong & Robert T. Kennedy
Neuroscience Graduate Program, University of California, San Francisco, San Francisco, CA, USA
Leah T. Vinson & Joshua D. Berke
Department of Biochemistry and Molecular Medicine, School of Medicine, University of California, Davis, Davis, CA, USA
Tommaso Patriarchi & Lin Tian
Weill Institute for Neurosciences and Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, San Francisco, USA
Joshua D. Berke

Authors

Ali Mohebi
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey R. Pettibone
View author publications
You can also search for this author in PubMed Google Scholar
Arif A. Hamid
View author publications
You can also search for this author in PubMed Google Scholar
Jenny-Marie T. Wong
View author publications
You can also search for this author in PubMed Google Scholar
Leah T. Vinson
View author publications
You can also search for this author in PubMed Google Scholar
Tommaso Patriarchi
View author publications
You can also search for this author in PubMed Google Scholar
Lin Tian
View author publications
You can also search for this author in PubMed Google Scholar
Robert T. Kennedy
View author publications
You can also search for this author in PubMed Google Scholar
Joshua D. Berke
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.M. performed and analysed the electrophysiology and photometry, and applied the computational model. J.R.P. performed and analysed the microdialysis with assistance from J.-M.T.W. and supervision by R.T.K. A.A.H. developed the behavioural task and initial photometry setup, and performed the voltammetry. L.T.V. performed retrograde tracing and analysis. T.P. and L.T. developed the dLight sensor and shared expertise. J.D.B. designed and supervised the study, and wrote the manuscript.

Corresponding author

Correspondence to Joshua D. Berke.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Microdialysis subregions and the reward rate parameter.

a, Top left, anatomical definitions of the subregions examined with microdialysis. Brain atlas outlines in this figure were reproduced with permission from Paxinos and Watson, 2005⁵¹. Other panels map the correlation between dopamine release and reward rate at individual probe placements in coronal (mm from bregma, B) and sagittal (mm from midline) planes. Colour bar shows strength of correlation. b, Top left, Regression analysis showing dependency of (log) latency on the outcome of recent trials, during microdialysis sessions (n = 26 sessions, 7,113 trials, from 12 rats; error bars show s.e.m.). *average regression weights significantly different from zero (t-test, P < 0.05). Top right, illustration of how the reward rate definition depends on the time constant (tau) of the leaky integrator. Top middle, dopamine: reward rate correlations as a function of τ. In the main Figs., τ was chosen (from a range of 1–1,200 s) to maximize the (negative) correlation between reward rate and (log) latency in each session. Thin lines represent individual sessions, with the best fit τ used in regression analyses indicated by a dot. Thick lines indicate the average of all dopamine: reward rate correlations for a given tau within each subregion. Overall behavioural metrics were similar between sessions sampling from each of the seven subregions (mean rewards per min: range 1.42–1.77, ANOVA F(6,44) = 0.58, P = 0.746; mean attempts per min: range 3.32–3.97, F(6,44) = 0.40, P = 0.872; mean latency: range 5.99–8.02, F(6,44) = 0.27, P = 0.948).

Extended Data Fig. 2 Correlations between all neurochemicals and a range of behavioural factors.

Bars represent R² values for linear tests between each analyte (rows) and behavioural covariates (columns). In models with more than one covariate, bar length indicates the R² for the full model. Negative relationships are reported in blue and positive relationships are in red. P values are reported at three alpha levels (0.05, 0.0005 and 0.000005) after Bonferroni correction for multiple comparisons (7 subregions × 21 analytes × 12 measures). To calculate reward rate, we averaged the leaky-integrator-estimated reward rate in 1-min bins defined by the start and end of each dialysis sample. ‘Attempts’ is the number of initiated trials (including trials that resulted in an error) in each dialysis minute. Attempts and reward rate and an interaction term were combined in a single model (column 2) to examine whether adding attempts could explain additional variance in the analyte signal that could not be explained by reward rate alone. ‘Latency’ is the average of the (log) latency in each minute. ‘Exploit’ is the proportion of choices of the higher reward probability option, in the last half of blocks for which the two ports had different probabilities. ‘Rewards’ and ‘omissions’ were defined as the number of rewarded and unrewarded trials in each minute, respectively. ‘Cumulative rewards’ and ‘time’ were included in the same regression model to estimate progressive factors such as satiety, and possible slow timescale increases or decreases in analyte concentration across the session. Cumulative rewards represents the total number of rewards received by the end of the current dialysis minute, and time was simply the number of minutes elapsed since the session began. Bars in this column show colour when only the coefficient for the cumulative reward variable was significant. %Ipsi and %Contra represent the fraction of choices to ipsi- or contra-versive ports (relative to probe location in the brain) in each minute, independent of block probability. P(win-stay) is the probability of repeating the previous choice, given the previous choice was rewarded.

Extended Data Fig. 3 Histological analysis of electrophysiological recording locations.

Left, atlas locations and histology photomicrographs for each rat (IM-657, IM-1002, IM-1003, IM-1037 and IM-1078) from which opto-tagged dopamine cells were obtained. Red, TH-staining; green: ChR2–eYFP; blue: DAPI. Scale bars, 1 mm. IM-1037 and IM-1078 brains were sliced horizontally, so fibre tracks appear as a circle. Font colours for rat ID numbers correspond to colours of tick marks in coronal atlas sections, indicating estimated recording locations for opto-tagged dopamine cells. For IM-1078, virus was injected into NAc core, and retrogradely infected dopamine neurons were recorded in VTA. Right, retrograde tracing of CTb from NAc core (top) to VTA-l (bottom). Top panel shows approximate extent of NAc labelling in each of the three rats (each rat indicated by a different colour). Bottom left panels show close-ups of TH labelling (blue), CTb (green) and merged image. Bottom right panels show reconstructed locations of TH⁺ and double-labelled TH⁺CTb⁺ midbrain neurons, on horizontal atlas sections. Estimated optrode locations are shown by red circles (or orange circle, in the case of the retrograde tagging rat IM-1078). Labelled neurons were counted within the red rectangles that span the AP and ML extent of estimated recording locations. Percentages shown are the fraction of TH⁺ neurons that are also CTb⁺. Brain atlas outlines in this figure were reproduced with permission from Paxinos and Watson, 2005⁵¹.

Extended Data Fig. 4 Identification of light-responsive cells.

a, Average waveforms of optogenetically identified dopamine neurons (negative voltage upwards). Average light-evoked waveforms are shown in blue and session-wide average waveforms are in black. All spikes within 10 ms of laser onset were used to construct light-evoked waveform average. Averaged waveforms are normalized to have similar total peak-valley voltages (see Supplementary Fig. 1 for individual voltage ranges). b, Session-wide average waveform for non-dopamine cells. c, Opto-tagging P value for all units plotted in log-scale, showing a strong bimodal distribution. To classify cells as light-responsive we used a threshold of P < 0.001. d, Times to first spike after laser onset, showing mean for each identified dopamine neuron, and standard deviation (jitter).

Extended Data Fig. 5 Dopaminergic responses to Pavlovian cues.

a, Tone pips were followed by reward delivery (‘click’) with different probabilities (zero, medium or high) depending on the tone pitch. During prior training (average 15.6 sessions, range 2–26) rats had learned about these different probabilities, as indicated by their corresponding scaled likelihood of entering the food port during cue presentation. ‘Head entry %’ indicates proportion of trials for which the rat was at the food port at each moment in time, for one example session. Red and blue indicate rewarded and unrewarded trials, respectively. This rat was more likely to go to the food port during the cue that was highly (75%) predictive of rewards compared to the other cues (25% and 0%; one-way ANOVA, F = 11.1, P < 1.2 × 10⁻⁶). Unpredictable reward delivery (right) prompts rapid approach. Bottom, raster plots and peri-event time histograms from an identified dopamine neuron during that same session. b, Averaged firing for identified dopamine cells (n = 27) in this task. High/medium tones were either 75%/25% predictive of reward (n = 9 cells) or 100%/50% (n = 18), respectively. Data on each individual dopamine neuron are presented in the Supplementary Fig. 1. c, Behaviour (top), cue response (middle) and click response (bottom) for all Pavlovian sessions with opto-tagged dopamine cells. Statistical comparisons were all one-way ANOVA, using food port head entry during 0.3–3-s epoch relative to cue onset, and peak firing rate during 0.5-s duration epochs after cue onset or food-hopper clicks. d–f, Same as above except for dLight measurements (n = 10 sessions total). All dLight sessions used tones with 75, 25 and 0% reward probability, and ANOVA tests examined peak signal within 1 s of cue onset or food-hopper clicks.

Extended Data Fig. 6 Results from each dLight recording session.

Each row shows a distinct optic fibre placement, and the corresponding recording session that was included in data analyses. For two rats (IM-1066 and IM-1088) we obtained bilateral NAc dLight recordings. From left to right, panels show histologically determined NAc location of fibre tip (within horizontal brain atlas section, including atlas coordinates⁵¹), long timescale cross-correlation with reward rate (as in Fig. 3c), short timescale cross-correlation with reward rate (black), SMDP state value (green) and RPE (magenta; as in Fig. 3f); event-aligned averages (as in Fig. 4b, but including more events). For Light-on and Centre-in alignments data are split by latencies <1 s (light green) or >2 s (dark green; as in Fig. 4d); for other alignments, data are split by rewarded (red) and unrewarded (blue) trials. Brain atlas outlines in this figure were reproduced with permission from Paxinos and Watson, 2005⁵¹.

Extended Data Fig. 7 Comparing event-aligned activity between different signals.

Format is as in Fig. 4. dLight fluorescence is here shown separately for 470-nm and 405-nm (control) excitation. Of note, (1) rapid, behaviour-linked dLight fluorescence changes occur at 470 nm, as expected, not in the control 405-nm band; (2) distinct timing of spiking, dLight, and voltammetry responses to cue onsets; and (3) non-dopamine cell firing is much more variable (wider error bands) but on average shows activity during movements: starting just before Centre-in (irrespective of latency), just before Side-in, and just before Food-port-in.

Extended Data Fig. 8 Different methods for calculating reward expectation produce similar results.

Left column, average firing rate of dopamine cells around Side-in, broken down by terciles of reward expectation, based either on recent reward rate (top; same as Fig. 5a), number of rewards in previous ten trials, state value (V) of an actor-critic model or state value (Q_left + Q_right) of a Q-learning model. The actor-critic and Q-learning models were both trial-based, rather than evolving continuously in time. The actor-critic model estimated the overall probability of receiving a reward on each trial, V, using the update rule V′ = V + alpha(RPE), in which RPE = actual reward [1 or 0] − V. The Q-learning model kept separate estimates of the probabilities of receiving rewards for left and right choices (Q_left and Q_right) and updated Q for the chosen action (only) using Q′ = Q + alpha(RPE), in which RPE = actual reward [1 or 0] – Q. The learning parameter alpha was determined for each session by best fit to latencies, for V or (Q_left + Q_right) respectively. The subsequent columns show correlations between reward expectation and dopamine cell firing after Side-in, measuring either peak firing rate (within 250 ms after rewarded Side-in), minimum firing rate (middle; within 2 s after unrewarded Side-in) and pause duration (bottom; maximum inter-spike-interval within 2 s after unrewarded Side-in). For all histograms, light blue indicates cells with significant correlations (P < 0.01) before multiple comparisons correction, dark blue indicates cells that remained significant after correction. Positive RPE coding is strong and consistent, negative RPE coding is less so.

Supplementary information

41586_2019_1235_MOESM1_ESM.pdf

Supplementary Figure 1 Properties of each individual identified dopamine cell (one per page; last two pages are retro-tagged cells). a, Average light-evoked spike waveform (blue) and session-wide average waveform (black). b, Interspike interval histogram (during bandit task). c, Raster plot showing response to 5ms laser pulses (delivered at 2Hz). d, Raster plot with 10ms laser pulses (for cells that were tested under this condition). e, Scatter plot (as Fig. 2b), with this neuron highlighted in yellow. f, Behavior, and g, activity during the Pavlovian approach task. h, Firing rate, latency and reward rate during the bandit task. i, Average response of this cell to the bandit task Side-In event, broken down by reward rate terciles (as Fig. 5a). j. Spike rasters and firing rate histograms aligned to various bandit task events

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mohebi, A., Pettibone, J.R., Hamid, A.A. et al. Dissociable dopamine dynamics for learning and motivation. Nature 570, 65–70 (2019). https://doi.org/10.1038/s41586-019-1235-y

Download citation

Received: 25 May 2018
Accepted: 29 April 2019
Published: 22 May 2019
Issue Date: 06 June 2019
DOI: https://doi.org/10.1038/s41586-019-1235-y

This article is cited by

Striatal dopamine signals reflect perceived cue–action–outcome associations in mice
- Tobias W. Bernklau
- Beatrice Righetti
- Simon N. Jacob
Nature Neuroscience (2024)
Neural inhibition as implemented by an actor-critic model involves the human dorsal striatum and ventral tegmental area
- Ana Araújo
- Isabel Catarina Duarte
- Miguel Castelo-Branco
Scientific Reports (2024)
State and rate-of-change encoding in parallel mesoaccumbal dopamine pathways
- Johannes W. de Jong
- Yilan Liang
- Stephan Lammel
Nature Neuroscience (2024)
Dopamine transients follow a striatal gradient of reward time horizons
- Ali Mohebi
- Wei Wei
- Joshua D. Berke
Nature Neuroscience (2024)
Dopamine-independent effect of rewards on choices through hidden-state inference
- Marta Blanco-Pozo
- Thomas Akam
- Mark E. Walton
Nature Neuroscience (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.