Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Dissociable dopamine dynamics for learning and motivation

A Publisher Correction to this article was published on 20 June 2019

This article has been updated

Abstract

The dopamine projection from ventral tegmental area (VTA) to nucleus accumbens (NAc) is critical for motivation to work for rewards and reward-driven learning. How dopamine supports both functions is unclear. Dopamine cell spiking can encode prediction errors, which are vital learning signals in computational theories of adaptive behaviour. By contrast, dopamine release ramps up as animals approach rewards, mirroring reward expectation. This mismatch might reflect differences in behavioural tasks, slower changes in dopamine cell spiking or spike-independent modulation of dopamine release. Here we compare spiking of identified VTA dopamine cells with NAc dopamine release in the same decision-making task. Cues that indicate an upcoming reward increased both spiking and release. However, NAc core dopamine release also covaried with dynamically evolving reward expectations, without corresponding changes in VTA dopamine cell spiking. Our results suggest a fundamental difference in how dopamine release is regulated to achieve distinct functions: broadcast burst signals promote learning, whereas local control drives motivation.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Dopamine release covaries with reward rate specifically in NAc core and ventral prelimbic cortex.
Fig. 2: Activity of identified VTA dopamine neurons does not change with reward rate.
Fig. 3: Bridging timescales of dopamine measurement.
Fig. 4: Phasic VTA dopamine firing does not account for NAc dopamine dynamics.
Fig. 5: Reward history affects VTA dopamine cell firing and NAc dopamine release differently.

Similar content being viewed by others

Data availability

The AAV.Synapsin.dLight1.3b virus used in this study has been deposited with Addgene (no. 125560; http://www.addgene.org). All data will be available through the Collaborative Research in Computational Neuroscience data sharing website (https://doi.org/110.6080/K0VQ30V9).

Code availability

Custom MATLAB code is available on request from J.D.B.

Change history

  • 20 June 2019

    Change history: In this Article, an extraneous label appeared in Fig. 4b, and has been removed in the online version.

References

  1. Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997).

    Article  CAS  Google Scholar 

  2. Pan, W. X., Schmidt, R., Wickens, J. R. & Hyland, B. I. Dopamine cells respond to predicted events during classical conditioning: evidence for eligibility traces in the reward-learning network. J. Neurosci. 25, 6235–6242 (2005).

    Article  CAS  Google Scholar 

  3. Cohen, J. Y., Haesler, S., Vong, L., Lowell, B. B. & Uchida, N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 482, 85–88 (2012).

    Article  CAS  ADS  Google Scholar 

  4. Steinberg, E. E. et al. A causal link between prediction errors, dopamine neurons and learning. Nat. Neurosci. 16, 966–973 (2013).

    Article  CAS  Google Scholar 

  5. Hamid, A. A. et al. Mesolimbic dopamine signals the value of work. Nat. Neurosci. 19, 117–126 (2016).

    Article  CAS  Google Scholar 

  6. Saunders, B. T., Richard, J. M., Margolis, E. B. & Janak, P. H. Dopamine neurons create Pavlovian conditioned stimuli with circuit-defined motivational properties. Nat. Neurosci. 21, 1072–1083 (2018).

    Article  CAS  Google Scholar 

  7. Phillips, P. E., Stuber, G. D., Heien, M. L., Wightman, R. M. & Carelli, R. M. Subsecond dopamine release promotes cocaine seeking. Nature 422, 614–618 (2003).

    Article  CAS  ADS  Google Scholar 

  8. Roitman, M. F., Stuber, G. D., Phillips, P. E., Wightman, R. M. & Carelli, R. M. Dopamine operates as a subsecond modulator of food seeking. J. Neurosci. 24, 1265–1271 (2004).

    Article  CAS  Google Scholar 

  9. Wassum, K. M., Ostlund, S. B. & Maidment, N. T. Phasic mesolimbic dopamine signaling precedes and predicts performance of a self-initiated action sequence task. Biol. Psychiatry 71, 846–854 (2012).

    Article  CAS  Google Scholar 

  10. Howe, M. W., Tierney, P. L., Sandberg, S. G., Phillips, P. E. & Graybiel, A. M. Prolonged dopamine signalling in striatum signals proximity and value of distant rewards. Nature 500, 575–579 (2013).

    Article  CAS  ADS  Google Scholar 

  11. Syed, E. C. et al. Action initiation shapes mesolimbic dopamine encoding of future rewards. Nat. Neurosci. 19, 34–36 (2016).

    Article  CAS  Google Scholar 

  12. Morris, G., Nevet, A., Arkadir, D., Vaadia, E. & Bergman, H. Midbrain dopamine neurons encode decisions for future action. Nat. Neurosci. 9, 1057–1063 (2006).

    Article  CAS  Google Scholar 

  13. da Silva, J. A., Tecuapetla, F., Paixão, V. & Costa, R. M. Dopamine neuron activity before action initiation gates and invigorates future movements. Nature 554, 244–248 (2018).

    Article  ADS  Google Scholar 

  14. Fiorillo, C. D., Tobler, P. N. & Schultz, W. Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299, 1898–1902 (2003).

    Article  CAS  ADS  Google Scholar 

  15. Patriarchi, T., Cho, J. R., Merten, K., Howe, M. W., et al. Ultrafast neuronal imaging of dopamine dynamics with designed genetically encoded sensors. Science 360, eaat4422 (2018).

    Article  Google Scholar 

  16. Salamone, J. D. & Correa, M. The mysterious motivational functions of mesolimbic dopamine. Neuron 76, 470–485 (2012).

    Article  CAS  Google Scholar 

  17. Schultz, W. Predictive reward signal of dopamine neurons. J. Neurophysiol. 80, 1–27 (1998).

    Article  CAS  Google Scholar 

  18. Garris, P. A. & Wightman, R. M. Different kinetics govern dopaminergic transmission in the amygdala, prefrontal cortex, and striatum: an in vivo voltammetric study. J. Neurosci. 14, 442–450 (1994).

    Article  CAS  Google Scholar 

  19. Frank, M. J., Doll, B. B., Oas-Terpstra, J. & Moreno, F. Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation. Nat. Neurosci. 12, 1062–1068 (2009).

    Article  CAS  Google Scholar 

  20. St Onge, J. R., Ahn, S., Phillips, A. G. & Floresco, S. B. Dynamic fluctuations in dopamine efflux in the prefrontal cortex and nucleus accumbens during risk-based decision making. J. Neurosci. 32, 16880–16891 (2012).

    Article  Google Scholar 

  21. Bartra, O., McGuire, J. T. & Kable, J. W. The valuation system: a coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value. Neuroimage 76, 412–427 (2013).

    Article  Google Scholar 

  22. Ikemoto, S. Dopamine reward circuitry: two projection systems from the ventral midbrain to the nucleus accumbens-olfactory tubercle complex. Brain Res. Brain Res. Rev. 56, 27–78 (2007).

    Article  CAS  Google Scholar 

  23. Breton, J. M. et al. Relative contributions and mapping of ventral tegmental area dopamine and GABA neurons by projection target in the rat. J. Comp. Neurol. (2018).

  24. Ungless, M. A., Magill, P. J. & Bolam, J. P. Uniform inhibition of dopamine neurons in the ventral tegmental area by aversive stimuli. Science 303, 2040–2042 (2004).

    Article  CAS  ADS  Google Scholar 

  25. Morales, M. & Margolis, E. B. Ventral tegmental area: cellular heterogeneity, connectivity and behaviour. Nat. Rev. Neurosci. 18, 73–85 (2017).

    Article  CAS  Google Scholar 

  26. Morris, G., Arkadir, D., Nevet, A., Vaadia, E. & Bergman, H. Coincident but distinct messages of midbrain dopamine and striatal tonically active neurons. Neuron 43, 133–143 (2004).

    Article  CAS  Google Scholar 

  27. Floresco, S. B., West, A. R., Ash, B., Moore, H. & Grace, A. A. Afferent modulation of dopamine neuron firing differentially regulates tonic and phasic dopamine transmission. Nat. Neurosci. 6, 968–973 (2003).

    Article  CAS  Google Scholar 

  28. Grace, A. A. Dysregulation of the dopamine system in the pathophysiology of schizophrenia and depression. Nat. Rev. Neurosci. 17, 524–532 (2016).

    Article  CAS  Google Scholar 

  29. Cohen, J. Y., Amoroso, M. W. & Uchida, N. Serotonergic neurons signal reward and punishment on multiple timescales. eLife 4, e06346 (2015).

    Article  Google Scholar 

  30. Niv, Y., Daw, N. & Dayan, P. How fast to work: response vigor, motivation and tonic dopamine. Adv. Neural Inf. Process. Syst. 18, 1019 (2006).

    Google Scholar 

  31. Bayer, H. M., Lau, B. & Glimcher, P. W. Statistics of midbrain dopamine neuron spike trains in the awake primate. J. Neurophysiol. 98, 1428–1439 (2007).

    Article  Google Scholar 

  32. Chergui, K., Suaud-Chagny, M. F. & Gonon, F. Nonlinear relationship between impulse flow, dopamine release and dopamine elimination in the rat brain in vivo. Neuroscience 62, 641–645 (1994).

    Article  CAS  Google Scholar 

  33. Parker, N. F. et al. Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target. Nat. Neurosci. 19, 845–854 (2016).

    Article  CAS  Google Scholar 

  34. Menegas, W., Babayan, B. M., Uchida, N. & Watabe-Uchida, M. Opposite initialization to novel cues in dopamine signaling in ventral and posterior striatum in mice. eLife 6, e21886 (2017).

    Article  Google Scholar 

  35. Trulson, M. E. Simultaneous recording of substantia nigra neurons and voltammetric release of dopamine in the caudate of behaving cats. Brain Res. Bull. 15, 221–223 (1985).

    Article  CAS  Google Scholar 

  36. Glowinski, J., Chéramy, A., Romo, R. & Barbeito, L. Presynaptic regulation of dopaminergic transmission in the striatum. Cell. Mol. Neurobiol. 8, 7–17 (1988).

    Article  CAS  Google Scholar 

  37. Zhou, F. M., Liang, Y. & Dani, J. A. Endogenous nicotinic cholinergic activity regulates dopamine release in the striatum. Nat. Neurosci. 4, 1224–1229 (2001).

    Article  CAS  Google Scholar 

  38. Threlfell, S. et al. Striatal dopamine release is triggered by synchronized activity in cholinergic interneurons. Neuron 75, 58–64 (2012).

    Article  CAS  Google Scholar 

  39. Cachope, R. et al. Selective activation of cholinergic interneurons enhances accumbal phasic dopamine release: setting the tone for reward processing. Cell Reports 2, 33–41 (2012).

    Article  CAS  Google Scholar 

  40. Sulzer, D., Cragg, S. J. & Rice, M. E. Striatal dopamine neurotransmission: regulation of release and uptake. Basal Ganglia 6, 123–148 (2016).

    Article  Google Scholar 

  41. Floresco, S. B., Yang, C. R., Phillips, A. G. & Blaha, C. D. Basolateral amygdala stimulation evokes glutamate receptor-dependent dopamine efflux in the nucleus accumbens of the anaesthetized rat. Eur. J. Neurosci. 10, 1241–1251 (1998).

    Article  CAS  Google Scholar 

  42. Jones, J. L. et al. Basolateral amygdala modulates terminal dopamine release in the nucleus accumbens and conditioned responding. Biol. Psychiatry 67, 737–744 (2010).

    Article  CAS  Google Scholar 

  43. Schultz, W. Responses of midbrain dopamine neurons to behavioral trigger stimuli in the monkey. J. Neurophysiol. 56, 1439–1461 (1986).

    Article  CAS  Google Scholar 

  44. Berke, J. D. What does dopamine mean? Nat. Neurosci. 21, 787–793 (2018).

    Article  CAS  Google Scholar 

  45. Bromberg-Martin, E. S., Matsumoto, M. & Hikosaka, O. Distinct tonic and phasic anticipatory activity in lateral habenula and dopamine neurons. Neuron 67, 144–155 (2010).

    Article  CAS  Google Scholar 

  46. Pasquereau, B. & Turner, R. S. Dopamine neurons encode errors in predicting movement trigger occurrence. J. Neurophysiol. 113, 1110–1123 (2015).

    Article  Google Scholar 

  47. Fiorillo, C. D., Newsome, W. T. & Schultz, W. The temporal precision of reward prediction in dopamine neurons. Nat. Neurosci. 11, 966–973 (2008).

    Article  CAS  Google Scholar 

  48. Morita, K. & Kato, A. Striatal dopamine ramping may indicate flexible reinforcement learning with forgetting in the cortico-basal ganglia circuits. Front. Neural Circuits 8, 36 (2014).

    PubMed  PubMed Central  Google Scholar 

  49. Gershman, S. J. Dopamine ramps are a consequence of reward prediction errors. Neural Comput. 26, 467–471 (2014).

    Article  Google Scholar 

  50. Nicola, S. M. The flexible approach hypothesis: unification of effort and cue-responding hypotheses for the role of nucleus accumbens dopamine in the activation of reward-seeking behavior. J. Neurosci. 30, 16585–16600 (2010).

    Article  CAS  Google Scholar 

  51. Paxinos, G. & Watson, C. The Rat Brain in Stereotaxic Coordinates 5th edn (Elsevier Academic, 2005).

  52. Witten, I. B. et al. Recombinase-driver rat lines: tools, techniques, and optogenetic application to dopamine-mediated reinforcement. Neuron 72, 721–733 (2011).

    Article  CAS  Google Scholar 

  53. Sugrue, L. P., Corrado, G. S. & Newsome, W. T. Matching behavior and the representation of value in the parietal cortex. Science 304, 1782–1787 (2004).

    Article  CAS  ADS  Google Scholar 

  54. Wong, J. M. et al. Benzoyl chloride derivatization with liquid chromatography-mass spectrometry for targeted metabolomics of neurochemicals in biological samples. J. Chromatogr. A 1446, 78–90 (2016).

    Article  CAS  Google Scholar 

  55. Chung, J. E. et al. A fully automated approach to spike sorting. Neuron 95, 1381–1394 (2017).

    Article  CAS  Google Scholar 

  56. Kvitsiani, D. et al. Distinct behavioural and network correlates of two interneuron types in prefrontal cortex. Nature 498, 363–366 (2013).

    Article  CAS  ADS  Google Scholar 

  57. Grace, A. A. & Bunney, B. S. The control of firing pattern in nigral dopamine neurons: burst firing. J. Neurosci. 4, 2877–2890 (1984).

    Article  CAS  Google Scholar 

  58. Lerner, T. N. et al. Intact-brain analyses reveal distinct information carried by SNc dopamine subcircuits. Cell 162, 635–647 (2015).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank P. Dayan, H. Fields, L. Frank, C. Donaghue and T. Faust for their comments on an early version of the manuscript, and V. Hetrick, R. Hashim and T. Davidson for technical assistance and advice. This work was supported by the National Institute on Drug Abuse, the National Institute of Mental Health, the National Institute on Neurological Disorders and Stroke, the University of Michigan, Ann Arbor, and the University of California, San Francisco.

Reviewer information

Nature thanks Margaret Rice and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Author information

Authors and Affiliations

Authors

Contributions

A.M. performed and analysed the electrophysiology and photometry, and applied the computational model. J.R.P. performed and analysed the microdialysis with assistance from J.-M.T.W. and supervision by R.T.K. A.A.H. developed the behavioural task and initial photometry setup, and performed the voltammetry. L.T.V. performed retrograde tracing and analysis. T.P. and L.T. developed the dLight sensor and shared expertise. J.D.B. designed and supervised the study, and wrote the manuscript.

Corresponding author

Correspondence to Joshua D. Berke.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Microdialysis subregions and the reward rate parameter.

a, Top left, anatomical definitions of the subregions examined with microdialysis. Brain atlas outlines in this figure were reproduced with permission from Paxinos and Watson, 200551. Other panels map the correlation between dopamine release and reward rate at individual probe placements in coronal (mm from bregma, B) and sagittal (mm from midline) planes. Colour bar shows strength of correlation. b, Top left, Regression analysis showing dependency of (log) latency on the outcome of recent trials, during microdialysis sessions (n = 26 sessions, 7,113 trials, from 12 rats; error bars show s.e.m.). *average regression weights significantly different from zero (t-test, P < 0.05). Top right, illustration of how the reward rate definition depends on the time constant (tau) of the leaky integrator. Top middle, dopamine: reward rate correlations as a function of τ. In the main Figs., τ was chosen (from a range of 1–1,200 s) to maximize the (negative) correlation between reward rate and (log) latency in each session. Thin lines represent individual sessions, with the best fit τ used in regression analyses indicated by a dot. Thick lines indicate the average of all dopamine: reward rate correlations for a given tau within each subregion. Overall behavioural metrics were similar between sessions sampling from each of the seven subregions (mean rewards per min: range 1.42–1.77, ANOVA F(6,44) = 0.58, P = 0.746; mean attempts per min: range 3.32–3.97, F(6,44) = 0.40, P = 0.872; mean latency: range 5.99–8.02, F(6,44) = 0.27, P = 0.948).

Extended Data Fig. 2 Correlations between all neurochemicals and a range of behavioural factors.

Bars represent R2 values for linear tests between each analyte (rows) and behavioural covariates (columns). In models with more than one covariate, bar length indicates the R2 for the full model. Negative relationships are reported in blue and positive relationships are in red. P values are reported at three alpha levels (0.05, 0.0005 and 0.000005) after Bonferroni correction for multiple comparisons (7 subregions × 21 analytes × 12 measures). To calculate reward rate, we averaged the leaky-integrator-estimated reward rate in 1-min bins defined by the start and end of each dialysis sample. ‘Attempts’ is the number of initiated trials (including trials that resulted in an error) in each dialysis minute. Attempts and reward rate and an interaction term were combined in a single model (column 2) to examine whether adding attempts could explain additional variance in the analyte signal that could not be explained by reward rate alone. ‘Latency’ is the average of the (log) latency in each minute. ‘Exploit’ is the proportion of choices of the higher reward probability option, in the last half of blocks for which the two ports had different probabilities. ‘Rewards’ and ‘omissions’ were defined as the number of rewarded and unrewarded trials in each minute, respectively. ‘Cumulative rewards’ and ‘time’ were included in the same regression model to estimate progressive factors such as satiety, and possible slow timescale increases or decreases in analyte concentration across the session. Cumulative rewards represents the total number of rewards received by the end of the current dialysis minute, and time was simply the number of minutes elapsed since the session began. Bars in this column show colour when only the coefficient for the cumulative reward variable was significant. %Ipsi and %Contra represent the fraction of choices to ipsi- or contra-versive ports (relative to probe location in the brain) in each minute, independent of block probability. P(win-stay) is the probability of repeating the previous choice, given the previous choice was rewarded.

Extended Data Fig. 3 Histological analysis of electrophysiological recording locations.

Left, atlas locations and histology photomicrographs for each rat (IM-657, IM-1002, IM-1003, IM-1037 and IM-1078) from which opto-tagged dopamine cells were obtained. Red, TH-staining; green: ChR2–eYFP; blue: DAPI. Scale bars, 1 mm. IM-1037 and IM-1078 brains were sliced horizontally, so fibre tracks appear as a circle. Font colours for rat ID numbers correspond to colours of tick marks in coronal atlas sections, indicating estimated recording locations for opto-tagged dopamine cells. For IM-1078, virus was injected into NAc core, and retrogradely infected dopamine neurons were recorded in VTA. Right, retrograde tracing of CTb from NAc core (top) to VTA-l (bottom). Top panel shows approximate extent of NAc labelling in each of the three rats (each rat indicated by a different colour). Bottom left panels show close-ups of TH labelling (blue), CTb (green) and merged image. Bottom right panels show reconstructed locations of TH+ and double-labelled TH+CTb+ midbrain neurons, on horizontal atlas sections. Estimated optrode locations are shown by red circles (or orange circle, in the case of the retrograde tagging rat IM-1078). Labelled neurons were counted within the red rectangles that span the AP and ML extent of estimated recording locations. Percentages shown are the fraction of TH+ neurons that are also CTb+. Brain atlas outlines in this figure were reproduced with permission from Paxinos and Watson, 200551.

Extended Data Fig. 4 Identification of light-responsive cells.

a, Average waveforms of optogenetically identified dopamine neurons (negative voltage upwards). Average light-evoked waveforms are shown in blue and session-wide average waveforms are in black. All spikes within 10 ms of laser onset were used to construct light-evoked waveform average. Averaged waveforms are normalized to have similar total peak-valley voltages (see Supplementary Fig. 1 for individual voltage ranges). b, Session-wide average waveform for non-dopamine cells. c, Opto-tagging P value for all units plotted in log-scale, showing a strong bimodal distribution. To classify cells as light-responsive we used a threshold of P < 0.001. d, Times to first spike after laser onset, showing mean for each identified dopamine neuron, and standard deviation (jitter).

Extended Data Fig. 5 Dopaminergic responses to Pavlovian cues.

a, Tone pips were followed by reward delivery (‘click’) with different probabilities (zero, medium or high) depending on the tone pitch. During prior training (average 15.6 sessions, range 2–26) rats had learned about these different probabilities, as indicated by their corresponding scaled likelihood of entering the food port during cue presentation. ‘Head entry %’ indicates proportion of trials for which the rat was at the food port at each moment in time, for one example session. Red and blue indicate rewarded and unrewarded trials, respectively. This rat was more likely to go to the food port during the cue that was highly (75%) predictive of rewards compared to the other cues (25% and 0%; one-way ANOVA, F = 11.1, P < 1.2 × 10−6). Unpredictable reward delivery (right) prompts rapid approach. Bottom, raster plots and peri-event time histograms from an identified dopamine neuron during that same session. b, Averaged firing for identified dopamine cells (n = 27) in this task. High/medium tones were either 75%/25% predictive of reward (n = 9 cells) or 100%/50% (n = 18), respectively. Data on each individual dopamine neuron are presented in the Supplementary Fig. 1. c, Behaviour (top), cue response (middle) and click response (bottom) for all Pavlovian sessions with opto-tagged dopamine cells. Statistical comparisons were all one-way ANOVA, using food port head entry during 0.3–3-s epoch relative to cue onset, and peak firing rate during 0.5-s duration epochs after cue onset or food-hopper clicks. d–f, Same as above except for dLight measurements (n = 10 sessions total). All dLight sessions used tones with 75, 25 and 0% reward probability, and ANOVA tests examined peak signal within 1 s of cue onset or food-hopper clicks.

Extended Data Fig. 6 Results from each dLight recording session.

Each row shows a distinct optic fibre placement, and the corresponding recording session that was included in data analyses. For two rats (IM-1066 and IM-1088) we obtained bilateral NAc dLight recordings. From left to right, panels show histologically determined NAc location of fibre tip (within horizontal brain atlas section, including atlas coordinates51), long timescale cross-correlation with reward rate (as in Fig. 3c), short timescale cross-correlation with reward rate (black), SMDP state value (green) and RPE (magenta; as in Fig. 3f); event-aligned averages (as in Fig. 4b, but including more events). For Light-on and Centre-in alignments data are split by latencies <1 s (light green) or >2 s (dark green; as in Fig. 4d); for other alignments, data are split by rewarded (red) and unrewarded (blue) trials. Brain atlas outlines in this figure were reproduced with permission from Paxinos and Watson, 200551.

Extended Data Fig. 7 Comparing event-aligned activity between different signals.

Format is as in Fig. 4. dLight fluorescence is here shown separately for 470-nm and 405-nm (control) excitation. Of note, (1) rapid, behaviour-linked dLight fluorescence changes occur at 470 nm, as expected, not in the control 405-nm band; (2) distinct timing of spiking, dLight, and voltammetry responses to cue onsets; and (3) non-dopamine cell firing is much more variable (wider error bands) but on average shows activity during movements: starting just before Centre-in (irrespective of latency), just before Side-in, and just before Food-port-in.

Extended Data Fig. 8 Different methods for calculating reward expectation produce similar results.

Left column, average firing rate of dopamine cells around Side-in, broken down by terciles of reward expectation, based either on recent reward rate (top; same as Fig. 5a), number of rewards in previous ten trials, state value (V) of an actor-critic model or state value (Qleft + Qright) of a Q-learning model. The actor-critic and Q-learning models were both trial-based, rather than evolving continuously in time. The actor-critic model estimated the overall probability of receiving a reward on each trial, V, using the update rule V′ = V + alpha(RPE), in which RPE = actual reward [1 or 0] − V. The Q-learning model kept separate estimates of the probabilities of receiving rewards for left and right choices (Qleft and Qright) and updated Q for the chosen action (only) using Q′ = Q + alpha(RPE), in which RPE = actual reward [1 or 0] – Q. The learning parameter alpha was determined for each session by best fit to latencies, for V or (Qleft + Qright) respectively. The subsequent columns show correlations between reward expectation and dopamine cell firing after Side-in, measuring either peak firing rate (within 250 ms after rewarded Side-in), minimum firing rate (middle; within 2 s after unrewarded Side-in) and pause duration (bottom; maximum inter-spike-interval within 2 s after unrewarded Side-in). For all histograms, light blue indicates cells with significant correlations (P < 0.01) before multiple comparisons correction, dark blue indicates cells that remained significant after correction. Positive RPE coding is strong and consistent, negative RPE coding is less so.

Supplementary information

41586_2019_1235_MOESM1_ESM.pdf

Supplementary Figure 1 Properties of each individual identified dopamine cell (one per page; last two pages are retro-tagged cells). a, Average light-evoked spike waveform (blue) and session-wide average waveform (black). b, Interspike interval histogram (during bandit task). c, Raster plot showing response to 5ms laser pulses (delivered at 2Hz). d, Raster plot with 10ms laser pulses (for cells that were tested under this condition). e, Scatter plot (as Fig. 2b), with this neuron highlighted in yellow. f, Behavior, and g, activity during the Pavlovian approach task. h, Firing rate, latency and reward rate during the bandit task. i, Average response of this cell to the bandit task Side-In event, broken down by reward rate terciles (as Fig. 5a). j. Spike rasters and firing rate histograms aligned to various bandit task events

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mohebi, A., Pettibone, J.R., Hamid, A.A. et al. Dissociable dopamine dynamics for learning and motivation. Nature 570, 65–70 (2019). https://doi.org/10.1038/s41586-019-1235-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-019-1235-y

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing