Article | Published:

Dopamine transients are sufficient and necessary for acquisition of model-based associations

Nature Neuroscience volume 20, pages 735742 (2017) | Download Citation

This article has been updated

Abstract

Associative learning is driven by prediction errors. Dopamine transients correlate with these errors, which current interpretations limit to endowing cues with a scalar quantity reflecting the value of future rewards. We tested whether dopamine might act more broadly to support learning of an associative model of the environment. Using sensory preconditioning, we show that prediction errors underlying stimulus–stimulus learning can be blocked behaviorally and reinstated by optogenetically activating dopamine neurons. We further show that suppressing the firing of these neurons across the transition prevents normal stimulus–stimulus learning. These results establish that the acquisition of model-based information about transitions between nonrewarding events is also driven by prediction errors and that, contrary to existing canon, dopamine transients are both sufficient and necessary to support this type of learning. Our findings open new possibilities for how these biological signals might support associative learning in the mammalian brain in these and other contexts.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Change history

  • 10 April 2017

    In the version of this article initially published online, the checkered and filled boxes were reversed in the keys to Figures 3a and 3b. The error has been corrected in the print, PDF and HTML versions of this article.

  • 04 May 2017

    In the version of this article initially published, the histogram in Figure 2c, center top graph, was duplicated from the panel below, and the remaining histograms accompanying the scatter plots in Figures 2c and 5c were slightly mis-scaled and misaligned relative to the scatterplots. The histograms, as well as the vertical scaling of Figure 5c, bottom right graph, have been adjusted. Also, one data point from the scatterplot in the top right panel of Figure 2c had originally been transformed from a negative value on the vertical axis to its absolute value. The errors have been corrected in the PDF and HTML versions of this article.

  • 17 July 2018

    In the version of this article initially published, the laser activation at the start of cue X in experiment 1 was described in the first paragraph of the Results and in the third paragraph of the Experiment 1 section of the Methods as lasting 2 s; in fact, it lasted only 1 s. The error has been corrected in the HTML and PDF versions of the article.

References

  1. 1.

    Dopamine neurons and their role in reward mechanisms. Curr. Opin. Neurobiol. 7, 191–197 (1997).

  2. 2.

    , & A neural substrate of prediction and reward. Science 275, 1593–1599 (1997).

  3. 3.

    & Toward a modern theory of adaptive networks: expectation and prediction. Psychol. Rev. 88, 135–170 (1981).

  4. 4.

    Multiplexing signals in reinforcement learning with internal models and dopamine. Curr. Opin. Neurobiol. 25, 123–129 (2014).

  5. 5.

    Dopamine reward prediction-error signalling: a two-component response. Nat. Rev. Neurosci. 17, 183–195 (2016).

  6. 6.

    Cognitive maps in rats and men. Psychol. Rev. 55, 189–208 (1948).

  7. 7.

    , & Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711 (2005).

  8. 8.

    , , & States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66, 585–595 (2010).

  9. 9.

    An associative analysis of instrumental learning. Curr. Dir. Psychol. Sci. 2, 111–116 (1993).

  10. 10.

    & The effect of two ways of devaluing the unconditioned stimulus after first- and second-order appetitive conditioning. J. Exp. Psychol. Anim. Behav. Process. 1, 355–363 (1975).

  11. 11.

    , , , & Model-based influences on humans' choices and striatal prediction errors. Neuron 69, 1204–1215 (2011).

  12. 12.

    et al. A causal link between prediction errors, dopamine neurons and learning. Nat. Neurosci. 16, 966–973 (2013).

  13. 13.

    et al. Arithmetic and local circuitry underlying dopamine prediction errors. Nature 525, 243–246 (2015).

  14. 14.

    et al. Brief optogenetic inhibition of dopamine neurons mimics endogenous negative prediction errors. Nat. Neurosci. 19, 111–116 (2016).

  15. 15.

    et al. Phasic firing in dopaminergic neurons is sufficient for behavioral conditioning. Science 324, 1080–1084 (2009).

  16. 16.

    et al. Optogenetic interrogation of dopaminergic modulation of the multiple phases of reward-seeking behavior. J. Neurosci. 31, 10829–10835 (2011).

  17. 17.

    et al. Similar roles of substantia nigra and ventral tegmental dopamine neurons in reward and aversion. J. Neurosci. 34, 817–822 (2014).

  18. 18.

    , , , & Overriding phasic dopamine signals redirects action selection during risk/reward decision making. Neuron 84, 177–189 (2014).

  19. 19.

    Sensory pre-conditioning. J. Exp. Psychol. 25, 323–332 (1939).

  20. 20.

    , & Preserved sensitivity to outcome value after lesions of the basolateral amygdala. J. Neurosci. 23, 7702–7709 (2003).

  21. 21.

    et al. Orbitofrontal cortex supports behavior and learning using inferred but not cached values. Science 338, 953–956 (2012).

  22. 22.

    & Associations in second-order conditioning and sensory preconditioning. J Comp Physiol Psychol 81, 1–11 (1972).

  23. 23.

    “Attention-like” processes in classical conditioning. in Miami Symposium on the Prediction of Behavior, 1967: Aversive Stimulation (ed. M.R. Jones) 9–31 (University of Miami Press, 1968).

  24. 24.

    , & Coding of predicted reward omission by dopamine neurons in a conditioned inhibition paradigm. J. Neurosci. 23, 10402–10410 (2003).

  25. 25.

    , , & Dopamine cells respond to predicted events during classical conditioning: evidence for eligibility traces in the reward-learning network. J. Neurosci. 25, 6235–6242 (2005).

  26. 26.

    & Dopamine neurons report an error in the temporal prediction of reward during learning. Nat. Neurosci. 1, 304–309 (1998).

  27. 27.

    , , , & Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 482, 85–88 (2012).

  28. 28.

    et al. The orbitofrontal cortex and ventral tegmental area are necessary for learning from unexpected outcomes. Neuron 62, 269–280 (2009).

  29. 29.

    & Dopamine: generalization and bonuses. Neural Netw. 15, 549–559 (2002).

  30. 30.

    , & Burst activity of ventral tegmental dopamine neurons is elicited by sensory stimuli in the awake cat. Brain Res. 759, 251–258 (1997).

  31. 31.

    et al. Recombinase-driver rat lines: tools, techniques, and optogenetic application to dopamine-mediated reinforcement. Neuron 72, 721–733 (2011).

  32. 32.

    , , & BOLD responses reflecting dopaminergic signals in the human ventral tegmental area. Science 319, 1264–1267 (2008).

  33. 33.

    et al. Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target. Nat. Neurosci. 19, 845–854 (2016).

  34. 34.

    , , & Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens. Nat. Neurosci. 10, 1020–1028 (2007).

  35. 35.

    Relations between Pavlovian-instrumental transfer and reinforcer devaluation. J. Exp. Psychol. Anim. Behav. Process. 30, 104–117 (2004).

  36. 36.

    & Motivational control of goal-directed action. Anim. Learn. Behav. 22, 1–18 (1994).

  37. 37.

    , & Phasic dopamine release in the medial prefrontal cortex enhances stimulus discrimination. Proc. Natl. Acad. Sci. USA 113, E3169–E3176 (2016).

  38. 38.

    A theory of attention: variations in the associability of stimuli with reinforcement. Psychol. Rev. 82, 276–298 (1975).

  39. 39.

    & A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychol. Rev. 87, 532–552 (1980).

  40. 40.

    & Reconciling the influence of predictiveness and uncertainty on stimulus salience: a model of attention in associative learning. Proceedings of the Royal Society of London B: Biological Sciences (2011).

  41. 41.

    , & Midbrain dopamine neurons compute inferred and cached value prediction errors in a common framework. eLife 5, e13665 (2016).

  42. 42.

    et al. Physiological state gates acquisition and expression of mesolimbic reward prediction signals. Proc. Natl. Acad. Sci. USA 113, 1943–1948 (2016).

  43. 43.

    , , & A pallidus-habenula-dopamine pathway signals inferred stimulus values. J. Neurophysiol. 104, 1068–1076 (2010).

  44. 44.

    , & Nucleus accumbens core dopamine signaling tracks the need-based motivational value of food-paired cues. J. Neurochem. 136, 1026–1036 (2016).

  45. 45.

    et al. Ventral striatal dopamine reflects behavioral and neural signatures of model-based control during sequential decision making. Proc. Natl. Acad. Sci. USA 112, 1595–1600 (2015).

  46. 46.

    , , & Dopamine neurons share common response function for reward prediction error. Nat. Neurosci. 19, 479–486 (2016).

  47. 47.

    et al. Unique properties of mesoprefrontal neurons within a dual mesocorticolimbic dopamine system. Neuron 57, 760–773 (2008).

  48. 48.

    & Preference by association: how memory mechanisms in the hippocampus bias decisions. Science 338, 270–273 (2012).

  49. 49.

    et al. Chemogenetic silencing of neurons in retrosplenial cortex disrupts sensory preconditioning. J. Neurosci. 34, 10982–10988 (2014).

  50. 50.

    , , & Looking for cognition in the structure within the noise. Trends Cogn. Sci. 13, 55–64 (2009).

  51. 51.

    Conditioned stimulus as a determinant of the form of the Pavlovian conditioned response. J. Exp. Psychol. Anim. Behav. Process. 3, 77–104 (1977).

  52. 52.

    , , , & Ventral striatum and orbitofrontal cortex are both required for model-based, but not model-free, reinforcement learning. J. Neurosci. 31, 2700–2705 (2011).

  53. 53.

    & Effects of amygdala central nucleus lesions on blocking and unblocking. Behav. Neurosci. 107, 235–245 (1993).

  54. 54.

    & Variations in unconditioned stimulus processing in unblocking. J. Exp. Psychol. Anim. Behav. Process. 31, 155–171 (2005).

  55. 55.

    & The prelimbic cortex contributes to the down-regulation of attention toward redundant cues. Cereb. Cortex 24, 1066–1074 (2014).

  56. 56.

    , , & The role of the orbitofrontal cortex in the pursuit of happiness and more specific rewards. Nature 454, 340–344 (2008).

Download references

Acknowledgements

The authors thank K. Deisseroth and the Gene Therapy Center at the University of North Carolina at Chapel Hill for providing viral reagents and G. Stuber for technical advice on their use. We also thank B. Harvey and the NIDA Optogenetic and Transgenic Core, M. Morales and the NIDA Histology Core for their assistance, and P. Dayan and N. Daw for their comments. This work was supported by R01-MH098861 (to Y.N.) and by the Intramural Research Program at NIDA ZIA-DA000587 (to G.S.). The opinions expressed in this article are the authors' own and do not reflect the view of the NIH/DHHS.

Author information

Affiliations

  1. NIDA Intramural Research Program, Baltimore, Maryland, USA.

    • Melissa J Sharpe
    • , Chun Yun Chang
    • , Melissa A Liu
    • , Hannah M Batchelor
    • , Lauren E Mueller
    • , Joshua L Jones
    •  & Geoffrey Schoenbaum
  2. Department of Psychology and Neuroscience Institute, Princeton University, Princeton, New Jersey, USA.

    • Melissa J Sharpe
    •  & Yael Niv
  3. Departments of Anatomy and of Neurobiology and Psychiatry, University of Maryland School of Medicine, Baltimore, Maryland, USA.

    • Geoffrey Schoenbaum
  4. Solomon H. Snyder Department of Neuroscience, The Johns Hopkins University, Baltimore, Maryland, USA.

    • Geoffrey Schoenbaum

Authors

  1. Search for Melissa J Sharpe in:

  2. Search for Chun Yun Chang in:

  3. Search for Melissa A Liu in:

  4. Search for Hannah M Batchelor in:

  5. Search for Lauren E Mueller in:

  6. Search for Joshua L Jones in:

  7. Search for Yael Niv in:

  8. Search for Geoffrey Schoenbaum in:

Contributions

M.J.S. and G.S. designed the experiments; M.J.S., M.A.L., H.M.B. and L.E.M. collected the data with technical advice and assistance from C.Y.C. and J.L.J. M.J.S. and G.S. analyzed the data. M.J.S., Y.N. and G.S. interpreted the data and wrote the manuscript with input from all authors.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Melissa J Sharpe or Geoffrey Schoenbaum.

Integrated supplementary information

Supplementary information

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nn.4538

Further reading