Dopamine responses comply with basic assumptions of formal learning theory

Waelti, Pascale; Dickinson, Anthony; Schultz, Wolfram

doi:10.1038/35083500

Article
Published: 05 July 2001

Dopamine responses comply with basic assumptions of formal learning theory

Pascale Waelti¹,
Anthony Dickinson² &
Wolfram Schultz¹

Nature volume 412, pages 43–48 (2001)Cite this article

10k Accesses
709 Citations
15 Altmetric
Metrics details

Abstract

According to contemporary learning theories, the discrepancy, or error, between the actual and predicted reward determines whether learning occurs when a stimulus is paired with a reward. The role of prediction errors is directly demonstrated by the observation that learning is blocked when the stimulus is paired with a fully predicted reward. By using this blocking procedure, we show that the responses of dopamine neurons to conditioned stimuli was governed differentially by the occurrence of reward prediction errors rather than stimulus–reward associations alone, as was the learning of behavioural reactions. Both behavioural and neuronal learning occurred predominantly when dopamine neurons registered a reward prediction error at the time of the reward. Our data indicate that the use of analytical tests derived from formal behavioural learning theory provides a powerful approach for studying the role of single neurons in learning.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Behavioural performance in the blocking paradigm and neuronal localizations.**

**Figure 2: Acquisition of dopamine responses to conditioned stimuli depends on prediction errors in the blocking paradigm.**

**Figure 3: Dopamine prediction error response at the time of the reward in the blocking paradigm.**

**Figure 4: Dopamine responses to unrewarded stimuli may reflect stimulus generalization rather than reward prediction.**

Dopamine transients do not act as model-free prediction errors during associative learning

Article Open access 08 January 2020

A distributional code for value in dopamine-based reinforcement learning

Article 15 January 2020

Beyond dichotomies in reinforcement learning

Article 01 September 2020

References

Thorndike, E. L. Animal Intelligence: Experimental Studies (MacMillan, New York, 1911).
Book Google Scholar
Pavlov, I. P. Conditional Reflexes (Oxford Univ. Press, London, 1927).
Google Scholar
Rescorla, R. A. & Wagner, A. R. in Classical Conditioning II: Current Research and Theory (eds Black, A. H. & Prokasy, W. F.) 64–99 (Appleton Century Crofts, New York, 1972).
Google Scholar
Mackintosh, N. J. A theory of attention: Variations in the associability of stimulus with reinforcement. Psychol. Rev. 82, 276–298 (1975).
Article Google Scholar
Pearce, J. M. & Hall, G. A. A model for Pavlovian conditioning: variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychol. Rev. 87, 532–552 (1980).
Article CAS Google Scholar
Dickinson, A. Contemporary Animal Learning Theory (Cambridge Univ. Press, Cambridge, 1980).
Google Scholar
Schultz, W. & Dickinson, A. Neuronal coding of prediction errors. Annu. Rev. Neurosci. 23, 473–500 (2000).
Article CAS Google Scholar
Widrow, G. & Hoff, M. E. Adaptive switching circuits. IRE Western Electron. Show Convention, Convention Record Part 4, 96–104 (1960).
Kalman, R. E. A new approach to linear filtering and prediction problems. J. Basic Eng. Trans. ASME 82, 35–45 (1960).
Article Google Scholar
Widrow, G. & Sterns, S. D. Adaptive Signal Processing (Prentice-Hall, Englewood Cliffs, 1985).
Google Scholar
Marr, D. A theory of cerebellar cortex. J. Physiol. 202, 437–470 (1969).
Article CAS Google Scholar
Ito, M. Long-term depression. Ann. Rev. Neurosci. 12, 85–102 (1989).
Article CAS Google Scholar
Thompson, R. F. & Gluck, M. A. in Perspectives on Cognitive Neuroscience (eds Lister, R. G. & Weingartner, H.) 25–45 (Oxford Univ. Press, New York, 1991).
Google Scholar
Kawato, M. & Gomi, H. The cerebellum and VOR/OKR learning models. Trends Neurosci. 15, 445–453 (1992).
Article CAS Google Scholar
Kim, J. J., Krupa, D. J. & Thompson, R. F. Inhibitory cerebello-olivary projections and blocking effect in classical conditioning. Science 279, 570–573 (1998).
Article ADS CAS Google Scholar
Sutton, R. S. & Barto, A. G. Toward a modern theory of adaptive networks: expectation and prediction. Psychol. Rev. 88, 135–170 (1981).
Article CAS Google Scholar
Sutton, R. S. & Barto, A. G. Reinforcement Learning (MIT Press, Cambridge, Massachusetts, 1998).
Google Scholar
Fibiger, H. C. & Phillips, A. G. in Handbook of Physiology—The Nervous System IV (ed. Bloom, F. E.) 647–675 (Williams and Wilkins, Baltimore, 1986).
Google Scholar
Wise, R. A. & Hoffman, C. D. Localization of drug reward mechanisms by intracranial injections. Synapse 10, 247–263 (1992).
Article CAS Google Scholar
Robinson, T. E. & Berridge, K. C. The neural basis for drug craving: an incentive-sensitization theory of addiction. Brain Res. Rev. 18, 247–291 (1993).
Article CAS Google Scholar
Robbins, T. W. & Everitt, B. J. Neurobehavioural mechanisms of reward and motivation. Curr. Opin. Neurobiol. 6, 228–236 (1996).
Article CAS Google Scholar
Romo, R. & Schultz, W. Dopamine neurons of the monkey midbrain: Contingencies of responses to active touch during self-initiated arm movements. J. Neurophysiol. 63, 592–606 (1990).
Article CAS Google Scholar
Schultz, W. & Romo, R. Dopamine neurons of the monkey midbrain: Contingencies of responses to stimuli eliciting immediate behavioural reactions. J. Neurophysiol. 63, 607–624 (1990).
Article CAS Google Scholar
Ljungberg, T., Apicella, P. & Schultz, W. Responses of monkey dopamine neurons during learning of behavioural reactions. J. Neurophysiol. 67, 145–163 (1992).
Article CAS Google Scholar
Schultz, W., Apicella, P. & Ljungberg, T. Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. J. Neurosci. 13, 900–913 (1993).
Article CAS Google Scholar
Schultz, W., Dayan, P. & Montague, R. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997).
Article CAS Google Scholar
Hollerman, J. R. & Schultz, W. Dopamine neurons report an error in the temporal prediction of reward during learning. Nature Neurosci. 1, 304–309 (1998).
Article CAS Google Scholar
Schultz, W. Predictive reward signal of dopamine neurons. J. Neurophysiol. 80, 1–27 (1998).
Article CAS Google Scholar
Salamone, J. D. The involvement of nucleus accumbens dopamine in appetitive and aversive motivation. Behav. Brain Res. 61, 117–133 (1994).
Article CAS Google Scholar
Horvitz, J. C. Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events. Neuroscience 96, 651–656 (2000).
Article CAS Google Scholar
Sutton, R. S. & Barto, A. G. in Learning and Computational Neuroscience: Foundations of Adaptive Networks (eds Gabriel, M. & Moore, J.) 497–537 (MIT Press, Cambridge, Massachusetts, 1990).
Google Scholar
Mackintosh, N. J. Conditioning and Associative Learning (Oxford Univ. Press, New York, 1983).
Google Scholar
Friston, K. J., Tononi, G., Reeke, G. N. Jr, Sporns, O. & Edelman, G. M. Value-dependent selection in the brain: simulation in a synthetic neural model. Neuroscience 59, 229–243 (1994).
Article CAS Google Scholar
Montague, P. R., Dayan, P. & Sejnowski, T. J. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16, 1936–1947 (1996).
Article CAS Google Scholar
Houk, J. C., Adams, J. L. & Barto, A. G. in Models of Information Processing in the Basal Ganglia (eds Houk, J. C., Davis, J. L. & Beiser, D. G.) 249–270 (MIT Press, Cambridge, Massachusetts, 1995).
Google Scholar
Suri, R. & Schultz, W. A neural network with dopamine-like reinforcement signal that learns a spatial delayed response task. Neuroscience 91, 871–890 (1999).
Article CAS Google Scholar
Kamin, L. J. in Fundamental Issues in Instrumental Learning (eds Mackintosh, N. J. & Honig, W. K.) 42–64 (Dalhousie Univ. Press, Dalhousie, 1969).
Google Scholar
Martin, I. & Levey, A. B. Blocking observed in human eyelid conditioning. Q. J. Exp. Psychol. B 43, 233–255 (1991).
CAS PubMed Google Scholar
Dickinson, A. Causal learning: An associative analysis. Q. J. Exp. Psychol. B 54, 3–25 (2001).
Article CAS Google Scholar
Holland, P. C. Brain mechanisms for changes in processing of conditioned stimuli in Pavlovian conditioning: Implications for behavioural theory. Anim. Learn. Behav. 25, 373–399 (1997).
Article Google Scholar
Calabresi, P., Maj, R., Pisani, A., Mercuri, N. B. & Bernardi, G. Long-term synaptic depression in the striatum: Physiological and pharmacological characterization. J. Neurosci. 12, 4224–4233 (1992).
Article CAS Google Scholar
Garcia-Munoz, M., Young, S. J. & Groves, P. Presynaptic long-term changes in excitability of the corticostriatal pathway. NeuroReport 3, 357–360 (1992).
Article CAS Google Scholar
Wickens, J. R., Begg, A. J. & Arbuthnott, G. W. Dopamine reverses the depression of rat corticostriatal synapses which normally follows high-frequency stimulation of cortex in vitro. Neuroscience 70, 1–5 (1996).
Article CAS Google Scholar
Calabresi, P. et al. Abnormal synaptic plasticity in the striatum of mice lacking dopamine D2 receptors. J. Neurosci. 17, 4536–4544 (1997).
Article CAS Google Scholar
Otani, S., Blond, O., Desce, J. M. & Crepel, F. Dopamine facilitates long-term depression of glutamatergic transmission in rat prefrontal cortex. Neuroscience 85, 669–676 (1998).
Article CAS Google Scholar
Otani, S., Auclair, N., Desce, J., Roisin, M. P. & Crepel, F. J. Neurosci. 19, 9788–9802 (1999).
Article CAS Google Scholar
Centonze, D. et al. Unilateral dopamine denervation blocks corticostriatal LTP. J. Neurophysiol. 82, 3575–3579 (1999).
Article CAS Google Scholar
Minsky, M. L. Steps toward artificial intelligence. Proc. Inst. Radio Engineers 49, 8–30 (1961).
MathSciNet Google Scholar
Rauhut, A. S., McPhee, J. E. & Ayres, J. J. B. Blocked and overshadowed stimuli are weakened in their ability to serve as blockers and second-order reinforcers in Pavlovian fear conditioning. J. Exp. Psychol: Anim. Behav. Process 25, 45–67 (1999).
CAS Google Scholar
Schultz, W. & Romo, R. Responses of nigrostriatal dopamine neurons to high intensity somatosensory stimulation in the anesthetized monkey. J. Neurophysiol. 57, 201–217 (1987).
Article CAS Google Scholar

Download references

Acknowledgements

We thank B. Aebischer, J. Corpataux, A. Gaillard, B. Morandi, A. Pisani and F. Tinguely for expert technical assistance. The study was supported by the Swiss NSF, the European Union (Human Capital and Mobility, and Biomed 2 programmes), the James S. McDonnell Foundation and the British Council.

Author information

Authors and Affiliations

Institute of Physiology and Programme in Neuroscience, University of Fribourg, Fribourg, CH-1700, Switzerland
Pascale Waelti & Wolfram Schultz
Department of Experimental Psychology, University of Cambridge, Cambridge, CB2 3EB, UK
Anthony Dickinson

Authors

Pascale Waelti
View author publications
You can also search for this author in PubMed Google Scholar
Anthony Dickinson
View author publications
You can also search for this author in PubMed Google Scholar
Wolfram Schultz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wolfram Schultz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Waelti, P., Dickinson, A. & Schultz, W. Dopamine responses comply with basic assumptions of formal learning theory. Nature 412, 43–48 (2001). https://doi.org/10.1038/35083500

Download citation

Received: 08 February 2001
Accepted: 17 May 2001
Issue Date: 05 July 2001
DOI: https://doi.org/10.1038/35083500

This article is cited by

Dopamine projections to the basolateral amygdala drive the encoding of identity-specific reward memories
- Ana C. Sias
- Yousif Jafar
- Kate M. Wassum
Nature Neuroscience (2024)
Dopaminergic prediction errors in the ventral tegmental area reflect a multithreaded predictive model
- Yuji K. Takahashi
- Thomas A. Stalnaker
- Geoffrey Schoenbaum
Nature Neuroscience (2023)
VTA dopamine neuron activity encodes social interaction and promotes reinforcement learning through social prediction error
- Clément Solié
- Benoit Girard
- Camilla Bellone
Nature Neuroscience (2022)
Deep brain stimulation of the “medial forebrain bundle”: a strategy to modulate the reward system and manage treatment-resistant depression
- Albert J. Fenoy
- Joao Quevedo
- Jair C. Soares
Molecular Psychiatry (2022)
A Model of the Neural Mechanism of Instrumentalization of Movements Induced by Stimulation of the Motor Cortex
- V. I. Maiorov
Neuroscience and Behavioral Physiology (2021)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Dopamine responses comply with basic assumptions of formal learning theory

Abstract

Access options

Similar content being viewed by others

Dopamine transients do not act as model-free prediction errors during associative learning

A distributional code for value in dopamine-based reinforcement learning

Beyond dichotomies in reinforcement learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

This article is cited by

Dopamine projections to the basolateral amygdala drive the encoding of identity-specific reward memories

Dopaminergic prediction errors in the ventral tegmental area reflect a multithreaded predictive model

VTA dopamine neuron activity encodes social interaction and promotes reinforcement learning through social prediction error

Deep brain stimulation of the “medial forebrain bundle”: a strategy to modulate the reward system and manage treatment-resistant depression

A Model of the Neural Mechanism of Instrumentalization of Movements Induced by Stimulation of the Motor Cortex

Comments

Search

Quick links

Abstract

Access options

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links