Abstract
Theories of instrumental learning are centred on understanding how success and failure are used to improve future decisions1. These theories highlight a central role for reward prediction errors in updating the values associated with available actions2. In animals, substantial evidence indicates that the neurotransmitter dopamine might have a key function in this type of learning, through its ability to modulate cortico-striatal synaptic efficacy3. However, no direct evidence links dopamine, striatal activity and behavioural choice in humans. Here we show that, during instrumental learning, the magnitude of reward prediction error expressed in the striatum is modulated by the administration of drugs enhancing (3,4-dihydroxy-l-phenylalanine; l-DOPA) or reducing (haloperidol) dopaminergic function. Accordingly, subjects treated with l-DOPA have a greater propensity to choose the most rewarding action relative to subjects treated with haloperidol. Furthermore, incorporating the magnitude of the prediction errors into a standard action-value learning algorithm accurately reproduced subjects' behavioural choices under the different drug conditions. We conclude that dopamine-dependent modulation of striatal activity can account for how the human brain uses reward prediction errors to improve future decisions.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Thorndike, E. L. Animal Intelligence: Experimental Studies (Macmillan, New York, 1911)
Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997)
Wise, R. A. Dopamine, learning and motivation. Nature Rev. Neurosci. 5, 483–494 (2004)
Everitt, B. J. et al. Associative processes in addiction and reward. The role of amygdala-ventral striatal subsystems. Ann. NY Acad. Sci. 877, 412–438 (1999)
Ikemoto, S. & Panksepp, J. The role of nucleus accumbens dopamine in motivated behavior: a unifying interpretation with special reference to reward-seeking. Brain Res. Brain Res. Rev. 31, 6–41 (1999)
Hollerman, J. R. & Schultz, W. Dopamine neurons report an error in the temporal prediction of reward during learning. Nature Neurosci. 1, 304–309 (1998)
Waelti, P., Dickinson, A. & Schultz, W. Dopamine responses comply with basic assumptions of formal learning theory. Nature 412, 43–48 (2001)
Nakahara, H., Itoh, H., Kawagoe, R., Takikawa, Y. & Hikosaka, O. Dopamine neurons can represent context-dependent prediction error. Neuron 41, 269–280 (2004)
Bayer, H. M. & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47, 129–141 (2005)
Smith, A. D. & Bolam, J. P. The neural network of the basal ganglia as revealed by the study of synaptic connections of identified neurones. Trends Neurosci. 13, 259–265 (1990)
Calabresi, P. et al. Synaptic transmission in the striatum: from plasticity to neurodegeneration. Prog. Neurobiol. 61, 231–265 (2000)
Tremblay, L., Hollerman, J. R. & Schultz, W. Modifications of reward expectation-related neuronal activity during learning in primate striatum. J. Neurophysiol. 80, 964–977 (1998)
Frank, M. J., Seeberger, L. C. & O'Reilly, R. C. By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science 306, 1940–1943 (2004)
Hollerman, J. R., Tremblay, L. & Schultz, W. Influence of reward expectation on behavior-related neuronal activity in primate striatum. J. Neurophysiol. 80, 947–963 (1998)
Lauwereyns, J., Watanabe, K., Coe, B. & Hikosaka, O. A neural correlate of response bias in monkey caudate nucleus. Nature 418, 413–417 (2002)
Samejima, K., Ueda, Y., Doya, K. & Kimura, M. Representation of action-specific reward values in the striatum. Science 310, 1337–1340 (2005)
O'Doherty, J. et al. Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science 304, 452–454 (2004)
Tanaka, S. C. et al. Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops. Nature Neurosci. 7, 887–893 (2004)
Dickinson, A. Contemporary Animal Learning Theory (Cambridge Univ. Press, Cambridge, 1980)
O'Doherty, J. P., Deichmann, R., Critchley, H. D. & Dolan, R. J. Neural responses during anticipation of a primary taste reward. Neuron 33, 815–826 (2002)
Knutson, B., Taylor, J., Kaufman, M., Peterson, R. & Glover, G. Distributed neural representation of expected value. J. Neurosci. 25, 4806–4812 (2005)
Jueptner, M. & Weiller, C. A review of differences between basal ganglia and cerebellar control of movements as revealed by functional imaging studies. Brain 121, 1437–1449 (1998)
Lehericy, S. et al. Motor control in basal ganglia circuits using fMRI and brain atlas approaches. Cereb. Cortex 16, 149–161 (2006)
Alexander, G. E., DeLong, M. R. & Strick, P. L. Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annu. Rev. Neurosci. 9, 357–381 (1986)
Haber, S. N. The primate basal ganglia: parallel and integrative networks. J. Chem. Neuroanat. 26, 317–330 (2003)
Seymour, B. et al. Temporal difference models describe higher-order learning in humans. Nature 429, 664–667 (2004)
Ungless, M. A., Magill, P. J. & Bolam, J. P. Uniform inhibition of dopamine neurons in the ventral tegmental area by aversive stimuli. Science 303, 2040–2042 (2004)
Salamone, J. D. The involvement of nucleus accumbens dopamine in appetitive and aversive motivation. Behav. Brain Res. 61, 117–133 (1994)
Cook, L., Morris, R. W. & Mattis, P. A. Neuropharmacological and behavioral effects of chlorpromazine (thorazine hydrochloride). J. Pharmacol. Exp. Ther. 113, 11–12 (1955)
Molina, J. A. et al. Pathologic gambling in Parkinson's disease: a behavioral manifestation of pharmacologic treatment? Mov. Disord. 15, 869–872 (2000)
Acknowledgements
We thank K. Friston for discussions, B. Draganski for assistance in the double-blind procedure, and J. Daunizeau for assistance in the statistical analysis. This work was funded by the Wellcome Trust research programme grants. M.P. received a grant from the Fyssen Foundation.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
Reprints and permissions information is available at www.nature.com/reprints. The authors declare no competing financial interests.
Supplementary information
Supplementary Notes
This file contains Supplementary Methods, Supplementary Tables and Supplementary Figures. (PDF 268 kb)
Rights and permissions
About this article
Cite this article
Pessiglione, M., Seymour, B., Flandin, G. et al. Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans. Nature 442, 1042–1045 (2006). https://doi.org/10.1038/nature05051
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nature05051
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.