Theories of instrumental learning are centred on understanding how success and failure are used to improve future decisions1. These theories highlight a central role for reward prediction errors in updating the values associated with available actions2. In animals, substantial evidence indicates that the neurotransmitter dopamine might have a key function in this type of learning, through its ability to modulate cortico-striatal synaptic efficacy3. However, no direct evidence links dopamine, striatal activity and behavioural choice in humans. Here we show that, during instrumental learning, the magnitude of reward prediction error expressed in the striatum is modulated by the administration of drugs enhancing (3,4-dihydroxy-l-phenylalanine; l-DOPA) or reducing (haloperidol) dopaminergic function. Accordingly, subjects treated with l-DOPA have a greater propensity to choose the most rewarding action relative to subjects treated with haloperidol. Furthermore, incorporating the magnitude of the prediction errors into a standard action-value learning algorithm accurately reproduced subjects' behavioural choices under the different drug conditions. We conclude that dopamine-dependent modulation of striatal activity can account for how the human brain uses reward prediction errors to improve future decisions.
Access optionsAccess options
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Thorndike, E. L. Animal Intelligence: Experimental Studies (Macmillan, New York, 1911)
Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997)
Wise, R. A. Dopamine, learning and motivation. Nature Rev. Neurosci. 5, 483–494 (2004)
Everitt, B. J. et al. Associative processes in addiction and reward. The role of amygdala-ventral striatal subsystems. Ann. NY Acad. Sci. 877, 412–438 (1999)
Ikemoto, S. & Panksepp, J. The role of nucleus accumbens dopamine in motivated behavior: a unifying interpretation with special reference to reward-seeking. Brain Res. Brain Res. Rev. 31, 6–41 (1999)
Hollerman, J. R. & Schultz, W. Dopamine neurons report an error in the temporal prediction of reward during learning. Nature Neurosci. 1, 304–309 (1998)
Waelti, P., Dickinson, A. & Schultz, W. Dopamine responses comply with basic assumptions of formal learning theory. Nature 412, 43–48 (2001)
Nakahara, H., Itoh, H., Kawagoe, R., Takikawa, Y. & Hikosaka, O. Dopamine neurons can represent context-dependent prediction error. Neuron 41, 269–280 (2004)
Bayer, H. M. & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47, 129–141 (2005)
Smith, A. D. & Bolam, J. P. The neural network of the basal ganglia as revealed by the study of synaptic connections of identified neurones. Trends Neurosci. 13, 259–265 (1990)
Calabresi, P. et al. Synaptic transmission in the striatum: from plasticity to neurodegeneration. Prog. Neurobiol. 61, 231–265 (2000)
Tremblay, L., Hollerman, J. R. & Schultz, W. Modifications of reward expectation-related neuronal activity during learning in primate striatum. J. Neurophysiol. 80, 964–977 (1998)
Frank, M. J., Seeberger, L. C. & O'Reilly, R. C. By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science 306, 1940–1943 (2004)
Hollerman, J. R., Tremblay, L. & Schultz, W. Influence of reward expectation on behavior-related neuronal activity in primate striatum. J. Neurophysiol. 80, 947–963 (1998)
Lauwereyns, J., Watanabe, K., Coe, B. & Hikosaka, O. A neural correlate of response bias in monkey caudate nucleus. Nature 418, 413–417 (2002)
Samejima, K., Ueda, Y., Doya, K. & Kimura, M. Representation of action-specific reward values in the striatum. Science 310, 1337–1340 (2005)
O'Doherty, J. et al. Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science 304, 452–454 (2004)
Tanaka, S. C. et al. Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops. Nature Neurosci. 7, 887–893 (2004)
Dickinson, A. Contemporary Animal Learning Theory (Cambridge Univ. Press, Cambridge, 1980)
O'Doherty, J. P., Deichmann, R., Critchley, H. D. & Dolan, R. J. Neural responses during anticipation of a primary taste reward. Neuron 33, 815–826 (2002)
Knutson, B., Taylor, J., Kaufman, M., Peterson, R. & Glover, G. Distributed neural representation of expected value. J. Neurosci. 25, 4806–4812 (2005)
Jueptner, M. & Weiller, C. A review of differences between basal ganglia and cerebellar control of movements as revealed by functional imaging studies. Brain 121, 1437–1449 (1998)
Lehericy, S. et al. Motor control in basal ganglia circuits using fMRI and brain atlas approaches. Cereb. Cortex 16, 149–161 (2006)
Alexander, G. E., DeLong, M. R. & Strick, P. L. Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annu. Rev. Neurosci. 9, 357–381 (1986)
Haber, S. N. The primate basal ganglia: parallel and integrative networks. J. Chem. Neuroanat. 26, 317–330 (2003)
Seymour, B. et al. Temporal difference models describe higher-order learning in humans. Nature 429, 664–667 (2004)
Ungless, M. A., Magill, P. J. & Bolam, J. P. Uniform inhibition of dopamine neurons in the ventral tegmental area by aversive stimuli. Science 303, 2040–2042 (2004)
Salamone, J. D. The involvement of nucleus accumbens dopamine in appetitive and aversive motivation. Behav. Brain Res. 61, 117–133 (1994)
Cook, L., Morris, R. W. & Mattis, P. A. Neuropharmacological and behavioral effects of chlorpromazine (thorazine hydrochloride). J. Pharmacol. Exp. Ther. 113, 11–12 (1955)
Molina, J. A. et al. Pathologic gambling in Parkinson's disease: a behavioral manifestation of pharmacologic treatment? Mov. Disord. 15, 869–872 (2000)
We thank K. Friston for discussions, B. Draganski for assistance in the double-blind procedure, and J. Daunizeau for assistance in the statistical analysis. This work was funded by the Wellcome Trust research programme grants. M.P. received a grant from the Fyssen Foundation.
Reprints and permissions information is available at www.nature.com/reprints. The authors declare no competing financial interests.
This file contains Supplementary Methods, Supplementary Tables and Supplementary Figures. (PDF 268 kb)
About this article
Brain Research (2019)
Dopamine D2-like receptor stimulation blocks negative feedback in visual and spatial reversal learning in the rat: behavioural and computational evidence
Frontiers in Pharmacology (2019)