Individuals differ in how they learn from experience. In Pavlovian conditioning models, where cues predict reinforcer delivery at a different goal location, some animals—called sign-trackers—come to approach the cue, whereas others, called goal-trackers, approach the goal. In sign-trackers, model-free phasic dopaminergic reward-prediction errors underlie learning, which renders stimuli ‘wanted’. Goal-trackers do not rely on dopamine for learning and are thought to use model-based learning. We demonstrate this double dissociation in 129 male humans using eye-tracking, pupillometry and functional magnetic resonance imaging informed by computational models of sign- and goal-tracking. We show that sign-trackers exhibit a neural reward prediction error signal that is not detectable in goal-trackers. Model-free value only guides gaze and pupil dilation in sign-trackers. Goal-trackers instead exhibit a stronger model-based neural state prediction error signal. This model-based construct determines gaze and pupil dilation more in goal-trackers.
This is a preview of subscription content
Subscribe to Nature+
Get immediate online access to the entire Nature family of 50+ journals
Subscribe to Journal
Get full journal access for 1 year
only $9.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Source data are available for Figs. 1–6 and Supplementary Figs. 2–12. Data sharing will be based on acceptance by the study team that: (1) a valid and timely scientific question, based on a written protocol, has been posed by those seeking to access the data; and (2) the role of the original study team will be fully acknowledged. Please contact the corresponding author via email to request access to the data. Safeguarding of ethical standards will be ensured by submission of a study amendment to the Charité and Dresden ethics committees. Data access for questions of scientific integrity may additionally be regulated via the funder.
Experimental code is freely available on request to the corresponding author. Analysis code will be provided with data access.
Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997).
Huys, Q. J. M., Tobler, P. N., Hasler, G. & Flagel, S. B. The role of learning-related dopamine signals in addiction vulnerability. Prog. Brain Res. 211, 31–77 (2014).
Lesaint, F., Sigaud, O., Flagel, S. B., Robinson, T. E. & Khamassi, M. Modelling individual differences in the form of Pavlovian conditioned approach responses: a dual learning systems approach with factored representations. PLoS Comput. Biol. 10, e1003466 (2014).
Gläscher, J., Daw, N., Dayan, P. & O’Doherty, J. P. States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66, 585–595 (2010).
Daw, N. D., Niv, Y. & Dayan, P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711 (2005).
Dickinson, A. & Balleine, B. in Stevens’ Handbook of Experimental Psychology 3rd edn 497–534 (2002).
Doya, K. What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural Netw. 12, 961–974 (1999).
Friedel, E. et al. Devaluation and sequential decisions: linking goal-directed and model-based behavior. Front. Hum. Neurosci. 8, 587 (2014).
Ernst, M. & Paulus, M. P. Neurobiology of decision making: a selective review from a neurocognitive and clinical perspective. Biol. Psychiatry 58, 597–604 (2005).
Flagel, S. B. et al. A selective role for dopamine in stimulus–reward learning. Nature 469, 53–57 (2011).
Day, J. J., Roitman, M. F., Wightman, R. M. & Carelli, R. M. Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens. Nat. Neurosci. 10, 1020–1028 (2007).
Berridge, K. C. & Robinson, T. E. What is the role of dopamine in reward: hedonic impact, reward learning, or incentive salience? Brain Res. Rev. 28, 309–369 (1998).
Berridge, K. C. & Robinson, T. E. Parsing reward. Trends Neurosci. 26, 507–513 (2003).
Hickey, C. & Peelen, M. V. Neural mechanisms of incentive salience in naturalistic human vision. Neuron 85, 512–518 (2015).
Robinson, T. E. & Flagel, S. B. Dissociating the predictive and incentive motivational properties of reward-related cues through the study of individual differences. Biol. Psychiatry 65, 869–873 (2009).
McClure, S. M., Daw, N. D. & Montague, P. R. A computational substrate for incentive salience. Trends Neurosci. 26, 423–428 (2003).
Dayan, P., Niv, Y., Seymour, B. & Daw, N. D. The misbehavior of value and the discipline of the will. Neural Netw. 19, 1153–1160 (2006).
Dayan, P. & Berridge, K. C. Model-based and model-free Pavlovian reward learning: revaluation, revision, and revelation. Cogn. Affect. Behav. Neurosci. 14, 473–492 (2014).
Garofalo, S. & di Pellegrino, G. Individual differences in the influence of task-irrelevant Pavlovian cues on human behavior. Front. Behav. Neurosci. 9, 163 (2015).
Morrison, S. E., Bamkole, M. A. & Nicola, S. M. Sign-tracking, but not goal-tracking, is resistant to outcome devaluation. Front. Neurosci. 9, 468 (2015).
Huys, Q. J. M. et al. Disentangling the roles of approach, activation and valence in instrumental and Pavlovian responding. PLoS Comput. Biol. 7, e1002028 (2011).
Gottlieb, J. Attention, learning, and the value of information. Neuron 76, 281–295 (2012).
Leclerc, R. & Reberg, D. Sign-tracking in aversive conditioning. Learn. Motiv. 11, 302–317 (1980).
Yager, L. M., Pitchers, K. K., Flagel, S. B. & Robinson, T. E. Individual variation in the motivational and neurobiological effects of an opioid cue. Neuropsychopharmacology 40, 1269–1277 (2015).
Gottlieb, J., Oudeyer, P. Y., Lopes, M. & Baranes, A. Information-seeking, curiosity, and attention: computational and neural mechanisms. Trends Cogn. Sci. 17, 585–593 (2013).
Renninger, L. W., Verghese, P. & Coughlan, J. Where to look next? Eye movements reduce local uncertainty. J. Vis. 7, 6 (2007).
Nassar, M. R. et al. Rational regulation of learning dynamics by pupil-linked arousal systems. Nat. Neurosci. 15, 1040–1046 (2012).
Manohar, S. G. & Husain, M. Reduced pupillary reward sensitivity in Parkinson’s disease. NPJ Park. Dis. 1, 15026 (2015).
Berridge, K. C. The debate over dopamine’s role in reward: the case for incentive salience. Psychopharmacology 191, 391–431 (2007).
Rutledge, R. B., Dean, M., Caplin, A. & Glimcher, P. W. Testing the reward prediction error hypothesis with an axiomatic model. J. Neurosci. 30, 13525–13536 (2010).
Seymour, B., Daw, N., Dayan, P., Singer, T. & Dolan, R. Differential encoding of losses and gains in the human striatum. J. Neurosci. 27, 4826–4831 (2007).
Flagel, S. B. et al. A food predictive cue must be attributed with incentive salience for it to induce c-fos mRNA expression in cortico-striatal-thalamic brain regions. Neuroscience 196, 80–96 (2011).
Wilson, R. C. & Niv, Y. Is model fitting necessary for model-based fMRI? PLoS Comput. Biol. 11, e1004237 (2015).
Sebold, M. et al. Don’t think, just feel the music: individuals with strong Pavlovian-to-instrumental transfer effects rely less on model-based reinforcement learning. J. Cogn. Neurosci. 28, 985–995 (2016).
Montague, P. R., Dayan, P. & Sejnowski, T. J. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16, 1936–1947 (1996).
Steinberg, E. E. et al. A causal link between prediction errors, dopamine neurons and learning. Nat. Neurosci. 16, 966–973 (2013).
Dayan, P., Kakade, S. & Montague, P. R. Learning and selective attention. Nat. Neurosci. 3, 1218–1223 (2000).
Robinson, T. E. & Berridge, K. C. The neural basis of drug craving: an incentive-sensitization theory of addiction. Brain Res. Rev. 18, 247–291 (1993).
Saunders, B. T. & Robinson, T. E. Individual variation in resisting temptation: implications for addiction. Neurosci. Biobehav. Rev. 37, 1955–1975 (2013).
Garbusow, M. et al. Pavlovian-to-instrumental transfer effects in the nucleus accumbens relate to relapse in alcohol dependence. Addict. Biol. 21, 719–731 (2016).
Schad, D. J. et al. Neural correlates of instrumental responding in the context of alcohol-related cues index disorder severity and relapse risk. Eur. Arch. Psychiatry Clin. Neurosci. 269, 295–308 (2019).
Geurts, D. E., Huys, Q. J. M., den Ouden, H. & Cools, R. Aversive Pavlovian control of instrumental behavior in humans. J. Cogn. Neurosci. 25, 1428–1441 (2013).
Brainard, D. H. The psychophysics toolbox. Spat. Vis. 10, 433–436 (1997).
Pelli, D. G. The VideoToolbox software for visual psychophysics: transforming numbers into movies. Spat. Vis. 10, 437–442 (1997).
Garbusow, M. et al. Pavlovian-to-instrumental transfer in alcohol dependence: a pilot study. Neuropsychobiology 70, 111–121 (2014).
American Psychiatric Association. Diagnostic and statistical manual of mental disorders: DSM-IV (American Psychiatric Publishing, 1994).
Wittchen, H.-U. & Pfister, H. DIA-X-Interviews: Manual für Screening-Verfahren und Interview; Interviewheft Längsschnittuntersuchung (DIA-X-Lifetime); Ergänzungsheft (DIA-X- Lifetime); Interviewheft Querschnittuntersuchung (DIA-X-12 Monate); Ergänzungsheft (DIA-X-12 Monate); PC-Programm zur Durchführung des Interviews (Längs- und Querschnittuntersuchung); Auswertungsprogramm (Swets and Zeitlinger, 1997).
R Development Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2016).
Singmann, H., Bolker, B., Westfall, J. & Aust, F. afex: Analysis of Factorial Experiments R package version 0.18-0 https://cran.r-project.org/web/packages/afex/index.html (2017).
Lenth, R. emmeans: Estimated Marginal Means, aka Least-Squares Means. R package version 1.1. https://cran.r-project.org/web/packages/emmeans/index.html (2018).
Ruxton, G. D. The unequal variance t-test is an underused alternative to Student’s t-test and the Mann–Whitney U test. Behav. Ecol. 17, 688–690 (2006).
Canty, A. & Ripley, B. D. boot: Bootstrap R (S-Plus) functions. R package version 1.3-18. https://cran.r-project.org/web/packages/boot/ (2017).
Davison, A. C. & Hinkley, D. V. Bootstrap Methods and Their Applications (Cambridge Univ. Press, 1997).
Morey, R. D. Confidence intervals from normalized data: a correction to Cousineau (2005). Tutor. Quant. Methods Psychol. 4, 81–84 (2008).
Kelley, K. MBESS: The MBESS R Package. R version 4.5.1. https://cran.r-project.org/web/packages/MBESS/index.html (2019).
Hogarth, L., Dickinson, A. & Duka, T. in Attention and Associative Learning: From Brain to Behaviour (eds Mitchell, C. J. & Le Pelley, M. E.) 71–98 (Oxford Univ. Press, 2010).
Peck, C. J., Jangraw, D. C., Suzuki, M., Efem, R. & Gottlieb, J. Reward modulates attention independently of action value in posterior parietal cortex. J. Neurosci. 29, 11182–11191 (2009).
Hickey, C., Chelazzi, L. & Theeuwes, J. Reward changes salience in human vision via the anterior cingulate. J. Neurosci. 30, 11096–11103 (2010).
Hickey, C. & van Zoest, W. Reward creates oculomotor salience. Curr. Biol. 22, R219–R220 (2012).
Itti, L. & Koch, C. Computational modelling of visual attention. Nat. Rev. Neurosci. 2, 194–203 (2001).
Gorgolewski, K. et al. Nipype: a flexible, lightweight and extensible neuroimaging data processing framework in python. Front. Neuroinform. 5, 13 (2011).
Iglesias, S. et al. Hierarchical prediction errors in midbrain and basal forebrain during sensory learning. Neuron 80, 519–530 (2013).
Deserno, L. et al. Ventral striatal dopamine reflects behavioral and neural signatures of model-based control during sequential decision making. Proc. Natl Acad. Sci. USA 112, 1595–1600 (2015).
White, D. M., Kraguljac, N. V., Reid, M. A. & Lahti, A. C. Contribution of substantia nigra glutamate to prediction error signals in schizophrenia: a combined magnetic resonance spectroscopy/functional imaging study. NPJ Schizophr. 1, 14001 (2015).
Watanabe, N., Sakagami, M. & Haruno, M. Reward prediction error signal enhanced by striatum–amygdala interaction explains the acceleration of probabilistic reward learning by emotion. J. Neurosci. 33, 4487–4493 (2013).
Gluth, S., Hotaling, J. M. & Rieskamp, J. The attraction effect modulates reward prediction errors and intertemporal choices. J. Neurosci. 37, 371–382 (2017).
Garrison, J., Erdeniz, B. & Done, J. Prediction error in reinforcement learning: a meta-analysis of neuroimaging studies. Neurosci. Biobehav. Rev. 37, 1297–1310 (2013).
Logothetis, N. K., Pauls, J., Augath, M., Trinath, T. & Oeltermann, A. Neurophysiological investigation of the basis of the fMRI signal. Nature 412, 150–157 (2001).
Nebe, S. et al. No association of goal-directed and habitual control with alcohol consumption in young adults. Addict. Biol. 23, 379–393 (2018).
Neyens, V. et al. Representation of semantic similarity in the left intraparietal sulcus: functional magnetic resonance imaging evidence. Front. Hum. Neurosci. 11, 402 (2017).
Eickhoff, S. B. et al. A new SPM toolbox for combining probabilistic cytoarchitectonic maps and functional imaging data. NeuroImage 25, 1325–1335 (2005).
This work was supported by the German Research Foundation (FOR 1617: grants SCHA 1971/1-2, HE 2597/13-1, HE 2597/13-2, HE 2597/15-1, SCHL 1969/2-2, SCHL 1969/4-1, SM 80/7-1, SM 80/7-2, WI 709/10-1, WI 709/10-2, ZI 1119/3-1, ZI 1119/3-2, RA 1047/2-1 and RA 1047/2-2, and in part by CRC-TR 265). E.F. is a participant in the BIH Charité Clinician Scientist Program funded by the Charité—Universitätsmedizin Berlin and Berlin Institute of Health. Q.J.M.H. acknowledges support from the UCLH NIHR BRC. S.N. received funding from the University of Zurich Grants Office (FK-19-020). We thank N. B. Krömer for helpful feedback and advice on the analyses, S. Kuitunen-Paul for helpful feedback, and M. Rothkirch for help with setting up eye-tracking at the Berlin site. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
The authors declare no competing interests.
Peer review information Primary Handling Editor: Marike Schiffer.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Schad, D.J., Rapp, M.A., Garbusow, M. et al. Dissociating neural learning signals in human sign- and goal-trackers. Nat Hum Behav 4, 201–214 (2020). https://doi.org/10.1038/s41562-019-0765-5
Fronto-striatal structures related with model-based control as an endophenotype for obsessive–compulsive disorder
Scientific Reports (2021)
Current Addiction Reports (2021)