Behavioural evidence for parallel outcome-sensitive and outcome-insensitive Pavlovian learning systems in humans

Pool, Eva R.; Pauli, Wolfgang M.; Kress, Carolina S.; O’Doherty, John P.

doi:10.1038/s41562-018-0527-9

Article
Published: 25 February 2019

Behavioural evidence for parallel outcome-sensitive and outcome-insensitive Pavlovian learning systems in humans

Nature Human Behaviour volume 3, pages 284–296 (2019)Cite this article

2023 Accesses
23 Citations
32 Altmetric
Metrics details

Subjects

Abstract

There is a dichotomy in instrumental conditioning between goal-directed actions and habits that are distinguishable on the basis of their relative sensitivity to changes in outcome value. It is less clear whether a similar distinction applies in Pavlovian conditioning, where responses have been found to be predominantly outcome-sensitive. To test for both devaluation-insensitive and devaluation-sensitive Pavlovian conditioning in humans, we conducted four experiments combining Pavlovian conditioning and outcome-devaluation procedures while measuring multiple conditioned responses. Our results suggest that Pavlovian conditioning involves two distinct types of learning: one that learns the current value of the outcome, which is sensitive to devaluation, and one that learns about the spatial localization of the outcome, which is insensitive to devaluation. Our findings have implications for the mechanistic understanding of Pavlovian conditioning and provide a more nuanced understanding of Pavlovian mechanisms that might contribute to a number of psychiatric disorders.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Schematic representation of the experimental design.**

**Fig. 2: Effect of conditioning during the learning phase of Experiment 1.**

**Fig. 3: Manipulation check of the outcome-devaluation procedure.**

**Fig. 4: Effects of the outcome-devaluation procedure on different conditioned responses during Experiment 1 and Experiment 2.**

**Fig. 5: Illustration of the sequence of events in a trial for Experiment 3.**

**Fig. 6: Illustration of the main effects during Experiment 3.**

**Fig. 7: Effect of conditioning during Experiment 4.**

Neural substrates of parallel devaluation-sensitive and devaluation-insensitive Pavlovian learning in humans

Article Open access 05 December 2023

Eva R. Pool, Wolfgang M. Pauli, … John P. O’Doherty

Adolescents exhibit reduced Pavlovian biases on instrumental learning

Article Open access 25 September 2020

Hillary A. Raab & Catherine A. Hartley

Controllability governs the balance between Pavlovian and instrumental action selection

Article Open access 20 December 2019

Hayley M. Dorfman & Samuel J. Gershman

Code availability

Code used to generate the figures and the results of the four studies reported in this manuscript is available through the Open Science Framework repository: https://osf.io/rve2p/

Data availability

Data from the four studies reported in this manuscript are available through the Open Science Framework repository: https://osf.io/rve2p/

References

Berridge, K. C. & Robinson, T. E. Liking, wanting, and the incentive-sensitization theory of addiction. Am. Psychol. 71, 670–679 (2016).
Article Google Scholar
Everitt, B. J. & Robbins, T. W. Drug addiction: updating actions to habits to compulsions ten years on. Annu. Rev. Psychol. 67, 23–50 (2016).
Article Google Scholar
Voon, V. et al. Disorders of compulsivity: a common bias towards learning habits. Mol. Psychiatr. 20, 342–352 (2015).
Article Google Scholar
Anderson, B. A. The attention habit: how reward learning shapes attentional selection. Ann. NY Acad. Sci. 1369, 24–39 (2016).
Article Google Scholar
Eder, A. B. & Dignath, D. Cue-elicited food seeking is eliminated with aversive outcomes following outcome devaluation. Q. J. Exp. Psychol. 69, 574–588 (2016).
Article Google Scholar
Nadler, N., Delgado, M. R. & Delamater, A. R. Pavlovian to instrumental transfer of control in a human learning task. Emotion 11, 1112–1123 (2011).
Article Google Scholar
Pool, E. R., Brosch, T., Delplanque, S. & Sander, D. Where is the chocolate? Rapid spatial orienting toward stimuli associated with primary rewards. Cognition 130, 348–359 (2014).
Article Google Scholar
Prévost, C., McNamee, D., Jessup, R. K., Bossaerts, P. & O’Doherty, J. P. Evidence for model-based computations in the human amygdala during Pavlovian conditioning. PLoS Comput. Biol. 9, e1002918 (2013).
Article Google Scholar
Rangel, A., Camerer, C. & Montague, P. R. A framework for studying the neurobiology of value-based decision making. Nat. Rev. Neurosci. 9, 545–556 (2008).
Article CAS Google Scholar
Sali, A. W., Anderson, B. A. & Yantis, S. The role of reward prediction in the control of attention. J. Exp. Psychol. Human 40, 1654–1664 (2014).
Article Google Scholar
O’Doherty, J. P., Cockburn, J. & Pauli, W. M. Learning, reward, and decision making. Annu. Rev. Psychol. 68, 73–100 (2017).
Article Google Scholar
Delamater, A. R. & Oakeshott, S. Learning about multiple attributes of reward in Pavlovian conditioning. Ann. NY Acad. Sci. 1104, 1–20 (2007).
Article Google Scholar
Balleine, B. W. & Killcross, S. Parallel incentive processing: an integrated view of amygdala function. Trends Neurosci. 29, 272–279 (2006).
Article CAS Google Scholar
Hatfield, T., Han, J.-S., Conley, M., Gallagher, M. & Holland, P. Neurotoxic lesions of basolateral, but not central, amygdala interfere with Pavlovian second-order conditioning and reinforcer devaluation effects. J. Neurosci. 16, 5256–5265 (1996).
Article CAS Google Scholar
Holland, P. C. & Straub, J. J. Differential effects of two ways of devaluing the unconditioned stimulus after Pavlovian appetitive conditioning. J. Exp. Psychol. Anim. B. 5, 65–78 (1979).
Article CAS Google Scholar
Ramachandran, R. & Pearce, J. M. Pavlovian analysis of interactions between hunger and thirst. J. Exp. Psychol. Anim. B. 13, 182–192 (1987).
Article CAS Google Scholar
Gottfried, J. A., O’Doherty, J. P. & Dolan, R. J. Encoding predictive reward value in human amygdala and orbitofrontal cortex. Science 301, 1104–1107 (2003).
Article CAS Google Scholar
Holland, P. C., Lasseter, H. & Agarwal, I. Amount of training and cue-evoked taste-reactivity responding in reinforcer devaluation. J. Exp. Psychol. Anim. B. 34, 119–132 (2008).
Article Google Scholar
Robinson, M. J. F. & Berridge, K. C. Instant transformation of learned repulsion into motivational “wanting”. Curr. Biol. 23, 282–289 (2013).
Article CAS Google Scholar
Pearce, J. M. & Hall, G. A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychol. Rev. 87, 532–552 (1980).
Article CAS Google Scholar
Rescorla, R. A. & Wagner, A. R. in Classical Conditioning II: Current Research and Theory (eds. Black, A.H. & Prokasy, W.F.) 64–99 (Appleton Century Crofts, New York, 1972).
Sutton, R. S. Learning to predict by the methods of temporal differences. Mach. Learn. 3, 9–44 (1988).
Google Scholar
Sharpe, M. J. & Schoenbaum, G. Evaluation of the hypothesis that phasic dopamine constitutes a cached-value signal. Neurobiol. Learn. Mem. (2017).
Cardinal, R. N., Parkinson, J. A., Hall, J. & Everitt, B. J. Emotion and motivation: the role of the amygdala, ventral striatum, and prefrontal cortex. Neurosci. Biobehav. Rev. 26, 321–352 (2002).
Article Google Scholar
Delamater, A. R. On the nature of CS and US representations in Pavlovian learning. Learn. Behav. 40, 1–23 (2012).
Article Google Scholar
Nasser, H. M., Chen, Y.-W., Fiscella, K. & Calu, D. J. Individual variability in behavioral flexibility predicts sign-tracking tendency. Front. Behav. Neurosci. 9, 289 (2015).
Article Google Scholar
Ahrens, A. M., Singer, B. F., Fitzpatrick, C. J., Morrow, J. D. & Robinson, T. E. Rats that sign-track are resistant to Pavlovian but not instrumental extinction. Behav. Brain. Res. 296, 418–430 (2016).
Article Google Scholar
Morrison, S. E., Bamkole, M. A. & Nicola, S. M. Sign tracking, but not goal tracking, is resistant to outcome devaluation. Front. Neurosci. 9, 468 (2015).
Article Google Scholar
Zhang, S., Mano, H., Ganesh, G., Robbins, T. & Seymour, B. Dissociable learning processes underlie human pain conditioning. Curr. Biol. 26, 52–58 (2016).
Article Google Scholar
Pauli, W. M. et al. Distinct contributions of ventromedial and dorsolateral subregions of the human substantia nigra to appetitive and aversive learning. J. Neurosci. 35, 14220–14233 (2015).
Article CAS Google Scholar
Seymour, B., Daw, N. D., Dayan, P., Singer, T. & Dolan, R. J. Differential encoding of losses and gains in the human striatum. J. Neurosci. 27, 4826–4831 (2007).
Article CAS Google Scholar
Tricomi, E., Balleine, B. W. & O’Doherty, J. P. A specific role for posterior dorsolateral striatum in human habit learning. Eur. J. Neurosci. 29, 2225–2232 (2009).
Article Google Scholar
Dickinson, A., Campos, J., Varga, Z. I. & Balleine, B. Bidirectional instrumental conditioning. Q. J. Exp. Psychol. B 49, 289–306 (1996).
Article CAS Google Scholar
Grindley, G. C. The formation of a simple habit in guinea-pigs. B. J. Psychol. Gen. Sect. 23, 127–147 (2011).
Google Scholar
Hershberger, W. A. An approach through the looking-glass. Anim. Learn. Behav. 14, 443–451 (1986).
Article Google Scholar
Guitart-Masip, M. et al. Action dominates valence in anticipatory representations in the human striatum and dopaminergic midbrain. J. Neurosci. 31, 7867–7875 (2011).
Article CAS Google Scholar
Adams, C. D. Variations in the sensitivity of instrumental responding to reinforcer devaluation. Q. J. Exp. Psychol.-B 34, 77–98 (1982).
Article Google Scholar
Balleine, B. W. & Dickinson, A. Instrumental performance following reinforcer devaluation depends upon incentive learning. Q. J. Exp. Psychol. B 43, 279–296 (1991).
Google Scholar
Valentin, V. V., Dickinson, A. & O’Doherty, J. P. Determining the neural substrates of goal-directed learning in the human brain. J. Neurosci. 27, 4019–4026 (2007).
Article CAS Google Scholar
Balleine, B. W. & O’Doherty, J. P. Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology 35, 48–69 (2010).
Article Google Scholar
Pauli, W. M., Cockburn, J., Pool, E. R., Pérez, O. D. & O’Doherty, J. P. Computational approaches to habits in a model-free world. Curr. Opin. Behav. Sci. 20, 104–109 (2018).
Article Google Scholar
Dickinson, A., Balleine, B. W., Watt, A., Gonzalez, F. & Boakes, R. A. Motivational control after extended instrumental training. Anim. Learn. Behav. 23, 197–206 (1995).
Article Google Scholar
Guitart-Masip, M., Duzel, E., Dolan, R. J. & Dayan, P. Action versus valence in decision making. Trends. Cogn. Sci. 18, 194–202 (2014).
Article Google Scholar
Guitart-Masip, M. et al. Go and no-go learning in reward and punishment: interactions between affect and effect. Neuroimage 62, 154–166 (2012).
Article Google Scholar
Konorski, J. Integrative Activity of the Brain: An Interdisciplinary Approach (Univ. Chicago, Chicago, 1967).
De Tommaso, M., Mastropasqua, T. & Turatto, M. Working for beverages without being thirsty: human Pavlovian-instrumental transfer despite outcome devaluation. Learn. Motiv. 63, 37–48 (2018).
Article Google Scholar
Holland, P. C. The effects of satiation after first- and second-order appetitive conditioning in rats. Pavlovian J. Biol. Sci. 16, 18–24 (1981).
CAS Google Scholar
Holland, P. C. & Rescorla, R. A. The effect of two ways of devaluing the unconditioned stimulus after first-and second-order appetitive conditioning. J. Exp. Psychol. Anim. B. 1, 355–363 (1975).
Article Google Scholar
Dayan, P. & Berridge, K. C. Model-based and model-free Pavlovian reward learning: revaluation, revision, and revelation. Cogn. Affect. Behav. Neurosci. 14, 473–492 (2014).
Article Google Scholar
Daw, N. D., Niv, Y. & Dayan, P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1707–1711 (2005).
Article Google Scholar
Faul, F., Erdfelder, E., Lang, A.-G. & Buchner, A. G* Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav. Res. Methods 39, 175–191 (2007).
Article Google Scholar
Krassanakis, V., Filippakopoulou, V. & Nakos, B. EyeMMV toolbox: an eye movement post-analysis tool based on a two-step spatial dispersion threshold for fixation identification. J. Eye Movement Res. https://doi.org/10.16910/jemr.7.1.1 (2014).
Choe, K. W., Blake, R. & Lee, S.-H. Pupil size dynamics during fixation impact the accuracy and precision of video-based gaze estimation. Vision Res. 118, 48–59 (2016).
Article Google Scholar
Nyström, M., Hooge, I. & Andersson, R. Pupil size influences the eye-tracker signal during saccades. Vision Res. 121, 95–103 (2016).
Article Google Scholar
Reber, J. et al. Selective impairment of goal-directed decision-making following lesions to the human ventromedial prefrontal cortex. Brain 140, 1743–1756 (2017).
Article Google Scholar
Prévost, C., Liljeholm, M., Tyszka, J. M. & O’Doherty, J. P. Neural correlates of specific and general Pavlovian-to-instrumental transfer within human amygdalar subregions: a high-resolution fMRI study. J. Neurosci. 32, 8383–8390 (2012).
Article Google Scholar

Download references

Acknowledgements

This work was supported by a NIDA-NIH R01 grant (1R01DA040011-01A1) to J.P.O. and W.M.P. and by an Early Postdoctoral Mobility fellowship from the Swiss National Science Foundation (P2GEP1162079) to E.R.P. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. The authors thank O. D. Perez and V. Sennwald for insightful comments on this manuscript.

Author information

Authors and Affiliations

Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA
Eva R. Pool, Wolfgang M. Pauli, Carolina S. Kress & John P. O’Doherty
Computation and Neural Systems Program, California Institute of Technology, Pasadena, CA, USA
Wolfgang M. Pauli & John P. O’Doherty

Authors

Eva R. Pool
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang M. Pauli
View author publications
You can also search for this author in PubMed Google Scholar
Carolina S. Kress
View author publications
You can also search for this author in PubMed Google Scholar
John P. O’Doherty
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

E.R.P., W.M.P., C.S.K. and J.P.O. designed the experiments. E.R.P. and C.S.K. collected and analysed the data. E.R.P., W.M.P., C.S.K. and J.P.O. wrote the paper. All authors discussed the results and implications and commented on the manuscript at all stages.

Corresponding author

Correspondence to Eva R. Pool.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Methods and Supplementary Notes.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pool, E.R., Pauli, W.M., Kress, C.S. et al. Behavioural evidence for parallel outcome-sensitive and outcome-insensitive Pavlovian learning systems in humans. Nat Hum Behav 3, 284–296 (2019). https://doi.org/10.1038/s41562-018-0527-9

Download citation

Received: 20 March 2018
Accepted: 21 December 2018
Published: 25 February 2019
Issue Date: March 2019
DOI: https://doi.org/10.1038/s41562-018-0527-9

This article is cited by

Neural substrates of parallel devaluation-sensitive and devaluation-insensitive Pavlovian learning in humans
- Eva R. Pool
- Wolfgang M. Pauli
- John P. O’Doherty
Nature Communications (2023)
Reinforcement-learning in fronto-striatal circuits
- Bruno Averbeck
- John P. O’Doherty
Neuropsychopharmacology (2022)
Is Synchronic Self-Control Possible?
- Julia Haas
Review of Philosophy and Psychology (2021)
More than two forms of Pavlovian prediction
- Hillary A. Raab
- Catherine A. Hartley
Nature Human Behaviour (2019)