Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Behavioural evidence for parallel outcome-sensitive and outcome-insensitive Pavlovian learning systems in humans

Abstract

There is a dichotomy in instrumental conditioning between goal-directed actions and habits that are distinguishable on the basis of their relative sensitivity to changes in outcome value. It is less clear whether a similar distinction applies in Pavlovian conditioning, where responses have been found to be predominantly outcome-sensitive. To test for both devaluation-insensitive and devaluation-sensitive Pavlovian conditioning in humans, we conducted four experiments combining Pavlovian conditioning and outcome-devaluation procedures while measuring multiple conditioned responses. Our results suggest that Pavlovian conditioning involves two distinct types of learning: one that learns the current value of the outcome, which is sensitive to devaluation, and one that learns about the spatial localization of the outcome, which is insensitive to devaluation. Our findings have implications for the mechanistic understanding of Pavlovian conditioning and provide a more nuanced understanding of Pavlovian mechanisms that might contribute to a number of psychiatric disorders.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Schematic representation of the experimental design.
Fig. 2: Effect of conditioning during the learning phase of Experiment 1.
Fig. 3: Manipulation check of the outcome-devaluation procedure.
Fig. 4: Effects of the outcome-devaluation procedure on different conditioned responses during Experiment 1 and Experiment 2.
Fig. 5: Illustration of the sequence of events in a trial for Experiment 3.
Fig. 6: Illustration of the main effects during Experiment 3.
Fig. 7: Effect of conditioning during Experiment 4.

Similar content being viewed by others

Code availability

Code used to generate the figures and the results of the four studies reported in this manuscript is available through the Open Science Framework repository: https://osf.io/rve2p/

Data availability

Data from the four studies reported in this manuscript are available through the Open Science Framework repository: https://osf.io/rve2p/

References

  1. Berridge, K. C. & Robinson, T. E. Liking, wanting, and the incentive-sensitization theory of addiction. Am. Psychol. 71, 670–679 (2016).

    Article  Google Scholar 

  2. Everitt, B. J. & Robbins, T. W. Drug addiction: updating actions to habits to compulsions ten years on. Annu. Rev. Psychol. 67, 23–50 (2016).

    Article  Google Scholar 

  3. Voon, V. et al. Disorders of compulsivity: a common bias towards learning habits. Mol. Psychiatr. 20, 342–352 (2015).

    Article  Google Scholar 

  4. Anderson, B. A. The attention habit: how reward learning shapes attentional selection. Ann. NY Acad. Sci. 1369, 24–39 (2016).

    Article  Google Scholar 

  5. Eder, A. B. & Dignath, D. Cue-elicited food seeking is eliminated with aversive outcomes following outcome devaluation. Q. J. Exp. Psychol. 69, 574–588 (2016).

    Article  Google Scholar 

  6. Nadler, N., Delgado, M. R. & Delamater, A. R. Pavlovian to instrumental transfer of control in a human learning task. Emotion 11, 1112–1123 (2011).

    Article  Google Scholar 

  7. Pool, E. R., Brosch, T., Delplanque, S. & Sander, D. Where is the chocolate? Rapid spatial orienting toward stimuli associated with primary rewards. Cognition 130, 348–359 (2014).

    Article  Google Scholar 

  8. Prévost, C., McNamee, D., Jessup, R. K., Bossaerts, P. & O’Doherty, J. P. Evidence for model-based computations in the human amygdala during Pavlovian conditioning. PLoS Comput. Biol. 9, e1002918 (2013).

    Article  Google Scholar 

  9. Rangel, A., Camerer, C. & Montague, P. R. A framework for studying the neurobiology of value-based decision making. Nat. Rev. Neurosci. 9, 545–556 (2008).

    Article  CAS  Google Scholar 

  10. Sali, A. W., Anderson, B. A. & Yantis, S. The role of reward prediction in the control of attention. J. Exp. Psychol. Human 40, 1654–1664 (2014).

    Article  Google Scholar 

  11. O’Doherty, J. P., Cockburn, J. & Pauli, W. M. Learning, reward, and decision making. Annu. Rev. Psychol. 68, 73–100 (2017).

    Article  Google Scholar 

  12. Delamater, A. R. & Oakeshott, S. Learning about multiple attributes of reward in Pavlovian conditioning. Ann. NY Acad. Sci. 1104, 1–20 (2007).

    Article  Google Scholar 

  13. Balleine, B. W. & Killcross, S. Parallel incentive processing: an integrated view of amygdala function. Trends Neurosci. 29, 272–279 (2006).

    Article  CAS  Google Scholar 

  14. Hatfield, T., Han, J.-S., Conley, M., Gallagher, M. & Holland, P. Neurotoxic lesions of basolateral, but not central, amygdala interfere with Pavlovian second-order conditioning and reinforcer devaluation effects. J. Neurosci. 16, 5256–5265 (1996).

    Article  CAS  Google Scholar 

  15. Holland, P. C. & Straub, J. J. Differential effects of two ways of devaluing the unconditioned stimulus after Pavlovian appetitive conditioning. J. Exp. Psychol. Anim. B. 5, 65–78 (1979).

    Article  CAS  Google Scholar 

  16. Ramachandran, R. & Pearce, J. M. Pavlovian analysis of interactions between hunger and thirst. J. Exp. Psychol. Anim. B. 13, 182–192 (1987).

    Article  CAS  Google Scholar 

  17. Gottfried, J. A., O’Doherty, J. P. & Dolan, R. J. Encoding predictive reward value in human amygdala and orbitofrontal cortex. Science 301, 1104–1107 (2003).

    Article  CAS  Google Scholar 

  18. Holland, P. C., Lasseter, H. & Agarwal, I. Amount of training and cue-evoked taste-reactivity responding in reinforcer devaluation. J. Exp. Psychol. Anim. B. 34, 119–132 (2008).

    Article  Google Scholar 

  19. Robinson, M. J. F. & Berridge, K. C. Instant transformation of learned repulsion into motivational “wanting”. Curr. Biol. 23, 282–289 (2013).

    Article  CAS  Google Scholar 

  20. Pearce, J. M. & Hall, G. A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychol. Rev. 87, 532–552 (1980).

    Article  CAS  Google Scholar 

  21. Rescorla, R. A. & Wagner, A. R. in Classical Conditioning II: Current Research and Theory (eds. Black, A.H. & Prokasy, W.F.) 64–99 (Appleton Century Crofts, New York, 1972).

  22. Sutton, R. S. Learning to predict by the methods of temporal differences. Mach. Learn. 3, 9–44 (1988).

    Google Scholar 

  23. Sharpe, M. J. & Schoenbaum, G. Evaluation of the hypothesis that phasic dopamine constitutes a cached-value signal. Neurobiol. Learn. Mem. (2017).

  24. Cardinal, R. N., Parkinson, J. A., Hall, J. & Everitt, B. J. Emotion and motivation: the role of the amygdala, ventral striatum, and prefrontal cortex. Neurosci. Biobehav. Rev. 26, 321–352 (2002).

    Article  Google Scholar 

  25. Delamater, A. R. On the nature of CS and US representations in Pavlovian learning. Learn. Behav. 40, 1–23 (2012).

    Article  Google Scholar 

  26. Nasser, H. M., Chen, Y.-W., Fiscella, K. & Calu, D. J. Individual variability in behavioral flexibility predicts sign-tracking tendency. Front. Behav. Neurosci. 9, 289 (2015).

    Article  Google Scholar 

  27. Ahrens, A. M., Singer, B. F., Fitzpatrick, C. J., Morrow, J. D. & Robinson, T. E. Rats that sign-track are resistant to Pavlovian but not instrumental extinction. Behav. Brain. Res. 296, 418–430 (2016).

    Article  Google Scholar 

  28. Morrison, S. E., Bamkole, M. A. & Nicola, S. M. Sign tracking, but not goal tracking, is resistant to outcome devaluation. Front. Neurosci. 9, 468 (2015).

    Article  Google Scholar 

  29. Zhang, S., Mano, H., Ganesh, G., Robbins, T. & Seymour, B. Dissociable learning processes underlie human pain conditioning. Curr. Biol. 26, 52–58 (2016).

    Article  Google Scholar 

  30. Pauli, W. M. et al. Distinct contributions of ventromedial and dorsolateral subregions of the human substantia nigra to appetitive and aversive learning. J. Neurosci. 35, 14220–14233 (2015).

    Article  CAS  Google Scholar 

  31. Seymour, B., Daw, N. D., Dayan, P., Singer, T. & Dolan, R. J. Differential encoding of losses and gains in the human striatum. J. Neurosci. 27, 4826–4831 (2007).

    Article  CAS  Google Scholar 

  32. Tricomi, E., Balleine, B. W. & O’Doherty, J. P. A specific role for posterior dorsolateral striatum in human habit learning. Eur. J. Neurosci. 29, 2225–2232 (2009).

    Article  Google Scholar 

  33. Dickinson, A., Campos, J., Varga, Z. I. & Balleine, B. Bidirectional instrumental conditioning. Q. J. Exp. Psychol. B 49, 289–306 (1996).

    Article  CAS  Google Scholar 

  34. Grindley, G. C. The formation of a simple habit in guinea-pigs. B. J. Psychol. Gen. Sect. 23, 127–147 (2011).

    Google Scholar 

  35. Hershberger, W. A. An approach through the looking-glass. Anim. Learn. Behav. 14, 443–451 (1986).

    Article  Google Scholar 

  36. Guitart-Masip, M. et al. Action dominates valence in anticipatory representations in the human striatum and dopaminergic midbrain. J. Neurosci. 31, 7867–7875 (2011).

    Article  CAS  Google Scholar 

  37. Adams, C. D. Variations in the sensitivity of instrumental responding to reinforcer devaluation. Q. J. Exp. Psychol.-B 34, 77–98 (1982).

    Article  Google Scholar 

  38. Balleine, B. W. & Dickinson, A. Instrumental performance following reinforcer devaluation depends upon incentive learning. Q. J. Exp. Psychol. B 43, 279–296 (1991).

    Google Scholar 

  39. Valentin, V. V., Dickinson, A. & O’Doherty, J. P. Determining the neural substrates of goal-directed learning in the human brain. J. Neurosci. 27, 4019–4026 (2007).

    Article  CAS  Google Scholar 

  40. Balleine, B. W. & O’Doherty, J. P. Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology 35, 48–69 (2010).

    Article  Google Scholar 

  41. Pauli, W. M., Cockburn, J., Pool, E. R., Pérez, O. D. & O’Doherty, J. P. Computational approaches to habits in a model-free world. Curr. Opin. Behav. Sci. 20, 104–109 (2018).

    Article  Google Scholar 

  42. Dickinson, A., Balleine, B. W., Watt, A., Gonzalez, F. & Boakes, R. A. Motivational control after extended instrumental training. Anim. Learn. Behav. 23, 197–206 (1995).

    Article  Google Scholar 

  43. Guitart-Masip, M., Duzel, E., Dolan, R. J. & Dayan, P. Action versus valence in decision making. Trends. Cogn. Sci. 18, 194–202 (2014).

    Article  Google Scholar 

  44. Guitart-Masip, M. et al. Go and no-go learning in reward and punishment: interactions between affect and effect. Neuroimage 62, 154–166 (2012).

    Article  Google Scholar 

  45. Konorski, J. Integrative Activity of the Brain: An Interdisciplinary Approach (Univ. Chicago, Chicago, 1967).

  46. De Tommaso, M., Mastropasqua, T. & Turatto, M. Working for beverages without being thirsty: human Pavlovian-instrumental transfer despite outcome devaluation. Learn. Motiv. 63, 37–48 (2018).

    Article  Google Scholar 

  47. Holland, P. C. The effects of satiation after first- and second-order appetitive conditioning in rats. Pavlovian J. Biol. Sci. 16, 18–24 (1981).

    CAS  Google Scholar 

  48. Holland, P. C. & Rescorla, R. A. The effect of two ways of devaluing the unconditioned stimulus after first-and second-order appetitive conditioning. J. Exp. Psychol. Anim. B. 1, 355–363 (1975).

    Article  Google Scholar 

  49. Dayan, P. & Berridge, K. C. Model-based and model-free Pavlovian reward learning: revaluation, revision, and revelation. Cogn. Affect. Behav. Neurosci. 14, 473–492 (2014).

    Article  Google Scholar 

  50. Daw, N. D., Niv, Y. & Dayan, P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1707–1711 (2005).

    Article  Google Scholar 

  51. Faul, F., Erdfelder, E., Lang, A.-G. & Buchner, A. G* Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav. Res. Methods 39, 175–191 (2007).

    Article  Google Scholar 

  52. Krassanakis, V., Filippakopoulou, V. & Nakos, B. EyeMMV toolbox: an eye movement post-analysis tool based on a two-step spatial dispersion threshold for fixation identification. J. Eye Movement Res. https://doi.org/10.16910/jemr.7.1.1 (2014).

  53. Choe, K. W., Blake, R. & Lee, S.-H. Pupil size dynamics during fixation impact the accuracy and precision of video-based gaze estimation. Vision Res. 118, 48–59 (2016).

    Article  Google Scholar 

  54. Nyström, M., Hooge, I. & Andersson, R. Pupil size influences the eye-tracker signal during saccades. Vision Res. 121, 95–103 (2016).

    Article  Google Scholar 

  55. Reber, J. et al. Selective impairment of goal-directed decision-making following lesions to the human ventromedial prefrontal cortex. Brain 140, 1743–1756 (2017).

    Article  Google Scholar 

  56. Prévost, C., Liljeholm, M., Tyszka, J. M. & O’Doherty, J. P. Neural correlates of specific and general Pavlovian-to-instrumental transfer within human amygdalar subregions: a high-resolution fMRI study. J. Neurosci. 32, 8383–8390 (2012).

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by a NIDA-NIH R01 grant (1R01DA040011-01A1) to J.P.O. and W.M.P. and by an Early Postdoctoral Mobility fellowship from the Swiss National Science Foundation (P2GEP1162079) to E.R.P. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. The authors thank O. D. Perez and V. Sennwald for insightful comments on this manuscript.

Author information

Authors and Affiliations

Authors

Contributions

E.R.P., W.M.P., C.S.K. and J.P.O. designed the experiments. E.R.P. and C.S.K. collected and analysed the data. E.R.P., W.M.P., C.S.K. and J.P.O. wrote the paper. All authors discussed the results and implications and commented on the manuscript at all stages.

Corresponding author

Correspondence to Eva R. Pool.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Methods and Supplementary Notes.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pool, E.R., Pauli, W.M., Kress, C.S. et al. Behavioural evidence for parallel outcome-sensitive and outcome-insensitive Pavlovian learning systems in humans. Nat Hum Behav 3, 284–296 (2019). https://doi.org/10.1038/s41562-018-0527-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41562-018-0527-9

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing