Dopamine transients are sufficient and necessary for acquisition of model-based associations

Sharpe, Melissa J; Chang, Chun Yun; Liu, Melissa A; Batchelor, Hannah M; Mueller, Lauren E; Jones, Joshua L; Niv, Yael; Schoenbaum, Geoffrey

doi:10.1038/nn.4538

Article
Published: 03 April 2017

Dopamine transients are sufficient and necessary for acquisition of model-based associations

Melissa J Sharpe^1,2,
Chun Yun Chang¹,
Melissa A Liu¹,
Hannah M Batchelor¹,
Lauren E Mueller¹,
Joshua L Jones¹,
Yael Niv ORCID: orcid.org/0000-0002-0259-8371² &
…
Geoffrey Schoenbaum ORCID: orcid.org/0000-0001-8180-0701^1,3,4

Nature Neuroscience volume 20, pages 735–742 (2017)Cite this article

12k Accesses
145 Citations
55 Altmetric
Metrics details

Subjects

An Author Correction to this article was published on 17 July 2018

A Corrigendum to this article was published on 26 July 2017

This article has been updated

Abstract

Associative learning is driven by prediction errors. Dopamine transients correlate with these errors, which current interpretations limit to endowing cues with a scalar quantity reflecting the value of future rewards. We tested whether dopamine might act more broadly to support learning of an associative model of the environment. Using sensory preconditioning, we show that prediction errors underlying stimulus–stimulus learning can be blocked behaviorally and reinstated by optogenetically activating dopamine neurons. We further show that suppressing the firing of these neurons across the transition prevents normal stimulus–stimulus learning. These results establish that the acquisition of model-based information about transitions between nonrewarding events is also driven by prediction errors and that, contrary to existing canon, dopamine transients are both sufficient and necessary to support this type of learning. Our findings open new possibilities for how these biological signals might support associative learning in the mammalian brain in these and other contexts.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Immunohistochemical verification of Cre-dependent ChR2 and eYFP expression in TH⁺ neurons and fiber placements in the VTA.**

**Figure 2: Brief optogenetic activation of VTA dopamine neurons strengthens associations between cues.**

**Figure 3: Conditioned responding resulting from learning, supported by brief activation of VTA dopamine neurons, is sensitive to devaluation of the predicted reward.**

**Figure 4: Immunohistochemical verification of Cre-dependent NpHR and eYFP expression in TH⁺ neurons and fiber placements in the VTA.**

**Figure 5: Brief optogenetic inhibition of dopamine neurons reduces the strength of associations between cues.**

Climbing fibers provide essential instructive signals for associative learning

Article Open access 02 April 2024

N. Tatiana Silva, Jorge Ramírez-Buriticá, … Megan R. Carey

Centripetal integration of past events in hippocampal astrocytes regulated by locus coeruleus

Article Open access 03 April 2024

Peter Rupprecht, Sian N. Duss, … Fritjof Helmchen

A brainstem–hypothalamus neuronal circuit reduces feeding upon heat exposure

Article Open access 27 March 2024

Marco Benevento, Alán Alpár, … Tibor Harkany

Change history

10 April 2017
In the version of this article initially published online, the checkered and filled boxes were reversed in the keys to Figures 3a and 3b. The error has been corrected in the print, PDF and HTML versions of this article.
04 May 2017
In the version of this article initially published, the histogram in Figure 2c, center top graph, was duplicated from the panel below, and the remaining histograms accompanying the scatter plots in Figures 2c and 5c were slightly mis-scaled and misaligned relative to the scatterplots. The histograms, as well as the vertical scaling of Figure 5c, bottom right graph, have been adjusted. Also, one data point from the scatterplot in the top right panel of Figure 2c had originally been transformed from a negative value on the vertical axis to its absolute value. The errors have been corrected in the PDF and HTML versions of this article.
17 July 2018
In the version of this article initially published, the laser activation at the start of cue X in experiment 1 was described in the first paragraph of the Results and in the third paragraph of the Experiment 1 section of the Methods as lasting 2 s; in fact, it lasted only 1 s. The error has been corrected in the HTML and PDF versions of the article.

References

Schultz, W. Dopamine neurons and their role in reward mechanisms. Curr. Opin. Neurobiol. 7, 191–197 (1997).
CAS PubMed Google Scholar
Schultz, W., Dayan, P. & Montague, P.R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997).
Article CAS PubMed Google Scholar
Sutton, R.S. & Barto, A.G. Toward a modern theory of adaptive networks: expectation and prediction. Psychol. Rev. 88, 135–170 (1981).
CAS PubMed Google Scholar
Nakahara, H. Multiplexing signals in reinforcement learning with internal models and dopamine. Curr. Opin. Neurobiol. 25, 123–129 (2014).
PubMed Google Scholar
Schultz, W. Dopamine reward prediction-error signalling: a two-component response. Nat. Rev. Neurosci. 17, 183–195 (2016).
CAS PubMed PubMed Central Google Scholar
Tolman, E.C. Cognitive maps in rats and men. Psychol. Rev. 55, 189–208 (1948).
CAS PubMed Google Scholar
Daw, N.D., Niv, Y. & Dayan, P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711 (2005).
CAS PubMed Google Scholar
Gläscher, J., Daw, N., Dayan, P. & O'Doherty, J.P. States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66, 585–595 (2010).
Article PubMed PubMed Central Google Scholar
Colwill, R.M. An associative analysis of instrumental learning. Curr. Dir. Psychol. Sci. 2, 111–116 (1993).
Google Scholar
Hollland, P.C. & Rescorla, R.A. The effect of two ways of devaluing the unconditioned stimulus after first- and second-order appetitive conditioning. J. Exp. Psychol. Anim. Behav. Process. 1, 355–363 (1975).
CAS PubMed Google Scholar
Daw, N.D., Gershman, S.J., Seymour, B., Dayan, P. & Dolan, R.J. Model-based influences on humans' choices and striatal prediction errors. Neuron 69, 1204–1215 (2011).
CAS PubMed PubMed Central Google Scholar
Steinberg, E.E. et al. A causal link between prediction errors, dopamine neurons and learning. Nat. Neurosci. 16, 966–973 (2013).
CAS PubMed PubMed Central Google Scholar
Eshel, N. et al. Arithmetic and local circuitry underlying dopamine prediction errors. Nature 525, 243–246 (2015).
CAS PubMed PubMed Central Google Scholar
Chang, C.Y. et al. Brief optogenetic inhibition of dopamine neurons mimics endogenous negative prediction errors. Nat. Neurosci. 19, 111–116 (2016).
CAS PubMed Google Scholar
Tsai, H.C. et al. Phasic firing in dopaminergic neurons is sufficient for behavioral conditioning. Science 324, 1080–1084 (2009).
CAS PubMed PubMed Central Google Scholar
Adamantidis, A.R. et al. Optogenetic interrogation of dopaminergic modulation of the multiple phases of reward-seeking behavior. J. Neurosci. 31, 10829–10835 (2011).
CAS PubMed PubMed Central Google Scholar
Ilango, A. et al. Similar roles of substantia nigra and ventral tegmental dopamine neurons in reward and aversion. J. Neurosci. 34, 817–822 (2014).
CAS PubMed PubMed Central Google Scholar
Stopper, C.M., Tse, M.T., Montes, D.R., Wiedman, C.R. & Floresco, S.B. Overriding phasic dopamine signals redirects action selection during risk/reward decision making. Neuron 84, 177–189 (2014).
CAS PubMed Google Scholar
Brogden, W.J. Sensory pre-conditioning. J. Exp. Psychol. 25, 323–332 (1939).
Google Scholar
Blundell, P., Hall, G. & Killcross, S. Preserved sensitivity to outcome value after lesions of the basolateral amygdala. J. Neurosci. 23, 7702–7709 (2003).
CAS PubMed PubMed Central Google Scholar
Jones, J.L. et al. Orbitofrontal cortex supports behavior and learning using inferred but not cached values. Science 338, 953–956 (2012).
CAS PubMed PubMed Central Google Scholar
Rizley, R.C. & Rescorla, R.A. Associations in second-order conditioning and sensory preconditioning. J Comp Physiol Psychol 81, 1–11 (1972).
CAS PubMed Google Scholar
Kamin, L.J. “Attention-like” processes in classical conditioning. in Miami Symposium on the Prediction of Behavior, 1967: Aversive Stimulation (ed. M.R. Jones) 9–31 (University of Miami Press, 1968).
Tobler, P.N., Dickinson, A. & Schultz, W. Coding of predicted reward omission by dopamine neurons in a conditioned inhibition paradigm. J. Neurosci. 23, 10402–10410 (2003).
CAS PubMed PubMed Central Google Scholar
Pan, W.-X., Schmidt, R., Wickens, J.R. & Hyland, B.I. Dopamine cells respond to predicted events during classical conditioning: evidence for eligibility traces in the reward-learning network. J. Neurosci. 25, 6235–6242 (2005).
CAS PubMed PubMed Central Google Scholar
Hollerman, J.R. & Schultz, W. Dopamine neurons report an error in the temporal prediction of reward during learning. Nat. Neurosci. 1, 304–309 (1998).
CAS PubMed Google Scholar
Cohen, J.Y., Haesler, S., Vong, L., Lowell, B.B. & Uchida, N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 482, 85–88 (2012).
CAS PubMed PubMed Central Google Scholar
Takahashi, Y.K. et al. The orbitofrontal cortex and ventral tegmental area are necessary for learning from unexpected outcomes. Neuron 62, 269–280 (2009).
CAS PubMed PubMed Central Google Scholar
Kakade, S. & Dayan, P. Dopamine: generalization and bonuses. Neural Netw. 15, 549–559 (2002).
PubMed Google Scholar
Horvitz, J.C., Stewart, T. & Jacobs, B.L. Burst activity of ventral tegmental dopamine neurons is elicited by sensory stimuli in the awake cat. Brain Res. 759, 251–258 (1997).
CAS PubMed Google Scholar
Witten, I.B. et al. Recombinase-driver rat lines: tools, techniques, and optogenetic application to dopamine-mediated reinforcement. Neuron 72, 721–733 (2011).
CAS PubMed PubMed Central Google Scholar
D'Ardenne, K., McClure, S.M., Nystrom, L.E. & Cohen, J.D. BOLD responses reflecting dopaminergic signals in the human ventral tegmental area. Science 319, 1264–1267 (2008).
CAS PubMed Google Scholar
Parker, N.F. et al. Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target. Nat. Neurosci. 19, 845–854 (2016).
CAS PubMed PubMed Central Google Scholar
Day, J.J., Roitman, M.F., Wightman, R.M. & Carelli, R.M. Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens. Nat. Neurosci. 10, 1020–1028 (2007).
CAS PubMed Google Scholar
Holland, P.C. Relations between Pavlovian-instrumental transfer and reinforcer devaluation. J. Exp. Psychol. Anim. Behav. Process. 30, 104–117 (2004).
PubMed Google Scholar
Dickinson, A. & Balleine, B.W. Motivational control of goal-directed action. Anim. Learn. Behav. 22, 1–18 (1994).
Google Scholar
Popescu, A.T., Zhou, M.R. & Poo, M.-M. Phasic dopamine release in the medial prefrontal cortex enhances stimulus discrimination. Proc. Natl. Acad. Sci. USA 113, E3169–E3176 (2016).
CAS PubMed PubMed Central Google Scholar
Mackintosh, N.J. A theory of attention: variations in the associability of stimuli with reinforcement. Psychol. Rev. 82, 276–298 (1975).
Google Scholar
Pearce, J.M. & Hall, G. A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychol. Rev. 87, 532–552 (1980).
CAS PubMed Google Scholar
Esber, G.R. & Haselgrove, M. Reconciling the influence of predictiveness and uncertainty on stimulus salience: a model of attention in associative learning. Proceedings of the Royal Society of London B: Biological Sciences http://dx.doi.org/10.1098/rspb.2011.0836 (2011).
Sadacca, B.F., Jones, J.L. & Schoenbaum, G. Midbrain dopamine neurons compute inferred and cached value prediction errors in a common framework. eLife 5, e13665 (2016).
PubMed PubMed Central Google Scholar
Cone, J.J. et al. Physiological state gates acquisition and expression of mesolimbic reward prediction signals. Proc. Natl. Acad. Sci. USA 113, 1943–1948 (2016).
CAS PubMed PubMed Central Google Scholar
Bromberg-Martin, E.S., Matsumoto, M., Hong, S. & Hikosaka, O. A pallidus-habenula-dopamine pathway signals inferred stimulus values. J. Neurophysiol. 104, 1068–1076 (2010).
PubMed PubMed Central Google Scholar
Aitken, T.J., Greenfield, V.Y. & Wassum, K.M. Nucleus accumbens core dopamine signaling tracks the need-based motivational value of food-paired cues. J. Neurochem. 136, 1026–1036 (2016).
CAS PubMed PubMed Central Google Scholar
Deserno, L. et al. Ventral striatal dopamine reflects behavioral and neural signatures of model-based control during sequential decision making. Proc. Natl. Acad. Sci. USA 112, 1595–1600 (2015).
CAS PubMed PubMed Central Google Scholar
Eshel, N., Tian, J., Bukwich, M. & Uchida, N. Dopamine neurons share common response function for reward prediction error. Nat. Neurosci. 19, 479–486 (2016).
CAS PubMed PubMed Central Google Scholar
Lammel, S. et al. Unique properties of mesoprefrontal neurons within a dual mesocorticolimbic dopamine system. Neuron 57, 760–773 (2008).
CAS PubMed Google Scholar
Wimmer, G.E. & Shohamy, D. Preference by association: how memory mechanisms in the hippocampus bias decisions. Science 338, 270–273 (2012).
CAS PubMed Google Scholar
Robinson, S. et al. Chemogenetic silencing of neurons in retrosplenial cortex disrupts sensory preconditioning. J. Neurosci. 34, 10982–10988 (2014).
PubMed PubMed Central Google Scholar
Johnson, A., Fenton, A.A., Kentros, C. & Redish, A.D. Looking for cognition in the structure within the noise. Trends Cogn. Sci. 13, 55–64 (2009).
PubMed PubMed Central Google Scholar
Holland, P.C. Conditioned stimulus as a determinant of the form of the Pavlovian conditioned response. J. Exp. Psychol. Anim. Behav. Process. 3, 77–104 (1977).
CAS PubMed Google Scholar
McDannald, M.A., Lucantonio, F., Burke, K.A., Niv, Y. & Schoenbaum, G. Ventral striatum and orbitofrontal cortex are both required for model-based, but not model-free, reinforcement learning. J. Neurosci. 31, 2700–2705 (2011).
CAS PubMed PubMed Central Google Scholar
Holland, P.C. & Gallagher, M. Effects of amygdala central nucleus lesions on blocking and unblocking. Behav. Neurosci. 107, 235–245 (1993).
CAS PubMed Google Scholar
Holland, P.C. & Kenmuir, C. Variations in unconditioned stimulus processing in unblocking. J. Exp. Psychol. Anim. Behav. Process. 31, 155–171 (2005).
PubMed PubMed Central Google Scholar
Sharpe, M.J. & Killcross, S. The prelimbic cortex contributes to the down-regulation of attention toward redundant cues. Cereb. Cortex 24, 1066–1074 (2014).
PubMed Google Scholar
Burke, K.A., Franz, T.M., Miller, D.N. & Schoenbaum, G. The role of the orbitofrontal cortex in the pursuit of happiness and more specific rewards. Nature 454, 340–344 (2008).
CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The authors thank K. Deisseroth and the Gene Therapy Center at the University of North Carolina at Chapel Hill for providing viral reagents and G. Stuber for technical advice on their use. We also thank B. Harvey and the NIDA Optogenetic and Transgenic Core, M. Morales and the NIDA Histology Core for their assistance, and P. Dayan and N. Daw for their comments. This work was supported by R01-MH098861 (to Y.N.) and by the Intramural Research Program at NIDA ZIA-DA000587 (to G.S.). The opinions expressed in this article are the authors' own and do not reflect the view of the NIH/DHHS.

Author information

Authors and Affiliations

NIDA Intramural Research Program, Baltimore, Maryland, USA
Melissa J Sharpe, Chun Yun Chang, Melissa A Liu, Hannah M Batchelor, Lauren E Mueller, Joshua L Jones & Geoffrey Schoenbaum
Department of Psychology and Neuroscience Institute, Princeton University, Princeton, New Jersey, USA
Melissa J Sharpe & Yael Niv
Departments of Anatomy and of Neurobiology and Psychiatry, University of Maryland School of Medicine, Baltimore, Maryland, USA
Geoffrey Schoenbaum
Solomon H. Snyder Department of Neuroscience, The Johns Hopkins University, Baltimore, Maryland, USA
Geoffrey Schoenbaum

Authors

Melissa J Sharpe
View author publications
You can also search for this author in PubMed Google Scholar
Chun Yun Chang
View author publications
You can also search for this author in PubMed Google Scholar
Melissa A Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hannah M Batchelor
View author publications
You can also search for this author in PubMed Google Scholar
Lauren E Mueller
View author publications
You can also search for this author in PubMed Google Scholar
Joshua L Jones
View author publications
You can also search for this author in PubMed Google Scholar
Yael Niv
View author publications
You can also search for this author in PubMed Google Scholar
Geoffrey Schoenbaum
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.J.S. and G.S. designed the experiments; M.J.S., M.A.L., H.M.B. and L.E.M. collected the data with technical advice and assistance from C.Y.C. and J.L.J. M.J.S. and G.S. analyzed the data. M.J.S., Y.N. and G.S. interpreted the data and wrote the manuscript with input from all authors.

Corresponding authors

Correspondence to Melissa J Sharpe or Geoffrey Schoenbaum.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 The sensory preconditioning effect is susceptible to blocking.

Plots show number of magazine entries occurring during all phases of the blocking of sensory preconditioning task with wild type rats: preconditioning (A), conditioning (B) and the probe test (C). A two-factor ANOVA (cue х group) revealed a significant difference in responding to cue F relative to cue D (F_(1,13)=5.845, p = 0.031), where the same analysis revealed no difference in responding to D and C (F_(1,13) =0.013, p = 0.911). ** indicates significance at p<0.05. We have interpreted our basic sensory preconditioning effect in terms of an associative chaining or value inference mechanism. An alternative account, which has been employed in other recent studies using similar procedures^1,2, is that the conditioned responding to the pre-conditioned cue results from mediated learning that occurs during the conditioning phase of the experimental procedure³. Briefly, this account would argue that, during conditioning, presentations of X also activate a representation of any associated pre-conditioned cue in memory within relatively close temporal contiguity with the delivery of the sucrose pellets, resulting in the representation of the pre-conditioned cue becoming directly associated with this reward. If this were to occur, then at test, the conditioned responding to the pre-conditioned cue might reflect a direct association with sucrose, rather than requiring X to bridge the experiences of this cue and sucrose. While there is significant evidence within the literature for the phenomenon of mediated learning^4,5, several features of our behavioral design were chosen to bias strongly against the operation of this mechanism. First, we used forward rather than simultaneous or backward pairings of the pre-conditioned and conditioned cues. This is important because mediated learning in rodents has been suggested to operate primarily when the constituent elements are presented simultaneously³ or in reverse i.e. backward sensory preconditioning;⁴. The reason for this is intuitive because either of these temporal arrangements maximizes the chances that B will evoke a representation of A during the conditioning phase and concurrent with reward delivery, an arrangement that obvious benefits in maximizing the ability of an evoked representation of A to become directly associated with reward. Our design avoids this issue by using forward pairing of the preconditioned and to-be-conditioned cues in the initial phase of training. This treatment is expected to render X relatively ineffective at subsequently conjuring up a memory of any of the preceding cues, thus making the contribution of mediated learning insubstantial⁶. Second, the amount of training given in conditioning, with X-reward pairings, was also designed to discourage mediated learning. As noted above, the presentation of X in conditioning could lead to mediated learning to the extent it activates a representation of a pre-conditioned cue in memory. However, with repeated presentations of X without the other cues, the ability of X to evoke a representation of these other cues will extinguish. Conditioning consisted of 4 days of AM and PM training in which X was presented without other cues. This extensive training should further undermine the likelihood of mediated learning. In conclusion, we believe our specific behavioral parameters largely eliminate any potential contribution of mediated learning to the sensory preconditioning effect in our particular design, and favor the parsimonious interpretation of the sensory preconditioning effect in terms of an associative chaining or inference mechanism. We would note that this interpretation is supported by our own prior report that OFC inactivation at probe test in this exact paradigm abolishes responding to the pre-conditioned cue and has no effect on responding to the conditioned cue⁷, since mediated learning is basically simple conditioning and OFC manipulations typically have no effect on expression of previously acquired conditioned responding.

1. Kurth-Nelson, Z., Barnes, G., Sejdinovic, D., Dolan, R. & Dayan, P. Temporal structure in associative retrieval. eLife 4 (2015).

2. Wimmer, G.E., Daw, N.D. & Shohamy, D. Generalization of value in reinforcement learning by humans. The European journal of neuroscience 35, 1092-1104 (2012).

3. Rescorla, R.A. & Freberg, L. The extinction of within-compound flavor associations. Learning and Motivation 9, 411-427 (1978).

4. Ward-Robinson, J. & Hall, G. Backward sensory preconditioning. Journal of Experimental Psychology: Animal Behavior Processes 22, 395-404 (1996).

5. Holland, P.C. Event representation in Pavlovian conditioning: image and action. Cognition 37, 105-131 (1990).

6. Hall, G. Learning about associatively activated representations: Implications for acquired equivalence and perceptual learning. Animal Learning & Behavior 24, 233-255 (1996).

7. Jones, J.L., et al. Orbitofrontal cortex supports behavior and learning using inferred but not cached values. Science 338, 953-956 (2012).

Supplementary Figure 2 Brief optogenetic inhibition of dopamine neurons reduces the strength of associations between cues when analyzing entries made into the food port.

Plots show the number of entries made into the magazine during cue presentation across all phases of the sensory preconditioning task: preconditioning (A), conditioning (B) and the probe test (C). Top panel shows data from the eYFP control group, bottom panel shows data from the experimental NpHR group. VTA dopamine neurons were inhibited by light delivery (yellow symbol) in the 500ms before the offset of A and carried through the first 2s of X. Error bars =SEM. As the second experiment involved a much higher amount of reward relative to experiment 1 (approximately double), the nature of the conditioned response changed. Rather than checking briefly many times for reward, the rats are more certain reward is coming and therefore they make fewer entries and spend more time in the food cup. As a result, we plot the conditioned responding as the amount of time spent in the food cup rather than number of entries in the main manuscript. Of course, both measures reflect a prediction that the food is coming. Further, the shift in the form of the response is expected based on the differences in reward in the two designs and there is a history of researchers who have reported differences in these measures^8-12. Importantly this figure show that we still see the same overall pattern and direction of effects whether we look at the data in terms of the time spent in, or the number of entries made into, the food port. In order to confirm this statistically, we conducted a Multivariate Analysis of Variance (MANOVA) where we included both measures as dependent variables in a single analysis and assessed their significance as we have done previously. This multivariate analyses elicited a significant interaction between cue and group across both measures (F_(2,38)= 3.5, p=0.04), which was due to a significant difference between A and B in the NpHR (F_(2,38)=5.0, p=0.01) but not in the control eYFP group (F_(2,38)=0.742, p=0.483). Thus, we obtained the same results when including both number and percent measures as dependent variables in the analyses. We also conducted a linear regression analysis that showed that percent responding to either cue significantly predicted the number of entries made towards that cue in the same animal (F_(1,80)=48.10, p<0.001). The correlation was 0.65 and the total variability in the number of responses made towards the cues that could be predicted by percent responding was ~40%. Further, when normalizing the numbers of responses according to the coefficients obtained in the linear regression to equate them with the percent data, we found that including response measure as a factor in our repeated-measures ANOVA did not produce any interactions between this factor and cue, group, or our critical cue by group interaction (data not shown). Thus, the two response measures were significantly correlated.

8. Holland, P.C. & Gallagher, M. Effects of amygdala central nucleus lesions on blocking and unblocking. Behavioral neuroscience 107, 235 (1993).

9. Holland, P.C. & Kenmuir, C. Variations in unconditioned stimulus processing in unblocking. Journal of Experimental Psychology: Animal Behavior Processes 31, 155 (2005).

10. Sharpe, M. & Killcross, S. The prelimbic cortex contributes to the down-regulation of attention toward redundant cues. Cerebral cortex 24, 1066-1074 (2014).

11. McDannald, M.A., Lucantonio, F., Burke, K.A., Niv, Y. & Schoenbaum, G. Ventral striatum and orbitofrontal cortex are both required for model-based, but not model-free, reinforcement learning. The Journal of Neuroscience 31, 2700-2705 (2011).

12. Burke, K.A., Franz, T.M., Miller, D.N. & Schoenbaum, G. The role of the orbitofrontal cortex in the pursuit of happiness and more specific rewards. Nature 454, 340-344 (2008).

Supplementary Figure 3 The difference between responding to cue A and cue B in the NpHR group in Experiment 2 is not caused by frequent responses to cue.

A. Plots show percent time spent in the magazine during the final probe test in Experiment 2. Consecutively removing the highest responders to cue A in the NpHR did not reduce the magnitude of the difference between cue A and B. While the NpHR group showed lower-levels of responding to cue B, it also exhibited elevated levels of responding to cue A. This likely reflects the distribution of learning (and responding) across available cues in a within-subject design. That is, if the learning (and responding) about one cue is compromised, it is sometimes elevated towards other available cues. In support of this suggestion, the critical difference in learning about cue A and B is not driven by heightened levels of responding to cue A. This is illustrated this figure, which shows that consecutively removing the highest responders in the NpHR group does not affect the magnitude of the difference. As responding to A goes down, responding to B also decreases. Accordingly, a split mean analysis including high and low responding to A as a factor in an ANOVA on data elicited from the NpHR group in the probe test revealed a main effect of cue (F_(1,15) = 10.6, p =0.006) but no interaction with the level of responding (F_(1,15) = 1.9, p = 0.189). Thus, high responding to A is not responsible for the difference between cue A and B in Experiment 2 in the NpHR rats.

Supplementary Figure 4 Stimulation or inhibition of VTA dopamine neurons during preconditioning does not cause rats to enter or avoid the magazine.

Left: panels indicate data from the preconditioning phase of our blocking of sensory preconditioning procedure where we stimulated dopamine neurons in our ChR2 group (left, bottom) at the beginning of X when preceded by AC trials. Rates of responding are represented as mean magazine entries (±SEM). Stimulation of dopamine did not alter rates of responding in the magazine where a repeated-measures ANOVA revealed no cue by group (F₍_4,140)=0.180, p=0.948),cue by session (F_(1,35)=1.854, p=0.182), or any three-way interaction between these terms (F_(4,14)=0.887, p = 0.474). Right: panels indicate data from preconditioning during our basic sensory preconditioning procedure where we inhibit VTA dopamine in our NpHR group at the transition of B and Y. Note again that inhibition of dopamine neurons does not change the amount of responding in the magazine where a repeated-measures ANOVA revealed no revealed no cue by group (F_(3,117)=0.425, p=0.736),cue by session (F_(1,39)=0.292, p=0.831), or any three-way interaction between these terms (F_(3,117)=0.591, p = 0.622).

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–4 (PDF 686 kb)

Supplementary Methods Checklist (PDF 428 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sharpe, M., Chang, C., Liu, M. et al. Dopamine transients are sufficient and necessary for acquisition of model-based associations. Nat Neurosci 20, 735–742 (2017). https://doi.org/10.1038/nn.4538

Download citation

Received: 03 June 2016
Accepted: 28 February 2017
Published: 03 April 2017
Issue Date: May 2017
DOI: https://doi.org/10.1038/nn.4538

This article is cited by

Midbrain signaling of identity prediction errors depends on orbitofrontal cortex networks
- Qingfang Liu
- Yao Zhao
- Thorsten Kahnt
Nature Communications (2024)
Prefrontal signals precede striatal signals for biased credit assignment in motivational learning biases
- Johannes Algermissen
- Jennifer C. Swart
- Hanneke E. M. den Ouden
Nature Communications (2024)
Dopaminergic prediction errors in the ventral tegmental area reflect a multithreaded predictive model
- Yuji K. Takahashi
- Thomas A. Stalnaker
- Geoffrey Schoenbaum
Nature Neuroscience (2023)
Neural substrates of parallel devaluation-sensitive and devaluation-insensitive Pavlovian learning in humans
- Eva R. Pool
- Wolfgang M. Pauli
- John P. O’Doherty
Nature Communications (2023)
Rethinking model-based and model-free influences on mental effort and striatal prediction errors
- Carolina Feher da Silva
- Gaia Lombardi
- Todd A. Hare
Nature Human Behaviour (2023)