Believing in dopamine

Gershman, Samuel J.; Uchida, Naoshige

doi:10.1038/s41583-019-0220-7

Perspective
Published: 30 September 2019

OPINION

Believing in dopamine

Nature Reviews Neuroscience volume 20, pages 703–714 (2019)Cite this article

13k Accesses
115 Citations
102 Altmetric
Metrics details

Subjects

Abstract

Midbrain dopamine signals are widely thought to report reward prediction errors that drive learning in the basal ganglia. However, dopamine has also been implicated in various probabilistic computations, such as encoding uncertainty and controlling exploration. Here, we show how these different facets of dopamine signalling can be brought together under a common reinforcement learning framework. The key idea is that multiple sources of uncertainty impinge on reinforcement learning computations: uncertainty about the state of the environment, the parameters of the value function and the optimal action policy. Each of these sources plays a distinct role in the prefrontal cortex–basal ganglia circuit for reinforcement learning and is ultimately reflected in dopamine activity. The view that dopamine plays a central role in the encoding and updating of beliefs brings the classical prediction error theory into alignment with more recent theories of Bayesian reinforcement learning.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: The neural architecture for reinforcement learning under state uncertainty.**

**Fig. 2: Experimental evidence for reflections of state uncertainty in dopamine signals.**

**Fig. 3: Experimental evidence for uncertainty-dependent dopamine signals in a perceptual decision making task.**

**Fig. 4: Two forms of uncertainty have distinct effects on exploratory choice and are governed by distinct dopamine afferents.**

The language network as a natural kind within the broader landscape of the human brain

Article 12 April 2024

Evelina Fedorenko, Anna A. Ivanova & Tamar I. Regev

From compulsivity to compulsion: the neural basis of compulsive disorders

Article 09 April 2024

Trevor W. Robbins, Paula Banca & David Belin

Conjunctive encoding of exploratory intentions and spatial information in the hippocampus

Article Open access 15 April 2024

Yi-Fan Zeng, Ke-Xin Yang, … Ning Zhou

References

Watabe-Uchida, M., Eshel, N. & Uchida, N. Neural circuitry of reward prediction error. Annu. Rev. Neurosci. 40, 373–394 (2017).
Article CAS PubMed PubMed Central Google Scholar
Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997).
Article CAS PubMed Google Scholar
Courville, A. C., Daw, N. D. & Touretzky, D. S. Bayesian theories of conditioning in a changing world. Trends Cogn. Sci. 10, 294–300 (2006).
Article PubMed Google Scholar
Gershman, S. J., Blei, D. M. & Niv, Y. Context, learning, and extinction. Psychol. Rev. 117, 197–209 (2010).
Article PubMed Google Scholar
Gershman, S. J. A Unifying probabilistic view of associative learning. PLOS Comput. Biol. 11, e1004567 (2015).
Article CAS PubMed PubMed Central Google Scholar
Kakade, S. & Dayan, P. Acquisition and extinction in autoshaping. Psychol. Rev. 109, 533–544 (2002).
Article PubMed Google Scholar
Friston, K. et al. Active inference and epistemic value. Cogn. Neurosci. 6, 187–214 (2015).
Article PubMed Google Scholar
Gershman, S. J. Deconstructing the human algorithms for exploration. Cognition 173, 34–42 (2018).
Article PubMed Google Scholar
Speekenbrink, M. & Konstantinidis, E. Uncertainty and exploration in a restless bandit problem. Top. Cogn. Sci. 7, 351–367 (2015).
Article PubMed Google Scholar
Wilson, R. C., Geana, A., White, J. M., Ludvig, E. A. & Cohen, J. D. Humans use directed and random exploration to solve the explore-exploit dilemma. J. Exp. Psychol. Gen. 143, 2074–2081 (2014).
Article PubMed PubMed Central Google Scholar
Ma, W. J. & Jazayeri, M. Neural coding of uncertainty and probability. Annu. Rev. Neurosci. 37, 205–220 (2014).
Article CAS PubMed Google Scholar
Rao, R. P. N. Decision making under uncertainty: a neural model based on partially observable Markov decision processes. Front. Comput. Neurosci. 4, 146 (2010).
Article PubMed PubMed Central Google Scholar
Daw, N. D., Courville, A. C. & Touretzky, D. S. Representation and timing in theories of the dopamine system. Neural Comput. 18, 1637–1677 (2006).
Article PubMed Google Scholar
Gershman, S. J. & Daw, N. D. Reinforcement learning and episodic memory in humans and animals: an integrative framework. Annu. Rev. Psychol. 68, 101–128 (2017).
Article PubMed Google Scholar
Jazayeri, M. & Movshon, J. A. Optimal representation of sensory information by neural populations. Nat. Neurosci. 9, 690–696 (2006).
Article CAS PubMed Google Scholar
Grabska-Barwińska, A. et al. A probabilistic approach to demixing odors. Nat. Neurosci. 20, 98–106 (2017).
Article CAS PubMed Google Scholar
Friston, K., FitzGerald, T., Rigoli, F., Schwartenbeck, P. & Pezzulo, G. Active inference: a process theory. Neural Comput. 29, 1–49 (2017).
Article PubMed Google Scholar
Buesing, L., Bill, J., Nessler, B. & Maass, W. Neural dynamics as sampling: a model for stochastic computation in recurrent networks of spiking neurons. PLOS Comput. Biol. 7, e1002211 (2011).
Article CAS PubMed PubMed Central Google Scholar
Pecevski, D., Buesing, L. & Maass, W. Probabilistic inference in general graphical models through sampling in stochastic networks of spiking neurons. PLOS Comput. Biol. 7, e1002294 (2011).
Article CAS PubMed PubMed Central Google Scholar
Haefner, R. M., Berkes, P. & Fiser, J. Perceptual decision-making as probabilistic inference by neural sampling. Neuron 90, 649–660 (2016).
Article CAS PubMed Google Scholar
Orbán, G., Berkes, P., Fiser, J. & Lengyel, M. Neural variability and sampling-based probabilistic representations in the visual cortex. Neuron 92, 530–543 (2016).
Article CAS PubMed PubMed Central Google Scholar
Ting, C.-C., Yu, C.-C., Maloney, L. T. & Wu, S.-W. Neural mechanisms for integrating prior knowledge and likelihood in value-based probabilistic inference. J. Neurosci. 35, 1792–1805 (2015).
Article CAS PubMed PubMed Central Google Scholar
Yoshida, W. & Ishii, S. Resolution of uncertainty in prefrontal cortex. Neuron 50, 781–789 (2006).
Article CAS PubMed Google Scholar
Yoshida, W., Seymour, B., Friston, K. J. & Dolan, R. J. Neural mechanisms of belief inference during cooperative games. J. Neurosci. 30, 10744–10751 (2010).
Article CAS PubMed PubMed Central Google Scholar
Fleming, S. M., van der Putten, E. J. & Daw, N. D. Neural mediators of changes of mind about perceptual decisions. Nat. Neurosci. 21, 617–624 (2018).
Article CAS PubMed PubMed Central Google Scholar
Kumaran, D., Banino, A., Blundell, C., Hassabis, D. & Dayan, P. Computations underlying social hierarchy learning: distinct neural mechanisms for updating and representing self-relevant information. Neuron 92, 1135–1147 (2016).
Article CAS PubMed PubMed Central Google Scholar
Turner, M. S., Cipolotti, L., Yousry, T. A. & Shallice, T. Confabulation: damage to a specific inferior medial prefrontal system. Cortex 44, 637–648 (2008).
Article PubMed Google Scholar
Karlsson, M. P., Tervo, D. G. R. & Karpova, A. Y. Network resets in medial prefrontal cortex mark the onset of behavioral uncertainty. Science 338, 135–139 (2012).
Article CAS PubMed Google Scholar
Fuhs, M. C. & Touretzky, D. S. Context learning in the rodent hippocampus. Neural Comput. 19, 3173–3215 (2007).
Article PubMed Google Scholar
Dufort, R. H., Guttman, N. & Kimble, G. A. One-trial discrimination reversal in the white rat. J. Comp. Physiol. Psychol. 47, 248–249 (1954).
Article CAS PubMed Google Scholar
Pubols, B. H. Jr. Serial reversal learning as a function of the number of trials per reversal. J. Comp. Physiol. Psychol. 55, 66–68 (1962).
Article PubMed Google Scholar
Bromberg-Martin, E. S., Matsumoto, M., Hong, S. & Hikosaka, O. A pallidus-habenula-dopamine pathway signals inferred stimulus values. J. Neurophysiol. 104, 1068–1076 (2010).
Article PubMed PubMed Central Google Scholar
Gallistel, C. R., Mark, T. A., King, A. P. & Latham, P. E. The rat approximates an ideal detector of changes in rates of reward: implications for the law of effect. J. Exp. Psychol. Anim. Behav. Process. 27, 354–372 (2001).
Article CAS PubMed Google Scholar
Jang, A. I. et al. The role of frontal cortical and medial-temporal lobe brain areas in learning a Bayesian prior belief on reversals. J. Neurosci. 35, 11751–11760 (2015).
Article CAS PubMed PubMed Central Google Scholar
Hampton, A. N., Bossaerts, P. & O’Doherty, J. P. The role of the ventromedial prefrontal cortex in abstract state-based inference during decision making in humans. J. Neurosci. 26, 8360–8367 (2006).
Article CAS PubMed PubMed Central Google Scholar
Mondragón, E., Alonso, E. & Kokkola, N. Associative learning should go deep. Trends Cogn. Sci. 21, 822–825 (2017).
Article PubMed Google Scholar
Gibbon, J. Scalar expectancy theory and weber’s law in animal timing. Psychol. Rev. 84, 279–325 (1977).
Article Google Scholar
Gibbon, J., Church, R. M. & Meck, W. H. Scalar timing in memory. Ann. NY Acad. Sci. 423, 52–77 (1984).
Article CAS PubMed Google Scholar
Shi, Z., Church, R. M. & Meck, W. H. Bayesian optimization of time perception. Trends Cogn. Sci. 17, 556–564 (2013).
Article PubMed Google Scholar
Petter, E. A., Gershman, S. J. & Meck, W. H. Integrating models of interval timing and reinforcement learning. Trends Cogn. Sci. 22, 911–922 (2018).
Article PubMed Google Scholar
Ma, W. J., Beck, J. M., Latham, P. E. & Pouget, A. Bayesian inference with probabilistic population codes. Nat. Neurosci. 9, 1432–1438 (2006).
Article CAS PubMed Google Scholar
Ludvig, E. A., Sutton, R. S. & Kehoe, E. J. Stimulus representation and the timing of reward-prediction errors in models of the dopamine system. Neural Comput. 20, 3034–3054 (2008).
Article PubMed Google Scholar
Ludvig, E. A., Sutton, R. S. & Kehoe, E. J. Evaluating the TD model of classical conditioning. Learn. Behav. 40, 305–319 (2012).
Article PubMed Google Scholar
Gershman, S. J., Moustafa, A. A. & Ludvig, E. A. Time representation in reinforcement learning models of the basal ganglia. Front. Comput. Neurosci. 7, 194 (2014).
Article PubMed PubMed Central Google Scholar
Mello, G. B. M., Soares, S. & Paton, J. J. A scalable population code for time in the striatum. Curr. Biol. 25, 1113–1122 (2015).
Article CAS PubMed Google Scholar
Akhlaghpour, H. et al. Dissociated sequential activity and stimulus encoding in the dorsomedial striatum during spatial working memory. eLife 5, e19507 (2016).
Article PubMed PubMed Central Google Scholar
Kim, J., Kim, D. & Jung, M. W. Distinct dynamics of striatal and prefrontal neural activity during temporal discrimination. Front. Integr. Neurosci. 12, 34 (2018).
Bakhurin, K. I. et al. Differential encoding of time by prefrontal and striatal network dynamics. J. Neurosci. 37, 854–870 (2017).
Article PubMed PubMed Central Google Scholar
Adler, A. et al. Temporal convergence of dynamic cell assemblies in the striato-pallidal network. J. Neurosci. 32, 2473–2484 (2012).
Article CAS PubMed PubMed Central Google Scholar
Emmons, E. B. et al. Rodent medial frontal control of temporal processing in the dorsomedial striatum. J. Neurosci. 37, 8718–8733 (2017).
Article CAS PubMed PubMed Central Google Scholar
Gouvêa, T. S. et al. Striatal dynamics explain duration judgments. eLife 4, e11386 (2015).
Article PubMed PubMed Central Google Scholar
Takahashi, Y. K., Langdon, A. J., Niv, Y. & Schoenbaum, G. Temporal specificity of reward prediction errors signaled by putative dopamine neurons in rat VTA depends on ventral striatum. Neuron 91, 182–193 (2016).
Article CAS PubMed PubMed Central Google Scholar
Wiener, S. I. Spatial and behavioral correlates of striatal neurons in rats performing a self-initiated navigation task. J. Neurosci. 13, 3802–3817 (1993).
Article CAS PubMed PubMed Central Google Scholar
Lavoie, A. M. & Mizumori, S. J. Spatial, movement- and reward-sensitive discharge by medial ventral striatum neurons of rats. Brain Res. 638, 157–168 (1994).
Article CAS PubMed Google Scholar
Caan, W., Perrett, D. I. & Rolls, E. T. Responses of striatal neurons in the behaving monkey. 2. Visual processing in the caudal neostriatum. Brain Res. 290, 53–65 (1984).
Article CAS PubMed Google Scholar
Brown, V. J., Desimone, R. & Mishkin, M. Responses of cells in the tail of the caudate nucleus during visual discrimination learning. J. Neurophysiol. 74, 1083–1094 (1995).
Article CAS PubMed Google Scholar
Kakade, S. & Dayan, P. Dopamine: generalization and bonuses. Neural Netw. 15, 549–559 (2002).
Article PubMed Google Scholar
Schultz, W. & Romo, R. Dopamine neurons of the monkey midbrain: contingencies of responses to stimuli eliciting immediate behavioral reactions. J. Neurophysiol. 63, 607–624 (1990).
Article CAS PubMed Google Scholar
Kobayashi, S. & Schultz, W. Reward contexts extend dopamine signals to unrewarded stimuli. Curr. Biol. 24, 56–62 (2014).
Article CAS PubMed PubMed Central Google Scholar
Matsumoto, H., Tian, J., Uchida, N. & Watabe-Uchida, M. Midbrain dopamine neurons signal aversion in a reward-context-dependent manner. eLife 5, e17328 (2016).
Article PubMed PubMed Central Google Scholar
Hollerman, J. R. & Schultz, W. Dopamine neurons report an error in the temporal prediction of reward during learning. Nat. Neurosci. 1, 304–309 (1998).
Article CAS PubMed Google Scholar
Starkweather, C. K., Babayan, B. M., Uchida, N. & Gershman, S. J. Dopamine reward prediction errors reflect hidden-state inference across time. Nat. Neurosci. 20, 581–589 (2017).
Article CAS PubMed PubMed Central Google Scholar
Fiorillo, C. D., Newsome, W. T. & Schultz, W. The temporal precision of reward prediction in dopamine neurons. Nat. Neurosci. 11, 966–973 (2008).
Article CAS PubMed Google Scholar
Nakahara, H., Itoh, H., Kawagoe, R., Takikawa, Y. & Hikosaka, O. Dopamine neurons can represent context-dependent prediction error. Neuron 41, 269–280 (2004).
Article CAS PubMed Google Scholar
Starkweather, C. K., Gershman, S. J. & Uchida, N. The medial prefrontal cortex shapes dopamine reward prediction errors under state uncertainty. Neuron 98, 616–629.e6 (2018).
Article CAS PubMed PubMed Central Google Scholar
Babayan, B. M., Uchida, N. & Gershman, S. J. Belief state representation in the dopamine system. Nat. Commun. 9, 1891 (2018).
Article CAS PubMed PubMed Central Google Scholar
Nomoto, K., Schultz, W., Watanabe, T. & Sakagami, M. Temporally extended dopamine responses to perceptually demanding reward-predictive stimuli. J. Neurosci. 30, 10692–10702 (2010).
Article CAS PubMed PubMed Central Google Scholar
Lak, A., Nomoto, K., Keramati, M., Sakagami, M. & Kepecs, A. Midbrain dopamine neurons signal belief in choice accuracy during a perceptual decision. Curr. Biol. 27, 821–832 (2017).
Article CAS PubMed PubMed Central Google Scholar
Sarno, S., de Lafuente, V., Romo, R. & Parga, N. Dopamine reward prediction error signal codes the temporal evaluation of a perceptual decision report. Proc. Natl Acad. Sci. USA 114, E10494–E10503 (2017).
Article CAS PubMed PubMed Central Google Scholar
Ghavamzadeh, M., Mannor, S., Pineau, J. & Tamar, A. Bayesian Reinforcement Learning: A Survey (Now Publishers, 2015).
Gershman, S. J. Dopamine, inference, and uncertainty. Neural Comput. 29, 3311–3326 (2017).
Article PubMed Google Scholar
Kamin, L. J. in Punishment and Aversive Behavior (eds Campbell, B. A. & Church, R. M.) 279–296 (Appleton-Century-Crofts, 1969).
Rescorla, R. A. & Wagner, A. R. in Classical Conditioning II: Recent Research and Theory (eds Black, A. H. & Prokasy, W. F.) 64–99 (Appleton-Century-Crofts, 1972).
Waelti, P., Dickinson, A. & Schultz, W. Dopamine responses comply with basic assumptions of formal learning theory. Nature 412, 43–48 (2001).
Article CAS PubMed Google Scholar
Steinberg, E. E. et al. A causal link between prediction errors, dopamine neurons and learning. Nat. Neurosci. 16, 966–973 (2013).
Article CAS PubMed PubMed Central Google Scholar
Miller, R. R. & Matute, H. Biological significance in forward and backward blocking: resolution of a discrepancy between animal conditioning and human causal judgment. J. Exp. Psychol. Gen. 125, 370–386 (1996).
Article CAS PubMed Google Scholar
Urushihara, K. & Miller, R. R. Backward blocking in first-order conditioning. PsycEXTRA Dataset https://doi.org/10.1037/e527342012-212 (2007).
Article Google Scholar
Blaisdell, A. P., Gunther, L. M. & Miller, R. R. Recovery from blocking achieved by extinguishing the blocking CS. Anim. Learn. Behav. 27, 63–76 (1999).
Article Google Scholar
Dayan, P. & Kakade, S. Explaining away in weight space. Adv. Neural Inf. Process. Syst. 13, 451–457 (2001).
Google Scholar
Miller, R. R. & Witnauer, J. E. Retrospective revaluation: the phenomenon and its theoretical implications. Behav. Process. 123, 15–25 (2016).
Article Google Scholar
Lubow, R. E. Latent inhibition. Psychol. Bull. 79, 398–407 (1973).
Article CAS PubMed Google Scholar
Aguado, L., Symonds, M. & Hall, G. Interval between preexposure and test determines the magnitude of latent inhibition: Implications for an interference account. Anim. Learn. Behav. 22, 188–194 (1994).
Article Google Scholar
Sadacca, B. F., Jones, J. L. & Schoenbaum, G. Midbrain dopamine neurons compute inferred and cached value prediction errors in a common framework. eLife 5, e13665 (2016).
Article CAS PubMed PubMed Central Google Scholar
Young, A. M., Joseph, M. H. & Gray, J. A. Latent inhibition of conditioned dopamine release in rat nucleus accumbens. Neuroscience 54, 5–9 (1993).
Article CAS PubMed Google Scholar
Frank, M. J. & Claus, E. D. Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. Psychol. Rev. 113, 300–326 (2006).
Article PubMed Google Scholar
Deco, G. & Rolls, E. T. Synaptic and spiking dynamics underlying reward reversal in the orbitofrontal cortex. Cereb. Cortex 15, 15–30 (2005).
Article PubMed Google Scholar
Wilson, R. C., Takahashi, Y. K., Schoenbaum, G. & Niv, Y. Orbitofrontal cortex as a cognitive map of task space. Neuron 81, 267–279 (2014).
Article CAS PubMed PubMed Central Google Scholar
Sadacca, B. F. et al. Orbitofrontal neurons signal sensory associations underlying model-based inference in a sensory preconditioning task. eLife 7, e30373 (2018).
Article PubMed PubMed Central Google Scholar
Jones, J. L. et al. Orbitofrontal cortex supports behavior and learning using inferred but not cached values. Science 338, 953–956 (2012).
Article CAS PubMed PubMed Central Google Scholar
Payzan-LeNestour, É. & Bossaerts, P. Do not bet on the unknown versus try to find out more: estimation uncertainty and ‘unexpected uncertainty’ both modulate exploration. Front. Neurosci. 6, 150 (2012).
Article PubMed PubMed Central Google Scholar
Schulz, E., Konstantinidis, E. & Speekenbrink, M. Putting bandits into context: How function learning supports decision making. J. Exp. Psychol. Learn. Mem. Cogn. 44, 927–943 (2018).
Article PubMed Google Scholar
Myers, J. L. & Sadler, E. Effects of range of payoffs as a variable in risk taking. J. Exp. Psychol. 60, 306–309 (1960).
Article CAS PubMed Google Scholar
Busemeyer, J. R. & Townsend, J. T. Decision field theory: a dynamic-cognitive approach to decision making in an uncertain environment. Psychol. Rev. 100, 432–459 (1993).
Article CAS PubMed Google Scholar
Gershman, S. J. Uncertainty and exploration. Decision 6, 277–286 (2019).
Article PubMed Google Scholar
Frank, M. J., Doll, B. B., Oas-Terpstra, J. & Moreno, F. Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation. Nat. Neurosci. 12, 1062–1068 (2009).
Article CAS PubMed PubMed Central Google Scholar
Humphries, M. D., Khamassi, M. & Gurney, K. Dopaminergic control of the exploration-exploitation trade-off via the basal ganglia. Front. Neurosci. 6, 9 (2012).
Article PubMed PubMed Central Google Scholar
Pezzulo, G., Rigoli, F. & Friston, K. J. Hierarchical active inference: a theory of motivated control. Trends Cogn. Sci. 22, 294–306 (2018).
Article PubMed PubMed Central Google Scholar
Botvinick, M. & Toussaint, M. Planning as inference. Trends Cogn. Sci. 16, 485–488 (2012).
Article PubMed Google Scholar
FitzGerald, T. H. B., Dolan, R. J. & Friston, K. Dopamine, reward learning, and active inference. Front. Comput. Neurosci. 9, 136 (2015).
Article PubMed PubMed Central Google Scholar
Friston, K. J. et al. Dopamine, affordance and active inference. PLOS Comput. Biol. 8, e1002327 (2012).
Article CAS PubMed PubMed Central Google Scholar
Weele, C. M. V. et al. Dopamine enhances signal-to-noise ratio in cortical-brainstem encoding of aversive stimuli. Nature 563, 397–401 (2018).
Article CAS Google Scholar
Thurley, K., Senn, W. & Lüscher, H.-R. Dopamine increases the gain of the input-output response of rat prefrontal pyramidal neurons. J. Neurophysiol. 99, 2985–2997 (2008).
Article PubMed Google Scholar
Gershman, S. J., Norman, K. A. & Niv, Y. Discovering latent causes in reinforcement learning. Curr. Opin. Behav. Sci. 5, 43–50 (2015).
Article Google Scholar
Gershman, S. J., Monfils, M.-H., Norman, K. A. & Niv, Y. The computational nature of memory modification. eLife 6, e23763 (2017).
Article PubMed PubMed Central Google Scholar
Redish, A. D., Jensen, S., Johnson, A. & Kurth-Nelson, Z. Reconciling reinforcement learning models with behavioral extinction and renewal: implications for addiction, relapse, and problem gambling. Psychol. Rev. 114, 784–805 (2007).
Article PubMed Google Scholar
Gardner, M. P. H., Schoenbaum, G. & Gershman, S. J. Rethinking dopamine as generalized prediction error. Proc. Biol. Sci. 285, 20181645 (2018).
Article PubMed PubMed Central Google Scholar
Gershman, S. J. The successor representation: its computational logic and neural substrates. J. Neurosci. 38, 7193–7200 (2018).
Article CAS PubMed PubMed Central Google Scholar
Le Bouc, R. et al. Computational dissection of dopamine motor and motivational functions in humans. J. Neurosci. 36, 6623–6633 (2016).
Article CAS PubMed PubMed Central Google Scholar
Walton, M. E. & Bouret, S. What is the relationship between dopamine and effort? Trends Neurosci. 42, 79–91 (2019).
Article CAS PubMed PubMed Central Google Scholar
Westbrook, A. & Braver, T. S. Dopamine does double duty in motivating cognitive effort. Neuron 91, 708 (2016).
Article CAS PubMed Google Scholar
Niv, Y., Daw, N. D., Joel, D. & Dayan, P. Tonic dopamine: opportunity costs and the control of response vigor. Psychopharmacology 191, 507–520 (2007).
Article CAS PubMed Google Scholar
Sutton, R. S. Learning to predict by the methods of temporal differences. Mach. Learn. 3, 9–44 (1988).
Google Scholar
Kaelbling, L. P., Littman, M. L. & Cassandra, A. R. Planning and acting in partially observable stochastic domains. Artif. Intell. 101, 99–134 (1998).
Article Google Scholar
Pan, W.-X., Schmidt, R., Wickens, J. R. & Hyland, B. I. Dopamine cells respond to predicted events during classical conditioning: evidence for eligibility traces in the reward-learning network. J. Neurosci. 25, 6235–6242 (2005).
Article CAS PubMed PubMed Central Google Scholar
Menegas, W., Babayan, B. M., Uchida, N. & Watabe-Uchida, M. Opposite initialization to novel cues in dopamine signaling in ventral and posterior striatum in mice. eLife 6, e21886 (2017).
Article PubMed PubMed Central Google Scholar
Tobler, P. N., Fiorillo, C. D. & Schultz, W. Adaptive coding of reward value by dopamine neurons. Science 307, 1642–1645 (2005).
Article CAS PubMed Google Scholar
Behrens, T. E. J., Woolrich, M. W., Walton, M. E. & Rushworth, M. F. S. Learning the value of information in an uncertain world. Nat. Neurosci. 10, 1214–1221 (2007).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

Author contributions

The authors contributed equally to all aspects of the article.

Reviewer information

Nature Reviews Neuroscience thanks J. Pearson and the other, anonymous, reviewers for their contribution to the peer review of this work.

Author information

Authors and Affiliations

Department of Psychology, Center for Brain Science, Harvard University, Cambridge, MA, USA
Samuel J. Gershman
Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA, USA
Naoshige Uchida

Authors

Samuel J. Gershman
View author publications
You can also search for this author in PubMed Google Scholar
Naoshige Uchida
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Samuel J. Gershman.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Glossary

Active inference: The hypothesis that biological agents will take actions to reduce expected surprise.
Free-energy principle: The hypothesis that the objective of brain function is to minimize expected (average) surprise.
Posterior probability distribution: The conditional probability of latent variables (for example, hidden states) conditional on observed variables (for example, sensory data).
Sufficient statistic: A function of a data sample that completely summarizes the information contained in the data about the parameters of a probability distribution.
Value function: The mapping from states to long-term expected future rewards (typically discounted to reflect a preference for sooner over later rewards).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gershman, S.J., Uchida, N. Believing in dopamine. Nat Rev Neurosci 20, 703–714 (2019). https://doi.org/10.1038/s41583-019-0220-7

Download citation

Accepted: 29 August 2019
Published: 30 September 2019
Issue Date: November 2019
DOI: https://doi.org/10.1038/s41583-019-0220-7

This article is cited by

State and rate-of-change encoding in parallel mesoaccumbal dopamine pathways
- Johannes W. de Jong
- Yilan Liang
- Stephan Lammel
Nature Neuroscience (2024)
Dopamine-independent effect of rewards on choices through hidden-state inference
- Marta Blanco-Pozo
- Thomas Akam
- Mark E. Walton
Nature Neuroscience (2024)
A mesocortical glutamatergic pathway modulates neuropathic pain independent of dopamine co-release
- Miao Li
- Guang Yang
Nature Communications (2024)
Blocking D2/D3 dopamine receptors in male participants increases volatility of beliefs when learning to trust others
- Nace Mikus
- Christoph Eisenegger
- Michael Naef
Nature Communications (2023)
Multisensory synchrony of contextual boundaries affects temporal order memory, but not encoding or recognition
- Vincent van de Ven
- Guyon Kleuters
- Joey Stuiver
Psychological Research (2023)