Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation

A Corrigendum to this article was published on 01 May 2010

This article has been updated

Abstract

The basal ganglia support learning to exploit decisions that have yielded positive outcomes in the past. In contrast, limited evidence implicates the prefrontal cortex in the process of making strategic exploratory decisions when the magnitude of potential outcomes is unknown. Here we examine neurogenetic contributions to individual differences in these distinct aspects of motivated human behavior, using a temporal decision-making task and computational analysis. We show that two genes controlling striatal dopamine function, DARPP-32 (also called PPP1R1B) and DRD2, are associated with exploitative learning to adjust response times incrementally as a function of positive and negative decision outcomes. In contrast, a gene primarily controlling prefrontal dopamine function (COMT) is associated with a particular type of 'directed exploration', in which exploratory decisions are made in proportion to Bayesian uncertainty about whether other choices might produce outcomes that are better than the status quo. Quantitative model fits reveal that genetic factors modulate independent parameters of a reinforcement learning system.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Task conditions: decreasing expected value (DEV), constant expected value (CEV), increasing expected value (IEV) and constant expected value–reverse (CEVR).
Figure 2: Response times as a function of trial number, smoothed (with weighted linear least-squares fit) over a ten-trial window.
Figure 3: Relative within-subjects biases to speed RTs in DEV relative to CEV (DEVdiff = CEV − DEV) and to slow RTs in IEV (IEVdiff = IEV − CEV).
Figure 4: Trial-to-trial RT adjustments in a single subject.
Figure 5: Genetic effects on reinforcement model parameters.
Figure 6: Evolution of action-value distributions.
Figure 7: COMT gene predicts directed exploration toward uncertain responses.

Similar content being viewed by others

Change history

  • 09 September 2009

    In the version of this article initially published, the last sentence of the second new paragraph on page 1065 read “that is, the following term was added to the RT prediction: ρ[σslow(s,t) – σfast(s,t)] , where ρ is a free parameter." A variable in the equation contained in this sentence was incorrect. The sentence should read “that is, the following term was added to the RT prediction: ρ[µslow(s,t) – µfast(s,t)], where ρ is a free parameter.” The error has been corrected in the HTML and PDF versions of the article.

References

  1. Scheres, A. & Sanfey, A.G. Individual differences in decision making: Drive and Reward Responsiveness affect strategic bargaining in economic games. Behav. Brain Funct. 2, 35 (2006).

    Article  PubMed  Google Scholar 

  2. Hsu, M., Bhatt, M., Adolphs, R., Tranel, D. & Camerer, C.F. Neural systems responding to degrees of uncertainty in human decision-making. Science 310, 1680–1683 (2005).

    Article  CAS  PubMed  Google Scholar 

  3. Frank, M.J., Woroch, B.S. & Curran, T. Error-related negativity predicts reinforcement learning and conflict biases. Neuron 47, 495–501 (2005).

    Article  CAS  PubMed  Google Scholar 

  4. Gittins, J.C. & Jones, D. A dynamic allocation index for the sequential design of experiments. in Progress in Statistics (eds. Gani, J., Sarkadi, K. & Vincze, I.), 241–266 (North Holland Publishing Company, Amsterdam, 1974).

    Google Scholar 

  5. Sutton, R.S. & Barto, A.G. Reinforcement Learning: An Introduction (MIT Press, Cambridge, Massachusetts, USA, 1998).

    Google Scholar 

  6. Daw, N.D., O'Doherty, J.P., Dayan, P., Seymour, B. & Dolan, R.J. Cortical substrates for exploratory decisions in humans. Nature 441, 876–879 (2006).

    Article  CAS  PubMed  Google Scholar 

  7. Cohen, J.D., McClure, S.M. & Yu, A.J. Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration. Phil. Trans. R. Soc. Lond. B 362, 933–942 (2007).

    Article  Google Scholar 

  8. Depue, R.A. & Collins, P.F. Neurobiology of the structure of personality: dopamine, facilitation of incentive motivation, and extraversion. Behav. Brain Sci. 22, 491–517 (2001).

    Article  Google Scholar 

  9. Meyer-Lindenberg, A. et al. Genetic evidence implicating DARPP-32 in human frontostriatal structure, function, and cognition. J. Clin. Invest. 117, 672–682 (2007).

    Article  CAS  PubMed  Google Scholar 

  10. Frank, M.J., Moustafa, A.A., Haughey, H.M., Curran, T. & Hutchison, K.E. Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning. Proc. Natl. Acad. Sci. USA 104, 16311–16316 (2007).

    Article  CAS  PubMed  Google Scholar 

  11. Klein, T.A. et al. Genetically determined differences in learning from errors. Science 318, 1642–1645 (2007).

    Article  CAS  PubMed  Google Scholar 

  12. Ouimet, C.C., Miller, P.E., Hemmings, H.C., Walaas, S.I. & Greengard, P. DARPP-32, a dopamine- and adenosine 3':5'-monophosphate-regulated phosphoprotein enriched in dopamine-innervated brain regions. III. Immunocytochemical localization. J. Neurosci. 4, 111–124 (1984).

    Article  CAS  PubMed  Google Scholar 

  13. Stipanovich, A. et al. A phosphatase cascade by which rewarding stimuli control nucleosomal response. Nature 453, 879–884 (2008).

    Article  CAS  PubMed  Google Scholar 

  14. Calabresi, P. et al. Dopamine and cAMP-regulated phosphoprotein 32 kDa controls both striatal long-term depression and long-term potentiation, opposing forms of synaptic plasticity. J. Neurosci. 20, 8443–8451 (2000).

    Article  CAS  PubMed  Google Scholar 

  15. Hirvonen, M. et al. Erratum: C957T polymorphism of the dopamine D2 receptor (DRD2) gene affects striatal DRD2 availability in vivo. Mol. Psychiatry . 10, 889 (2005).

    Article  CAS  Google Scholar 

  16. Montague, P.R., Dayan, P. & Sejnowski, T.J. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16, 1936–1947 (1996).

    Article  CAS  PubMed  Google Scholar 

  17. Frank, M.J. Dynamic dopamine modulation in the basal ganglia: a neurocomputational account of cognitive deficits in medicated and nonmedicated Parkinsonism. J. Cogn. Neurosci. 17, 51–72 (2005).

    Article  PubMed  Google Scholar 

  18. Shen, W., Flajolet, M., Greengard, P. & Surmeier, D.J. Dichotomous dopaminergic control of striatal synaptic plasticity. Science 321, 848–851 (2008).

    Article  CAS  PubMed  Google Scholar 

  19. Graybiel, A.M. Habits, rituals, and the evaluative brain. Annu. Rev. Neurosci. 31, 359–387 (2008).

    Article  CAS  PubMed  Google Scholar 

  20. Kakade, S. & Dayan, P. Dopamine: generalization and bonuses. Neural Netw. 15, 549–559 (2002).

    Article  Google Scholar 

  21. Yoshida, W. & Ishii, S. Resolution of uncertainty in prefrontal cortex. Neuron 50, 781–789 (2006).

    Article  CAS  Google Scholar 

  22. Frank, M.J. & Claus, E.D. Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. Psychol. Rev. 113, 300–326 (2006).

    Article  Google Scholar 

  23. Roesch, M.R. & Olson, C.R. Neuronal activity related to reward value and motivation in primate frontal cortex. Science 304, 307–310 (2004).

    Article  CAS  Google Scholar 

  24. Rudebeck, P.H., Walton, M.E., Smyth, A.N., Bannerman, D.M. & Rushworth, M.F.S. Separate neural pathways process different decision costs. Nat. Neurosci. 9, 1161–1168 (2006).

    Article  CAS  Google Scholar 

  25. Meyer-Lindenberg, A. et al. Midbrain dopamine and prefrontal function in humans: interaction and modulation by COMT genotype. Nat. Neurosci. 8, 594–596 (2005).

    Article  CAS  Google Scholar 

  26. Slifstein, M. et al. COMT genotype predicts cortical-limbic D1 receptor availability measured with [11C]NNC112 and PET. Mol. Psychiatry 13, 821–827 (2008).

    Article  CAS  PubMed  Google Scholar 

  27. Gogos, J.A. et al. Catechol-O-methyltransferase-deficient mice exhibit sexually dimorphic changes in catecholamine levels and behavior. Proc. Natl. Acad. Sci. USA 95, 9991–9996 (1998).

    Article  CAS  PubMed  Google Scholar 

  28. Forbes, E.E. et al. Genetic variation in components of dopamine neurotransmission impacts ventral striatal reactivity associated with impulsivity. Mol. Psychiatry 14, 60–70 (2009).

    Article  CAS  PubMed  Google Scholar 

  29. Moustafa, A.A., Cohen, M.X., Sherman, S.J. & Frank, M.J. A role for dopamine in temporal decision making and reward maximization in parkinsonism. J. Neurosci. 28, 12294–12304 (2008).

    Article  CAS  PubMed  Google Scholar 

  30. Frank, M.J., Seeberger, L.C. & O'Reilly, R.C. By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science 306, 1940–1943 (2004).

    Article  CAS  PubMed  Google Scholar 

  31. Santesso, D., Evins, A., Frank, M., Cowman, E. & Pizzagalli, D. Single dose of a dopamine agonist impairs reinforcement learning in humans: evidence from event-related potentials and computational modeling of striatal-cortical function. Hum. Brain Mapp. 30, 1963–1976 (2009).

    Article  PubMed  Google Scholar 

  32. Wiecki, T.V., Riedinger, K., Meyerhofer, A., Schmidt, W.J. & Frank, M.J. A neurocomputational account of catalepsy sensitization induced by D2 receptor blockade in rats: context dependency, extinction, and renewal. Psychopharmacology (Berl.) 204, 265–277 (2009).

    Article  CAS  Google Scholar 

  33. Bayer, H.M. & Glimcher, P.W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47, 129–141 (2005).

    Article  CAS  PubMed  Google Scholar 

  34. O'Doherty, J. et al. Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science 304, 452–454 (2004).

    Article  CAS  PubMed  Google Scholar 

  35. O'Reilly, R.C., Frank, M.J., Hazy, T.E. & Watz, B. PVLV: the primary value and learned value Pavlovian learning algorithm. Behav. Neurosci. 121, 31–49 (2007).

    Article  PubMed  Google Scholar 

  36. Nakamura, K. & Hikosaka, O. Role of dopamine in the primate caudate nucleus in reward modulation of saccades. J. Neurosci. 26, 5360–5369 (2006).

    Article  CAS  PubMed  Google Scholar 

  37. Sutton, R.S. Integrated architectures for learning, planning and reacting based on approximating dynamic programming. Proceedings of the Seventh International Conference on Machine Learning (Porter, B.W. & Mooney, R.J., eds.) 216–224 (Morgan Kaufmann, Palo Alto, California, USA, 1990).

    Google Scholar 

  38. Dayan, P. & Sejnowski, T.J. Exploration bonuses and dual control. Mach. Learn. 25, 5–22 (1996).

    Google Scholar 

  39. Daw, N.D., Niv, Y. & Dayan, P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711 (2005).

    Article  CAS  PubMed  Google Scholar 

  40. Niv, Y., Daw, N.D., Joel, D. & Dayan, P. Tonic dopamine: opportunity costs and the control of response vigor. Psychopharmacology (Berl.) 191, 507–520 (2007).

    Article  CAS  Google Scholar 

  41. Dalley, J.W. et al. Time-limited modulation of appetitive Pavlovian memory by D1 and NMDA receptors in the nucleus accumbens. Proc. Natl. Acad. Sci. USA 102, 6189–6194 (2005).

    Article  CAS  Google Scholar 

  42. Zhang, Y. et al. Polymorphisms in human dopamine D2 receptor gene affect gene expression, splicing, and neuronal activity during working memory. Proc. Natl. Acad. Sci. USA 104, 20552–20557 (2007).

    Article  CAS  Google Scholar 

  43. Hollerman, J.R. & Schultz, W. Dopamine neurons report an error in the temporal prediction of reward during learning. Nat. Neurosci. 1, 304–309 (1998).

    Article  CAS  Google Scholar 

  44. Satoh, T., Nakai, S., Sato, T. & Kimura, M. Correlated coding of motivation and outcome of decision by dopamine neurons. J. Neurosci. 23, 9913–9923 (2003).

    Article  CAS  Google Scholar 

  45. Bayer, H.M., Lau, B. & Glimcher, P.W. Statistics of midbrain dopamine neuron spike trains in the awake primate. J. Neurophysiol. 98, 1428–1439 (2007).

    Article  Google Scholar 

  46. Dalley, J.W. et al. Nucleus accumbens D2/3 receptors predict trait impulsivity and cocaine reinforcement. Science 315, 1267–1270 (2007).

    Article  CAS  PubMed  Google Scholar 

  47. Belin, D., Mar, A.C., Dalley, J.W., Robbins, T.W. & Everitt, B.J. High impulsivity predicts the switch to compulsive cocaine-taking. Science 320, 1352–1355 (2008).

    Article  CAS  PubMed  Google Scholar 

  48. Zemel, R.S., Dayan, P. & Pouget, A. Probabilistic interpretation of population codes. Neural Comput. 10, 403–430 (1998).

    Article  CAS  PubMed  Google Scholar 

  49. Ma, W.J., Beck, J.M., Latham, P.E. & Pouget, A. Bayesian inference with probabilistic population codes. Nat. Neurosci. 9, 1432–1438 (2006).

    Article  CAS  PubMed  Google Scholar 

  50. Ye, S., Dhillon, S., Ke, X., Collins, A.R. & Day, I.N. An efficient procedure for genotyping single nucleotide polymorphisms. Nucleic Acids Res. 29, e88-1–e88–8 (2001).

    Article  Google Scholar 

Download references

Acknowledgements

We thank S. Williamson and E. Carter for help with DNA analysis and administering cognitive tasks to participants, and N. Daw, P. Dayan, and R. O'Reilly for helpful discussions. This research was supported by US National Institutes of Mental Health grant R01 MH080066-01.

Author information

Authors and Affiliations

Authors

Contributions

M.J.F., B.B.D. and F.M. designed the study; M.J.F. conducted the modeling and analyzed the behavioral data; B.B.D. collected data; J.O.-T. and F.M. extracted the DNA and conducted genotyping; M.J.F., B.B.D. and F.M. wrote the manuscript.

Corresponding author

Correspondence to Michael J Frank.

Supplementary information

Supplementary Text and Figures

Supplementary Data Analysis (PDF 862 kb)

Supplementary Video 1

Single subject evolution of beta distributions, DEV. (MPG 1444 kb)

Supplementary Video 2

Single subject evolution of beta distributions, CEV. (MPG 1467 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Frank, M., Doll, B., Oas-Terpstra, J. et al. Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation. Nat Neurosci 12, 1062–1068 (2009). https://doi.org/10.1038/nn.2342

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nn.2342

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing