Nicotinic receptors in the ventral tegmental area promote uncertainty-seeking

Naudé, Jérémie; Tolu, Stefania; Dongelmans, Malou; Torquet, Nicolas; Valverde, Sébastien; Rodriguez, Guillaume; Pons, Stéphanie; Maskos, Uwe; Mourot, Alexandre; Marti, Fabio; Faure, Philippe

doi:10.1038/nn.4223

Article
Published: 18 January 2016

Nicotinic receptors in the ventral tegmental area promote uncertainty-seeking

Jérémie Naudé ORCID: orcid.org/0000-0001-5781-6498^1,2,3,
Stefania Tolu^1,2,3,
Malou Dongelmans^1,2,3,
Nicolas Torquet^1,2,3,
Sébastien Valverde^1,2,3,
Guillaume Rodriguez^1,2,3,
Stéphanie Pons⁴,
Uwe Maskos⁴,
Alexandre Mourot^1,2,3,
Fabio Marti^1,2,3 &
…
Philippe Faure^1,2,3

Nature Neuroscience volume 19, pages 471–478 (2016)Cite this article

6498 Accesses
32 Citations
17 Altmetric
Metrics details

Subjects

Abstract

Cholinergic neurotransmission affects decision-making, notably through the modulation of perceptual processing in the cortex. In addition, acetylcholine acts on value-based decisions through as yet unknown mechanisms. We found that nicotinic acetylcholine receptors (nAChRs) expressed in the ventral tegmental area (VTA) are involved in the translation of expected uncertainty into motivational value. We developed a multi-armed bandit task for mice with three locations, each associated with a different reward probability. We found that mice lacking the nAChR β2 subunit showed less uncertainty-seeking than their wild-type counterparts. Using model-based analysis, we found that reward uncertainty motivated wild-type mice, but not mice lacking the nAChR β2 subunit. Selective re-expression of the β2 subunit in the VTA was sufficient to restore spontaneous bursting activity in dopamine neurons and uncertainty-seeking. Our results reveal an unanticipated role for subcortical nAChRs in motivation induced by expected uncertainty and provide a parsimonious account for a wealth of behaviors related to nAChRs in the VTA expressing the β2 subunit.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Decisions under uncertainty in a mouse bandit task using intracranial self-stimulations.**

**Figure 2: Model-based analysis of decisions shows motivation for expected uncertainty.**

**Figure 3: β2*-nAChRs in the VTA affect choices and locomotion.**

**Figure 4: Model-based analysis reveals a role for VTA β2-nAChR in uncertainty-driven motivation.**

**Figure 5: β2*-nAChRs affect decision-making under uncertainty in a dynamical foraging task.**

**Figure 6: New interpretation of behaviors related to VTA nAChRs using the uncertainty model.**

Rewarding-unrewarding prediction signals under a bivalent context in the primate lateral hypothalamus

Article Open access 12 April 2023

Chronic nicotine increases midbrain dopamine neuron activity and biases individual strategies towards reduced exploration in mice

Article Open access 26 November 2021

A quadruple dissociation of reward-related behaviour in mice across excitatory inputs to the nucleus accumbens shell

Article Open access 30 January 2023

References

Everitt, B.J. & Robbins, T.W. Central cholinergic systems and cognition. Annu. Rev. Psychol. 48, 649–684 (1997).
CAS PubMed Google Scholar
Dani, J.A. & Bertrand, D. Nicotinic acetylcholine receptors and nicotinic cholinergic mechanisms of the central nervous system. Annu. Rev. Pharmacol. Toxicol. 47, 699–729 (2007).
Article CAS PubMed Google Scholar
Guillem, K. et al. Nicotinic acetylcholine receptor β2 subunits in the medial prefrontal cortex control attention. Science 333, 888–891 (2011).
Article CAS PubMed Google Scholar
Rangel, A., Camerer, C. & Montague, P.R. A framework for studying the neurobiology of value-based decision making. Nat. Rev. Neurosci. 9, 545–556 (2008).
Article CAS PubMed PubMed Central Google Scholar
Fobbs, W.C. & Mizumori, S.J. Cost-benefit decision circuitry: proposed modulatory role for acetylcholine. Prog. Mol. Biol. Transl. Sci. 122, 233–261 (2014).
Article CAS PubMed Google Scholar
Kolokotroni, K.Z., Rodgers, R.J. & Harrison, A.A. Acute nicotine increases both impulsive choice and behavioral disinhibition in rats. Psychopharmacology (Berl.) 217, 455–473 (2011).
Article CAS Google Scholar
Mendez, I.A., Gilbert, R.J., Bizon, J.L. & Setlow, B. Effects of acute administration of nicotinic and muscarinic cholinergic agonists and antagonists on performance in different cost-benefit decision making tasks in rats. Psychopharmacology (Berl.) 224, 489–499 (2012).
Article CAS Google Scholar
McGrath, D.S. & Barrett, S.P. The comorbidity of tobacco smoking and gambling: a review of the literature. Drug Alcohol Rev. 28, 676–681 (2009).
Article PubMed Google Scholar
Schultz, W. Multiple dopamine functions at different time courses. Annu. Rev. Neurosci. 30, 259–288 (2007).
Article CAS PubMed Google Scholar
Waelti, P., Dickinson, A. & Schultz, W. Dopamine responses comply with basic assumptions of formal learning theory. Nature 412, 43–48 (2001).
Article CAS PubMed Google Scholar
Montague, P.R., Dayan, P. & Sejnowski, T.J. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16, 1936–1947 (1996).
Article CAS PubMed PubMed Central Google Scholar
Berridge, K.C. From prediction error to incentive salience: mesolimbic computation of reward motivation. Eur. J. Neurosci. 35, 1124–1143 (2012).
Article PubMed PubMed Central Google Scholar
Maskos, U. et al. Nicotine reinforcement and cognition restored by targeted expression of nicotinic receptors. Nature 436, 103–107 (2005).
Article CAS PubMed Google Scholar
Mameli-Engvall, M. et al. Hierarchical control of dopamine neuron-firing patterns by nicotinic receptors. Neuron 50, 911–921 (2006).
Article CAS PubMed Google Scholar
Grace, A.A., Floresco, S.B., Goto, Y. & Lodge, D.J. Regulation of firing of dopaminergic neurons and control of goal-directed behaviors. Trends Neurosci. 30, 220–227 (2007).
Article CAS PubMed Google Scholar
Daw, N.D., O'Doherty, J.P., Dayan, P., Seymour, B. & Dolan, R.J. Cortical substrates for exploratory decisions in humans. Nature 441, 876–879 (2006).
Article CAS PubMed PubMed Central Google Scholar
Frank, M.J., Doll, B.B., Oas-Terpstra, J. & Moreno, F. Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation. Nat. Neurosci. 12, 1062–1068 (2009).
Article CAS PubMed PubMed Central Google Scholar
Gittins, J.C. & Jones, D.M. A dynamic allocation index for the discounted multiarmed bandit problem. Biometrika 66, 561–565 (1979).
Article Google Scholar
Scott, P.D. & Markovitch, S. Learning novel domains through curiosity and conjecture. IJCAI (US) 1, 669–674 (1989).
Google Scholar
Kaelbling, L.P. Learning in Embedded Systems (MIT Press, 1993).
Meuleau, N. & Bourgine, P. Exploration of multi-state environments: Local measures and back-propagation of uncertainty. Mach. Learn. 35, 117–154 (1999).
Article Google Scholar
Yu, A.J. & Dayan, P. Uncertainty, neuromodulation, and attention. Neuron 46, 681–692 (2005).
Article CAS PubMed Google Scholar
Bach, D.R. & Dolan, R.J. Knowing how much you don't know: a neural organization of uncertainty estimates. Nat. Rev. Neurosci. 13, 572–586 (2012).
Article CAS PubMed Google Scholar
Oudeyer, P.-Y. & Kaplan, F. What is intrinsic motivation? A typology of computational approaches. Front. Neurorobot. 1, 6 (2007).
Article PubMed PubMed Central Google Scholar
Fiorillo, C.D., Tobler, P.N. & Schultz, W. Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299, 1898–1902 (2003).
Article CAS PubMed Google Scholar
Schuck-Paim, C., Pompilio, L. & Kacelnik, A. State-dependent decisions cause apparent violations of rationality in animal choice. PLoS Biol. 2, e402 (2004).
Article PubMed PubMed Central Google Scholar
Carlezon, W.A. Jr. & Chartoff, E.H. Intracranial self-stimulation (ICSS) in rodents to study the neurobiology of motivation. Nat. Protoc. 2, 2987–2995 (2007).
Article CAS PubMed Google Scholar
Kobayashi, T., Nishijo, H., Fukuda, M., Bureš, J. & Ono, T. Task-dependent representations in rat hippocampal place neurons. J. Neurophysiol. 78, 597–613 (1997).
Article CAS PubMed Google Scholar
Funamizu, A., Ito, M., Doya, K., Kanzaki, R. & Takahashi, H. Uncertainty in action-value estimation affects both action choice and learning rate of the choice behaviors of rats. Eur. J. Neurosci. 35, 1180–1189 (2012).
Article PubMed PubMed Central Google Scholar
Anselme, P., Robinson, M.J.F. & Berridge, K.C. Reward uncertainty enhances incentive salience attribution as sign-tracking. Behav. Brain Res. 238, 53–61 (2013).
Article PubMed Google Scholar
Sutton, R.S. & Barto, A.G. Reinforcement Learning: an introduction (MIT Press, 1998).
Kakade, S. & Dayan, P. Dopamine: generalization and bonuses. Neural Netw. 15, 549–559 (2002).
Article PubMed Google Scholar
Herrnstein, R.J. Relative and absolute strength of response as a function of frequency of reinforcement. J. Exp. Anal. Behav. 4, 267–272 (1961).
Article CAS PubMed PubMed Central Google Scholar
Ishii, S., Yoshida, W. & Yoshimoto, J. Control of exploitation-exploration meta-parameter in reinforcement learning. Neural Netw. 15, 665–687 (2002).
Article PubMed Google Scholar
Yeomans, J. & Baptista, M. Both nicotinic and muscarinic receptors in ventral tegmental area contribute to brain-stimulation reward. Pharmacol. Biochem. Behav. 57, 915–921 (1997).
Article CAS PubMed Google Scholar
Serreau, P., Chabout, J., Suarez, S.V., Naudé, J. & Granon, S. Beta2-containing neuronal nicotinic receptors as major actors in the flexible choice between conflicting motivations. Behav. Brain Res. 225, 151–159 (2011).
Article CAS PubMed Google Scholar
Krugel, L.K., Biele, G., Mohr, P.N., Li, S.-C. & Heekeren, H.R. Genetic variation in dopaminergic neuromodulation influences the ability to rapidly and flexibly adapt decisions. Proc. Natl. Acad. Sci. USA 106, 17951–17956 (2009).
Article CAS PubMed PubMed Central Google Scholar
Niv, Y., Edlund, J.A., Dayan, P. & O'Doherty, J.P. Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain. J. Neurosci. 32, 551–562 (2012).
Article CAS PubMed PubMed Central Google Scholar
Balasubramani, P.P., Chakravarthy, V.S., Ravindran, B. & Moustafa, A.A. An extended reinforcement learning model of basal ganglia to understand the contributions of serotonin and dopamine in risk-based decision making, reward prediction, and punishment learning. Front. Comput. Neurosci. 8, 47 (2014).
Article PubMed PubMed Central Google Scholar
Granon, S., Faure, P. & Changeux, J.-P. Executive and social behaviors under nicotinic receptor regulation. Proc. Natl. Acad. Sci. USA 100, 9596–9601 (2003).
Article CAS PubMed PubMed Central Google Scholar
Picciotto, M.R. et al. Abnormal avoidance learning in mice lacking functional high-affinity nicotine receptor in the brain. Nature 374, 65–67 (1995).
Article CAS PubMed Google Scholar
Maubourguet, N., Lesne, A., Changeux, J.-P., Maskos, U. & Faure, P. Behavioral sequence analysis reveals a novel role for β2* nicotinic receptors in exploration. PLoS Comput. Biol. 4, e1000229 (2008).
Article PubMed PubMed Central Google Scholar
Gordon, G., Fonio, E. & Ahissar, E. Emergent exploration via novelty management. J. Neurosci. 34, 12646–12661 (2014).
Article CAS PubMed PubMed Central Google Scholar
Payzan-LeNestour, E. & Bossaerts, P. Risk, unexpected uncertainty and estimation uncertainty: Bayesian learning in unstable settings. PLoS Comput. Biol. 7, e1001048 (2011).
Article CAS PubMed PubMed Central Google Scholar
Redgrave, P. & Gurney, K. The short-latency dopamine signal: a role in discovering novel actions? Nat. Rev. Neurosci. 7, 967–975 (2006).
Article CAS PubMed Google Scholar
Bromberg-Martin, E.S. & Hikosaka, O. Midbrain dopamine neurons signal preference for advance information about upcoming rewards. Neuron 63, 119–126 (2009).
Article CAS PubMed PubMed Central Google Scholar
Rice, M.E. & Cragg, S.J. Nicotine amplifies reward-related dopamine signals in striatum. Nat. Neurosci. 7, 583–584 (2004).
Article CAS PubMed Google Scholar
Addicott, M.A., Pearson, J.M., Wilson, J., Platt, M.L. & McClernon, F.J. Smoking and the bandit: a preliminary study of smoker and nonsmoker differences in exploratory behavior measured with a multiarmed bandit task. Exp. Clin. Psychopharmacol. 21, 66–73 (2013).
Article PubMed Google Scholar
Galván, A. et al. Greater risk sensitivity of dorsolateral prefrontal cortex in young smokers than in nonsmokers. Psychopharmacology (Berl.) 229, 345–355 (2013).
Article Google Scholar
Paxinos, G. & Franklin, K.B. The Mouse Brain in Stereotaxic Coordinates (Gulf Professional Publishing, 2004).
Grace, A.A. & Bunney, B.S. Intracellular and extracellular electrophysiology of nigral dopaminergic neurons--1. Identification and characterization. Neuroscience 10, 301–315 (1983).
Article CAS PubMed Google Scholar
Rokosik, S.L. & Napier, T.C. Intracranial self-stimulation as a positive reinforcer to study impulsivity in a probability discounting paradigm. J. Neurosci. Methods 198, 260–269 (2011).
Article PubMed Google Scholar
D'Acremont, M. & Bossaerts, P. Neurobiological studies of risk assessment: a comparison of expected utility and mean-variance approaches. Cogn. Affect. Behav. Neurosci. 8, 363–374 (2008).
Article PubMed Google Scholar
Behrens, T.E.J., Woolrich, M.W., Walton, M.E. & Rushworth, M.F.S. Learning the value of information in an uncertain world. Nat. Neurosci. 10, 1214–1221 (2007).
Article CAS PubMed Google Scholar
Daw, N.D. Trial-by-trial data analysis using computational models. in Decision Making, Affect, and Learning: Attention and Performance XXIII (eds. Delgado, M.R., Phelps, E.A. & Robbins, T.W.) 3–38 (2011).
McClure, S.M., Daw, N.D. & Montague, P.R. A computational substrate for incentive salience. Trends Neurosci. 26, 423–428 (2003).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We thank E. Guigon for discussions, C. Prévost-Solié for technical support, and J.-P. Changeux, E. Ey, G. Dugué and A. Boo for comments on the manuscript. This work was supported by the Centre National de la Recherche Scientifique CNRS UMR 8246, the University Pierre et Marie Curie (Programme Emergence 2012 for J.N. and P.F.), the Agence Nationale pour la Recherche (ANR Programme Blanc 2012 for P.F., ANR JCJC to A.M.), the Neuropole de Recherche Francilien (NeRF) of Ile de France, the Foundation for Medical Research (FRM, Equipe FRM DEQ2013326488 to P.F.), the Bettencourt Schueller Foundation (Coup d'Elan 2012 to P.F.), the Ecole des Neurosciences de Paris (ENP) to P.F., the Fondation pour la Recherche sur le Cerveau (FRC et les rotariens de France, “espoir en tête” 2012) to P.F. and the Brain & Behavior Research Foundation for a NARSAD Young Investigator Grant to A.M. The laboratories of P.F. and U.M. are part of the École des Neurosciences de Paris Ile-de-France RTRA network. P.F. and U.M. are members of the Laboratory of Excellence, LabEx Bio-Psy, and P.F. is member of the DHU Pepsy.

Author information

Authors and Affiliations

Sorbonne Universités, UPMC University Paris 06, Institut de Biologie Paris Seine, UM 119, Paris, France
Jérémie Naudé, Stefania Tolu, Malou Dongelmans, Nicolas Torquet, Sébastien Valverde, Guillaume Rodriguez, Alexandre Mourot, Fabio Marti & Philippe Faure
CNRS, UMR 8246, Neuroscience Paris Seine, Paris, France
Jérémie Naudé, Stefania Tolu, Malou Dongelmans, Nicolas Torquet, Sébastien Valverde, Guillaume Rodriguez, Alexandre Mourot, Fabio Marti & Philippe Faure
INSERM, U1130, Neuroscience Paris Seine, Paris, France
Jérémie Naudé, Stefania Tolu, Malou Dongelmans, Nicolas Torquet, Sébastien Valverde, Guillaume Rodriguez, Alexandre Mourot, Fabio Marti & Philippe Faure
Institut Pasteur, CNRS UMR 3571, Unité NISC, Paris, France
Stéphanie Pons & Uwe Maskos

Authors

Jérémie Naudé
View author publications
You can also search for this author in PubMed Google Scholar
Stefania Tolu
View author publications
You can also search for this author in PubMed Google Scholar
Malou Dongelmans
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Torquet
View author publications
You can also search for this author in PubMed Google Scholar
Sébastien Valverde
View author publications
You can also search for this author in PubMed Google Scholar
Guillaume Rodriguez
View author publications
You can also search for this author in PubMed Google Scholar
Stéphanie Pons
View author publications
You can also search for this author in PubMed Google Scholar
Uwe Maskos
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Mourot
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Marti
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Faure
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.N. and P.F. designed the study. S.T. and J.N. performed the virus injections. M.D., N.T., G.R. and J.N. performed the behavioral experiments. S.V. and F.M. performed the electrophysiological recordings. S.P. and U.M. provided the genetic tools. J.N., F.M. and P.F. analyzed the data. J.N., A.M. and P.F. wrote the manuscript.

Corresponding author

Correspondence to Philippe Faure.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Analysis of locomotion in the ICSS-based bandit task

(a) Dwell times (see Methods) were shorter (T₍₁₈₎=3.67, p=0.002, paired t-test) in the CS than in the US. In US, there were no effects of the reward probability of the target on the dwell times (F_(2,18)=0.2, p=0.82, one-way ANOVA). (b) Variation of the instantaneous speed in certain setting: in the CS, the maximal speed of WT mice depended on the ICSS intensity (F_(2,18)=13.2, p<0.001, one-way ANOVA) in contrary to what was observed in the uncertainty setting (US) with different probabilities of reward (see Fig 1e).

Supplementary Figure 2 Comparison of models of decision-making and locomotion

(a) Bayesian Information Criterion (BIC) computed using three classical models of action selection ignoring uncertainty (matching law, epsilon-greedy, softmax, see Methods) and two alternative models taking uncertainty into account (softmax model with an uncertainty bonus, or with uncertainty-modulated temperature parameter, see Methods). Smaller BIC value indicates that the uncertainty bonus provided a better fit. (b) BIC derived from multiple linear regression (see Methods) for exploratory locomotion models embedding an increasing number of explicative variables. The red star and crosses indicate the winning model, which incorporate the reward history (R_(t-1)) and the expected value (E(R)) and uncertainty (σ²(R)) of the chosen goal (indexed A), but not those of the alternative goal (indexed B).

Supplementary Figure 3 Robustness of the parameters derived from the decision-making model

(a) Comparison between transitions in the last two sessions (#9-10) displayed in the main part of the results, versus transitions measured two sessions before (#7-8). Transitions from these two data sets were not significantly different for all gambles (G1: T₍₁₈₎=0.44, p=0.67; G2: T₍₁₈₎=-1.36, p=0.19; G3: T₍₁₈₎=-1.64, p=0.12, paired t-tests), indicating that the results are stable through the sessions, and that mice decisions reached steady state in this setting. (b, c, d) Proportions of exploitative choices (choice of the most valuable alternative) of the mice for the three gambles in different sets of reward probabilities: {25%, 50%, 75%} (b); {50%, 75%, 100%} (c); {25%, 75%, 100%} (d). (e) Parameters (ϕ and ß) derived from the model-based analysis (uncertainty model) of the transition functions, for the probabilities used in the main text (black) and in the present panels (b, green; c, purple; d, light blue). In each case the uncertainty-seeking parameter was significantly positive, showing that the parameters derived from the model provide a robust characterization (across probabilities sets) of the influence of uncertainty on decision-making process.

Supplementary Figure 4 β2*-nAChRs are not involved in motivation by certain rewards

(a) Learning of the task in the DS with increasing performance along learning sessions for both groups (session effect: F=12.16, p<0.001), which was not different between β2 KO and WT mice (genotype effect: F=0.04, p=0.84, genotype x session interaction: F=0.99, p=0.45, two-way ANOVA). (b) In the DS, the rate of ICSS behavior (number of ICSS per minute) scaled with the intensity of current pulses up to a plateau for both groups (intensity effect: p=<0.001, two-way ANOVA). When tested with different intensities of current pulses, β2KO mice performed the task with the same level of performance (genotype effect: F=1.22, p=0.27, genotype-intensity interaction: F=0.73, p=0.74, two-way ANOVA). (c) In the deterministic setting (DS), when the ICSS intensity increases (I(ICSS) = {40,80,120} µA), the speed profile of β2KO mice is affected, with higher maximal speed (F_(2,10)=36.35, p<0.001, one-way ANOVA) at higher intensities of ICSS. (d) Maximal speeds corresponding to the three ICSS intensities ({40,80,120} µA) for β2 KO and WT mice did not differ significantly (genotype effect: F=0,86, p=0,36; genotype x intensity interaction: F=0.16, p=0.86). Note that the values given in (d) do not correspond to the peaks in the speed profiles because the maximum of the average speed profile does not necessarily correspond to the average of the maximal speeds.

Supplementary Figure 5 Analysis of locomotion in β2KO and β2VEC mice

(a) The speed profile of β2KO mice was not significantly modified by the reward probability of the target (F_(2,10)=0.08, p=0.93). (b) β2KO mice travelled the same distance whatever the target probability (F_(2,10)=0.14, p=0.87), hence the relation between the reward probability of the target place and the cumulative distance travelled was altered in β2KO mice. (c) The speed profiles of β2VEC mice were similar irrespective of the probability of the next reward (F_(2,11)=0.21, p=0.81). (d) When going towards less likely ICSS, β2VEC mice tended to travel more (F_(2,11)=6.2, p=0.005), showing that β2 nAChRs in the VTA is sufficient to restore the balance of exploiting the task versus exploring the open field.

Supplementary Figure 6 Additional measures of restoration of functional β2*-nAChRs by the lentiviral injection

(a-d) Example of a recorded neuron: (a) Neurobiotine (b) eGFP and (c) tyrosine hydroxylase, identify, respectively, DA cells (green), the neuron re-expressing the β2 subunit (red), and a recorded cell (blue). eGFP, enhanced green fluorescent protein. e) Mean ± s.e.m DA cell firing frequency increase after injection of 30 µg/gk nicotine concentration, in WT (n=46, gray), β2KO (n=20, red) and β2VEC (n=45, black) mice. f) Same for proportion of spike within burst (%SWB). Vertical dashed bar indicates nicotine injection.

Supplementary Figure 7 Model comparison and robustness in β2KO and β2VEC mice

(a,b) Bayesian Information Criterion (BIC) computed using the four models of action selection (matching law, epsilon-greedy, softmax, softmax with an uncertainty bonus, see Methods) for (a) β2KO mice, (b) β2VEC mice. In each case, the uncertainty model provided smaller BIC, which indicates better fit. (c, d, e) Proportions of exploitative choices (choice of the most valuable alternative) of β2KO mice for the three gambles in different sets of reward probabilities: {25%, 50%, 75%} (c); {50%, 75%, 100%} (d); {25%, 75%, 100%} (e). (f) Parameters derived from the model-based analysis (uncertainty model) of the transition functions of β2KO mice, for the probabilities used in the main text (black) and in the present panels (b, green; c, purple; d, light blue). The model parameters did not significantly differ between probability sets (for ϕ, F_(3,37)=0,32; p=0,81; for β, F_(3,37)=0,26; p=0,85).

Supplementary Figure 8 Learning phase in the probabilistic task: experimental data and model comparison

(a,b) Evolution of the proportion of choices of the three rewarded locations in the uncertain setting, across the learning sessions, for WT (a) and β2KO (b) mice. (c,d) Difference in Bayesian information criterion (compared to the standard RL model) of models including an expected uncertainty bonus (“uncertainty”), an adaptive learning rate (“adaptive LR”) and an unexpected uncertainty bonus, for WT (c) and β2KO (d) mice. (e,f) Model fits of the experimental data shown in (a,b) for the winning models, i.e. expected uncertainty for WT mice, and standard model for β2KO mice.

Supplementary Figure 9 Model comparison in the dynamic foraging task

(a) Computational models of reinforcement-learning and decision-making used to analyze the behavioral data, summarizing whether sensitivity to uncertain outcomes arises from learning, decision, or both processes. (b,c) Bayesian Information Criterion (BIC) for the standard reinforcement learning model and alternative models: standard model with asymmetric learning (L) rates for positive and negative outcomes, uncertainty model with a single learning rate for value and uncertainty (bonus), uncertainty model with separate learning rates for value and uncertainty, uncertainty model with three learning rates (for positive and negative outcomes, and for uncertainty). Smaller BIC value indicates better fit, which was the uncertainty model with separate learning rates for value and uncertainty for WT mice (b) and the standard reinforcement learning model for β2KO mice (c).

Supplementary Figure 10 Alternative models for the spatial learning and passive avoidance tasks

(a) Variations of the temperature parameter (ß) in the simulation of the spatial learning task using the standard reinforcement-learning model. Original experimental data are represented (mean ± sem) by dots (black for WT, red for β2KO). The curves represent the modeling of the data with an increased value of ß (from top to bottom, black to dark blue). (b) Variations of the initial value (V₀) of the rewarding arm in the simulation the spatial learning task using the standard reinforcement-learning model. Same presentation as (a). (c) Variations of the learning rate (α) in the simulation the spatial learning task using the standard reinforcement-learning model. Same presentation as (a). (d) In the simulation the spatial learning task using the standard reinforcement-learning model, combined modifications of initial value and learning rate hardly explain the WT data. Data are shown as dots with error bars (mean + s.e.m), simulation as stripes. (e) Variations of the temperature parameter (ß) in the simulation the passive avoidance task using a sequential reinforcement-learning model. (Same presentation as (a). (f) Variations of the baseline activity (θ) in the simulation the passive avoidance task using a sequential reinforcement-learning model. (Same presentation as (a). Data in (a-d) adapted with permission from Ref 42. Data in (e,f) adapted with permission from Ref 43.

Supplementary Figure 11 Model simulation: open-fields without rewards and object recognition

(a) Decomposition of behavior in an open-field. Locomotion in the open field is transformed into four states, resulting from the differentiation between active (A) or inactive (I) states (depending on the velocity) and periphery (P) or center (C) zones. (b) Discretized representation of the behavior based on the four-states decomposition, used for model simulation. Possible transitions are represented by plain arrows and forbidden transition by dashed arrows. (c,d) Simulation of transition probabilities between “center-active” (CA) and “center-inactive” states (c), and between “periphery-active” and “center-active” (d), for WT (black, model with uncertainty bonus) and β2KO (red, model without uncertainty bonus) mice. (e) Simulation of total time spent in inactive states (PI and CI) for WT (black) and β2KO (red) mice. (f) Object recognition in an open-field. Two states represent the object areas, the rest of the open-field is modeled as 25 discrete states. (g) Total time spent in the “object areas” states for WT (black, model with uncertainty bonus) and β2KO (red, model without uncertainty bonus) mice. Data in (c- e) adapted with permission from Ref 13. Data in (g) adapted with permission from Ref 42.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–11 (PDF 2272 kb)

Supplementary Methods Checklist

(PDF 494 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Naudé, J., Tolu, S., Dongelmans, M. et al. Nicotinic receptors in the ventral tegmental area promote uncertainty-seeking. Nat Neurosci 19, 471–478 (2016). https://doi.org/10.1038/nn.4223

Download citation

Received: 09 October 2015
Accepted: 09 December 2015
Published: 18 January 2016
Issue Date: March 2016
DOI: https://doi.org/10.1038/nn.4223