Choosing whether to stick to a belief or to abandon it in the face of uncertainty is central to human behaviour. Modelling implicates brain chemicals called neuromodulators in adjudicating this essential decision.
Why did you choose to read this article? Perhaps you are a neuroscientist eager to refine your knowledge. Or perhaps you are keen to broaden your horizons outside your current discipline. These motivations reflect a fundamental trade-off in how we invest our time and effort: individuals must continually decide whether it would be better to pursue known sources of reward, or whether there is more to be gained by searching for new strategies or opportunities. In reinforcement learning, this dilemma is referred to as the trade-off between exploitation and exploration. There is growing evidence that the mechanisms used to resolve this trade-off are directly regulated by neuromodulators1,2,3. Yu and Dayan4, writing in Neuron, extend this work by using simple principles from Bayesian probability theory to derive a sophisticated model of how neuromodulatory systems are central to the trade-off decision.
As their name suggests, neuromodulators such as dopamine, acetylcholine and noradrenaline seem to modify the effects of neurotransmitters — the molecules that allow communication between neurons. Neuromodulatory systems are implicated in almost every mental function, including attention, learning and emotion, and they are disturbed in major neurological and psychiatric disorders, from Alzheimer's disease and post-traumatic stress disorder to depression and schizophrenia.
The conventional view of neuromodulators has been that they have broad, nonspecific functions, such as signalling reward (dopamine) and regulating arousal (noradrenaline). But a current renaissance in the field is showing that neuromodulators have more-specific functions in learning and decision-making. For example, dopamine neurons are implicated in signalling errors in reward prediction, a role that is central to reinforcement learning1,5. Moreover, noradrenaline may be key in facilitating the responses to decision-making processes6,7, and in regulating the balance between exploitation and exploration3,8.
To examine the role of neuromodulators in such processes, Yu and Dayan4 exposed subjects to a task involving a set of cues — differently coloured arrows pointing to the left or right — one of which points to where a target will subsequently appear. The participant must respond as quickly as possible to the target; if they work out what the predictive cue is, they tend to do better. In a typical set-up, the predictive cue stays the same for a number of trials, but then changes without warning.
The crux of the task is that there are two forms of uncertainty associated with the cues. Subjects must work out which cue predicts where the target will appear, as well as how reliably it does so. If the predictive cue were 100% accurate, the task would be trivial: most people would quickly discover which arrow reliably points to the target, and notice as soon as this changed. However, the cue is usually set to be only partly predictive (for example, 80% of the time). And herein lies the intrigue — and generality — of the problem. Suppose you have come to believe that a particular cue predicts the target, but in the last few trials it has failed to do so. How do you know whether this is because the cue is not a perfect predictor of the target (like most cues in the world), or because the relevant cue has changed? More generally, how do we decide whether to stand by our beliefs, even as we recognize their fallibility, or to abandon them in search of better ones?
Yu and Dayan cast this dilemma in terms of a distinction between expected uncertainty (in their task, the less-than-perfect reliability of a cue) and unexpected uncertainty (a surreptitious switch in the relevant cue). They propose that information about these forms of uncertainty is coded in the brain by different neuromodulatory systems — with acetylcholine reflecting the degree of expected uncertainty, and noradrenaline gauging unexpected uncertainty.
Yu and Dayan develop a model of how acetylcholine and noradrenaline levels encode uncertainty, and how their interaction determines whether we should abide by or abandon an existing belief. Their analysis implies that if the optimal strategy was computed on each trial, the process would be so demanding as to be biologically unfeasible. However, they demonstrate that an alternative probability algorithm that is biologically plausible can approximate the optimal strategy.
Their model allows Yu and Dayan to make detailed quantitative — and sometimes counterintuitive — predictions about neuromodulatory function and its influence on behaviour. For example, according to their theory the degree of unexpected uncertainty that causes you to abandon a belief should depend on the level of expected uncertainty; that is, if you know that the selected cue is not reliable, you will have a higher tolerance for its failure to predict the target. This makes interesting predictions about how disturbances of acetylcholine and noradrenaline levels will affect behaviour. Confirmation of these predictions, in turn, is likely to provide deeper insight into the patterns of behavioural deficits observed in clinical disorders involving disturbances of attention, decision-making and learning.
It is perhaps in this regard that Yu and Dayan's work is most noteworthy. Research on the role of neuromodulatory disturbances in mental disorders has tended to focus on simple hypotheses concerning static excesses or deficits of activity in individual neuromodulatory systems, with little consideration of interactions between systems. A more profound understanding of these dynamics, and their relationship to cognition and behaviour, is crucial if we are to understand how disruption of these systems contributes to the clinical symptoms associated with neurological and psychiatric disorders, and ultimately, how to design effective treatments.
A future problem will be to bring this theory into contact with ones that have framed the question more specifically in terms of the trade-off between exploration and exploitation, and the maximization of reward (for instance, theories about reinforcement learning from neuroscience and utility maximization from economics). The theory also needs to address the high temporal specificity of neuromodulatory systems, which can show rapid phasic responses following task-relevant events (within 100–200 ms), and which may have an immediate impact on task performance2,6,9,10,11. Finally, an understanding of the biophysical and circuit mechanisms by which acetylcholine and noradrenaline interact to produce the proposed functions is needed to advance this view of neuromodulators in brain function.
Yu and Dayan's work is an impressive contribution to the evidence that neuromodulatory function is more specific than previously thought. We can say with some certainty that their theory represents a particularly promising direction of research, and that their paper is a highly rewarding read.
Schultz, W., Dayan, P. & Montague, P. R. Science 275, 1593–1599 (1997).
Hasselmo, M. E. & McGaughy, J. Prog. Brain Res. 145, 207–231 (2004).
Aston-Jones, G. & Cohen, J. D. Annu. Rev. Neurosci. 28, 403–450 (2005).
Yu, A. J. & Dayan, P. Neuron 46, 681–692 (2005).
Montague, P. R., Dayan, P. & Sejnowski, T. J. J. Neurosci. 16, 1936–1947 (1996).
Clayton, E. C., Rajkowski, J., Cohen, J. D. & Aston-Jones, G. J. Neurosci. 24, 9914–9920 (2004).
Brown, E. T. et al. Int. J. Bifurcat. Chaos 15, 803–826 (2005).
Usher, M., Cohen, J. D., Servan-Schreiber, D., Rajkowski, J. & Aston-Jones, G. Science 283, 549–554 (1999).
Hasselmo, M. E. Behav. Brain Res. 67, 1–27 (1995).
Aston-Jones, G., Rajkowski, J., Kubiak, P. & Alexinsky, T. J. Neurosci. 14, 4467–4480 (1994).
Hasselmo, M. E. Neuron 46, 526–528 (2005).
About this article
Advances in Psychology (2017)
Journal of Business Ethics (2017)
British Journal of Neuroscience Nursing (2016)
Risk-taking and risky decision-making in Internet gaming disorder: Implications regarding online gaming in the setting of negative consequences
Journal of Psychiatric Research (2016)
Trends in Neurosciences (2016)