Biases in the Explore–Exploit Tradeoff in Addictions: The Role of Avoidance of Uncertainty

Morris, Laurel S; Baek, Kwangyeol; Kundu, Prantik; Harrison, Neil A; Frank, Michael J; Voon, Valerie

doi:10.1038/npp.2015.208

Download PDF

Original Article
Open access
Published: 15 July 2015

Biases in the Explore–Exploit Tradeoff in Addictions: The Role of Avoidance of Uncertainty

Laurel S Morris¹,
Kwangyeol Baek^2,3,
Prantik Kundu^2,3,4,
Neil A Harrison⁵,
Michael J Frank⁶ &
…
Valerie Voon^1,2,3,7

Neuropsychopharmacology volume 41, pages 940–948 (2016)Cite this article

5781 Accesses
31 Citations
10 Altmetric
Metrics details

Subjects

Abstract

We focus on exploratory decisions across disorders of compulsivity, a potential dimensional construct for the classification of mental disorders. Behaviors associated with the pathological use of alcohol or food, in alcohol use disorders (AUD) or binge-eating disorder (BED), suggest a disturbance in explore–exploit decision-making, whereby strategic exploratory decisions in an attempt to improve long-term outcomes may diminish in favor of more repetitive or exploitatory choices. We compare exploration vs exploitation across disorders of natural (obesity with and without BED) and drug rewards (AUD). We separately acquired resting state functional MRI data using a novel multi-echo planar imaging sequence and independent components analysis from healthy individuals to assess the neural correlates underlying exploration. Participants with AUD showed reduced exploratory behavior across gain and loss environments, leading to lower-yielding exploitatory choices. Obese subjects with and without BED did not differ from healthy volunteers but when compared with each other or to AUD subjects, BED had enhanced exploratory behaviors particularly in the loss domain. All subject groups had decreased exploration or greater uncertainty avoidance to losses compared with rewards. More exploratory decisions in the context of reward were associated with frontal polar and ventral striatal connectivity. For losses, exploration was associated with frontal polar and precuneus connectivity. We further implicate the relevance and dimensionality of constructs of compulsivity across disorders of both natural and drug rewards.

Microdosing with psilocybin mushrooms: a double-blind placebo-controlled study

Article Open access 02 August 2022

Federico Cavanna, Stephanie Muller, … Enzo Tagliazucchi

Adults who microdose psychedelics report health related motivations and lower levels of anxiety and depression compared to non-microdosers

Article Open access 18 November 2021

Joseph M. Rootman, Pamela Kryskow, … Zach Walsh

Psilocybin microdosers demonstrate greater observed improvements in mood and mental health at one month relative to non-microdosing controls

Article Open access 30 June 2022

Joseph M. Rootman, Maggie Kiraga, … Zach Walsh

INTRODUCTION

Tricky decisions arise almost daily, from the mundane, should I try something new for lunch today, to the more exotic, should I move to a different city? To navigate a dynamic world, individuals must adapt behavior and consider the trade-off between exploring an uncertain environment for the potential to improve beyond the status quo and exploiting known reward sources, in the hope of maintaining optimal decision-making. Behaviors associated with the pathological use of alcohol or food, in alcohol use disorders (AUD) or binge-eating disorder (BED), might suggest a disturbance in explore–exploit decision-making, whereby strategic exploratory decisions in attempt to improve long-term outcomes may diminish in favor of more repetitive or exploitatory choices. Here we aim to further characterize the trade-off between exploring the uncertain and exploiting the known in these groups.

Faced with an explore–exploit dilemma, one may initially randomly sample the environment and gradually reduce the probability of choosing each action with increasing outcome knowledge. However, descriptions using stochastic choice rules initially govern random exploration and do not take into account the amount of information that could be gained by sampling an unknown choice. Instead, choices may be directed by the amount gained by an exploratory choice (Badre et al, 2012; Frank et al, 2009; Dayan and Sejnowski, 1996). Within this framework, the level of certainty that a choice will engender a better than expected outcome, will influence exploratory choice. Using a temporal utility decision-making task, a recent study provided support for this assumption; the inclusion of an uncertainty term in computational modeling of trail-by-trial choices provided a superior description of exploratory choice (Frank et al, 2009). Thus, behavioral measures that are not accounted for by positive and negative prediction error updating can instead be explained as exploratory adjustments toward uncertainty (Badre et al, 2012; Cavanagh et al, 2012).

At a neural level, the frontopolar cortex (FPC) and intraparietal sulcus have been implicated in exploratory behaviors (Daw et al, 2006). With widespread cortical and subcortical anatomical and functional connectivity (Liu et al, 2013), the FPC sits at the top of a hierarchical behavioral control system, evaluating heterogeneous inputs for reward-related cognitive task integration in the pursuit of an advanced behavioral goal (Christoff and Gabrieli, 2000; Koechlin and Hyafil, 2007; Ramnani and Owen, 2004). Activity in FPC increases and decreases, with exploratory and exploitative decisions, respectively (Daw et al, 2006). In line with the role of uncertainty in driving exploratory choice, the lateral FPC has been shown to track the relative uncertainty of choices when exploratory choices are made and preferentially in those subjects who use an uncertainty-guided exploration strategy (Badre et al, 2012; Cavanagh et al, 2012). Striatal dopamine function, marked by functional polymorphisms in dopaminergic genes, has also been associated with exploitative decision-making by modulating learning from positive and negative prediction errors (Frank et al, 2009).

We focus on exploratory decisions across disorders of compulsivity, a potential dimensional construct for the classification of mental disorders in line with recent Research Domain Criteria strategies (Insel et al, 2010). Compulsivity can be described as repetitions of deleterious choices, which remain insensitive to changes in outcome contingencies and occur despite negative consequences (Robbins et al, 2012; Voon et al, 2014a). An outstanding question is to what extent exploratory choices are altered in disorders of compulsivity. We have recently shown that binge-eating, a compulsive pattern of food intake, presents similar behavioral characteristics to drug taking disorders including greater risk-taking for rewards (Voon et al, 2014c) and greater reliance on habitual learning strategies (Voon et al, 2014a). Binge-eating behavior provides a means of distinguishing crucial subtypes within obesity.

With a task previously shown to elicit uncertainty-driven exploratory decision-making behavior in humans (Badre et al, 2012; Frank et al, 2009; Cavanagh et al, 2012; Kayser et al, 2015), we compare on a behavioral level, exploration vs exploitation across disorders of natural (obesity with and without BED) and drug rewards (AUD). We expect a trans-pathological marker of reduced strategic uncertainty-driven exploratory behaviors compared with healthy volunteers (HV). We separately acquired resting state functional MRI (rsfMRI) data from healthy individuals to assess the neural correlates underlying exploration. We use a novel multi-echo planar imaging sequence and independent components analysis (ME-ICA) to separate blood oxygen level dependent (BOLD) from non-BOLD activity. This acquisition and analysis greatly enhances signal-to-noise ratios compared with traditional single-echo sequences thus allowing higher spatial resolution (Kundu et al, 2012). We focus on the connectivity of the FPC and hypothesize that connectivity with ventral striatum (reward valuation) and inferior parietal cortex (action implementation) is associated with exploratory behaviors in the context of reward. We secondarily assess exploration in the context of loss, expecting a similar network including FPC and inferior parietal cortex.

MATERIALS AND METHODS

Participants

We recruited HV from community and University-based advertisements in the East Anglia region, United Kingdom. The recruitment strategy for patient groups has been reported elsewhere (Voon et al, 2014b). For all patient groups primary diagnoses were confirmed by a psychiatrist using the Diagnostic and Statistical Manual of Mental Disorders, Version IV criteria for substance dependence or Research Diagnostic Criteria for BED (Association, 2000). Written informed consent was obtained and the study was approved by the University of Cambridge Research Ethics Committee. The same subjects completed the behavioral task outside of the scanner and underwent the rsfMRI scan. For further information see Supplementary Materials.

Task

We used a task previously shown to elicit exploratory decision-making behavior in humans (Badre et al, 2012; Frank et al, 2009). Participants viewed a clock arm that rotates at 5 s per revolution (Figure 1). Participants were instructed to press the space bar before a full turn of the arm to win and were informed that the time at which the arm is stopped will determine how much money would be won. The outcome (£0–£200) was revealed for 1 s followed by an inter-trial interval of 300 ms. There were 40 trials per condition. An early key press did not affect the total time of the task and subjects were instructed to stop the clock at different times to maximize potential of winning.

In the previously described task, outcomes varied in probability and magnitude as a function of response times (RTs) such that expected value increased, decreased or remained constant with increasing RTs. In the current version of the task, only the conditions in which expected value was constant across the whole clock were used which engender most exploratory decisions, but with different frequencies and magnitudes. The increasing and decreasing conditions were replaced with a duplicate set of constant expected value conditions (both CEV and CEVReverse), but for which the outcomes were losses instead of gains. This allows us to assess whether participants use the same uncertainty-driven exploration strategy in the domain of losses, whether they are more averse to uncertainty in that case and whether compulsive individuals show any difference not only in exploration but in its modulation by valence. Exploratory choices are those made for clock arm positions (coarsely, fast vs slow portions of the clock) for which reward outcomes were more uncertain given previous samples (Badre et al, 2012). The relationship between the probability of winning or losing, the outcome magnitude and clock position was random hence was not associated with learning. Further task details and the computational model are reported in Supplementary Materials.

The model parameters were inspected for normality of distribution using Shapiro–Wilkes. For the exploration parameter, each group was compared with their own matched HV and assessed using mixed measures ANOVA with within-subject factor of valence (gain, loss) and between-subject factor of group. The BED and obese subjects were also directly compared. Data that were skewed (learning rates, ρ) were analyzed using Mann–Whitney U-tests.

Resting State Functional MRI

We employed a novel ME-ICA in which BOLD signals were identified as independent components having linear TE-dependent signal change and non-BOLD signals were identified as TE-independent components (Kundu et al, 2012). Spatial smoothing was conducted with a Gaussian kernel (full width half maximum=6 mm). CONN-fMRI Functional Connectivity toolbox (Whitfield-Gabrieli and Nieto-Castanon, 2012) for Statistical Parametric Mapping SPM8 (http://www.fil.ion.ucl.ac.uk/spm/software/spm8/) was used for functional connectivity analysis. A strictly defined region of interest (ROI) for the FPC was used based on strong a priori hypotheses (Daw et al, 2006), to compute ROI-to-voxel connectivity maps. These maps were entered into second level correlation analysis with exploration behavioral measures, using cluster extent threshold correction calculated at 15 voxels at p<0.001 whole brain uncorrected, which corrects for multiple comparisons at p<0.05 assuming an individual-voxel Type I error of p=0.01 (Slotnick et al, 2003). Further details are reported in Supplementary Materials.

RESULTS

The subject characteristics have been previously reported (Voon et al, 2014a,c; see Table 1). Thirty-two AUD subjects (weeks abstinent 16.62 (SD 16.72); years of dependence 13.67 (SD 9.40); units/day 27.28 (SD 13.95), on the following medications (acamprosate 2; disulfiram 1)), 31 obese with BED and 30 obese without BED were matched with their own age- and gender-matched HV (N=55 for each group). AUD and obese with BED had higher depression scores compared with HV. Obese with and without BED had higher body mass index (BMI) and obese with BED had higher Binge Eating Scale (BES) scores.

Table 1 Subject Characteristics

Full size table

Behavioral Characterization of Explore–Exploit Dilemma Across Disorders of Natural and Drug Rewards

The data from one healthy volunteer and one AUD were removed as they were >3 SD above the group mean. The exploration indices for gain and for loss were square root transformed.

Exploration indices were compared between gain and loss separately for each subject group using repeated measures ANOVA with smoking status as a covariate of no interest. Higher exploratory behaviors in the context of gain compared with loss was observed in HV (F(1.94)=511.77, p<0.001), AUD (F(1.29)=178.99, p<0.001), obese subjects (F(1.28)=109.17, p<0.001), and in BED (F(1.29)=72.10, p<0.001), supporting the interpretation that subjects are averse to uncertainty in the context of losses, possibly in the fear that their exploratory choices could yield yet worse outcomes.

In the AUD comparison with HV, there was a main valence effect (F(1.84)=36.00, p<0.001) and a group effect (F(1.84)=6.69, p=0.003) in which AUD subjects had lower exploration indices compared to HV with no interaction effect (F(1.84)=0.032, p=0.858) (Figure 1). With the addition of smoking status as a covariate of no interest, the group effect remained significant (p=0.035).

In the BED comparison with HV, there was a main valence effect (F(1.84)=4187.31, p<0.001) and no group (F(1.84)=0.46, p=0.499) or interaction effect (F(1.84)=1.50, p=0.224). In the obese comparison with HV, there was a main valence effect (F(1.83)=4105.23, p<0.001) and no group (F(1.83)=2.00, p=0.161) or interaction effect (F(1.83)=0.17, p=0.683). We then compared the BED and obese subjects which showed a trend towards a group difference (F(1.56)=3.47, p=0.068). With the addition of age, gender and smoking status as covariates of no interest, we show a main valence effect (F(1.56)=58.39, p<0.001) and main group effect (F(1.56)=4.60, p=0.037) in which obese subjects had lower exploration indices than BED. There was a trend towards an interaction between group × valence (F(1.56)=3.48, p=0.068). Posthoc analysis revealed significant differences between groups in the loss (p=0.041) but not gain condition (p=0.405).

We also compared AUD with BED subjects with age, gender, and smoking status as a covariate of no interest showing a main group effect (F(1.54)=9.19, p=0.004) in which AUD subjects were less exploratory than BED subjects; a main valence effect (F(1.54)=50.94, p<0.001); and a group × valence interaction (F(1.54)=8.60, p=0.005). Posthoc testing revealed significant group difference in the loss domain only (p=0.003) in which BED subjects were more exploratory compared with AUD subjects.

On an exploratory basis, we examined the influence of smoking status in HV. We identified 13 current smokers and 83 current non-smokers and compared these using mixed measures ANOVA. There was a main valence effect (F(1.94)=511.77, p<0.001) and a group × valence interaction (F(1.94)=5.02, p=0.027) in which smokers made more exploratory choices under gain and fewer exploratory choices under loss compared to non-smokers (Figure 1). There was no main group effect (F(1.94)=1.76, p=0.187).

The other parameter fits were also compared between AUD and HV and between obese subjects with and without BED. There were no differences in the other parameters (Table 2, Supplementary Figure S1). There were no correlations between the exploration indices and measures of alcohol severity, BMI, or BES.

Table 2 Best Fitting Model Parameters and Model Fit

Full size table

Frontal Polar Cortex Connectivity and Exploration

Of the participants that completed the task, 37 HV (20 male; mean age 35, SD 15; verbal IQ 115, and SD 11), underwent resting state fMRI with a multi-echo resting state sequence. This acquisition and an analysis greatly enhances signal-to-noise ratios compared with traditional techniques and provides enhanced spatial resolution based on robust physical priniciples (Kundu et al, 2012). The explore/exploit task was tested out of the scanner. The FPC was carefully defined and used as a seed. Connectivity was quantified by calculating Pearson correlations coefficients between activity within the seed and the whole brain, producing seed-to-voxel whole-brain connectivity maps. These maps were then correlated with the behavioral measure of exploration. Age was included as a covariate of no interest.

Cluster-extent threshold analysis (calculated at 15 voxels at p<0.001 whole-brain uncorrected, correcting for multiple comparisons at p<0.05 assuming an individual-voxel Type I error of p=0.01 (Slotnick et al, 2003)) revealed that exploration in the context of reward was positively correlated with FPC and ventral striatal connectivity (peak coordinates x y z=−22, 21, −10 mm; cluster size=32; Z=4.38, Figure 2). In the context of loss, greater exploration was positively correlated with greater FPC and precuneus connectivity (peak x y z=−1, −41, 42 mm; cluster size=24; Z=3.61).

Finally we map connectivity of the FPC to the whole brain. At whole-brain FWE p<0.05 we find that FPC is functionally connected with a network including dorsolateral prefrontal cortex, precuneus, inferior parietal and subcortically, and the ventral striatum (Figure 3, Table 3).

Table 3 Statistics of FPC and Whole Brain Connectivity

Full size table

DISCUSSION

We employed a choice task previously used to demonstrate strategic exploratory decision-making behavior in healthy humans (Badre et al, 2012; Frank et al, 2009). All groups show a conserved effect of valence such that exploration was higher in the reward domain compared with the loss domain. Indeed, in the loss domains, subjects showed a consistently negative exploration parameter, meaning that they were averse to uncertainty when there was some prospect of losing even more. These findings potentially reflect the asymmetrical influence of gains and losses on choice behavior (Kahneman and Tversky, 1979) imposed by the strength of loss aversion as a consistent mediator of choice (Tom et al, 2007).

Exploratory behavior in subjects with AUD was reduced across gain and loss environments, in favor of more repetitive or exploitative choices. Obese subjects with and without BED did not differ from HV in their exploratory choices. However, when compared with each other, there was greater exploratory behaviors in BED subjects compared with those without BED. There was a trend toward a group × valence interaction driven by greater exploratory behaviors to losses in BED subjects compared with those without BED. Similarly, BED subjects had greater exploratory behaviors particularly to losses compared with AUD. Furthermore, we investigated the influence of smoking in HV on a pilot basis: current smokers showed an enhancement of the influence of valence with greater exploration to gain outcomes and less exploration to loss outcomes compared with non-smokers. Exploratory behavior in HV was associated with an underlying network including FPC and ventral striatal connectivity in the context of reward and FPC and precuneus for losses.

Compared with HV, AUD subjects had restricted exploratory behaviors and were more likely to avoid uncertainty across both gain and loss stimulus-outcomes in a task that is independent of learning. AUD subject have been shown to have abnormalities in decision making under ambiguity or uncertainty as measured using the Iowa Gambling Task (Goudriaan et al, 2005; Bechara et al, 2001). Our findings extend these results to suggest either intolerance/avoidance of uncertainty, or a reduced tendency to use a controlled strategy that searches for uncertain outcomes so as to maximize information gain. The current findings of reduced exploration in an unknown environment dovetail with findings suggesting that the effects of alcohol are selective for uncertainty-related anxiety rather than certainty-related fear (Hefner and Curtin, 2012), the former being hypothesized to drive the negative-reinforcement cycle of alcohol use (Edwards and Koob, 2010). An alternate explanation may be that changes in outcome sensitivity, rather than uncertainty avoidance, may engender reluctance to explore. However, decreased sensitivity to outcome may be more likely to manifest as greater exploration to sample further stimulus-outcome contingencies. Although we do not explicitly measure the role of novelty, decreased exploration may relate to the possible presence of novel environments. Ethanol withdrawal in rodents indeed causes reduced exploration of brightly lit chambers (Hascoet et al, 2001).

Furthermore, like HV, AUD subjects had decreased exploratory behaviors to losses compared with gains suggesting sensitivity to their differential influences. Current smokers also have an enhancement of this differential effect of valence with greater exploratory behaviors to gains and the opposite to losses relative to non-smokers. The enhancement in exploration for gains is in line with enhanced reward sensitivity related to nicotine use (Rose et al, 2013). This finding invites the suggestion that participants who are more likely to explore the potential hedonic benefits of smoking are those that become smokers. The findings in the loss domain suggest a potential role for enhanced loss aversion in smokers with greater avoidance of uncertainty in a loss context, perhaps facilitating sustained smoking in the presence of perceived small losses associated with immediate health consequences, rather than explore alternative strategies that would require giving up smoking for potentially other (eg social) losses. Although losses in the form of social and health cost are difficult to model, the secondary reinforcer of money can act as a proxy. These findings in AUD and smokers may be consistent with the negative reinforcement model of addiction (Koob, 2013; Koob and Le Moal, 2005) whereby a negative context may drive exploitative repetitive behaviors to avoid losses. Reduced exploration, or more repetitive choices, in the face of losses is consistent with theories that neuroadaptive systems driving aversive states lead to repetitive drug-seeking behaviors (Edwards and Koob, 2010). Indeed, negative affect in smokers is associated with craving severity (Robinson et al, 2011). Together with the current findings, this may explain how particular environmental influences (ie negative outcomes in the form of financial, social, or health losses) may facilitate the repetition of behaviors with certain, known outcomes, such as pathological drinking and smoking behaviors. Although these findings are intriguing, we caution that the findings in smokers are preliminary as the sample size of current users is small, and we cannot rule out an impact of nicotine etc. on exploration rather than the other direction of causality.

That subjects display reduced exploration for losses contrasts with the observation of enhanced ambiguity seeking in the face of losses in healthy humans (Ho et al, 2002; Chakravarty and Roy, 2009). However, this discrepancy is also similar to the observation of ambiguity aversion in the face of gains, despite exploration toward uncertain options in that case. The main difference is that in a learning task, choosing an ambiguous option can serve to reduce subsequent ambiguity, ie exploration drives learning. In the case of losses, it is thus perhaps surprising that subjects do not seek uncertain options to reduce subsequent ambiguity. In addition, the current study deals with explicit and experienced uncertainty rather than hypothetical ambiguity. The effect of valence on risky choice has been shown to be reversed when choices are either experience or description-based, with the former reducing risk-seeking for losses (Ludvig and Spetch, 2011) consistent with our findings. Furthermore, there may be at least two strategies for approaching an explore–exploit dilemma: choice biased toward information seeking; and random exploratory decisions involving chance (Wilson et al, 2014) and perhaps subjects adopted a strategy to simply increase random choices in the case of losses rather than rely on uncertainty.

Our findings show decreased exploration in obese subjects without BED as compared with BED suggesting differences as a function of greater avoidance of uncertainty. BED subjects appear to be more biased toward exploratory behavior but particularly in the context of losses and not to gains, that is, the opposite profile from smokers. These findings are similarly evident in the comparison of AUD and BED subjects in which BED have greater exploratory behaviors and particularly in the loss domain. This dissociation of valence coincides with previous work showing that BED subjects demonstrate greater risk taking for high probability losses only (Voon et al, 2014c) possibly suggesting less of an influence of loss aversion. These findings suggest differences between AUD and BED subjects particularly in the loss domain. Whether the distinct rewards of choice (natural or drug) are responsible for causing increased or decreased exploration in the face of loss or whether they are a product of an inherent attraction or aversion to exploration, remains a question for future studies. The suggestion that neuroadaptive negative reinforcement systems are initiated or propagated by excessive reward system activation (Koob, 2013), may explain the current finding of heightened sensitivity to losses in smokers and individuals with AUD, but not in BED, whereby nicotine and alcohol hijack the reward system to a greater degree than food. Moreover, we note that the negative consequences of binge eating on weight gain are far more immediate than those of smoking, which are perceived to be delayed and subject to potential quitting.

Our findings further highlight a role for an intrinsic network of FPC connectivity in exploration biases. The FPC sits at the outermost periphery of the hierarchical prefrontal control regions (Christoff and Gabrieli, 2000; Koechlin and Hyafil, 2007), being well poised to mediate higher level strategic switches rather than behavioral sequence control. Accumulating evidence suggests that through interactions with social/emotional network (orbitofrontal cortex, amygdala), cognitive network (dorsolateral prefrontal cortex) and default mode network (precuneus, anterior cingulate cortex; Liu et al, 2013), the FPC orchestrates more flexible and self-relevant behavioral control in the pursuit of optimal decision-making (Koechlin and Hyafil, 2007). We show that FPC and ventral striatal connectivity is associated with exploration in the context of a rewarding environment. This coincides with the notion that the FPC coordinates voluntary and adaptive switching based on uncertainty and expected value (Badre et al, 2012; Daw et al, 2006). Exploration may depend on the probability that an explored choice will provide a better outcome than expected based on previous experiences (a positive prediction error; Frank et al, 2009). It is thus possible that the FPC-VS connectivity implies a reward value assignment to the potential for exploring. This would not be expected in the context of losses because the value of exploring is only to reduce loss values rather than provide a positive outcome.

We also show that FPC and precuneus connectivity positively correlates with exploration in the loss domain. Although the precuneus has been traditionally associated with integration of visuo-spatial imagery (Selemon and Goldman-Rakic, 1988), converging evidence suggests a role in integration of external and self-relevant information (Cavanna and Trimble, 2006). Furthermore, goal-directed hand movements (Karnath and Perenin, 2005) and voluntary attentional shifts between targets even in the absence of an overt motor response (Culham et al, 1998), are mediated by the precuneus. Functional links between FPC and the default mode network (Liu et al, 2013) support its role in processing internal rather than external generation of information (Christoff and Gabrieli, 2000) to guide future-focused (Okuda et al, 2003) decision-making. The current findings suggest that although assignment of perceived agency to actions and encoding and organizing of intentions is mediated by the precuneus, it may interact with the FPC (Liu et al, 2013) which in turn processes internally-generated goals for behavioral control (Ramnani and Owen, 2004; Okuda et al, 2003). Further evidence of the role of the precuneus in exploratory choices comes from studies of foraging behavior. Humans may alternate between economic decisions and choices governed by sequential ‘engage or search elsewhere’ foraging choices (Kolling et al, 2012). Foraging choices (compared with decisions between two options) have been associated with activations in the precuneus extending to posterior cingulate cortex (PCC; Kolling et al, 2012) and PCC seems to be sensitive to risker compared with safer choices (Kolling et al, 2014). That this region is associated with risker choices suggests why it may be associated with exploratory choices losses rather than rewards.

Although recent evidence implicates both FPC and inferior parietal cortex in exploratory choices (Daw et al, 2006; Boorman et al, 2009), we did not find significant correlations for inferior parietal cortex. In a previous study, activity in both FPC and the inferior parietal sulcus correlated with the ratio between an unchosen and chosen action probability, or the relative unchosen probability (Boorman et al, 2009). However, the inferior parietal sulcus was only recruited when a switch in choice occurred (Boorman et al, 2009). Therefore, the FPC seems to track information accumulation relevant to switching to an alternate choice—here to reduce uncertainty—but engages the parietal cortex immediately before switching, which implements the switch itself. In line with this hypothesis, a recent study examining negative outcomes implicated the inferior parietal cortex in encoding actions and outcome objects but a more medial region, similar to that implicated in the current study, in encoding the action × object interaction reflecting the appropriate or inappropriate action (Morrison et al, 2013).

Our findings suggest biases in exploratory behaviors in the context of an uncertain environment across the misuse of drug and natural rewards. We also highlight the conserved effect of valence on exploration across groups with enhanced uncertainty avoidance to losses possibly reflecting an interaction with underlying loss aversion tendencies. Although we do not currently examine the neural correlates of exploration in the pathological groups, we build upon the understanding of the role of the FPC in guiding higher order and flexible decision-making, illustrating the possible means through which it coordinates behavioral processes in HV. Together, the findings further the characterization of overlapping disorders of natural and drug rewards by maintaining the use of dimensional facets of compulsivity.

FUNDING AND DISCLOSURE

The study was funded by the Wellcome Trust Fellowship grant for VV (093705/Z/10/Z) and Cambridge NIHR Biomedical Research Centre. MJF is funded by NIMH and NSF grants, and is consultant for Hoffmann–LaRoche pharmaceuticals. The remaining authors declare no conflict of interest.

References

Association AP (2000) Diagnostic and Statistical Manual of Mental Disorders. 4th edn, Text rev. American Psychiatric Association: Washington, DC.
Google Scholar
Badre D, Doll BB, Long NM, Frank MJ (2012). Rostrolateral prefrontal cortex and individual differences in uncertainty-driven exploration. Neuron 73: 595–607.
Article CAS Google Scholar
Bechara A, Dolan S, Denburg N, Hindes A, Anderson SW, Nathan PE (2001). Decision-making deficits, linked to a dysfunctional ventromedial prefrontal cortex, revealed in alcohol and stimulant abusers. Neuropsychologia 39: 376–389.
Article CAS Google Scholar
Boorman ED, Behrens TE, Woolrich MW, Rushworth MF (2009). How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron 62: 733–743.
Article CAS Google Scholar
Cavanagh JF, Figueroa CM, Cohen MX, Frank MJ (2012). Frontal theta reflects uncertainty and unexpectedness during exploration and exploitation. Cereb Cortex 22: 2575–2586.
Article Google Scholar
Cavanna AE, Trimble MR (2006). The precuneus: a review of its functional anatomy and behavioural correlates. Brain 129: 564–583.
Article Google Scholar
Chakravarty S, Roy J (2009). Recursive expected utility and the separation of attitudes towards risk and ambiguity: an experimental study. Theor Decis 66: 199–228.
Article Google Scholar
Christoff K, Gabrieli JDE (2000). The frontopolar cortex and human cognition: evidence for a rostrocaudal hierarchical organization within the human prefrontal cortex. Psychobiology 28: 168–186.
Google Scholar
Culham JC, Brandt SA, Cavanagh P, Kanwisher NG, Dale AM, Tootell RB (1998). Cortical fMRI activation produced by attentive tracking of moving targets. J Neurophysiol 80: 2657–2670.
Article CAS Google Scholar
Daw ND, O'Doherty JP, Dayan P, Seymour B, Dolan RJ (2006). Cortical substrates for exploratory decisions in humans. Nature 441: 876–879.
Article CAS Google Scholar
Dayan P, Sejnowski TJ (1996). Exploration bonuses and dual control. Mach Learn 25: 5–22.
Google Scholar
Edwards S, Koob GF (2010). Neurobiology of dysregulated motivational systems in drug addiction. Future Neurol 5: 393–401.
Article CAS Google Scholar
Frank MJ, Doll BB, Oas-Terpstra J, Moreno F (2009). Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation. Nat Neurosci 12: 1062–1068.
Article CAS Google Scholar
Goudriaan AE, Oosterlaan J, de Beurs E, van den Brink W (2005). Decision making in pathological gambling: a comparison between pathological gamblers, alcohol dependents, persons with Tourette syndrome, and normal controls. Brain Res Cogn Brain Res 23: 137–151.
Article Google Scholar
Hascoet M, Bourin M, Nic Dhonnchadha BA (2001). The mouse light-dark paradigm: a review. Prog Neuropsychopharmacol Biol Psychiatry 25: 141–166.
Article CAS Google Scholar
Hefner KR, Curtin JJ (2012). Alcohol stress response dampening: selective reduction of anxiety in the face of uncertain threat. J Psychopharmacol 26: 232–244.
Article Google Scholar
Ho JLY, Keller LR, Keltyka P (2002). Effects of outcome and probabilistic ambiguity on managerial choices. J Risk Uncertain 24: 47–74.
Article Google Scholar
Insel T, Cuthbert B, Garvey M, Heinssen R, Pine DS, Quinn K et al (2010). Research domain criteria (RDoC): toward a new classification framework for research on mental disorders. Am J Psychiatry 167: 748–751.
Article Google Scholar
Kahneman D, Tversky A (1979). Prospect Theory—analysis of decision under risk. Econometrica 47: 263–291.
Article Google Scholar
Karnath HO, Perenin MT (2005). Cortical control of visually guided reaching: evidence from patients with optic ataxia. Cereb Cortex 15: 1561–1569.
Article Google Scholar
Kayser AS, Mitchell JM, Weinstein D, Frank MJ (2015). Dopamine, locus of control, and the exploration-exploitation tradeoff. Neuropsychopharmacology 40: 454–462.
Article CAS Google Scholar
Koechlin E, Hyafil A (2007). Anterior prefrontal function and the limits of human decision-making. Science 318: 594–598.
Article CAS Google Scholar
Kolling N, Behrens TEJ, Mars RB, Rushworth MFS (2012). Neural mechanisms of foraging. Science 336: 95–98.
Article CAS Google Scholar
Kolling N, Wittmann M, Rushworth MFS (2014). Multiple neural mechanisms of decision making and their competition under changing risk pressure. Neuron 81: 1190–1202.
Article CAS Google Scholar
Koob GF (2013). Negative reinforcement in drug addiction: the darkness within. Curr Opin Neurobiol 23: 559–563.
Article CAS Google Scholar
Koob GF, Le Moal M (2005). Plasticity of reward neurocircuitry and the 'dark side' of drug addiction. Nat Neurosci 8: 1442–1444.
Article CAS Google Scholar
Kundu P, Inati SJ, Evans JW, Luh WM, Bandettini PA (2012). Differentiating BOLD and non-BOLD signals in fMRI time series using multi-echo EPI. Neuroimage 60: 1759–1770.
Article Google Scholar
Liu H, Qin W, Li W, Fan L, Wang J, Jiang T et al (2013). Connectivity-based parcellation of the human frontal pole with diffusion tensor imaging. J Neurosci 33: 6782–6790.
Article CAS Google Scholar
Ludvig EA, Spetch ML (2011). Of black swans and tossed coins: is the description-experience gap in risky choice limited to rare events? PloS One 6: e20262.
Article CAS Google Scholar
Morrison I, Tipper SP, Fenton-Adams WL, Bach P (2013). ‘Feeling’ others' painful actions: the sensorimotor integration of pain and action information. Hum Brain Mapp 34: 1982–1998.
Article Google Scholar
Okuda J, Fujii T, Ohtake H, Tsukiura T, Tanji K, Suzuki K et al (2003). Thinking of the future and past: the roles of the frontal pole and the medial temporal lobes. Neuroimage 19: 1369–1380.
Article Google Scholar
Ramnani N, Owen AM (2004). Anterior prefrontal cortex: insights into function from anatomy and neuroimaging. Nat Rev Neurosci 5: 184–194.
Article CAS Google Scholar
Robbins TW, Gillan CM, Smith DG, de Wit S, Ersche KD (2012). Neurocognitive endophenotypes of impulsivity and compulsivity: towards dimensional psychiatry. Trends Cogn Sci 16: 81–91.
Article Google Scholar
Robinson JD, Lam CY, Carter BL, Minnix JA, Cui Y, Versace F et al (2011). A multimodal approach to assessing the impact of nicotine dependence, nicotine abstinence, and craving on negative affect in smokers. Exp Clin Psychopharmacol 19: 40–52.
Article Google Scholar
Rose EJ, Ross TJ, Salmeron BJ, Lee M, Shakleya DM, Huestis MA et al (2013). Acute nicotine differentially impacts anticipatory valence- and magnitude-related striatal activity. Biol Psychiatry 73: 280–288.
Article CAS Google Scholar
Selemon LD, Goldman-Rakic PS (1988). Common cortical and subcortical targets of the dorsolateral prefrontal and posterior parietal cortices in the rhesus monkey: evidence for a distributed neural network subserving spatially guided behavior. J Neurosci 8: 4049–4068.
Article CAS Google Scholar
Slotnick SD, Moo LR, Segal JB, Hart J Jr (2003). Distinct prefrontal cortex activity associated with item memory and source memory for visual shapes. Brain Res Cogn Brain Res 17: 75–82.
Article Google Scholar
Tom SM, Fox CR, Trepel C, Poldrack RA (2007). The neural basis of loss aversion in decision-making under risk. Science 315: 515–518.
Article CAS Google Scholar
Voon V, Derbyshire K, Rück C, Irvine MA, Worbe Y, Enander J et al (2014a). Disorders of compulsivity: a common bias towards learning habits. Mol Psychiatry 20: 345–352.
Article Google Scholar
Voon V, Irvine MA, Derbyshire K, Worbe Y, Lange I, Abbott S et al (2014b). Measuring ‘waiting’ impulsivity in substance addictions and binge eating disorder in a novel analogue of rodent serial reaction time task. Biol Psychiatry 75: 148–155.
Article Google Scholar
Voon V, Morris LS, Irvine MA, Ruck C, Worbe Y, Derbyshire K et al (2014c). Risk-taking in disorders of natural and drug rewards: neural correlates and effects of probability, valence, and magnitude. Neuropsychopharmacology 40: 804–812.
Article Google Scholar
Whitfield-Gabrieli S, Nieto-Castanon A (2012). Conn: a functional connectivity toolbox for correlated and anticorrelated brain networks. Brain Connect 2: 125–141.
Article Google Scholar
Wilson RC, Geana A, White JM, Ludvig EA, Cohen JD (2014). Humans use directed and random exploration to solve the explore-exploit dilemma. J Exp Psychol Gen 143: 2074–2081.
Article Google Scholar

Download references

Acknowledgements

VV and NAH are Wellcome Trust (WT) intermediate Clinical Fellows. LSM is in receipt of an MRC studentship. The BCNI is supported by a WT and MRC grant.

Author information

Authors and Affiliations

Institute of Behavioural and Clinical Neuroscience, University of Cambridge, Cambridge, UK
Laurel S Morris & Valerie Voon
Department of Psychiatry, University of Cambridge, Addenbrooke’s Hospital, Cambridge, UK
Kwangyeol Baek, Prantik Kundu & Valerie Voon
Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD, USA
Kwangyeol Baek, Prantik Kundu & Valerie Voon
Department of Psychiatry, Brighton and Sussex Medical School, Brighton, UK
Prantik Kundu
Department of Cognitive, Linguistic and Psychological Sciences, Brown Institute for Brain Science, Psychiatry and Human Behavior, Brown University, Providence, RI, USA
Neil A Harrison
Cambridgeshire and Peterborough NHS Foundation Trust, Cambridge, UK
Michael J Frank
NIHR Cambridge Biomedical Research Centre, Cambridge, UK
Valerie Voon

Authors

Laurel S Morris
View author publications
You can also search for this author in PubMed Google Scholar
Kwangyeol Baek
View author publications
You can also search for this author in PubMed Google Scholar
Prantik Kundu
View author publications
You can also search for this author in PubMed Google Scholar
Neil A Harrison
View author publications
You can also search for this author in PubMed Google Scholar
Michael J Frank
View author publications
You can also search for this author in PubMed Google Scholar
Valerie Voon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Valerie Voon.

Additional information

Supplementary Information accompanies the paper on the Neuropsychopharmacology website

Supplementary information

Supplementary Figure S1 (JPG 118 kb)

Supplementary Figure Legends (DOCX 25 kb)

Supplementary Information (DOCX 113 kb)

PowerPoint slides

PowerPoint slide for Fig. 1

PowerPoint slide for Fig. 2

PowerPoint slide for Fig. 3

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Morris, L., Baek, K., Kundu, P. et al. Biases in the Explore–Exploit Tradeoff in Addictions: The Role of Avoidance of Uncertainty. Neuropsychopharmacol 41, 940–948 (2016). https://doi.org/10.1038/npp.2015.208

Download citation

Received: 19 March 2015
Revised: 03 July 2015
Accepted: 05 July 2015
Published: 15 July 2015
Issue Date: March 2016
DOI: https://doi.org/10.1038/npp.2015.208

This article is cited by

Disentangling the roles of dopamine and noradrenaline in the exploration-exploitation tradeoff during human decision-making
- Anna Cremer
- Felix Kalbe
- Lars Schwabe
Neuropsychopharmacology (2023)
Value-free random exploration is linked to impulsivity
- Magda Dubois
- Tobias U. Hauser
Nature Communications (2022)
Parameter and Model Recovery of Reinforcement Learning Models for Restless Bandit Problems
- Ludwig Danwitz
- David Mathar
- Jan Peters
Computational Brain & Behavior (2022)
To learn or to gain: neural signatures of exploration in human decision-making
- Shanshan Zhen
- Zachary A. Yaple
- Rongjun Yu
Brain Structure and Function (2022)
Chronic nicotine increases midbrain dopamine neuron activity and biases individual strategies towards reduced exploration in mice
- Malou Dongelmans
- Romain Durand-de Cuttoli
- Philippe Faure
Nature Communications (2021)

Subjects

Abstract

Similar content being viewed by others

INTRODUCTION

MATERIALS AND METHODS

Participants

Task

Resting State Functional MRI

RESULTS

Behavioral Characterization of Explore–Exploit Dilemma Across Disorders of Natural and Drug Rewards

Frontal Polar Cortex Connectivity and Exploration

DISCUSSION

FUNDING AND DISCLOSURE

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Supplementary information

PowerPoint slides

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links