Abstract
Valuation is a key tenet of decision neuroscience, where it is generally assumed that different attributes of competing options are assimilated into unitary values. Such values are central to current neural models of choice. By contrast, psychological studies emphasize complex interactions between choice and valuation. Principles of neuronal selection also suggest that competitive inhibition may occur in early valuation stages, before option selection. We found that behavior in multi-attribute choice is best explained by a model involving competition at multiple levels of representation. This hierarchical model also explains neural signals in human brain regions previously linked to valuation, including striatum, parietal and prefrontal cortex, where activity represents within-attribute competition, competition between attributes and option selection. This multi-layered inhibition framework challenges the assumption that option values are computed before choice. Instead, our results suggest a canonical competition mechanism throughout all stages of a processing hierarchy, not simply at a final choice stage.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Change history
10 August 2015
In the version of this article initially published, a minus sign was missing before the coefficient β in equation (3), Online Methods. The error has been corrected in the HTML and PDF versions of the article.
References
Bettman, J.R. Constructive consumer choice processes. J. Consum. Res. 25, 187–217 (1998).
McFarland, D.J. & Sibly, R.M. The behavioral final common path. Philos. Trans. R. Soc. Lond. B Biol. Sci. 270, 265–293 (1975).
Camerer, C.F. & Fehr, E. When does “economic man” dominate social behavior? Science 311, 47–52 (2006).
Keeney, R.L. & Raiffa, H. Decisions With Multiple Objectives: Preferences And Value Functions (Cambridge University Press, Cambridge, 1993).
Rangel, A., Camerer, C. & Montague, P.R. A framework for studying the neurobiology of value-based decision making. Nat. Rev. Neurosci. 9, 545–556 (2008).
Kable, J.W. & Glimcher, P.W. The neurobiology of decision: consensus and controversy. Neuron 63, 733–745 (2009).
Padoa-Schioppa, C. Neurobiology of economic choice: a good-based model. Annu. Rev. Neurosci. 34, 333–359 (2011).
Hare, T.A., Schultz, W., Camerer, C.F., O'Doherty, J.P. & Rangel, A. Transformation of stimulus value signals into motor commands during simple choice. Proc. Natl. Acad. Sci. USA 108, 18120–18125 (2011).
Lim, S.-L., O'Doherty, J.P. & Rangel, A. Stimulus value signals in ventromedial PFC reflect the integration of attribute value signals computed in fusiform gyrus and posterior superior temporal gyrus. J. Neurosci. 33, 8729–8741 (2013).
Hunt, L.T. et al. Mechanisms underlying cortical activity during value-guided choice. Nat. Neurosci. 15, 470–476 (2012).
FitzGerald, T.H., Seymour, B. & Dolan, R.J. The role of human orbitofrontal cortex in value comparison for incommensurable objects. J. Neurosci. 29, 8388–8395 (2009).
Rudebeck, P.H. et al. Frontal cortex subregions play distinct roles in choices between actions and stimuli. J. Neurosci. 28, 13775–13785 (2008).
Glöckner, A. & Betsch, T. Multiple-reason decision making based on automatic processing. J. Exp. Psychol. Learn. Mem. Cogn. 34, 1055–1075 (2008).
Tversky, A. & Simonson, I. Context-dependent preferences. Manage. Sci. 39, 1179–1189 (1993).
Payne, J.W. Task complexity and contingent processing in decision-making—information search and protocol analysis. Organ. Behav. Hum. Perform. 16, 366–387 (1976).
Vlaev, I., Chater, N., Stewart, N. & Brown, G.D. Does the brain calculate value? Trends Cogn. Sci. 15, 546–554 (2011).
Roe, R.M., Busemeyer, J.R. & Townsend, J.T. Multialternative decision field theory: a dynamic connectionist model of decision making. Psychol. Rev. 108, 370–392 (2001).
Usher, M. & McClelland, J.L. Loss aversion and inhibition in dynamical models of multialternative choice. Psychol. Rev. 111, 757–769 (2004).
Tsetsos, K., Usher, M. & Chater, N. Preference reversal in multiattribute choice. Psychol. Rev. 117, 1275–1293 (2010).
Park, S.Q., Kahnt, T., Rieskamp, J. & Heekeren, H.R. Neurobiology of value integration: when value impacts valuation. J. Neurosci. 31, 9307–9314 (2011).
Wang, X.-J. Decision making in recurrent neuronal circuits. Neuron 60, 215–234 (2008).
Bogacz, R., Brown, E., Moehlis, J., Holmes, P. & Cohen, J.D. The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks. Psychol. Rev. 113, 700–765 (2006).
Louie, K., Khaw, M.W. & Glimcher, P.W. Normalization is a general neural mechanism for context-dependent decision making. Proc. Natl. Acad. Sci. USA 110, 6139–6144 (2013).
Noonan, M.P. et al. Separate value comparison and learning mechanisms in macaque medial and lateral orbitofrontal cortex. Proc. Natl. Acad. Sci. USA 107, 20547–20552 (2010).
Balleine, B.W. Neural bases of food-seeking: affect, arousal and reward in corticostriatolimbic circuits. Physiol. Behav. 86, 717–730 (2005).
Tricomi, E., Balleine, B.W. & O'Doherty, J.P. A specific role for posterior dorsolateral striatum in human habit learning. Eur. J. Neurosci. 29, 2225–2232 (2009).
Boorman, E.D., Behrens, T.E.J., Woolrich, M.W. & Rushworth, M.S.F. How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron 62, 733–743 (2009).
Philiastides, M.G., Biele, G. & Heekeren, H.R. A mechanistic account of value computation in the human brain. Proc. Natl. Acad. Sci. USA 107, 9430–9435 (2010).
Heekeren, H.R., Marrett, S., Bandettini, P.A. & Ungerleider, L.G. A general mechanism for perceptual decision-making in the human brain. Nature 431, 859–862 (2004).
Kriegeskorte, N., Simmons, W.K., Bellgowan, P.S. & Baker, C.I. Circular analysis in systems neuroscience: the dangers of double dipping. Nat. Neurosci. 12, 535–540 (2009).
Boorman, E.D., Rushworth, M.F. & Behrens, T.E. Ventromedial prefrontal and anterior cingulate cortex adopt choice and default reference frames during sequential multi-alternative choice. J. Neurosci. 33, 2242–2253 (2013).
Gershman, S.J., Pesaran, B. & Daw, N.D. Human reinforcement learning subdivides structured action spaces by learning effector-specific values. J. Neurosci. 29, 13524–13531 (2009).
O'Reilly, J.X., Woolrich, M.W., Behrens, T.E., Smith, S.M. & Johansen-Berg, H. Tools of the trade: psychophysiological interactions and functional connectivity. Soc. Cogn. Affect. Neurosci. 7, 604–609 (2012).
Jbabdi, S., Lehman, J.F., Haber, S.N. & Behrens, T.E. Human and monkey ventral prefrontal fibers use the same organizational principles to reach their targets: tracing versus tractography. J. Neurosci. 33, 3190–3201 (2013).
Alexander, W.H. & Brown, J.W. Medial prefrontal cortex as an action-outcome predictor. Nat. Neurosci. 14, 1338–1344 (2011).
Clithero, J.A. & Rangel, A. Informatic parcellation of the network involved in the computation of subjective value. Soc. Cogn. Affect. Neurosci. 9, 1289–1302 (2014).
Levy, D.J. & Glimcher, P.W. The root of all value: a neural common currency for choice. Curr. Opin. Neurobiol. 22, 1027–1038 (2012).
Collins, A.G. & Frank, M.J. Cognitive control over learning: creating, clustering, and generalizing task-set structure. Psychol. Rev. 120, 190–229 (2013).
Mars, R.B. et al. Diffusion-weighted imaging tractography-based parcellation of the human parietal cortex and comparison with human and macaque resting-state functional connectivity. J. Neurosci. 31, 4087–4100 (2011).
Thiel, C.M., Zilles, K. & Fink, G.R. Cerebral correlates of alerting, orienting and reorienting of visuospatial attention: an event-related fMRI study. Neuroimage 21, 318–328 (2004).
Piazza, M., Izard, V., Pinel, P., Le Bihan, D. & Dehaene, S. Tuning curves for approximate numerosity in the human intraparietal sulcus. Neuron 44, 547–555 (2004).
O'Reilly, J.X., Jbabdi, S., Rushworth, M.F. & Behrens, T.E. Brain systems for probabilistic and dynamic prediction: computational specificity and integration. PLoS Biol. 11, e1001662 (2013).
Fellows, L.K. Deciding how to decide: ventromedial frontal lobe damage affects information acquisition in multi-attribute decision making. Brain 129, 944–952 (2006).
Price, J.L. Definition of the orbital cortex in relation to specific connections with limbic and visceral structures and other cortical regions. Ann. NY Acad. Sci. 1121, 54–71 (2007).
Haber, S.N., Fudge, J.L. & McFarland, N.R. Striatonigrostriatal pathways in primates form an ascending spiral from the shell to the dorsolateral striatum. J. Neurosci. 20, 2369–2382 (2000).
Carandini, M. & Heeger, D.J. Normalization as a canonical neural computation. Nat. Rev. Neurosci. 13, 51–62 (2012).
Chau, B.K., Kolling, N., Hunt, L.T., Walton, M.E. & Rushworth, M.F. A neural mechanism underlying failure of optimal choice with multiple alternatives. Nat. Neurosci. 17, 463–470 (2014).
Rangel, A. & Clithero, J.A. Value normalization in decision making: theory and evidence. Curr. Opin. Neurobiol. 22, 970–981 (2012).
Sjöberg, L. Choice frequency and similarity. Scand. J. Psychol. 18, 103–115 (1977).
Hunt, L.T., Woolrich, M.W., Rushworth, M.F. & Behrens, T.E. Trial-type dependent frames of reference for value comparison. PLOS Comput. Biol. 9, e1003225 (2013).
Weiskopf, N., Hutton, C., Josephs, O. & Deichmann, R. Optimal EPI parameters for reduction of susceptibility-induced BOLD sensitivity losses: a whole-brain analysis at 3 T and 1.5 T. Neuroimage 33, 493–504 (2006).
Woolrich, M.W. et al. Bayesian analysis of neuroimaging data in FSL. Neuroimage 45, S173–S186 (2009).
Behrens, T.E., Hunt, L.T., Woolrich, M.W. & Rushworth, M.F. Associative learning of social value. Nature 456, 245–249 (2008).
Acknowledgements
We thank M. Rushworth, K. Tsetsos and M. Usher for helpful discussions, and H. Barron and P. Smittenaar for comments on the manuscript. This work was supported by the Wellcome Trust (Sir Henry Wellcome Fellowship 098830/Z/12/Z to L.T.H., Senior Investigator Award 098362/Z/12/Z to R.J.D. and Career Development Fellowship 088312AIA to T.E.J.B.). The Wellcome Trust Centre for Neuroimaging is supported by core funding from the Wellcome Trust (091593/Z/10/Z).
Author information
Authors and Affiliations
Contributions
All of the authors contributed to study design and preparation of the manuscript. L.T.H. acquired the data and built the model. L.T.H. and T.E.J.B. analyzed data.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 Maximum likelihood parameter estimates for logarithm of γ parameter, testing for subjective preference for one attribute over another.
Values greater than 0 indicate subject used stimulus information more heavily, values less than 0 indicate subject used action information more heavily. Bar shows mean; dots show individual datapoints. Although some variation can be seen across the population, the distribution of log(γ) was not significantly different from zero (T(18)=0.46, p=0.65). See methods section (‘p reference for using stimulus or action attribute’) for details of model.
Supplementary Figure 2 Schematic of the hierarchical competition network that best captures subject behavior.
Leaky, competing accumulators subserve competition at lower, within-attribute levels (LCAl) and higher, between-option levels (LCAh). Numbers in each of these nodes represent the three different alternatives. The amount of competition within-attribute (within LCAl nodes) itself forms the input for a competition for attention between stimulus and action attributes. Inputs to this competition are denoted by **; this competition is essentially the most novel part of the proposed model. This ‘attribute competition’ then feeds back to modulate attribute salience, such that attributes with a high within-attribute value difference capture attention and are upweighted in the decision, and attributes with a low within-attribute value difference are deemed less relevant and downweighted in the decision. The inputs to the lower-level competitions undergo divisive normalisation. See methods for mathematical description.
Supplementary Figure 3 Alternative choice models fail to capture within-attribute distractor effect.
In the main paper, we present the hierarchical LCA as an explanation of both behavioral and neural observations. Here, we show the behavior of several alternative models fails to capture the within-attribute distractor effect. In each case, we simulated choices from the model, and ran the same analysis on model choices as was performed on subject behavior (in main figure 3). (a) Choice behavior of a simple race model, in which the accumulation rate on each option is determined by the integrated value (pO) on each option. The red and blue lines overlie one another, showing that the weight given to each stimulus/action is the same irrespective of the value of the third option. Similar results could be obtained using a softmax choice rule. This also confirms that there is no inherent selection bias in the analysis run in main figure 3. (b) Choice behavior of a feedforward inhibition model, in which the rate of evidence accumulation on each option was proportional to that option’s value minus the mean value of all other options. Note that the blue lines – where option 3 is higher in value – are here slightly steeper than the red lines – implying a ‘reverse’ distractor effect. The model selects between options 1 and 2 with slightly higher accuracy when the value of the third option is higher. This may be caused by an increase in network inhibition when the third option is higher in value – leading to slower, but more accurate, decisions. (c) Choice behavior of a feedforward inhibition model with divisive normalisation of integrated values. As expected from previous studies, this leads to a third-option distractor effect in the same direction as observed in subject behavior (red lines are steeper than blue lines; selection between option 1 and 2 is more accurate when option 3 is lower in value). Importantly, however, because divisive normalisation is implemented at the level of integrated values, this distractor effect does not occur selectively within-attribute, but generalises across attributes (see lower two panels).
Supplementary Figure 4 Within-attribute divisive normalization (with integrated value comparison) accounts for within-attribute distractor effect, but not reaction time data.
Here we consider a feedforward inhibition model where comparison still occurs selectively on integrated values, but where these values are constructed from attributes that have been first undergone divisive normalisation. (a) Choice behavior of this model accurately predicts within-attribute distractor effect. Red lines are steeper than blue lines (i.e. choices between options 1 and 2 are more accurate when 3 is low rather than high) in the integrated value split (top panel), and selectively within-attribute in the stimulus split (middle panel) and action split (bottom palnel. (b) However, the reaction times of this model are, unsurprisingly, driven by integrated value difference, as the model’s rate of evidence accumulation is driven by integrated values. This is also true of the other models considered in Supplementary Figure 3. The slightly unusual effects of increased within-attribute value differences causing a slowing of reaction time is likely due to a combined effect of within-attribute divisive normalisation and some (limited) correlation between explanatory variables; this was not present in other models under consideration.
Supplementary Figure 5 Raw choice probabilities (from subject choices) also reveal within-attribute distractor effect.
Probability of choosing option 1 over option 2 is shown as a joint function of pS1-pS2 (stimulus probability difference) and pA1-pA2 (action probability difference). As in main figure 3, option 3 is always an unchosen option. Trials are sorted into the four panels based upon the stimulus or action value of option 3, split into high/low values. Trials are selected such that pS1>pS2 and pA1<pA2 (in cases where pS1>pS2 and pA1>pA2, subjects very rarely choose option 2). Hence, if subjects choose option 1, they choose the option with the higher stimulus value out of 1 and 2. Bin labels are as follows: +++ denotes (pS1-pS2)>0.55; ++ denotes 0.55>(pS1-pS2)>0.25; + denotes 0.25>(pS1-pS2)>0; −−− denotes (pA1-pA2)<−0.55; −− denotes −0.55>(pA1-pA2)>−0.25; − denotes −0.25<(pA1-pA2)<0. Two features can be seen. First, the general slope of the surfaces reflects the main influence of the values of options 1 and 2 on choices – for instance, when pS1-pS2 is large (+++) but pA1-pA2 is small (−), then subjects will generally choose option 1. Second, however, the within-attribute distractor effect can be seen by comparing the shape of each of the four surfaces. Compare, for example, when stimulus evidence (pS1-pS2) is moderately in favour of option 1 (++), but action evidence (pA1-pA2) is equally moderately in favour of option 2 (−−). Subjects become more likely to choose option 2 (i.e. go with best action) if V3r is low than if V3r is high (compare top-left plot with top-right plot). Conversely, subjects become more likely to choose option 1 (i.e. go with best stimulus) if V3s is low than if V3s is high. Other points where evidence in favour of option 1 and option 2 are roughly equal (i.e. (pS1-pS2) ≍ (pA2-pA1)) are denoted by the [+++,−−−] and [+,−] points in each graph. The data from these points and the [++,−−] points are the same data which are collapsed together and plotted in main figure 3d.
Supplementary Figure 6 Activations for the contrast shown in Figure 5, shown at an uncorrected threshold of |Z|>2.3 (p<0.01 uncorrected).
Note that the shown here is flipped in sign compared to that shown in main figure 5; it reflects (pCh relevant – pBUnCh relevant) – (pCh irrelevant – pBUnCh irrelevant). Whilst intraparietal sulcus (IPS) is the only brain region that survives whole-brain correction, bilateral activation can also be seen in the superior frontal sulcus (SFS). An activation for the same contrast, but with the opposing sign to IPS/SFS, is seen in the Lingual gyrus (Ling).
Supplementary Figure 7 Choice signals in IPS and dMFC survive inclusion of reaction time as co-regressor.
Analysis of fMRI timeseries from IPS (a) and dMFC (b), as in main figure 5b/7b, with log(reaction time) included as a coregressor in the analysis. In both regions, log(reaction time) had a positive effect on BOLD signal at the time of choice. However, the value effects shown in the main paper remain untouched by the inclusion of log(reaction time) as a coregressor.
Supplementary Figure 8 Learning signals relating to reward in classic reward prediction error-encoding regions, and learning signals relating to attribute in the intraparietal sulcus.
(a) A region in the ventromedial prefrontal cortex, extending into a portion of the nucleus accumbens, was activated by the contrast (reward outcome – reward probabilityrelevant) (FWE-corrected p<0.05, cluster-forming threshold Z>3.1; peak Z=5.03, MNI=4,18,-6mm). (b) A timeseries analysis of this region revealed hallmarks of a reward prediction error – an initial coding of the expectation of reward (positive reward probability, on both relevant and irrelevant attributes), and a subsequent encoding of the reward outcome (blue line). As before, cross-validated ROIs were used for timeseries extraction (see methods). (c) A region in the intraparietal sulcus was activated by the contrast (reward probabilityrelevant - reward probabilityirrelevant)) (FWE-corrected p<0.05, cluster-forming threshold Z>3.1; peak Z = 4.20, MNI = -40,-44,58 mm). (d) Timeseries analysis of region in the intraparietal sulcus extracted from the decision epoch (see main figure 5a) revealed an attribute learning signal, with a negative correlate of reward outcome, a positive correlate of reward probabilityrelevant, and a negative correlate of reward probabilityirrelevant.
Supplementary Figure 9 Choice signals in ventromedial prefrontal cortex.
In the present study, at the time of choice, VMPFC activity (on trials where stimulus and action attributes conflicted) was found to reflect the chosen reward probability on the relevant attribute (but not the best unchosen reward probability). VMPFC mask was taken from a recent meta-analysis.
Supplementary Figure 10 Colinearity of probabilities used in main experiment with frequency of experienced reward during training.
As mentioned in the methods, pS and pA were scaled from worst to best as (0.1,0.2,0.3,0.4,0.6,0.7,0.8,0.9) in the main experiment, but during training, deterministic feedback was used (to ensure equal learning on chosen and unchosen options, and to facilitate speed of learning). Here we show that the experienced reward probabilities during training, and the probability of receiving reward in the experiment, are essentially collinear (r>0.99). Datapoints show individual stimuli/actions; line shows best-fitting regression line.
Supplementary Figure 11 Contrast matrix applied to parameter estimates for binomial logistic regression model.
The logistic regression model used to analyse subjects choices (see methods) had the subject’s choice (i.e. chose option 2 coded as 1, chose option 1 coded as 0) as the dependent variable, and as independent variables contained separate indicator variables for each stimulus and each action for option 1 and 2 (valued 1 when that stimulus/action was present for option 1 or 2, and 0 otherwise). To finesse problems of rank deficiency, we removed the regressor corresponding to the eighth (best) stimulus and action for both options 1 and 2 (that is, we removed columns 8, 16, 24 and 32 from the design matrix), and added a single constant term. The parameter estimates from this model do not directly correspond to weights for each stimulus and action; instead, to recover the effects for each individual stimulus and action on subjects’ choices, we must multiply the vector of parameter estimates by a contrast matrix, C. This consists of 16 rows (one for each stimulus, one for each action) and 29 columns (one for each parameter in the regression model). The formulation of this contrast matrix is shown above; it is an extended form of the contrast matrix used to return mean effects when specifying an ANOVA, see http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FEAT/UserGuide. The weights plotted in main figure 3a-c/4a-c, therefore correspond to the mean +/− s.e (across subjects) of C* β, where β reflects the parameter estimates from the logistic model. See example MATLAB scripts (provided as supplementary material) for implementation.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–11, Supplementary Tables 1–3, and Supplementary Note (PDF 1564 kb)
Supplementary Methods Checklist
(PDF 396 kb)
Supplementary Data
Matlab script for behavioural analysis (ZIP 3313 kb)
Rights and permissions
About this article
Cite this article
Hunt, L., Dolan, R. & Behrens, T. Hierarchical competitions subserving multi-attribute choice. Nat Neurosci 17, 1613–1622 (2014). https://doi.org/10.1038/nn.3836
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nn.3836
This article is cited by
-
Advances in modeling learning and decision-making in neuroscience
Neuropsychopharmacology (2022)
-
Use of the Alpha-Theta Diagram as a decision neuroscience tool for analyzing holistic evaluation in decision making
Annals of Operations Research (2022)
-
Healthful choices depend on the latency and rate of information accumulation
Nature Human Behaviour (2021)
-
Two-dimensional reward evaluation in mice
Animal Cognition (2021)
-
Combining holistic and decomposition paradigms in preference modeling with the flexibility of FITradeoff
Central European Journal of Operations Research (2021)