The neural mechanisms mediating sensory-guided decision-making have received considerable attention, but animals often pursue behaviors for which there is currently no sensory evidence. Such behaviors are guided by internal representations of choice values that have to be maintained even when these choices are unavailable. We investigated how four macaque monkeys maintained representations of the value of counterfactual choices—choices that could not be taken at the current moment but which could be taken in the future. Using functional magnetic resonance imaging, we found two different patterns of activity co-varying with values of counterfactual choices in a circuit spanning the hippocampus, the anterior lateral prefrontal cortex and the anterior cingulate cortex. Anterior cingulate cortex activity also reflected whether the internal value representations would be translated into actual behavioral change. To establish the causal importance of the anterior cingulate cortex for this translation process, we used a novel technique, transcranial focused ultrasound stimulation, to reversibly disrupt anterior cingulate cortex activity.
Subscribe to Journal
Get full journal access for 1 year
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
The data that support the findings of this study are available from the corresponding author upon reasonable request
The code to generate the results and the figures of this study are available from the corresponding author upon reasonable request.
Noser, R. & Byrne, R. W. Mental maps in chacma baboons (Papio ursinus): using inter-group encounters as a natural experiment. Anim. Cogn. 10, 331–340 (2007).
Shadlen, M. N. & Shohamy, D. Decision making and sequential sampling from memory. Neuron 90, 927–939 (2016).
Boorman, E. D., Behrens, T. E. & Rushworth, M. F. Counterfactual choice and learning in a neural network centered on human lateral frontopolar cortex. PLoS Biol. 9, e1001093 (2011).
Boorman, E. D., Behrens, T. E. J., Woolrich, M. W. & Rushworth, M. F. S. How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron 62, 733–743 (2009).
Scholl, J. et al. The good, the bad, and the irrelevant: neural mechanisms of learning real and hypothetical rewards and effort. J. Neurosci. 35, 11233–11251 (2015).
Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B. & Dolan, R. J. Cortical substrates for exploratory decisions in humans. Nature 441, 876–879 (2006).
Kolling, N., Behrens, T. E. J., Mars, R. B. & Rushworth, M. F. S. Neural mechanisms of foraging. Science 336, 95–98 (2012).
Kolling, N., Behrens, T., Wittmann, M. K. & Rushworth, M. Multiple signals in anterior cingulate cortex. Curr. Opin. Neurobiol. 37, 36–43 (2016).
Aggleton, J. P., Wright, N. F., Rosene, D. L. & Saunders, R. C. Complementary patterns of direct amygdala and hippocampal projections to the macaque prefrontal cortex. Cereb. Cortex 25, 4351–4373 (2015).
Jang, A. I. et al. The role of frontal cortical and medial-temporal lobe brain areas in learning a Bayesian prior belief on reversals. J. Neurosci. 35, 11751–11760 (2015).
Neubert, F.-X., Mars, R. B., Thomas, A. G., Sallet, J. & Rushworth, M. F. S. Comparison of human ventral frontal cortex areas for cognitive control and language with areas in monkey frontal cortex. Neuron 81, 700–713 (2014).
Bludau, S. et al. Cytoarchitecture, probability maps and functions of the human frontal pole. Neuroimage 93, 260–275 (2014).
Abe, H. & Lee, D. Distributed coding of actual and hypothetical outcomes in the orbital and dorsolateral prefrontal cortex. Neuron 70, 731–741 (2011).
Hayden, B. Y., Pearson, J. M. & Platt, M. L. Fictive reward signals in the anterior cingulate cortex. Science 324, 948–950 (2009).
Kolling, N. et al. Value, search, persistence and model updating in anterior cingulate cortex. Nat. Neurosci. 19, 1280–1285 (2016).
Miyamoto, K. et al. Causal neural network of metamemory for retrospection in primates. Science 355, 188–193 (2017).
Verhagen, L. et al. Offline impact of transcranial focused ultrasound on cortical activation in primates. eLife 8, e40541 (2019).
Folloni, D. et al. Manipulation of subcortical and deep cortical activity in the primate brain using transcranial focused ultrasound stimulation. Neuron https://doi.org/10.1016/j.neuron.2019.01.019 (2019).
Deffieux, T. et al. Low-intensity focused ultrasound modulates monkey visuomotor behavior. Curr. Biol. 23, 2430–2433 (2013).
Fouragnan, E., Queirazza, F., Retzler, C., Mullinger, K. J. & Philiastides, M. G. Spatiotemporal neural characterization of prediction error valence and surprise during reward learning in humans. Sci. Rep. 7, 4762 (2017).
Fouragnan, E., Retzler, C., Mullinger, K. & Philiastides, M. G. Two spatiotemporally distinct value systems shape reward-based learning in the human brain. Nat. Commun. 6, 8107 (2015).
Niv, Y. et al. Reinforcement learning in multidimensional environments relies on attention mechanisms. J. Neurosci. 35, 8145–8157 (2015).
Klein-Flügge, M. C. & Bestmann, S. Time-dependent changes in human corticospinal excitability reveal value-based competition for action during decision processing. J. Neurosci. 32, 8373–8382 (2012).
Hunt, L. T. et al. Mechanisms underlying cortical activity during value-guided choice. Nat. Neurosci. 15, S1–S3 (2012).
Rangel, A., Camerer, C. & Montague, P. R. A framework for studying the neurobiology of value-based decision making. Nat. Rev. Neurosci. 9, 545–556 (2008).
Kolling, N., Wittmann, M. & Rushworth, M. F. S. Multiple neural mechanisms of decision making and their competition under changing risk pressure. Neuron 81, 1190–1202 (2014).
Kriegeskorte, N., Simmons, W. K., Bellgowan, P. S. F. & Baker, C. I. Circular analysis in systems neuroscience: the dangers of double dipping. Nat. Neurosci. 12, 535–540 (2009).
Martin, V. C., Schacter, D. L., Corballis, M. C. & Addis, D. R. A role for the hippocampus in encoding simulations of future events. Proc. Natl Acad. Sci. USA 108, 13858–13863 (2011).
Palminteri, S., Khamassi, M., Joffily, M. & Coricelli, G. Contextual modulation of value signals in reward and punishment learning. Nat. Commun. 6, 8096 (2015).
Wittmann, M. K. et al. Predictive decision making driven by multiple time-linked reward representations in the anterior cingulate cortex. Nat. Commun. 7, 12327 (2016).
Schuck, N. W. et al. Medial prefrontal cortex predicts internally driven strategy shifts. Neuron 86, 331–340 (2015).
Procyk, E., Tanaka, Y. L. & Joseph, J. P. Anterior cingulate activity during routine and non-routine sequential behaviors in macaques. Nat. Neurosci. 3, 502–508 (2000).
Paus, T. Imaging the brain before, during, and after transcranial magnetic stimulation. Neuropsychologia 37, 219–224 (1999).
Noonan, M. P. et al. Separate value comparison and learning mechanisms in macaque medial and lateral orbitofrontal cortex. Proc. Natl Acad. Sci. USA 107, 20547–20552 (2010).
O’Reilly, J. X. et al. Dissociable effects of surprise and model update in parietal and anterior cingulate cortex. Proc. Natl Acad. Sci. USA 110, E3660–E3669 (2013).
Gallistel, C. R., Mark, T. A., King, A. P. & Latham, P. E. The rat approximates an ideal detector of changes in rates of reward: implications for the law of effect. J. Exp. Psychol. Anim. Behav. Process. 27, 354–372 (2001).
Rudebeck, P. H. & Murray, E. A. The orbitofrontal oracle: cortical mechanisms for the prediction and evaluation of specific behavioral outcomes. Neuron 84, 1143–1156 (2014).
Papageorgiou, G. K. et al. Inverted activity patterns in ventromedial prefrontal cortex during value-guided decision-making in a less-is-more task. Nat. Commun. 8, 1886 (2017).
Chau, B. K. H., Kolling, N., Hunt, L. T., Walton, M. E. & Rushworth, M. F. S. A neural mechanism underlying failure of optimal choice with multiple alternatives. Nat. Neurosci. 17, 463–470 (2014).
Rich, E. L. & Wallis, J. D. Decoding subjective decisions from orbitofrontal cortex. Nat. Neurosci. 19, 973–980 (2016).
Strait, C. E., Blanchard, T. C. & Hayden, B. Y. Reward value comparison via mutual inhibition in ventromedial prefrontal cortex. Neuron 82, 1357–1366 (2014).
Hunt, L. T. & Hayden, B. Y. A distributed, hierarchical and recurrent framework for reward-based choice. Nat. Rev. Neurosci. 18, 172–182 (2017).
Rushworth, M. F. S., Kolling, N., Sallet, J. & Mars, R. B. Valuation and decision-making in frontal cortex: one or many serial or parallel systems? Curr. Opin. Neurobiol. 22, 946–955 (2012).
Hayden, B. Y., Pearson, J. M. & Platt, M. L. Neuronal basis of sequential foraging decisions in a patchy environment. Nat. Neurosci. 14, 933–939 (2011).
Kolling, N., Scholl, J., Chekroud, A., Trier, H. A. & Rushworth, M. F. S. Prospection, perseverance, and insight in sequential behavior. Neuron 99, 1069–1082.e7 (2018).
Mackey, S. & Petrides, M. Quantitative demonstration of comparable architectonic areas within the ventromedial and lateral orbital frontal cortex in the human and the macaque monkey brains. Eur. J. Neurosci. 32, 1940–1950 (2010).
Sallet, J. et al. The organization of dorsal frontal cortex in humans and macaques. J. Neurosci. 33, 12255–12274 (2013).
Krubitzer, L. The magnificent compromise: cortical field evolution in mammals. Neuron 56, 201–208 (2007).
Louie, K., Khaw, M. W. & Glimcher, P. W. Normalization is a general neural mechanism for context-dependent decision making. Proc. Natl Acad. Sci. USA 110, 6139–6144 (2013).
Noonan, M. P., Chau, B. K. H., Rushworth, M. F. S. & Fellows, L. K. Contrasting effects of medial and lateral orbitofrontal cortex lesions on credit assignment and decision-making in humans. J. Neurosci. 37, 7023–7035 (2017).
Chau, B. K. H. et al. Contrasting roles for orbitofrontal cortex and amygdala in credit assignment and learning in macaques. Neuron 87, 1106–1118 (2015).
Sutton, R. Reinforcement Learning: An Introduction (MIT Press, 1998).
Farashahi, S., Azab, H., Hayden, B. & Soltani, A. On the flexibility of basic risk attitudes in monkeys. J. Neurosci. 38, 4383–4398 (2018).
Kolster, H. et al. Visual field map clusters in macaque extrastriate visual cortex. J. Neurosci. 29, 7031–7039 (2009).
Kolster, H., Janssens, T., Orban, G. A. & Vanduffel, W. The retinotopic organization of macaque occipitotemporal cortex anterior to V4 and caudoventral to the middle temporal (MT) cluster. J. Neurosci. 34, 10168–10191 (2014).
Van Essen, D. C. et al. Mapping visual cortex in monkeys and humans using surface-based atlases. Vision Res. 41, 1359–1378 (2001).
Smith, S. M. et al. Advances in functional and structural MR image analysis and implementation as FSL. Neuroimage 23(Suppl 1), S208–S219 (2004).
Fouragnan, E. et al. Reputational priors magnify striatal responses to violations of trust. J. Neurosci. 33, 3602–3611 (2013).
Nakahara, K., Hayashi, T., Konishi, S. & Miyashita, Y. Functional MRI of macaque monkeys performing a cognitive set-shifting task. Science 295, 1532–1536 (2002).
Kagan, I., Iyer, A., Lindner, A. & Andersen, R. A. Space representation for eye movements is more contralateral in monkeys than in humans. Proc. Natl Acad. Sci. USA 107, 7933–7938 (2010).
Behrens, T. E. J., Woolrich, M. W., Walton, M. E. & Rushworth, M. F. S. Learning the value of information in an uncertain world. Nat. Neurosci. 10, 1214–1221 (2007).
Constans, C., Deffieux, T., Pouget, P., Tanter, M. & Aubry, J. F. Erratum to ‘A 200–1380 kHz quadrifrequency focused ultrasound transducer for neurostimulation in rodents and primates: transcranial in vitro calibration and numerical study of the influence of skull cavity’. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 64, 1417 (2017).
Wattiez, N. et al. Transcranial ultrasonic stimulation modulates single-neuron discharge in macaques performing an antisaccade task. Brain Stimul. 10, 1024–1031 (2017).
Vanduffel, W., Zhu, Q. & Orban, G. A. Monkey cortex through fMRI glasses. Neuron 83, 533–550 (2014).
Velleman, P. F. & Welsch, R. E. Efficient computing of regression diagnostics. Am. Stat. 35, 234–242 (1981).
Noonan, M. P. et al. A neural circuit covarying with social hierarchy in macaques. PLoS Biol. 12, e1001940 (2014).
Neubert, F.-X., Mars, R. B., Sallet, J. & Rushworth, M. F. S. Connectivity reveals relationship of brain areas for reward-guided learning and decision making in human and monkey frontal cortex. Proc. Natl Acad. Sci. USA 112, E2695–E2704 (2015).
Funding for this work was provided by the Wellcome Trust (grant nos. 203139/Z/16/Z, WT100973AIA, 103184/Z/13/Z and 105238/Z/14/Z), the Medical Research Council (grant nos. MR/P024955/1 and G0902373), the Bettencourt Schueller Foundation and the Agence Nationale de la Recherche (grant no. ANR-10-EQPX-15) and Christ Church, University of Oxford. We are very grateful for the care afforded to the animals by the veterinary and technical staff at the University of Oxford. We also thank J. Scholl for helpful comments on the manuscript.
The authors declare no competing interests.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Integrated supplementary information
a, Bayesian model comparison using BIC scores revealed that the Maintain model explained the data better than a Decay model in which a free parameter captured how much the animals would ‘forget’ the unavailable option as well as two risk-sensitive models where the probabilities could be distorted by the animals. BICs are summed across 20 sessions. b, Distortion parameter for the simple distortion model across animals/sessions as well as the mean. The blue line represents the mean distortion parameter across sessions (n = 20). c, To compare the Decay and Maintain model, we simulated choice data using the best fitted parameter of each model and compared choice rates pre and post reversals (~110th trials) (left panel) as well as switch behaviors after a win or a loss (right panel); n = 20 sessions. The real data are represented in black.
Whole-brain analysis shows no significant relationship between BOLD activity and the expected value associated with the currently chosen (red color) and unchosen options (blue color). All statistical images are uncorrected at a lenient threshold of P < 0.05, whole-brain fMRI analysis, n = 25 sessions.
a, We contrasted four different accounts of activity found in ACC. b, We found that activity in the ACC, unlike in the hippocampus (Fig. 3) was not significantly modulated by the value of the unavailable option (red trace in leftmost panel). By contrast, activity in the ACC was significantly related to the value of the best counterfactual option (black trace in panel second from the left). Activity in the ACC could not be explained by the hypothesis that only options available for choice (and which might therefore affect the difficulty of choice selection) determined activity levels. Instead even options unavailable for choice (which could not affect the difficulty of choice selection) affected activity. When options were coded as HV, LV and unavailable options, the LV option did not exert a consistent influence over the ACC (third panel from the left). Activity in the ACC could not be explained in a simple way as a function of the values assigned to particular object identities (rightmost panel). Shaded areas represent s.e.m. across 25 sessions. c, As a result, model comparison unambiguously supported the second hypothesis of ACC function. Note that ROI selection avoids double dipping in favor of the hypothesis we aimed to validate, since the ROIs were defined from Hypothesis 1 that we aimed to reject.
a, Neurostimulation site where TUS was applied for each animal (S1 and S2) for the behavioral part. The TUS transducer was set at a resonance frequency of 250 kHz and concentrated ultrasound ina cigar-shaped focal spot in the ACC. b, ROIs used in the rs-fMRI connectivity analysis (see full details in Supplementary Table 1).
a, While entropy is negatively related to value difference, there was no difference between the ACC-TUS and SHAM conditions (n = 18 sessions). A linear model was used and a one-sample t-test was performed on the resulting coefficients. b, Similarly, the negative relationship between cumulative stay and value difference was not different between the ACC-TUS and SHAM conditions. c, Maintain model’s parameters (left panel). There was no difference in temperature parameter between the fMRI, SHAM TUS and ACC-TUS sessions (right panel). A similar picture emerged for the learning rates. While the mean learning rate in the ACC-TUS session was higher than in the fMRI and SHAM sessions, this result was not significant. A linear regression model was used to determine the difference between conditions. d, Decision accuracy (selecting the option with the highest subjective value) plotted as a function of difficulty (difference in subjective value between the best (HV) and worst (LV) presented options). Subjective values were estimated using the RL model (Methods). Each bin contains data binned according to percentile, with each point corresponding to the [0–5%], [5–10%], [10–15%], …, [90–95%], [95–100%] of the value difference amplitudes. Accuracy is the rate at which the participant picked the subjectively better option. Because subjective value estimates are obtained by fitting a model to the data and reflect the animals’ actual patterns of decisions, the overall difference in performance levels is no longer apparent between the TUS and sham conditions. If the animal is poor at learning or choosing the best option to take, then this pattern of behavior will mean that the RL model will estimate that the animal’s subjective valuations of the choices are closer together than they objectively are. The RL model therefore ‘explains away’ differences in performances as differences in subjective valuations. However, once again it is clear that there was no evidence of a TUS-induced impairment in performance that increased with difficulty (smaller HV – LV value differences on the left of the figure). Solid lines are linear fits to the data and the shaded area is the 95% confidence interval. Two-sample t-tests were performed between the ACC-TUS and control sessions (n = 18 sessions for each group, non-significant).
Supplementary Figure 6 TUS of lateral orbitofrontal cortex (lOFC) did not impair translation of counterfactual choice values into actual behavioral change.
a, Neurostimulation site where TUS was applied for each of the four animals. The TUS transducer was set at a resonance frequency of 250 kHz and concentrated ultrasound in a cigar-shaped focal spot in the lOFC. b, The significant difference between the influence of the better counterfactual option value on future switching behavior (in dark blue, as per Figs. 2f and 5c) was unaltered after lOFC TUS (in red, n = 20 sessions) compared to the control sham session (n = 20 sessions) collected on interleaved days (Cohen’s d = 0.21, t38 = 0.65, P = 0.51). Because of the position of the head post, it was only possible to apply TUS to ACC in two animals. If we focus just on the data collected from the same two individual animals during lOFC TUS and interleaved control sham days (n = 10 and 10 sessions), it is still the case that there is no evidence of a significant effect (Cohen’s d = 0.14, t18 = 0.59, P = 0.56). We can also compare the size of the effects produced by ACC TUS (for the better counterfactual) and the size of the effects produced by lOFC TUS (for the same predictor). This reveals that ACC TUS produced a significantly greater effect than lOFC TUS (Cohen’s d = 0.69, t36 = 2.08, P = 0.04). Moreover, there were no significant baseline differences in behavior in the SHAM testing days in the ACC-TUS and lOFC-TUS testing periods (Cohen’s d = 0.05, t36 = 0.14, P = 0.88; similarly if we only apply the test for the same two animals that had been examined in the ACC-TUS experiment: Cohen’s d = 0.08, t26 = 0.20, P = 0.84). There was also no difference in the effect of the worst counterfactual on switching behaviors (Cohen’s d = 0.27, t36 = 0.82, P = 0.42; similarly if we only test for the two animals that were tested with ACC TUS: Cohen’s d = 0.39, t26 = 1.00, P = 0.32). All analyses are mixed-effect models with sessions and animals as random effects. c, Entropy is strongly and negatively predictive of change in exploratory behavior (indexed by cumulative number of ‘stay’ choices—choices of the same option on one trial after another) in the control condition (blue) and this remains the case after lOFC TUS (Cohen’s d = −0.24, t28 = −0.64, P = 0.52; comparison between the lOFC effect against the ACC effect: Cohen’s d = −0.56, t64 = −2.23, P = 0.02).
Supplementary Figure 7 Relationship between the unavailable option’s value and the efficiency of choice behavior.
a, Average choice behavior when choosing between the left and right options plotted as a function of the value of the unavailable option (low value, blue; high value, red) value for each animal separately (data are averaged across sessions). Curves plot logistic functions fit to the choice data. b, A partial regression plot shows the uncontaminated effect of the unavailable option’s value on accuracy (y axis, accuracy residuals; x axis, residuals of the unavailable option’s value) for each animal during the fMRI sessions after partialling out the difference in value between the better and worse options in any given trial (which should rationally be the main determinant of decision-making). There was a linear relationship between the residual accuracy and the residual value of the unavailable option in three out of four animals (mixed-effect linear models, animal 1: t1160 = 2.28, P = 0.02; animal 2: t1358 = 3.85, P = 0.0001; animal 3: t1329 = 0.30, P = 0.76; animal 4, t950 = 5.85, P = 6.8 × 10–9). Each bin contains 25% of averaged data across sessions (±s.e.m.). c,d, Extra data from four animals (c) show the same effect as in the main fMRI experiment (d). e, The impact of the unavailable option on the current decision—an impact that seems to be mediated by vmPFC—was not significantly altered by the application of TUS to ACC; there was no difference between the ACC-TUS sessions (n = 18) and the baseline sessions.
About this article
Cite this article
Fouragnan, E.F., Chau, B.K.H., Folloni, D. et al. The macaque anterior cingulate cortex translates counterfactual choice value into actual behavioral change. Nat Neurosci 22, 797–808 (2019). https://doi.org/10.1038/s41593-019-0375-6
Polarity of uncertainty representation during exploration and exploitation in ventromedial prefrontal cortex
Nature Human Behaviour (2021)
Chimpanzee histology and functional brain imaging show that the paracingulate sulcus is not human-specific
Communications Biology (2021)
Brain & Neurorehabilitation (2021)
Noninvasive Ultrasound Stimulation of Ventral Tegmental Area Induces Reanimation from General Anaesthesia in Mice
Trends in Cognitive Sciences (2021)