Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Medial prefrontal cortical activity reflects dynamic re-evaluation during voluntary persistence


Deciding how long to keep waiting for future rewards is a nontrivial problem, especially when the timing of rewards is uncertain. We carried out an experiment in which human decision makers waited for rewards in two environments in which reward-timing statistics favored either a greater or lesser degree of behavioral persistence. We found that decision makers adaptively calibrated their level of persistence for each environment. Functional neuroimaging revealed signals that evolved differently during physically identical delays in the two environments, consistent with a dynamic and context-sensitive reappraisal of subjective value. This effect was observed in a region of ventromedial prefrontal cortex that is sensitive to subjective value in other contexts, demonstrating continuity between valuation mechanisms involved in discrete choice and in temporally extended decisions analogous to foraging. Our findings support a model in which voluntary persistence emerges from dynamic cost/benefit evaluation rather than from a control process that overrides valuation mechanisms.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Prices vary by article type



Prices may be subject to local taxes which are calculated during checkout

Figure 1: Experimental task and timing conditions.
Figure 2: Behavioral results.
Figure 3: Theoretical subjective value of the awaited token as a function of elapsed time in each environment.
Figure 4: Model-based contrast results.
Figure 5: Model-free analysis of trial onset–locked BOLD time courses.
Figure 6: Regions in which BOLD signal differentiated reward-related and quit-related key presses, assessed on the basis of the event type (reward versus quit) by time point interaction.
Figure 7: Effects of task events on mean cardiac inter-beat interval (IBI; lower values correspond to faster heart rate).


  1. Mischel, W. & Ebbesen, E.B. Attention in delay of gratification. J. Pers. Soc. Psychol. 16, 329–337 (1970).

    Article  Google Scholar 

  2. Baumeister, R.F., Vohs, K.D. & Tice, D.M. The strength model of self-control. Curr. Dir. Psychol. Sci. 16, 351–355 (2007).

    Article  Google Scholar 

  3. Bartra, O., McGuire, J.T. & Kable, J.W. The valuation system: A coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value. Neuroimage 76, 412–427 (2013).

    Article  PubMed  Google Scholar 

  4. Clithero, J.A. & Rangel, A. Informatic parcellation of the network involved in the computation of subjective value. Soc. Cogn. Affect. Neurosci. 9, 1289–1302 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Liu, X., Hairston, J., Schrier, M. & Fan, J. Common and distinct networks underlying reward valence and processing stages: a meta-analysis of functional neuroimaging studies. Neurosci. Biobehav. Rev. 35, 1219–1236 (2011).

    Article  PubMed  Google Scholar 

  6. Levy, D.J. & Glimcher, P.W. The root of all value: A neural common currency for choice. Curr. Opin. Neurobiol. 22, 1027–1038 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Kable, J.W. & Glimcher, P.W. The neural correlates of subjective value during intertemporal choice. Nat. Neurosci. 10, 1625–1633 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Kable, J.W. & Glimcher, P.W. An “as soon as possible” effect in human intertemporal decision making: Behavioral evidence and neural mechanisms. J. Neurophysiol. 103, 2513–2531 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Hare, T.A., Camerer, C.F. & Rangel, A. Self-control in decision-making involves modulation of the vmPFC valuation system. Science 324, 646–648 (2009).

    Article  CAS  PubMed  Google Scholar 

  10. Mischel, W., Ayduk, O. & Mendoza-Denton, R. Sustaining delay of gratification over time: a hot-cool systems perspective. in Time and Decision: Economic and Psychological Perspectives on Intertemporal Choice (eds. Loewenstein, G., Read, D. & Baumeister, R.F.) 175–200 (Russell Sage Foundation, New York, 2003).

  11. Metcalfe, J. & Mischel, W. A hot/cool-system analysis of delay of gratification: dynamics of willpower. Psychol. Rev. 106, 3–19 (1999).

    Article  CAS  PubMed  Google Scholar 

  12. McGuire, J.T. & Kable, J.W. Rational temporal predictions can underlie apparent failures to delay gratification. Psychol. Rev. 120, 395–410 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  13. McGuire, J.T. & Kable, J.W. Decision makers calibrate behavioral persistence on the basis of time-interval experience. Cognition 124, 216–226 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Rachlin, H. The Science of Self Control (Harvard University Press, 2000).

  15. Dasgupta, P. & Maskin, E. Uncertainty and hyperbolic discounting. Am. Econ. Rev. 95, 1290–1299 (2005).

    Article  Google Scholar 

  16. Kim, H., Shimojo, S. & O'Doherty, J.P. Overlapping responses for the expectation of juice and money rewards in human ventromedial prefrontal cortex. Cereb. Cortex 21, 769–776 (2011).

    Article  PubMed  Google Scholar 

  17. Hare, T.A., Malmaud, J. & Rangel, A. Focusing attention on the health aspects of foods changes value signals in vmPFC and improves dietary choice. J. Neurosci. 31, 11077–11087 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Hampton, A.N., Bossaerts, P. & O'Doherty, J.P. The role of the ventromedial prefrontal cortex in abstract state-based inference during decision making in humans. J. Neurosci. 26, 8360–8367 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Daw, N.D., Gershman, S.J., Seymour, B., Dayan, P. & Dolan, R.J. Model-based influences on humans' choices and striatal prediction errors. Neuron 69, 1204–1215 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Hutcherson, C.A., Plassmann, H., Gross, J.J. & Rangel, A. Cognitive regulation during decision making shifts behavioral control between ventromedial and dorsolateral prefrontal value systems. J. Neurosci. 32, 13543–13554 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Casey, B.J. et al. Behavioral and neural correlates of delay of gratification 40 years later. Proc. Natl. Acad. Sci. USA 108, 14998–15003 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Figner, B. et al. Lateral prefrontal cortex and self-control in intertemporal choice. Nat. Neurosci. 13, 538–539 (2010).

    Article  CAS  PubMed  Google Scholar 

  23. Heatherton, T.F. & Wagner, D.D. Cognitive neuroscience of self-regulation failure. Trends Cogn. Sci. 15, 132–139 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Charnov, E.L. Optimal foraging, the marginal value theorem. Theor. Popul. Biol. 9, 129–136 (1976).

    Article  CAS  PubMed  Google Scholar 

  25. McNamara, J. Optimal patch use in a stochastic environment. Theor. Popul. Biol. 21, 269–288 (1982).

    Article  Google Scholar 

  26. Hayden, B.Y., Pearson, J.M. & Platt, M.L. Neuronal basis of sequential foraging decisions in a patchy environment. Nat. Neurosci. 14, 933–939 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Fawcett, T.W., McNamara, J.M. & Houston, A.I. When is it adaptive to be patient? A general framework for evaluating delayed rewards. Behav. Processes 89, 128–136 (2012).

    Article  PubMed  Google Scholar 

  28. Nickerson, R.S. Response time to the second of two successive signals as a function of absolute and relative duration of intersignal interval. Percept. Mot. Skills 21, 3–10 (1965).

    Article  CAS  PubMed  Google Scholar 

  29. Griffiths, T.L. & Tenenbaum, J.B. Optimal predictions in everyday cognition. Psychol. Sci. 17, 767–773 (2006).

    Article  PubMed  Google Scholar 

  30. Montague, P.R., Dayan, P. & Sejnowski, T.J. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16, 1936–1947 (1996).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Fiorillo, C.D., Newsome, W.T. & Schultz, W. The temporal precision of reward prediction in dopamine neurons. Nat. Neurosci. 11, 966–973 (2008).

    Article  CAS  PubMed  Google Scholar 

  32. Cui, X., Stetson, C., Montague, P.R. & Eagleman, D.M. Ready.go: amplitude of the FMRI signal encodes expectation of cue arrival time. PLoS Biol. 7, e1000167 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  33. Bueti, D., Bahrami, B., Walsh, V. & Rees, G. Encoding of temporal probabilities in the human brain. J. Neurosci. 30, 4343–4352 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Tootell, R.B.H. et al. The retinotopy of visual spatial attention. Neuron 21, 1409–1422 (1998).

    Article  CAS  PubMed  Google Scholar 

  35. Lacey, J.I. & Lacey, B.C. Some autonomic-central nervous system interrelationships. in Physiological Correlates of Emotion (ed. Black, P.) 205–227 (Academic Press, New York, 1970).

  36. Schweighofer, N. et al. Humans can adopt optimal discounting strategy under real-time constraints. PLoS Comput. Biol. 2, e152 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Loewenstein, G. Anticipation and the valuation of delayed consumption. Econ. J. 97, 666–684 (1987).

    Article  Google Scholar 

  38. Duckworth, A.L., Gendler, T.S. & Gross, J.J. Self-control in school-age children. Educ. Psychol. 49, 199–217 (2014).

    Article  Google Scholar 

  39. Jimura, K., Chushak, M.S. & Braver, T.S. Impulsivity and self-control during intertemporal decision making linked to the neural dynamics of reward value representation. J. Neurosci. 33, 344–357 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Helfinstein, S.M. et al. Predicting risky choices from brain activity patterns. Proc. Natl. Acad. Sci. USA 111, 2470–2475 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Rushworth, M.F.S., Kolling, N., Sallet, J. & Mars, R.B. Valuation and decision-making in frontal cortex: one or many serial or parallel systems? Curr. Opin. Neurobiol. 22, 946–955 (2012).

    Article  CAS  PubMed  Google Scholar 

  42. Shenhav, A., Straccia, M.A., Cohen, J.D. & Botvinick, M.M. Anterior cingulate engagement in a foraging context reflects choice difficulty, not foraging value. Nat. Neurosci. 17, 1249–1254 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Blanchard, T.C. & Hayden, B.Y. Neurons in dorsal anterior cingulate cortex signal postdecisional variables in a foraging task. J. Neurosci. 34, 646–655 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. McClure, S.M., Berns, G.S. & Montague, P.R. Temporal prediction errors in a passive learning task activate human striatum. Neuron 38, 339–346 (2003).

    Article  CAS  PubMed  Google Scholar 

  45. Hollerman, J.R. & Schultz, W. Dopamine neurons report an error in the temporal prediction of reward during learning. Nat. Neurosci. 1, 304–309 (1998).

    Article  CAS  PubMed  Google Scholar 

  46. Berns, G.S., McClure, S.M., Pagnoni, G. & Montague, P.R. Predictability modulates human brain response to reward. J. Neurosci. 21, 2793–2798 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Hare, T.A., O'Doherty, J., Camerer, C.F., Schultz, W. & Rangel, A. Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors. J. Neurosci. 28, 5623–5630 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Howe, M.W., Tierney, P.L., Sandberg, S.G., Phillips, P.E.M. & Graybiel, A.M. Prolonged dopamine signaling in striatum signals proximity and value of distant rewards. Nature 500, 575–579 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Miyazaki, K.W. et al. Optogenetic activation of dorsal raphe serotonin neurons enhances patience for future rewards. Curr. Biol. 24, 2033–2040 (2014).

    Article  CAS  PubMed  Google Scholar 

  50. Janssen, P. & Shadlen, M.N. A representation of the hazard rate of elapsed time in macaque area LIP. Nat. Neurosci. 8, 234–241 (2005).

    Article  CAS  PubMed  Google Scholar 

  51. Brainard, D.H. The psychophysics toolbox. Spat. Vis. 10, 433–436 (1997).

    Article  CAS  PubMed  Google Scholar 

  52. Pelli, D.G. The VideoToolbox software for visual psychophysics: transforming numbers into movies. Spat. Vis. 10, 437–442 (1997).

    CAS  PubMed  Google Scholar 

  53. Luhmann, C.C., Chun, M.M., Yi, D.-J., Lee, D. & Wang, X.-J. Neural dissociation of delay and uncertainty in intertemporal choice. J. Neurosci. 28, 14459–14466 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Ungemach, C., Chater, N. & Stewart, N. Are probabilities overweighted or underweighted when rare outcomes are experienced (rarely)? Psychol. Sci. 20, 473–479 (2009).

    Article  PubMed  Google Scholar 

  55. Fitzgerald, T.H., Seymour, B., Bach, D.R. & Dolan, R.J. Differentiable neural substrates for learned and described value and risk. Curr. Biol. 20, 1823–1829 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Hertwig, R., Barron, G., Weber, E.U. & Erev, I. Decisions from experience and the effect of rare events in risky choice. Psychol. Sci. 15, 534–539 (2004).

    Article  PubMed  Google Scholar 

  57. Kaplan, E.L. & Meier, P. Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc. 53, 457–481 (1958).

    Article  Google Scholar 

  58. Gibbon, J. Scalar expectancy theory and Weber's law in animal timing. Psychol. Rev. 84, 279–325 (1977).

    Article  Google Scholar 

  59. Rakitin, B.C. et al. Scalar expectancy theory and peak-interval timing in humans. J. Exp. Psychol. Anim. Behav. Process. 24, 15–33 (1998).

    Article  CAS  PubMed  Google Scholar 

  60. Jenkinson, M. & Smith, S. A global optimisation method for robust affine registration of brain images. Med. Image Anal. 5, 143–156 (2001).

    Article  CAS  PubMed  Google Scholar 

  61. Smith, S.M. et al. Advances in functional and structural MR image analysis and implementation as FSL. Neuroimage 23, S208–S219 (2004).

    Article  PubMed  Google Scholar 

  62. Jenkinson, M., Beckmann, C.F., Behrens, T.E.J., Woolrich, M.W. & Smith, S.M. FSL. Neuroimage 62, 782–790 (2012).

    Article  PubMed  Google Scholar 

  63. Jenkinson, M., Bannister, P., Brady, M. & Smith, S. Improved optimization for the robust and accurate linear registration and motion correction of brain images. Neuroimage 17, 825–841 (2002).

    Article  PubMed  Google Scholar 

  64. Cox, R.W. AFNI: What a long strange trip it's been. Neuroimage 62, 743–747 (2012).

    Article  PubMed  Google Scholar 

  65. Cox, R.W. AFNI: Software for analysis and visualization of functional magnetic resonance neuroimages. Comput. Biomed. Res. 29, 162–173 (1996).

    Article  CAS  PubMed  Google Scholar 

  66. Greve, D.N. & Fischl, B. Accurate and robust brain image alignment using boundary-based registration. Neuroimage 48, 63–72 (2009).

    Article  PubMed  Google Scholar 

  67. Nichols, T.E. & Holmes, A.P. Nonparametric permutation tests for functional neuroimaging: a primer with examples. Hum. Brain Mapp. 15, 1–25 (2002).

    Article  PubMed  Google Scholar 

Download references


This research was supported by NIH grants DA030870 to J.T.M. and DA029149 to J.W.K.

Author information

Authors and Affiliations



J.T.M. and J.W.K. designed the experiment, developed the analysis procedures, and wrote the paper. J.T.M. collected and analyzed the data and developed the theoretical model.

Corresponding authors

Correspondence to Joseph T McGuire or Joseph W Kable.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Behavior over time across training and fMRI sessions.

A secondary behavioral analysis assessed the trajectory of persistence behavior over the course of the experiment in each condition. We estimated the timecourse of willingness to wait (WTW) in each condition across preliminary training, pre-scan practice, and fMRI runs (plot shows mean +/- SEM). Within each pair of adjacent runs (e.g., training runs 1–2), participants differed in whether the HP or LP condition was presented first. Cumulative minutes 0–20 are the preliminary training session, minutes 20–45 are from the day of scanning, and minutes 25–45 represent data collected in the scanner (the data used in all other analyses in the paper).

WTW timecourses were estimated using a nonparametric procedure described previously (McGuire & Kable, 2012). WTW at each point in the experiment was estimated as the longest time waited since the last quit trial. The estimate is necessarily only an approximation; we lack full moment-by-moment information about WTW because reward delivery events censor our observation of participants’ waiting times. This means there can be a lag before increases in WTW are reflected in the estimated timecourse (in particular, the gradual rise at the beginning of individual runs may be an artifact of the estimator).

Mean behavior was stable during scanning. We estimated subject-wise linear slopes in each condition across the fMRI runs (25–45 min) and neither differed significantly from zero (HP: median slope=0.00, IQR -0.22 to 0.13, signed-rank p=0.970; LP: median slope=-0.07, IQR -0.62 to 0.19, signed-rank p=0.391). We also estimated slopes in each run and condition individually (10 tests); there was a significant negative slope in the first training run of the LP condition (signed-rank p<0.001) but not in the other 9 condition × runs (0.06<p<0.97).

We further confirmed the stability of behavior within the fMRI experiment by calculating AUC for each run individually (cf. Fig. 2b). We observed strong Spearman correlations between an individual’s two HP runs (ρn=20=0.81, p<0.001), and LP runs (ρn=20=0.80, p<0.001).

The plot suggests a possible discontinuity between the training session and the fMRI session (at the 20-min point). One possible interpretation is that participants adopted a strategy of exploratory information-gathering during training, when tokens were less valuable, and shifted to a more exploitive strategy when token values increased from 10¢ to 30¢ on the day of the fMRI session. A second possibility is that it took several exposures for participants to learn to disambiguate the two environments.

Supplementary Figure 2 Individual terms in the theoretical model.

Our theoretical model computes an awaited reward’s subjective value (SV) as the sum of two terms: expected reward value (EV) and expected time cost (Panels A–C; see Methods, Eq. 2). The EV term factors in the subjective probability of obtaining the reward on the current trial (rather than quitting). The time cost term factors in the number of additional seconds the agent expects to spend waiting on the current trial (before either giving up or receiving the reward). Both terms vary as a function of elapsed delay time, and depend on the agent's temporal expectations and intended giving-up time. The values plotted in Panels A–C assume the ideal giving-up time of 20s in the LP condition. This ideal strategy can be identified by maximizing total SV (Methods, Eq. 3), but cannot be identified by maximizing the terms in B or C individually. (A strategy of always waiting would maximize EV, whereas a strategy of never waiting would minimize the time cost.)

In our paradigm, the different SV trajectories between the HP and LP environments were mainly driven by the EV term. As time passed in the LP environment, it became more likely that the scheduled delay would exceed the agent’s giving-up time and the reward would not be obtained. The dynamic estimate of time cost alone did not follow markedly different trajectories in the two environments.

Accordingly, the SV-related brain responses that we observed in VMPFC could be alternatively described as encoding an EV estimate that evolved dynamically on the basis of temporal expectations. To verify this, we extracted trial-onset-locked timecourses for HRF-convolved EV using the same methods as our main analysis (Panels D–E; see Methods; equivalent results for time cost are shown for reference in Panel F). Although the SV and EV timecourses look dissimilar, the predicted difference between the HP and LP conditions was nearly identical between SV and EV (Panels G–H; median r2 = 0.85, IQR 0.83 to 0.88; time cost results shown for reference in Panel I). A whole-brain analysis of EV effects identified a significant VMPFC cluster similar to the SV-related cluster shown in Fig. 4a (257 voxels, corrected p = 0.020).

Therefore, although the responsiveness of VMPFC to delay-related costs is well established in other tasks, it remains for future work to establish definitively whether VMPFC encodes time costs per se in the willingness-to-wait paradigm. However, an effect either of total SV or of EV alone would support our main conclusion that VMPFC encodes a dynamic and context-dependent value signal in a foraging-like decision context.

Supplementary Figure 3 Amount of data at various delays.

Amount of data available at various lags from trial onset. A: Median number of trials available per subject for each timepoint (with IQR). Dashed line marks the 30s window analyzed. B: Number of subjects with data in both environments for each timepoint.

Supplementary Figure 4 Absence of BOLD effects varying inversely with subjective value.

Although our behavioral results demonstrated a direct relationship between persistence and the subjective value of the awaited reward, alternative theoretical frameworks might posit that persistence depends on control processes whose engagement varies inversely with value. For example, it might be necessary to engage control processes in order to sustain persistence when an awaited reward’s value is in doubt.

Our two-tailed model-based contrast (Fig. 4) could in principle have detected BOLD effects negatively related to subjective value, but no such effects were found. The strongest sub-threshold negative cluster was in left superior occipital cortex (37 voxels; corrected p = 0.226).

It is possible that such an analysis would be more sensitive if it were restricted to timepoints when participants went on to continue waiting. For example, control processes might have become more strongly engaged as the awaited reward’s subjective value decreased, provided that persistence was indeed sustained further. To investigate this possibility we conducted a secondary analysis in which BOLD timecourses were estimated only from data that preceded the end of a trial by a margin 5s or more (in contrast to the margin of 1s used in our main analyses). For example, the timepoint coefficient at 30s was estimated only from trials that lasted at least 35s. If control processes were engaged to sustain persistence as subjective value decreased, then cognitive-control-related activations might have emerged as negative BOLD effects of subjective value in this analysis.

A disadvantage of this analysis strategy was that it severely curtailed the amount of available data. The number of available trials at later timepoints was reduced by more than 30% for the median participant (compare Panel A to Supplementary Fig. 3a), and some participants were eliminated entirely (compare Panel B to Supplementary Fig. 3b). Accordingly, neither positive nor negative effects were significant when timecourses estimated in this manner were submitted to a two-tailed model-based contrast (cf. Fig. 4). As in our primary analysis, the strongest sub-threshold negative effect was in a left superior occipital cluster (27 voxels, corrected p = 0.29).

The paucity of data for this analysis was a direct consequence of participants’ overall pattern of successful, value-sensitive behavioral calibration. Participants tended not to persist very long in the absence of a valuable future prospect. An internal control process that enforced persistence in such circumstances would not have been beneficial, at least in the present experimental task.

Supplementary Figure 5 Expectancy effects on reward-related brain response.

Occipitoparietal cluster in which the amplitude of the reward-related brain response was positively modulated by the duration of the preceding delay in the HP condition (398 voxels; local peaks in left [-9,-81,12] and right [12,-78,9] calcarine sulci and posterior parietal cortex [12,-78,45]). This region showed a higher-amplitude response to rewards that arrived after longer delays. Rewards at longer delays theoretically involved higher levels of expectancy (Fig. 2c), and were also associated with faster reaction times (Fig. 2d) and larger changes in heart rate (Fig. 7b). The analysis did not find any regions in which reward-related BOLD amplitude decreased as a function of expectancy, a pattern that would be characteristic of a reward prediction error signal.

Supplementary Figure 6 Theoretical subjective value using subject-specific rates of reward.

We recalculated the predictions of our theoretical model using subject-specific empirical estimates of the richness of the environment. Environmental richness is used in our model to define the opportunity cost of time (i.e., the gains one might expect to attain by quitting, akin to abandoning a food patch to forage elsewhere). Because participants’ behavior fell short of optimality, it is reasonable to suppose they had a lower-than-optimal estimate of the richness of the environment.

Actual rates of reward were calculated from the fMRI sessions for each subject in each condition. Median reward rate was 1.10¢/s in the HP environment (range 0.80 to 1.20; optimal=1.22) and 0.63¢/s in the LP environment (range 0.40 to 0.73; optimal=0.82). We calculated performance-based theoretical subjective value trajectories for each subject and condition by using these observed reward rates to define the opportunity cost of time.

Plotted is the median performance-based subjective value trajectory in each condition (with IQR; dotted lines represent optimal trajectories from Fig. 3a). Incorporating the lower-than-optimal empirical reward rates tended to increase the subjective value of waiting, especially early in the delay, because delay time was treated as incurring a smaller opportunity cost. Individual subjects’ performance-based subjective value trajectories across 0–30s were nonetheless highly correlated with the optimal trajectories (HP: median r2=1.00, IQR 1.00 to 1.00; LP: median r2=0.91, IQR 0.84 to 0.92).

The modified procedure for calculating subjective value did not change the results of our fMRI analyses. For each subject we generated synthetic BOLD timecourses encoding performance-based subjective value and passed these through our fMRI analysis to obtain predicted trial-onset-locked BOLD trajectories (analogous to Fig. 3d; see Methods for details). The resulting performance-based difference timecourses (HP minus LP) were highly correlated with the original difference timecourses (median r2=1.00, IQR 0.99 to 1.00), and using this version of the model-derived regressor yielded the same pattern of whole-brain and ROI results described in the main text.

Using performance-based subjective value also did not alter the results for the stochastic behavioral choice model (Fig. 3b). The two variants of the model yielded equivalent fits to the data (difference of model deviances: median=-3.50, IQR -8.22 to 10.72, signed-rank p=0.852).

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

McGuire, J., Kable, J. Medial prefrontal cortical activity reflects dynamic re-evaluation during voluntary persistence. Nat Neurosci 18, 760–766 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing