Pain is a primary driver of learning and motivated action. It is also a target of learning, as nociceptive brain responses are shaped by learning processes. We combined an instrumental pain avoidance task with an axiomatic approach to assessing fMRI signals related to prediction errors (PEs), which drive reinforcement-based learning. We found that pain PEs were encoded in the periaqueductal gray (PAG), a structure important for pain control and learning in animal models. Axiomatic tests combined with dynamic causal modeling suggested that ventromedial prefrontal cortex, supported by putamen, provides an expected value–related input to the PAG, which then conveys PE signals to prefrontal regions important for behavioral regulation, including orbitofrontal, anterior mid-cingulate and dorsomedial prefrontal cortices. Thus, pain-related learning involves distinct neural circuitry, with implications for behavior and pain dynamics.
This is a preview of subscription content
Subscribe to Journal
Get full journal access for 1 year
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
McNally, G.P., Johansen, J.P. & Blair, H.T. Placing prediction into the fear circuit. Trends Neurosci. 34, 283–292 (2011).
Seymour, B. et al. Temporal difference models describe higher-order learning in humans. Nature 429, 664–667 (2004).
Hollerman, J.R. & Schultz, W. Dopamine neurons report an error in the temporal prediction of reward during learning. Nat. Neurosci. 1, 304–309 (1998).
O'Doherty, J.P., Hampton, A. & Kim, H. Model-based fMRI and its application to reward learning and decision making. Ann. NY Acad. Sci. 1104, 35–53 (2007).
Daw, N.D. in Decision Making, Affect and Learning. (eds. Delgado, M.R., Phelps, E.A. & Robbins, T.W.) 3–38 (Oxford Univ. Press, 2011).
Behrens, T.E.J., Hunt, L.T., Woolrich, M.W. & Rushworth, M.F.S. Associative learning of social value. Nature 456, 245–249 (2008).
Li, J. & Daw, N.D. Signals in human striatum are appropriate for policy update rather than value prediction. J. Neurosci. 31, 5504–5511 (2011).
Niv, Y., Edlund, J.A., Dayan, P. & O'Doherty, J.P. Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain. J. Neurosci. 32, 551–562 (2012).
Rutledge, R.B., Dean, M., Caplin, A. & Glimcher, P.W. Testing the reward prediction error hypothesis with an axiomatic model. J. Neurosci. 30, 13525–13536 (2010).
Seymour, B. et al. Opponent appetitive-aversive neural processes underlie predictive learning of pain relief. Nat. Neurosci. 8, 1234–1240 (2005).
Seymour, B., Daw, N.D., Roiser, J.P., Dayan, P. & Dolan, R. Serotonin selectively modulates reward value in human decision-making. J. Neurosci. 32, 5833–5842 (2012).
Yacubian, J. et al. Dissociable systems for gain- and loss-related value predictions and errors of prediction in the human brain. J. Neurosci. 26, 9530–9537 (2006).
Ploghaus, A. et al. Learning about pain: the neural substrate of the prediction error for aversive events. Proc. Natl. Acad. Sci. USA 97, 9281–9286 (2000).
Li, J., Schiller, D., Schoenbaum, G., Phelps, E.A. & Daw, N.D. Differential roles of human striatum and amygdala in associative learning. Nat. Neurosci. 14, 1250–1252 (2011).
Schiller, D., Levy, I., Niv, Y., LeDoux, J.E. & Phelps, E.A. From fear to safety and back: reversal of fear in the human brain. J. Neurosci. 28, 11517–11525 (2008).
Delgado, M.R., Li, J., Schiller, D. & Phelps, E.A. The role of the striatum in aversive learning and aversive prediction errors. Phil. Trans. R. Soc. Lond. B 363, 3787–3800 (2008).
Hindi Attar, C., Finckh, B. & Büchel, C. The influence of serotonin on fear learning. PLoS ONE 7, e42397 (2012).
Johansen, J.P., Tarpley, J.W., LeDoux, J.E. & Blair, H.T. Neural substrates for expectation-modulated fear learning in the amygdala and periaqueductal gray. Nat. Neurosci. 13, 979–986 (2010).
Stephan, K.E. et al. Dynamic causal models of neural system dynamics: current state and future extensions. J. Biosci. 32, 129–144 (2007).
Schönberg, T., Daw, N.D., Joel, D. & O'Doherty, J.P. Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making. J. Neurosci. 27, 12860–12867 (2007).
Gallistel, C.R. The importance of proving the null. Psychol. Rev. 116, 439–453 (2009).
Garrison, J., Erdeniz, B. & Done, J. Prediction error in reinforcement learning: a meta-analysis of neuroimaging studies. Neurosci. Biobehav. Rev. 37, 1297–1310 (2013).
Wimmer, G.E., Daw, N.D. & Shohamy, D. Generalization of value in reinforcement learning by humans. Eur. J. Neurosci. 35, 1092–1104 (2012).
Satpute, A.B. et al. Identification of discrete functional subregions of the human periaqueductal gray. Proc. Natl. Acad. Sci. USA 110, 17101–17106 (2013).
Beissner, F. & Baudrexel, S. Investigating the human brainstem with structural and functional MRI. Front. Hum. Neurosci. 8, 116 (2014).
Keay, K.A. & Bandler, R. Parallel circuits mediating distinct emotional coping reactions to different types of stress. Neurosci. Biobehav. Rev. 25, 669–678 (2001).
Schmidt, L., Lebreton, M., Cléry-Melin, M.-L., Daunizeau, J. & Pessiglione, M. Neural mechanisms underlying motivation of mental versus physical effort. PLoS Biol. 10, e1001266 (2012).
Millan, M.J. The induction of pain: an integrative review. Prog. Neurobiol. 57, 1–164 (1999).
Brooks, A.M. & Berns, G.S. Aversive stimuli and loss in the mesocorticolimbic dopamine system. Trends Cogn. Sci. 17, 281–286 (2013).
Price, J.L. Definition of the orbital cortex in relation to specific connections with limbic and visceral structures and other cortical regions. Ann. NY Acad. Sci. 1121, 54–71 (2007).
Rangel, A. & Hare, T. Neural computations associated with goal-directed choice. Curr. Opin. Neurobiol. 20, 262–270 (2010).
Herrero, M.T., Insausti, R. & Gonzalo, L.M. Cortically projecting cells in the periaqueductal gray matter of the rat. A retrograde fluorescent tracer study. Brain Res. 543, 201–212 (1991).
Shackman, A.J. et al. The integration of negative affect, pain and cognitive control in the cingulate cortex. Nat. Rev. Neurosci. 12, 154–167 (2011).
Krasne, F.B., Fanselow, M.S. & Zelikowsky, M. Design of a neurally plausible model of fear learning. Front. Behav. Neurosci. 5, 41 (2011).
Reynolds, S.M. & Berridge, K.C. Emotional environments retune the valence of appetitive versus fearful functions in nucleus accumbens. Nat. Neurosci. 11, 423–425 (2008).
Tom, S.M., Fox, C.R., Trepel, C. & Poldrack, R.A. The neural basis of loss aversion in decision-making under risk. Science 315, 515–518 (2007).
Kim, H., Shimojo, S. & O'Doherty, J.P. Is avoiding an aversive outcome rewarding? Neural substrates of avoidance learning in the human brain. PLoS Biol. 4, e233 (2006).
Boll, S., Gamer, M., Gluth, S., Finsterbusch, J. & Büchel, C. Separate amygdala subregions signal surprise and predictiveness during associative fear learning in humans. Eur. J. Neurosci. 37, 758–767 (2013).
Paton, J.J., Belova, M.A., Morrison, S.E. & Salzman, C.D. The primate amygdala represents the positive and negative value of visual stimuli during learning. Nature 439, 865–870 (2006).
Linnman, C., Moulton, E.A., Barmettler, G., Becerra, L. & Borsook, D. Neuroimaging of the periaqueductal gray: state of the field. Neuroimage 60, 505–522 (2012).
Buhle, J.T. et al. Cognitive reappraisal of emotion: a meta-analysis of human neuroimaging studies. Cereb. Cortex 10.1093/cercor/bht154 (2013).
Buhle, J.T. et al. Common representation of pain and negative emotion in the midbrain periaqueductal gray. Soc. Cogn. Affect. Neurosci. 8, 609–616 (2013).
Wager, T.D. et al. Brain mediators of cardiovascular responses to social threat, part II: Prefrontal-subcortical pathways and relationship with anxiety. Neuroimage 47, 836–851 (2009).
Bartra, O., McGuire, J.T. & Kable, J.W. The valuation system: a coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value. Neuroimage 76, 412–427 (2013).
Roy, M., Shohamy, D. & Wager, T.D. Ventromedial prefrontal-subcortical systems and the generation of affective meaning. Trends Cogn. Sci. 16, 147–156 (2012).
Chib, V.S., Rangel, A., Shimojo, S. & O'Doherty, J.P. Evidence for a common representation of decision values for dissimilar goods in human ventromedial prefrontal cortex. J. Neurosci. 29, 12315–12320 (2009).
Milad, M.R. et al. Recall of fear extinction in humans activates the ventromedial prefrontal cortex and hippocampus in concert. Biol. Psychiatry 62, 446–454 (2007).
Wunderlich, K., Dayan, P. & Dolan, R.J. Mapping value based planning and extensively trained choice in the human brain. Nat. Neurosci. 15, 786–791 (2012).
Glover, G. H. & Law, C. S. Spiral-in/out BOLD fMRI for increased SNR and reduced susceptibility artifacts. Mag. Reson. Med. 46, 515–522 (2001).
Caplin, A. & Dean, M. Axiomatic methods, dopamine and reward prediction error. Curr. Opin. Neurobiol. 18, 197–202 (2008).
Jepma, M., Jones, M. & Wager, T.D. The dynamics of pain: evidence for simultaneous site-specific habituation and site-nonspecific sensitization in thermal pain. J. Pain 15, 734–746 (2014).
We would like to thank D. Abraham and A. Pingree for help with data collection and A. Krishnan, L. Schmidt and L. Atlas for help with data analyses. This work was supported by a grant from the US National Institutes of Health to T.D.W. (R01DA035484 and R01MH076136) and by Canadian Institute of Health Research and Fonds de Recherche en Santé du Québec fellowships to M.R. N.D. was supported by a Scholar Award from the James S. McDonnell Foundation.
The authors declare no competing financial interests.
Integrated supplementary information
Mean online pain ratings obtained during the behavioral session superimposed on the temporal profile of the thermal stimulus (number or participants = 23; 4 seconds plateau, 2.5 seconds ramp-up/ramp-down). Ratings begin to rise in the first second of the stimulation (left of the red vertical line).
Activity at outcome onset (1st second of outcome; number or participants = 23) related to (A) model-based aversive prediction error (outcome worse than expected), (B) pain > no stimulus. Displayed activations are cluster-thresholded (p<0.05, FWE, two-tailed) with cluster-defining thresholds of p<0.001, p<0.01 and p<0.05. (C) Conjunction of model-based prediction error and pain effects. Conjunctions of positive/negative effects are in yellow/blue.
A) On each trial, participants (n = 24) chose one of four face options. After a delay, the outcome ($0.25 or $0.00) was revealed. Faces are paired together such that the probability of receiving a reward on a given trial is the same for both faces of the pair. In colored brackets, one example of option pairing is indicated (reproduced from Wimmer et al., 2012). B) Comparison of pain aversive prediction errors (PE) and monetary appetitive PE in periaqueductal gray (PAG) and nucleus accumbens (Nacc) regions of interest (see figure 2 in main article;PAG-aversive: t(22) = 3.07, p = 0.006; PAG–appetitive: t(20) = 1.54, p = 0.14; NAcc-aversive = t(22) = -0.80, p = 0.44; NAcc-appetitive = t(20) = 5.77, p < 0.001). C) Drifting reward probability distribution defining the reward equivalence for one example pairing (reproduced from Wimmer et al., 2012). * = p < 0.05, ** = p < 0.01, *** = p < 0.001. Error bars represent standard errors of the mean.
A) Experimental conditions. In both the control and placebo runs, participants (n = 50) are presented two predictive cues. The low cue is followed 50% of the time by low pain (46°C) and 50% of the time by medium pain (47°C). The high cue is followed 50% of the time by high pain (48°C) and 50% of the time by medium pain (47°C). In placebo runs the thermode is installed on a skin spot pre-treated with a cream participants are told has analgesic properties. B) Periaqueductal gray (PAG) region of interest (ROI). C) Continuous pain ratings from 30 independent subjects for 11-s thermal stimulations at 46.5°C, 47.5°C and 48.5°C. The window of analysis for aversive prediction error signals was set between 4 and 10 seconds, i.e. between the time the temperatures can be differentiated and the peak of pain. D) Axiomatic predictions for aversive prediction error. Axiom #1 stipulates that aversive prediction error signals should increase with temperature intensity, regardless of expectations. Axiom #2 stipulates that lower pain expectations should be associated with higher prediction errors, regardless of temperature. Therefore, axiom #2 can only be tested on the medium temperature. Moreover, if the same prediction error signals are also influenced by instruction-based expectations, we should observe higher activity for the placebo vs. control condition. E) Activity in the PAG during the PE window. The left panel shows a clear effect of temperature (low < medium, medium < high, all p’s < 0.001). The right panel shows effects of cues and condition for stimulations at 47°C, which are in conformity with axioms #2 and 3. Activity in the PAG ROI for medium pain stimulations is higher for low vs. high pain cues (F(1,49) = 4.39, p < 0.05) and for the placebo analgesia condition vs. control (F(1,49) =16.03, p < 0.001). Error bars represent standard errors of the mean.
Activity during decision-making (number or participants = 23) correlating positively (red) or negatively (blue) with the expected value of the chosen option (warm/cold colors indicate low/high subjective (model-based) probability of pain. Displayed activations are cluster-thresholded (p<0.05, FWE, two-tailed) with cluster-defining thresholds of p<0.001, p<0.01 and p<0.05.
Supplementary Figure 6 DCM optimizing the connectivity of the aversive prediction error structure: step 1a
(A) In all models, the driving inputs are the pain > no stimulus and expected value parametric modulators on outcome onsets. The models tested systematically varied the structure(s) receiving the expected value driving inputs and conveying this information to the midbrain. The model with the highest exceedance probability is highlighted in red. (B) Expected (expected posterior probability) and exceedance (probability compared with other tested models) probabilities associated with each model. Val = expected value, str = striatum, hipp = hippocampus, mb = midbrain. number or participants = 23.
Supplementary Figure 7 DCM optimizing the connectivity of the aversive prediction error structure: step 1b
(A) In all models, the driving inputs are the pain > no stimulus and expected value parametric modulators on outcome onsets. The model selected in the previous step is the first one (top left). From left to right, the tested models varied hippocampus targets (vmPFC, PAG, or nothing), or added a link between the striatum and midbrain. Models in the second row additionally include expected value as a driving input to the vmPFC, and a link from the vmPFC to the striatum. (B) Expected (expected posterior probability) and exceedance (probability compared with other tested models) probabilities associated with each model. Val = expected value, put = putamen, hipp = hippocampus, PAG = periaqueductal gray. number or participants = 23.
Supplementary Figure 8 DCM optimizing the connectivity of the aversive prediction error structure: step 2a.
(A) In all models, the structures generating PE signals (put, vmPFC, hipp, PAG) are arranged according to the best model selected from the previous model selection steps. The links between these structures and the pain-specific PE structures are systematically varied. The model with the highest exceedance probability is highlighted in red. (B) Expected (expected posterior probability) and exceedance (probability compared with other tested models) probabilities associated with each model. Val = expected value, put = putamen, hipp = hippocampus, PAG = periaqueductal gray, OFC = orbitofrontal cortex, aMCC = anterior cingulate cortex, dmPFC = dorsomedial prefrontal cortex. number or participants = 23.
Supplementary Figure 9 DCM optimizing the connectivity of the aversive prediction error structure: step 2b.
(A) Modulatory influences were systematically added to the connections from the striatum or midbrain to the OFC or aMCC. (B) Expected (expected posterior probability) and exceedance (probability compared with other tested models) probabilities associated with each model. Val = expected value, str = striatum, hipp = hippocampus, mb = midbrain, OFC = orbitofrontal cortex, aMCC = anterior cingulate cortex, dmPFC = dorsomedial prefrontal cortex.
The four sets of random walks used in the current study. Probabilities associated with each option (blue and red lines) varied independently and slowly from trial-to-trial according to random walks. Probabilities were bounded 20% and 80%, and had to cross at least once over the course of the experiment.
Supplementary Figures 1–11 (PDF 1123 kb)
(PDF 512 kb)
About this article
Cite this article
Roy, M., Shohamy, D., Daw, N. et al. Representation of aversive prediction errors in the human periaqueductal gray. Nat Neurosci 17, 1607–1612 (2014). https://doi.org/10.1038/nn.3832
Anatomical dissociation of intracerebral signals for reward and punishment prediction errors in humans
Nature Communications (2021)
Nature Reviews Neuroscience (2020)
Bidirectional control of fear memories by cerebellar neurons projecting to the ventrolateral periaqueductal grey
Nature Communications (2020)
Nature Neuroscience (2019)
Altered resting state functional connectivity of the cognitive control network in fibromyalgia and the modulation effect of mind-body intervention
Brain Imaging and Behavior (2019)