Associative learning and extinction of conditioned threat predictors across sensory modalities

The formation and persistence of negative pain-related expectations by classical conditioning remain incompletely understood. We elucidated behavioural and neural correlates involved in the acquisition and extinction of negative expectations towards different threats across sensory modalities. In two complementary functional magnetic resonance imaging studies in healthy humans, differential conditioning paradigms combined interoceptive visceral pain with somatic pain (study 1) and aversive tone (study 2) as exteroceptive threats. Conditioned responses to interoceptive threat predictors were enhanced in both studies, consistently involving the insula and cingulate cortex. Interoceptive threats had a greater impact on extinction efficacy, resulting in disruption of ongoing extinction (study 1), and selective resurgence of interoceptive CS-US associations after complete extinction (study 2). In the face of multiple threats, we preferentially learn, store, and remember interoceptive danger signals. As key mediators of nocebo effects, conditioned responses may be particularly relevant to clinical conditions involving disturbed interoception and chronic visceral pain.

xpectations shape our experience of reality, and are essential for adaptive behaviour in any complex environment. In the face of danger, negative expectations are formed and dynamically updated by experience, involving associative learning and memory processes. As a key emotional response during the expectation of threat, conditioned fear is essential to trigger adaptive escape or avoidance responses. Learned fear can however also turn maladaptive and contribute to pathology, as underscored by knowledge from fear conditioning accomplished in the context of anxiety, psychological trauma, and stress-related disorders 1 . More recently, the scope has been broadened to acute and chronic pain 2 , embedded within the fear-avoidance model 3 . In keeping with its biological salience, pain is a ubiquitous and fundamentally threatening experience. Interoceptive pain arising from visceral organs appears to be particularly threatening and fear-inducing 4,5 , resulting in much suffering in highly prevalent disorders of the gut-brain axis like the irritable bowel syndrome (IBS) 6 . Interoceptive, visceral pain is highly modifiable by cognitions and emotions 6,7 , including negative expectations during the anticipation of pain as key mediators of nocebo effects [8][9][10][11][12] . Despite broad clinical implications of nocebo effects reaching far beyond chronic pain [13][14][15] , the formation and persistence of negative pain-related expectations by classical conditioning remain incompletely understood, especially with respect to neurobiological mechanisms and their possible specificity to threat modality.
Human fear conditioning studies with experimental pain as unconditioned stimuli (US) have implemented exteroceptive, somatic [16][17][18][19] or interoceptive, visceral pain as salient threats [20][21][22][23] , but knowledge about common and distinct threat-specific neural mechanisms remains limited, especially regarding the extinction and retrieval of pain-related fear memories 24,25 . Combining interoceptive and exteroceptive threats from different sensory modalities, as accomplished herein, constitutes a unique opportunity to elucidate specificity to threat modality in a clinicallyrelevant context 4,26 , and is timely given recent conceptual advances regarding interoception 27,28 and interoceptive psychopathology 29 . The experience of multiple threats from different sensory modalities closely mimics the clinical reality of patients with diverse symptoms, especially those with complex comorbidities as they often characterise patients with chronic pain. In fact, any normal environment presents multiple salient threats, which may impact learning and memory processes relevant to nocebo effects that remain incompletely understood even in healthy individuals. Existing studies with multiple threats support the role of the insula, a key region of the salience network, in sensory modality-specific effects underlying aversive expectancy 26,[30][31][32] . The engagement of the salience network, together with regions of the fear and extinction networks, remains to be tested not only for the formation but especially for the extinction of conditioned responses to multiple threats. Conditioned negative expectations may be markedly resistant to extinction, as suggested by studies involving somatic pain stimuli 33,34 . This may be particularly the case for interoceptive memory traces, as suggested by early classical interoceptive conditioning studies carried out by soviet psychologists 35 , complemented by modern approaches on fear learning of interoceptive and exteroceptive cues 36 and on the partially distinct neural representation of aversive visceral signals 37 . Impaired extinction efficacy and other phenomena related to memory processes can reportedly facilitate the return of fear and increase the risk of relapse 38 , with broad implications for the chronicity and treatment of pain and fear-related disorders 39 .
We herein elucidated the behavioural and neural mechanisms involved in the acquisition and extinction of negative expectations towards different types of interoceptive and exteroceptive threats across sensory modalities. To this end, we analysed data from two independent differential fear conditioning studies with methodological and conceptual overlap, allowing to assess reproducibility, and offering converging insight into conditioned anticipatory responses to threats from different sensory modalities. In both functional magnetic resonance imaging (fMRI) studies, visceral pain induced by rectal distension was implemented as clinically-relevant interoceptive US together with an exteroceptive US, which was either an equally painful thermal stimulus (study 1), or a non-nociceptive, yet equally aversive tone (study 2). In addition to threat modality-specific predictive cues (conditioned stimuli, CS), unpaired safety cues in both studies allowed us to compare conditioned differential responses to interoceptive versus exteroceptive threat predictors for different phases of conditioning. For the acquisition, we tested the general hypothesis that in the face of multiple threats, predictive learning is shaped by the salience of the US, as suggested by preparedness theory 40 and the evolutionary significance of the interoceptive modality, as illustrated by one-trial learning phenomena like conditioned nausea and taste aversion 41 . Given initial evidence that pain-modality shapes not only the perception and processing of stimuli 4,5,42 , but also anticipatory responses including conditioned fear 26,36 , we expected greater differential conditioned responses involving regions of the fear and salience networks to cues predicting interoceptive threat. We further tested whether conditioned responses to interoceptive threat predictors are more resistant to effective extinction, involving regions of the extinction network. The return of conditioned responses induced by reinstatement, i.e. unexpected re-exposure to the US, constitutes a promising translational tool to assess extinction efficacy 38 that has rarely been applied in brain imaging studies on pain-related fear 20 . To this end, our paradigms incorporated different reinstatement procedures following extinction phases, allowing us to test in reinstatement-test phases if the interoceptive CS-US association is more susceptible to reinstatement effects.
In sum, conditioned responses to interoceptive threat predictors were enhanced in both studies after the acquisition, consistently involving the insula and cingulate cortex as key regions of the salience network. Our results supported that unexpected exposure to interoceptive threats had a greater impact on extinction efficacy, resulting in disruption of ongoing extinction (study 1), and selective resurgence of interoceptive CS-US associations after complete extinction (study 2). Together, our findings are an important step towards unravelling how negative expectations are shaped by associative learning and memory processes in the face of multiple threats. A more refined understanding of conditioned nocebo effects in the context of clinicallyrelevant interoceptive and exteroceptive threats may ultimately contribute to an improved consideration of expectancy effects to the benefit of patients, broaden the rapidly evolving scope of the gut-brain axis in health neuroscience and disease [43][44][45] , and fits into a framework of the cognitive neurosciences interfacing between mind and body 27 .

Results
Participants. Out of a total of N = 77 healthy adults who participated, N = 12 were excluded due to technical difficulties with MRI data acquisition (N = 6), movement artefacts (N = 4), or failure to reach visceral pain threshold within predetermined maximal distension pressure (N = 2). As a result, we herein report on data from N = 42 volunteers for study 1 (all female, age 34.5 ± 2.0 years; BMI 22.7 ± 0.4 kg/m 2 ), and N = 23 volunteers for study 2 ( (Fig. 1a) as well as compared to US AUD (Fig. 1b), involving aINS and dACC in both studies, and additionally the amygdala in study 1 (Table 1). Further, interoceptive US VISC consistently induced lower neural activation compared to both exteroceptive US SOM and US AUD within pINS (Table 1). Of note, these differences between US modalities remained largely unchanged when considering post-acquisition differences in US unpleasantness (study 1,2) or US intensity ratings (study 1 only) as covariates of no interest (Supplementary Tables S1 and S2). Moreover, conjunction analyses (against global null) revealed shared neural activation in study 1 (US VISC ∩ US SOM ) in the aINS, whereas shared activation was detectable in study 2 only in uncorrected whole-brain analyses but not in FWE-corrected ROI analyses (US VISC ∩ US AUD , see Supplementary Table S3). Interoceptive threat (US VISC ) induced significantly greater neural activation in dACC and aINS compared to both exteroceptive threats (a study 1, compared to US SOM ; b study 2, compared to US AUD ; all P FWE < 0.05). Moreover, interoceptive and exteroceptive threats induced shared neural activation in different brain regions across studies (full results in Supplementary Table S3). Neural activations in regions of interest were superimposed on a structural T1-image and thresholded at P < 0.001 uncorrected for visualisation purposes; colour bars indicate t-scores. For details, see Table 1. For whole-brain results on differential activation, see Supplementary Fig. S4. aINS anterior insula, AUD auditory, dACC dorsal anterior cingulate cortex, FWE family-wise error, SOM somatic, US unconditioned stimuli, VISC visceral.
Conditioned stimuli: acquisition phase. Repeated CS + -US pairings during acquisition resulted in the conditioned negative valence of all threat predictors in both studies, as evidenced by significant rmANOVA time effects (Supplementary Tables S4 and  S5). Interestingly, this increase in negative valence was consistently enhanced for interoceptive (ΔCS + VISC ) versus exteroceptive threat predictors (ΔCS + SOM and ΔCS + AUD , respectively), despite comparable contingency awareness between modalities (   Contingency awareness for threat-predictive conditioned stimuli (CS + ) and safety cues (CS − ) assessed with visual analogue scales (0-100 VAS, mm, representing % probability that a CS is followed by the specific US) after acquisition (ACQ), extinction (EXT), and reinstatement-test (RST-TEST) in study 1 (CS + VISC , CS + SOM ) and study 2 (CS + VISC , CS + AUD ). Note that actual CS-US contingencies for all CS + during ACQ were 80% in study 1 and 83% in study 2; all CS were presented without US in EXT and RST-TEST phases. CS − were never paired with US in any phase. Data are given as mean ± standard error of the mean. AUD auditory, CS conditioned stimuli, SOM somatic, US unconditioned stimuli, VISC visceral. *Exact P-values for between-modality paired t-tests assessing differences between CS + within each study, shown uncorrected for multiple testing. Conditioned predictors of interoceptive threat (ΔCS + VISC ) acquired significantly greater negative valence than conditioned predictors of exteroceptive threats after acquisition a study 1, compared to ΔCS + SOM : ***P < 0.001; b study 2, compared to ΔCS + AUD : **P < 0 . 01; results of Bonferroni-corrected paired t-tests between modalities, full details in Supplementary Tables S4-S7. Individual delta (Δ) scores were computed for differential CS valence of each CS + relative to the CS − . Data are presented as individual data points, boxplots, and densities (raincloud plots 117 ). At the neural level, interoceptive threat predictors (ΔCS + VISC ) induced enhanced differential neural responses in MCC and pINS compared to both exteroceptive threat predictors c study 1, ΔCS + VISC > ΔCS + SOM ; d study 2, ΔCS + VISC > ΔCS + AUD ; all P FWE < 0.05; details in Table 3, but threat predictors also induced shared differential activation in overlapping brain regions across studies (full results in Supplementary Table S8). For whole-brain results, see Supplementary compared to both ΔCS + SOM (Fig. 2c), as well as compared to ΔCS + AUD (Fig. 2d), a finding that was strikingly similar between both studies (P FWE < 0.05, Table 3). Of note, these findings were not appreciably altered when considering post-acquisition differences in US ratings as covariates of no interest (Supplementary Tables S9 and S10). Moreover, exploratory correlational analyses revealed significant correlations between CS and US valence, as well as between differential CS-and US-induced activations in peak activations in the insula and cingulate cortices (Supplementary analyses).
Conditioned stimuli: reinstatement-test phase. In the reinstatement-test phase, unpaired CS presentations were implemented immediately after single threat reinstatement in study 1 [i.e. unexpected exposure to only US VISC in one subgroup (N = 22), and to only US SOM in another subgroup (N = 20)], or after multiple threat reinstatement (i.e. unexpected exposure to both US VISC and US SOM in all participants) in study 2. After single threat reinstatement (US VISC (N = 22); US SOM (N = 20)) in study 1, a significant time × modality × group interaction was observed for negative CS valence (F(1,1,40) = 5.56; P = 0.023; η 2 = 0.12, see Supplementary Table S4). After multiple threat reinstatement with Differential neural activation induced by predictors of interoceptive (CS + VISC ) versus exteroceptive threat (study 1, CS + SOM ; study 2, CS + AUD ) relative to safety-predictive CS − , during acquisition. Results of second-level paired t-tests (study 1: and vice versa) are presented. Peak voxel indicates results of ROI analyses (cluster size k E ≥ 3; all P FWE < 0.05) and whole-brain analyses (in italic font; cluster size k E ≥ 10; all P uncorrected < 0.001). Exact unilateral P-values are provided. For a visualisation, see Fig. 2 and Supplementary Fig. S5. For analysis of shared responses, see Supplementary Table S8. For analyses controlling for US ratings, see Supplementary Tables S9 and S10. AUD auditory, CS conditioned stimuli, dACC dorsal anterior cingulate cortex, FWE family-wise error, H hemisphere, MCC midcingulate cortex, MNI Montreal Neurological Institute, pINS posterior insula, S1 primary somatosensory cortex, SOM somatic, VISC visceral, vmPFC ventromedial prefrontal cortex.
both US VISC and US SOM in study 2, the interaction effect was not significant (P = 0.055; η p 2 = 0.16, full results in Supplementary Table S5). Planned between-modality comparisons for the POST RST-TEST time point (Supplementary Tables S6 and S7) revealed greater differential negative valence of interoceptive compared to exteroceptive threat predictors in those groups involving US VISC exposure during reinstatement (study 1, US VISCsubgroup, ΔCS + VISC versus ΔCS  Table S6). The same comparisons for study 2 (PRE-POST) revealed a significant increase for ΔCS + VISC (P = 0.038 uncorrected), and no change for ΔCS + AUD (P > 0.05) (Supplementary Table S7).
At the neural level, threat predictors induced shared differential neural activation within hippocampus and vmPFC in the US VISCsubgroup (Table 4), and in the pINS in the US SOM -subgroup (study 1, ΔCS + VISC ∩ ΔCS + SOM , Fig. 3e, Table 5). After multiple threat reinstatement in study 2, threat predictors induced shared activation in the hippocampus, aINS, pINS, dACC, and MCC (ΔCS Fig. 3f, Table 6). Differential activation induced by interoceptive versus exteroceptive threat predictors was enhanced within pINS (L:  After reinstatement with unexpected US, negative valence was greater for interoceptive compared to exteroceptive threat predictors in reinstatement groups involving US VISC a study 1, US VISC -subgroup, ΔCS + VISC versus ΔCS + SOM ; c study 2, ΔCS + VISC versus ΔCS + AUD ; *both P < 0.05, results of Bonferroni-corrected paired t-tests between modalities, full results in Supplementary Tables S4-S7, whereas no difference was observed in the US SOMsubgroup b study 1, ΔCS + VISC versus ΔCS + SOM ). Individual delta (Δ) scores were computed for differential CS valence of each CS + relative to the CS − . Data are presented as individual data points, boxplots, and densities (raincloud plots 117 ). At the neural level, differential activation induced by interoceptive cues was enhanced in the US VISC -subgroup within dACC and pINS during reinstatement-test d study 1, ΔCS + VISC compared to ΔCS + SOM, all P FWE < 0.05). While no differential activation was observed for ΔCS + VISC compared to ΔCS + SOM in the US SOM -subgroup in study 1 or compared to ΔCS + AUD in study 2, shared differential neural activation was induced by both threat predictors in regions of interest, such as in the insula and cingulate cortex e ΔCS + VISC ∩ ΔCS + SOM in study 1, US SOM -subgroup; f ΔCS + VISC ∩ ΔCS + AUD in study 2. For full results, see Tables 4-6. For whole-brain results on differential activation, see Supplementary to ΔCS + SOM , Fig. 3d, Table 4). In contrast, other groups revealed no differences between modalities in differential neural activation induced by interoceptive compared to exteroceptive threat predictors at all (Tables 5 and 6).

Discussion
Adaptive human behaviour in complex environments with multiple threats is guided by evolutionary-driven survival strategies that are preserved across species. In the face of imminent threat, learning from experience is particularly fundamental to the ability to identify and remember predictors of danger to facilitate avoidance or escape. As a translational model at the interface of psychology and the neurosciences, Pavlovian conditioning has proven valuable to elucidating behavioural and neural mechanisms underlying conditioned fear during the expectation of threat 49-51 , with widely appreciated clinical implications for anxiety and stress-related disorders 1 . Herein, we broadened the scope to unravel learning and memory processes underlying negative expectations in the face of multiple threats from different sensory modalities, with a particular focus on the pain. Pain is a ubiquitous and highly salient threat and a crucial part of the organism's survival system that evokes strong adaptive responses, including cognitive and emotional processes orchestrated within the brain. These guide behaviour not only in response to actual pain experience 52 , but more importantly also during pain expectation 8,53,54 . Interoceptive, visceral pain appears to be particularly threatening 4,5 , engages partly distinct neural representations 37 , and may have a specific functional role in shaping brain dynamics 27 . Given the evolutionary significance of aversive signals originating from within our bodies, interoceptive conditioning could evoke greater and more persisting conditioned Shared and differential neural activation induced by predictors of interoceptive (CS + VISC ) and exteroceptive threat (CS + SOM ) relative to safety-predictive CS − , during reinstatement-test (RST-TEST) after single threat reinstatement with US VISC . For shared activation, results of conjunction analyses against global null are presented . For differential activation, results of second-level paired t-tests are presented ({CS + VISC < CS − } > {CS + SOM < CS − }; and vice versa). Peak voxel indicates results of ROI analyses (cluster size k E ≥ 3; all P FWE < 0.05) and whole-brain analyses (in italic font; cluster size k E ≥ 10; all P uncorrected < 0.001). Exact unilateral P-values are provided. For a visualisation, see responses relevant to nocebo mechanisms underlying hypervigilance and hyperalgesia.
In two independent fMRI studies, we implemented specific, yet complementary differential conditioning paradigms to elucidate the acquisition and extinction of conditioned responses to predictors of interoceptive and exteroceptive threats. Experimental visceral pain as clinically-relevant interoceptive US was significantly more unpleasant when compared to the exteroceptive US from two sensory modalities, i.e. exteroceptive somatic pain in study 1 and aversive tone in study 2, despite careful matching to intensity and unpleasantness, respectively, supporting possible differences in habituation processes 4 . In addition to shared neural activation induced by US of different modalities, interoceptive US interestingly evoked greater neural activation within the anterior insula and dorsal anterior cingulate cortex as key regions of the salience network, with well-established roles in the central integration of interoceptive sensory signals with emotional and cognitive facets [55][56][57] . These findings, which appeared robust even when considering differences in US ratings as nuisance variables, reproduce and complement earlier efforts to elucidate the specificity of interoceptive visceral pain in shaping aversive anticipation and central pain processing, not only in direct comparison to an exteroceptive painful threat 4,26,58 , but also to a non-nociceptive, yet a priori equally aversive auditory threat. In line with a notable recent publication detailing a multivariate brain measure, the Neurologic Pain Signature (NPS), for visceral and somatic stimulation across different independent datasets 37 , our results from two independent studies support the unique salience of interoceptive pain as a US, above and beyond specific yet highly intertwined perceptual characteristics of intensity and unpleasantness, and underscore the suitability of this experimental model to elucidating the role of threat modality in associative learning and extinction processes in a clinically-relevant context.
To assess whether US modality distinctly shapes the formation of learned negative expectations, we accomplished analyses of differential conditioned responses to modality-specific threat predictors (CS). Results for the acquisition phases of both studies showed enhanced differential behavioural and neural responses to interoceptive threat predictors, suggesting preferential learning for the visceral modality. This was supported at the behavioural level by greater increases in the negative valence of interoceptive versus exteroceptive predictive cues, which were observed despite comparable contingency awareness, and greater visceral cue-induced SCR responses suggested by exploratory analyses of a subset of data. Within the brain, we documented shared differential activation to all threat predictors compared to safety cues in highly overlapping brain regions across studies, in line with a recent meta-analysis documenting a consistent and robust pattern of an 'extended fear network' across diverse fear conditioning paradigms 50 , as well as a meta-analysis supporting that pain-related and non-pain-related conditioned fear recruits overlapping but distinguishable neural networks 24 . Importantly, we also consistently demonstrated differences between interoceptive versus exteroceptive threat predictors in both studies. Specifically, conditioned interoceptive threat predictors induced greater differential neural activation in the posterior insula and midcingulate cortex. A recent meta-analysis focusing on pain anticipation supported an interplay of insular and cingulate regions in the representation of the affective qualities of sensory Shared and differential neural activation induced by predictors of interoceptive (CS + VISC) and exteroceptive threat (CS + SOM) relative to safety-predictive CS − , during reinstatement-test (RST-TEST) after single threat reinstatement with USSOM. For shared activation, results of conjunction analyses against global null are presented events, particularly applying to interoceptive signals 59 , in line with our own recent findings documenting the relevance of posterior insula in visceral compared to somatic pain expectation 26 . Given the well-established role of the posterior insula in restoring and maintaining homoeostasis in the face of imminent danger 60 , its distinct involvement may serve adaptive modulatory functions during the expectation of interoceptive threat. Our findings extend knowledge from other brain imaging studies on the modalityspecific aversive expectancy that have compared predictors of somatic pain with aversive pictures 31 or disgusting odours 30 , and complement our own data on nocebo effects and underlying mechanisms in visceral pain 8,9,11,21 . These observations suggest a specific relevance of insular together with cingulate regions in the preferential acquisition of the presumably more salient interoceptive CS-US association. In keeping with the notion that expectations dynamically influence perception and learning 61,62 , the reciprocal impact of interoceptive threats and their predictors is further substantiated by our exploratory correlational results. These not only indicate that affective qualities of interoceptive versus exteroceptive threats shape conditioned negative expectations, but also suggest a tight link between differential neural responses during the expectation and experience of aversive interoceptive signals. Together, our findings regarding the formation of negative interoceptive expectancies by aversive conditioning support our hypothesis that in the face of multiple danger signals indicating bodily harm, visceral pain evokes preferential interoceptive fear learning. These findings could be viewed as a modern replication of classical interoceptive conditioning studies carried out by soviet psychologists 35 , complemented herein by brain imaging techniques. They are in keeping with preparedness theory 40 , and support its applicability to pain-related learning in a broader context of the affective neurosciences, with intriguing putative clinical relevance. The role of fear and hypervigilance is increasingly appreciated in the pathophysiology and treatment of multiple complex and overlapping clinical conditions, including anxiety and chronic pain 54 . Modality-specific conditioning could therefore contribute to unravelling nocebo mechanisms relevant to vulnerability to chronicity and treatment failure, especially when nocebo effects persist rather than extinguish.
Persisting or resurging fear constitutes a core target of cognitive-behavioural treatment approaches like exposure therapy, which is essentially built on the successful and robust extinction of conditioned responses including learned fear. When the threat is no longer present, extinction of conditioned responses to former threat predictors is adaptive, allowing behavioural flexibility in rapidly changing, complex environments. At the same time, the initially acquired memory trace is preserved and can be dynamically reactivated 63 , which can contribute to impaired extinction efficacy and to relapse in clinical contexts 64 . This may be particularly relevant for highly salient and fear-evoking threats that are crucial to avoid, like interoceptive pain, as essentially already suggested by the Soviet pioneers of classical conditioning 35 . To elucidate threat modality-specific extinction processes and their underlying neural mechanisms, we tested whether the visceral Shared and differential neural activation induced by predictors of interoceptive (CS + VISC ) and exteroceptive threat (CS + AUD ) relative to safety-predictive CS − , during reinstatement-test (RST-TEST) after multiple threat reinstatement with US VISC and US AUD . For shared activation, results of conjunction analyses against global null are presented . For differential activation, results of second-level paired t-tests are presented ({CS + VISC < CS − } > {CS + AUD < CS − }; and vice versa). Peak voxel indicates results of ROI analyses (cluster size k E ≥ 3; all P FWE < 0.05) and whole-brain analyses (in italic font; cluster size k E ≥ 10; all P uncorrected < 0.001). Exact unilateral P-values are provided. For a visualisation, see Fig. 3 and Supplementary Fig. S7. aINS anterior insula, CS conditioned stimuli, FWE family-wise error, dACC dorsal anterior cingulate cortex, dlPFC dorsolateral prefrontal cortex, H hemisphere, HIP hippocampus, MCC midcingulate cortex, MNI Montreal Neurological Institute, pINS posterior insula, ROI regions of interest, S2 secondary somatosensory cortex, VISC visceral. a Results also significant against conjunction null. CS-US association is more resistant to extinction and more susceptible to memory reactivation or 'relapse', induced by unexpected US exposure (i.e. reinstatement). In an effort to model different aspects of extinction learning, including the clinical reality of patients with waxing and waning symptoms, we herein implemented different experimental extinction and reinstatement protocols. Interestingly, when omission of the US occurred directly after acquisition on the same day in study 1, behavioural results indicated persisting conditioned fear in response to visceral but not somatic pain predictors, despite comparable contingency awareness. While these results may indicate a greater resistance to extinction specifically for the interoceptive CS-US association, a cautious interpretation is warranted given the small number of extinction trials and an overall overestimation of reported CS-US contingency awareness. In study 2, with an extinction phase accomplished on a subsequent study day and more extinction trials, conditioned behavioural responses were no longer evident to either predictive cue, not even in a supplementary analysis of a smaller number of extinction trials, and hence do not support a modality-specific resistance to extinction. However, given that our earlier conditioning work repeatedly documented rapid and full extinction in 1-day paradigms with visceral threats only 20,65 , together the present findings could hence also indicate that full extinction of conditioned emotional responses to multiple threats requires more unreinforced trials, especially for threats of higher salience and immediate extinction learning. In light of increasing knowledge regarding the role of consolidation and reconsolidation in the context of conditioned fear 66,67 , our divergent findings in studies 1 and 2 call for more mechanistic studies on the temporal dynamics and boundary conditions of pain-related extinction learning in multi-day and multi-threat paradigms, ideally including objective, physiological measures derived from electrodermal activity or pupillometry recordings. Regarding brain imaging results for the extinction phase, no differences were observed in neural responses to threat predictors of different modalities in either study. Instead, shared neural activation induced by both threat predictors during extinction was evident in both studies, involving key areas of the extinction network, particularly the hippocampus, supporting its general role in extinction learning 68 irrespective of threat modality, or of the number and timing of extinction trials.
Although extinction efficacy is clearly relevant to chronicity and treatment failure in patients 69,70 , underlying mechanisms remain incompletely understood even in healthy individuals. Reinstatement constitutes a promising translational tool 38 , which has not been applied in human brain imaging studies in the context of multiple threats. While we implemented different reinstatement procedures in the two studies, all involved unexpected US exposure followed by unpaired cue presentations. This allowed us to analyse differential conditioned responses during a reinstatement-test phase for different (former) threat predictors as an indicator of extinction efficacy. Our behavioural findings provide at least partial support for the notion that extinction efficacy may be reduced for the interoceptive CS-US association, i.e. that reinstatement with the visceral US had a greater impact on differential responses to CS. After reinstatement involving unexpected exposure to the visceral US, either as a 'single threat' (US VISC -subgroup of study 1) or as a 'multiple threat' (all participants in study 2), we observed greater negative valence of former interoceptive versus exteroceptive threat predictors. On the other hand, reinstatement with the somatic US alone (US SOMsubgroup of study 1) did not induce differences between CS modalities. Our hypothesis is most clearly supported by the results of study 2, in which reinstatement induced a selective resurgence of the interoceptive CS-US association, consistent with a return of interoceptive fear. Interpretation of findings in study 1 is complicated by the fact that extinction was immediate and shorter, as explained above, and evidently did not lead to a complete resolution of conditioned responses. Herein, reinstatement can rather be conceptualised as a disruption of the ongoing extinction process, which would make a return (i.e. a de novo increase) of conditioned responses difficult to detect. While we did not plan for this, and results may be hampered by limited statistical power given relatively small sample sizes of reinstatement groups, applicability to real-life scenarios is intriguing: Unexpected and unsignaled threats like painful episodes can obviously occur at any time point during an ongoing extinction process. Understanding how easily such a process can be disrupted would hence inform our understanding of adaptive extinction learning with relevance to factors that may interfere with successful exposure-based treatment. Given these considerations, one interpretation of data from study 1 is that unexpected exposure to interoceptive threat disrupted or in fact halted the ongoing extinction of conditioned responses to interoceptive threat predictors, whereas the extinction process for the same predictors effectively continues after unexpected exposure to exteroceptive threat, as supported by within-modality comparisons. Interestingly, residual effects of cue aversiveness have been shown to predict a reinstatement effect in healthy volunteers 71 , consistent with evidence from a clinical setting 72 . Hence, residual fear responses after extinction may serve as a predictor of treatment outcome. In patients with persistent fear and/or pain, achieving a robust and sustained extinction of conditioned responses constitutes an important treatment goal, and maybe particularly challenging for interoceptive memory traces based on our data in healthy individuals. In other words, we may be primed to preferentially learn, store, and remember cues that signal internal harm.
Within the brain, after reinstatement with visceral US alone or in combination with the somatic US, shared differential neural responses to both threat-predictive cues were consistently observed in the hippocampus as a core region of the extinction network. This finding is well in line with earlier studies implementing only one threat modality [73][74][75] , including visceral pain in healthy individuals 76 and in patients with chronic visceral pain 77,78 . In addition to the hippocampus, posterior insula and cingulate regions were differentially activated following reinstatement with visceral threats alone, a neural activation pattern that closely resembled neural responses detected during the acquisition, yet not observed during extinction. These enhanced differential responses within the insula and cingulate cortex during reinstatement-test may reflect the reactivation of the excitatory memory trace, presumably triggering preparatory responses in expectation of the reoccurrence of threat. Notably, single threat visceral reinstatement (study 1), resulted in enhanced differential neural responses for interoceptive compared to the exteroceptive threat predictors, whereas multiple threat reinstatement (study 2) led to shared differential activation of these regions to both interoceptive and exteroceptive threat cues. The latter finding may be explained by generalisation effects induced by the unexpected exposure to multiple threats in close temporal proximity. This may promote a generalisation of threat value from the more salient interoceptive to the less salient exteroceptive threat, ultimately resulting in the reactivation of neural responses to all former threat predictors regardless of their salience. While this is speculative, the inability to adequately differentiate between stimuli of different threat value, particularly when confronted with recent adversity, has previously been discussed as one mechanism contributing to enhanced relapse risk in clinical populations undergoing extinction-based treatment 74 . Together, these observations underline the critical importance of factors promoting either discrimination or generalisation of conditioned responses in the prediction and prevention of fear relapse 79,80 , extending the concept that as a result of conditioning, impaired discrimination of conditioned 81 and unconditioned responses 82 could play a role in the development of chronic pain.
Conditioned interoceptive fear should generally be considered adaptive, and an essential component of evolutionary-driven survival behaviour. However, when contextualised within a broader nocebo framework in which conditioned negative expectations drive maladaptive avoidance, hypervigilance and hyperalgesia 8,83,84 , putative clinical implications and future directions are noteworthy. Our findings support that interoceptive threat predictors may more readily evoke conditioned fear, which could drive the transition from acute to chronic pain as well as symptom chronicity, especially in vulnerable individuals. Furthermore, the risk for impaired extinction efficacy and relapse phenomena may be more pronounced in the context of aversive interoceptive signals, especially in combination with stress 83,85 , which demonstrably amplifies visceral nocebo effects 86 , and may contribute to a negative recall bias about aversive visceral experiences 87 . Given the evidence supporting altered extinction learning in patients with chronic pain, including IBS 77,78 , translational research in clinical populations is urgently needed. While our results are limited by the lack of complete SCR data as a biological marker of learning as well as by a more definite exclusion of pain-modality-specific vascular artefacts induced by gasping or other respiratory or movement-related effects that could be more closely inspected whether pulse oximetry or respiration had been measured, they do provide a more refined understanding of conditioned nocebo effects in the context of clinically-relevant interoceptive and exteroceptive threats. Merging our clinicallydriven perspective with the rapidly expanding general literature on interoception and predictive processing provides opportunities for the development or refinement of computational models based on the precision of interoceptive versus exteroceptive signals, advancing not only the definition of salience itself but also clarification its mechanistic basis [88][89][90][91][92][93] . The translation of this knowledge may ultimately help understand and minimise negative expectancy effects in patients with chronic pain 94,95 , especially in disorders of gut-brain interactions, adds a brain perspective to the eloquent claim that the gut is 'smart' due to its capability to learn and remember 96 , and supports further efforts towards extinction-based treatment approaches for these highly prevalent conditions 10,97 .

Methods
Participants. For the purposes of this report, we analysed unpublished data from healthy volunteers who were recruited to serve as controls in two conceptually connected fMRI conditioning studies conducted within a collaborative research unit (SFB 1280 'Extinction Learning', funded by the German Research Foundation). We utilised data from healthy volunteers recruited as part of a patient study (study 1), and included data from the placebo arm of a pharmacological study (study 2; German Clinical Trials Register, registration ID: DRKS00016706). Recruitment and screening of all healthy volunteers in both studies followed highly-standardised and established procedures in our line of visceral pain research 4,86 . An initial structured telephone screening was followed by a personal interview and a medical examination. Interview and examinations were accomplished in a medically-equipped room within a clinical research unit at the University Hospital Essen, Germany. Exclusion criteria common to both studies were <18 or >45 years of age, body mass index (BMI) < 18 or >30, and any known medical condition or regular medication use (except thyroid medication and hormonal contraceptives). The usual exclusion criteria for magnetic resonance imaging (MRI) applied, and structural brain abnormalities were ruled out upon structural MRI. Perianal tissue damage (e.g. haemorrhoids, fissures), which may interfere with rectal balloon distensions were excluded by digital rectal examination. Pregnancy was excluded with a commercially available urinary pregnancy test (Biorepair GmbH, Sinsheim, Germany) on the day of the experiment. Prior participation in any previous or other ongoing studies involving pain-related conditioning was also exclusionary. Standardised questionnaires were used to screen for recent gastrointestinal complaints 98 , symptoms of depression or anxiety (Hospital Anxiety and Depression Scale, HADS) 99 , as well as to confirm right-handedness 100,101 . As part of a comprehensive psychosocial questionnaire battery, chronic perceived stress was also assessed (Trier Inventory of Chronic Stress, TICS) 102 . All the participants reported normal hearing and normal or corrected-to-normal vision. The work was conducted in accordance with the Declaration of Helsinki, and studies were approved by the ethics committee of the University Hospital Essen (protocol numbers 10-4493 and 16-7237), and followed the relevant ethical guidelines and regulations. All volunteers provided written informed consent and were paid for their participation.
Overview of study designs and procedures. To elucidate the formation and extinction of conditioned responses to threat-predictive cues (conditioned stimuli, CS) in the face of multiple different biologically-salient threats, we implemented two differential delay conditioning studies with visual CS + predicting interoceptive threat (visceral pain: US VISC ) or exteroceptive threat (study 1: somatic thermal pain, US SOM ; study 2: aversive auditory stimulus, US AUD , study 2), and unpaired CS − . All experimental procedures were conducted in the MRI-suite of the Institute of Diagnostic and Interventional Radiology and Neuroradiology at the University Hospital Essen, Germany. In both studies, perceptual thresholds for each US modality were initially assessed, and individual US stimulus intensities for implementation during conditioning were identified with a calibration and matching procedure (for details, see below). During conditioning, both studies implemented the same sequence of experimental learning phases, namely acquisition, extinction, and reinstatement-test phases (Fig. 4, details below). In study 1, all phases were accomplished consecutively on a single study day, whereas in study 2, extinction and reinstatement-test phases were implemented 24 h after acquisition. During all learning phases, fMRI was applied to assess shared and differential neural activation induced by US and CS. US-and CS-related behavioural measures were acquired for each phase using digitised visual analogue scales (VAS). Moreover, electrodermal activity was continuously recorded aiming for analysis of skin conductance responses (SCR) as a psychophysiological measure of learning using an MRI-compatible system (Biopac Systems, Inc., Goleta, CA, USA; MP100 in study 1, MP160 in study 2), but technical difficulties resulted in incomplete data. Results of exploratory analyses of skin conductance responses to CS for a subset of participants in studies 1 and 2 are provided as Supplementary Material (Supplementary Fig. S1). Note that participants were informed that the study goals were to investigate neural mechanisms underlying visceral pain-related learning and memory processes. Importantly, no detailed information was provided about experimental phases, or about the contingencies between CS and US.
Unconditioned stimuli (US). For interoceptive US (US VISC ), applied in both studies, pressure-controlled rectal distensions were carried out with a barostat system (modified ISOBAR 3 device, G & J Electronics, Toronto, ON, Canada). Graded distensions of the rectum with an inflatable balloon constitute a wellestablished experimental model to assess visceroception and visceral pain, especially in the context of IBS 103 . The distension model allows the controlled and finely-tuned application of distensions inducing mild, intermediate or strong sensations of urgency, discomfort, and pain that closely resemble aversive visceral sensations experienced by patients, which are also commonly but less frequently experienced by healthy persons. Building on our long-standing experimental expertise with different sensory modalities 4,47,48,86 , for the exteroceptive US, cutaneous thermal stimuli (US SOM ) were applied on the left ventral forearm with a thermode (PATHWAY model CHEPS; Medoc Ltd. Advanced Medical Systems, Ramat Yishai, Israel) in study 1. In study 2, an aversive tone (US AUD ) with a sawtooth waveform profile and a frequency of 1 kHz created using Audacity 1.3.10beta (http://www.audacity.sourceforge.net/) was presented by an MRI-compatible sound system (Amplifier mkll+S/N 2016-2-2-03, MR confon GmbH, Magdeburg, Germany) bi-aurally through headphones. Individual perceptual thresholds for US are reported herein only for the purpose of descriptive characterisation of the study samples, as they primarily served as anchors for US calibration and matching.
In a continuation of our previous work on specificity to pain modality 4,26,86 , US VISC and US SOM were matched to perceived pain intensity in study 1. In study 2, US VISC were compared to a non-nociceptive, yet equally aversive US AUD by matching to perceived unpleasantness. In both studies, visceral stimuli served as an anchor for calibration and matching of exteroceptive US, aiming to identify individual US stimulation intensities within a predefined perceptual range of 60-80 mm (assessed on 0-100 mm VAS, ends labelled not painful and extremely painful in study 1, and not unpleasant and very unpleasant in study 2). To this end, a distension pressure 5 mmHg below the individual rectal pain threshold was initially chosen and rated on VAS until pressure within the predefined range was identified. This was successfully accomplished in both studies (VAS ratings for US VISC : 70.1 ± 0.9 mm in study 1; 65.8 ± 3.6 mm in study 2). For matching, visceral stimuli were presented with thermal (study 1) or auditory stimuli (study 2), respectively, and participants were prompted to compare the stimuli on a response device with Likert-type response options indicating more, less, or equally painful (study 1) or unpleasant (study 2) stimuli. If the rating showed a deviation, the intensity of exteroceptive stimuli was successively adjusted until ratings indicated equal perception at least twice consecutively. Of note, stimulus durations for interoceptive and exteroceptive US were adjusted for each individual, aiming at matched durations of ascending and plateau phases of US stimulation (20 s study 1; 14 s study 2). For additional details, see Supplementary Methods.
Note that after matching, in study 1 a short adaptation phase was accomplished in the MR scanner to accommodate for possible habituation effects previously observed for thermal pain stimuli 4 . This involved a short series of unsignaled US presentations (i.e. five visceral and five heat pain in pseudorandomized order), followed by another matching procedure when necessary. Supplementary control analysis of movement data (linear and degree movement) was accomplished for the unsignaled US delivered in the habituation phase conducted prior to the acquisition, in order to explore possible pain-modality-specific movements (e.g. due to gasping) potentially confounding differential neural activation in subsequent experimental phases ( Supplementary Fig. S3).
Based on these careful matching procedures, the following stimulus intensities for implementation during acquisition and reinstatement were identified: For US VISC distension pressures, 34.7 ± 1.6 mmHg in study 1, 35.9 ± 1.8 mmHg in study 2; for US SOM thermode temperature, 45.1 ± 0.3°C; for US AUD loudness, 94.74 ± 1.18 dB SPL (range: 89-108 dB SPL), all in line with our earlier results involving the application of the same pain 4,86 or auditory stimuli 47,48 .
Experimental phases. During acquisition, both studies involved three distinct conditioned stimuli (CS), which were contingently paired with the interoceptive US (CS + VISC ) and one exteroceptive US (CS + SOM or CS + AUD , respectively), or remained unpaired (CS − ). Visual geometric symbols served as CS, and allocation of a specific CS symbol to a specific US (US VISC , US SOM , or US AUD ) or designation as CS − was counterbalanced across participants. Within each study, participants were pseudo-randomly assigned to a different order sequence of CS-US pairings to avoid potential sequence effects. The programming of pairings aimed for an essentially pseudorandomized order, but avoided more than two successive pairings of one modality, and ensured that sequences alternatingly started with an interoceptive or exteroceptive CS-US pairing. All acquisition sequences were also balanced for the number of CS + and CS − presentations. The number of CS presentations and reinforcement schedules were similar in the two studies (study 1: 10 presentations per CS, 8 CS-US pairings, 80% reinforcement; study 2: 12 presentations per CS, 10 CS-US pairings, 83% reinforcement). All CS + were presented 6-12 s before US, with CS and US co-terminating. Inter-stimulus intervals consisted of a black screen with a white frame (durations: 5-8 s).
During extinction and reinstatement-test phases, CS was presented without any US. Given earlier evidence of rapid extinction for CS-US VISC associations 20,65 and to ensure tolerability of total scanning time for participants in study 1 (all phases accomplished consecutively), the number of extinction and reinstatement-test trials, respectively, was lower (5 presentations per CS) than in study 2 (12 presentations per CS). Subsequent to the extinction phase, reinstatement procedures involving the unsignaled and unexpected re-exposure to the US were accomplished, followed by a reinstatement-test phase, consisting of the same number of CS presentations as during extinction in pseudorandomized order. Given a lack of human reinstatement studies involving the multiple US, and unresolved methodological challenges in the field 38 , we implemented different reinstatement procedures in an effort to provide procedure-specific, yet complementary data. In study 1, participants were pseudo-randomly assigned to subgroups undergoing a reinstatement procedure with either interoceptive US alone (4 US VISC , N = 22) or the exteroceptive US alone (4 US SOM , N = 20). By doing so, we aimed to test for reinstatement effects after unexpected re-exposure to threat from one modality (single threat reinstatement) within each reinstatement subgroup. In study 2, all participants (N = 23) underwent a reinstatement procedure with both interoceptive and exteroceptive US (3 US VISC , 3 US AUD ), aiming to test for reinstatement effects after unexpected re-exposure to threats from multiple modalities (multiple threat reinstatement). Note that the number of unexpected US presentations was chosen based on work in the field 38 and our own earlier studies 20,77,104 . All US intensities and durations implemented as part of reinstatement procedures were identical to those applied during acquisition.
Behavioural measures. For the purposes of concise and parallelised US-and CSrelated behavioural and neural analyses across different sensory modalities in two independent studies, we focussed our behavioural data analysis on unpleasantness ratings as a clinically-relevant indicator of emotional valence. Emotional valence is relevant to all types of threat, shapes the perception of aversive stimuli, including pain 105 , and drives threat-related behaviours like approach and avoidance 106 . It is highly relevant to the specificity of visceral pain 26,42 , and sensitive to modulation by placebo/nocebo mechanisms 8,10,107,108 . Prior pain-related conditioning studies from our own group (reviewed in refs. 8,109 ) and in the broader fear conditioning Fig. 4 Schematic overview of study designs. All participants in studies 1 and 2 underwent acquisition (ACQ), extinction (EXT), and reinstatement-test (RST-TEST) phases. As for unconditioned stimuli (US), visceral pain (US VISC ) and either equally painful somatic pain (study 1, US SOM ), or equallyunpleasant auditory stimuli (study 2, US AUD ) were implemented during acquisition (ACQ) and reinstatement (RST). As conditioned stimuli (CS), distinct visual geometrical symbols were paired with US (CS + VISC ; CS + SOM ; CS + AUD ) or were presented without US (CS − ) during acquisition (differential delay conditioning). All CS were presented without US during EXT and RST-TEST. RST procedures involved unsignalled US from one modality ('single threat reinstatement' in study 1: US VISC in one subgroup; US SOM in another subgroup) or from both modalities (multiple threat reinstatement in study 2: US VISC and US AUD in all participants). During all phases, functional magnetic resonance imaging (fMRI) was accomplished to assess shared and differential CSand US-induced neural activation in regions of interest. Before and after each phase, behavioural measures were acquired with visual analogue scales (VAS).
literature support the notion that conditioned changes in cue valence constitute a sensitive and relevant behavioural measure capturing the formation, as well as the extinction and return of fear responses in healthy adults 110 and clinical populations 106 .
All ratings were accomplished on digitised VAS in the scanner using an MRIcompatible hand-held fibre optic response system (LUMItouchTM, Photon Control Inc., Burnaby, BC, Canada) before and after learning phases (for specific assessment time points, see Fig. 4; note that in study 2, an additional VAS rating was accomplished mid-extinction (after 6 trials), which we report on in the Supplementary Table S7). In study 1, VAS anchors were labelled 'very pleasant' (−100 mm) and 'very unpleasant' (+100 mm), with the word 'neutral' (0 mm) marked in the middle of the digitised VAS, as accomplished in our previous conditioning work involving painful US 20,21,65,77,86,111 . In study 2, VAS anchors were labelled 'not at all unpleasant' (0 mm) and 'very unpleasant' (100 mm), consistent with our prior work across sensory modalities 47,48 . For the purposes of this report and in light of differing scales, we exclusively analysed differential CS valence, computed as individual delta (Δ) scores for each CS + relative to the CS − for each learning phase and within each study group. This allows phase-specific comparisons of ΔCS + VIS vs. ΔCS + SOM in study 1 and ΔCS + VIS vs. ΔCS + AUD in study 2, in keeping with contrasts computed for brain imaging analyses (see below). Note that we additionally acquired perceived intensity of US VISC and US SOM in study 1; for findings dedicated to elucidating the contributions of intensity versus unpleasantness in the context of visceral pain specificity, see our earlier work 4,26 and Supplementary analyses herein using these ratings as a covariate of no interest for fMRI data analyses (Supplementary Tables S1-2, 9-10-S9).
To elucidate cognitive awareness of the specific CS-US pairings for each phase, we report contingency awareness as a secondary behavioural measure. To this end, at the conclusion of each experimental phase, for each CS a VAS with ends labelled 'never' (0 mm) and 'always' (100 mm) assessed the perceived probability (0-100%) of a US following a specific CS, as previously described 20,77,104 . Note that we herein report contingency awareness with a focus on differences between modalities. Statistical analyses of the accuracy of CS-US associations require more complex computations (for an approach, see ref. 112 ), which is beyond the scope herein.
Statistical analyses and reproducibility of behavioural data. Statistical analyses of behavioural data were accomplished separately for each study using IBM SPSS Statistics for Windows, version 20 (IBM Corp., Armonk, N.Y., USA). For ΔCS valence, 2 × 2 repeated-measures analyses of variance (rmANOVA) with the factors time (pre, post) and modality (interoceptive, exteroceptive) were computed for each experimental phase, applying the Greenhouse-Geisser correction when the assumption of sphericity was violated. Given our hypotheses and to ensure readability, we provide statistical details on time × modality interaction effects in the main manuscript; full rmANOVA results including all main and interaction effects are given in Supplementary Tables (Supplementary Tables S4 and S5). Two-tailed paired t-tests were computed as planned comparisons for two purposes: (1) To test for hypothesis-driven differences between modalities in ΔCS valence, US valence, and contingency awareness at specific time points within studies and experimental groups (all results reported in the main manuscript; further details in Supplementary Tables S6 and S7); and (2) to explore differences in ΔCS valence within modalities across time points (PRE-POST; full results reported in Supplementary  Tables S6 and S7). Only Bonferroni-corrected P-values are reported within the main manuscript; full uncorrected results of all paired t-tests are provided in Supplementary Tables S6 and S7. For rmANOVA, effect sizes are reported as partial eta squared (η p 2 ); for t-tests, effect sizes are provided based on Cohen's d for correlated designs 113 . Correlational analyses were accomplished using Pearson's r. Results are reported as mean ± standard error of the mean (SEM).
All data are expected to be reproducible given the same settings and procedures as described herein.
Functional images were analysed with SPM software (SPM12, Wellcome Trust Centre for Neuroimaging, UCL, London, UK) implemented in Matlab (R2016b, Mathworks Inc., Sherborn, MA, USA). A standard realignment procedure as implemented in SPM12 was performed for the estimation of six parameters for translation (linear: x, y, z (mm)) and for rotation (degree: pitch, roll, yaw (°)) to describe the rigid body transformation between each image and a reference image.
Subsequently, functional images were co-registered to individual T1-weighted structural images used as reference images, with the origin set to the anterior commissure. Functional images were normalised to Montreal Neurological Institute (MNI) space using a standardised International Consortium for Brain Mapping (ICBM) template for European brains as implemented in SPM12, and smoothed using an isotropic Gaussian kernel of 8 mm. To correct for lowfrequency drifts, a temporal high-pass filter with a cut-off set at 128 s was implemented. Serial autocorrelations were taken into consideration by means of an autoregressive model first-order correction.
First-level analyses were performed using a general linear model applied to the EPI images. The time series of each voxel was fitted with a corresponding task regressor that modelled a box car convolved with a canonical hemodynamic response function (HRF). As regressors, CS type (CS + VISC ; CS + SOM /CS + AUD ; CS − ) and US modality (US VISC ; US SOM /US AUD , only in analyses of acquisition phases) were included. For analyses of CS-induced activations, durations were used exactly as implemented in the experiments (jittered between 6 and 12 s before US presentation), for analyses of US-induced activations, ascending and plateau phases of US stimulation were included in analyses (20 s in study 1, 14 s in study 2). Six realignment parameters for translation and rotation were additionally implemented as multiple regressors for motion correction. After model estimation, the following first-level contrasts and respective reverse contrasts were computed for analyses of differential CS-related and US-related neural responses separately for each study group: CS + VISC > CS − , CS + SOM > CS − , US VISC > US SOM for study 1; CS + VISC > CS − , CS + AUD > CS − , US VISC > US AUD for study 2. CS contrasts were computed for each phase, US contrasts only for the acquisition phase.
On the second level, for analyses of US-induced neural activation, one-sample ttests based on these differential first-level contrasts and paired t-tests were calculated. For analyses of CS-induced differential neural activation, paired t-tests were computed for each experimental phase to compare ΔCS + VISC versus ΔCS Further, extending our earlier findings revealing not only distinct but also shared neural activations for US 4 as well as CS 26 across modalities, conjunction analyses using first-level contrasts were carried out to identify joint activations (i.e. CS + VISC > CS − ∩ CS + SOM > CS − , US VISC ∩ US SOM for study 1; CS + VISC > CS − ∩ CS + AUD > CS − , US VISC ∩ US AUD for study 2). Conjunction analyses were computed (a) using the minimum statistic to the conjunction null to test for shared activation within all tested subjects, and (b) using the minimum statistics to the global null to test for shared activation within some subjects 114,115 . For correlational analyses exploring associations between differential CS and differential US activation in specific ROI (provided in Supplementary analyses), parameter estimates were extracted for peak-voxels in significant regions of interest (ROIs) as identified by one-sample t-tests.
All analyses focused on a priori defined ROIs of the salience and extinction networks 26,51,[55][56][57]66,68 , including the insula (anterior, aINS; posterior, pINS), subregions of the cingulate cortex (midcingulate cortex, MCC; dorsal anterior cortex, dACC), amygdala, hippocampus, and ventromedial prefrontal cortex (vmPFC). All ROI analyses were carried out using unilateral anatomical templates constructed from the WFU Pick Atlas (Version 2.5.2), as implemented in SPM12. Segmentation of the insula (aINS, pINS) and cingulate cortex (dACC, MCC) was accomplished with masks based on the previous literature 116 within the borders of the Wake Forest University (WFU) Pick Atlas. For all reported ROI analyses, family-wise-error (FWE) correction for multiple testing was used with statistical significance set at P FWE < 0.05, and coordinates refer to the MNI space. Supplementary whole-brain analyses (uncorrected P < 0.001) were additionally carried out (Tables 1, 3-6; Supplementary Tables S1-3, S8-12; Supplementary Figs. S4-S7 for visualisation). Note that the results presented within the main manuscript all focus on ROI analyses unless explicitly specified otherwise.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
All fMRI data analysed for the current study are available in the neurovault repository (https://neurovault.org/collections/GPPGVZAT/). Behavioural and SCR data are provided in the main manuscript or its Supplementary Information; additional data and information upon request.

Code availability
No custom code or mathematical algorithms were used in the study. All software used for statistical analyses has been declared in the manuscript.