arising from M.E. Hoeppli et al. Nature Communicationshttps://doi.org/10.1038/s41467-022-31039-3 (2022)

In a recent article, Hoeppli et al. 1 reported that sociodemographic and psychological factors were not associated with interindividual differences in reported pain intensity. In addition, the interindividual differences in pain could not be detected by thermal pain-evoked brain activities measured by functional Magnetic Resonance Imaging (fMRI). Their comprehensive analyses provided convincing evidence for the null findings, but here, we provide another look at their conclusions by analyzing a new large-scale fMRI dataset involving thermal pain (N = 124) and re-analyzing their behavioral and fMRI data (N = 101). Our main findings are as follows: First, a multiple regression model incorporating all available sociodemographic and psychological measures could significantly predict the interindividual differences in reported pain intensity. The key to achieving a significant prediction was to include multiple individual difference measures in a single model. Second, with a new fMRI dataset with a group of 124 participants, we could identify brain regions and a multivariate pattern-based predictive model significantly correlated with the interindividual differences in reported pain intensity. Our results, along with Hoeppli et al.’s findings, highlight the challenge of predicting interindividual differences in pain but also suggest that it is not an impossible task.

Developing neuroimaging biomarkers of pain has the potential to improve pain assessment and management by providing objective measures of the subjective experience of pain2. However, it remains challenging to develop such biomarkers due to the substantial interindividual variability in brain systems for pain processing3 and pain-expressive behaviors4. This interindividual variability is influenced by multiple factors, including biological, psychological, and social ones, requiring systematic investigation of how these multiple factors and components are associated with interindividual variability in pain. In this endeavor, Hoeppli et al.’s recent report could be discouraging to those hoping for progress in brain-based pain biomarker development. The study provided two main conclusions, one for the sociodemographic and psychological factors and the other for the fMRI signal. They showed that both data types failed to explain interindividual differences in pain sensitivity and reported pain intensity.

First, they reported that individual differences in pain could not be explained by sociodemographic and psychological factors, which is somewhat inconsistent with what has been known about their effects on pain5. For example, previous studies have shown that pain experience can be influenced by age6, ethnicity7, sex8, and emotional states9, among many others. Potentially, the inconsistency may come from the fact that they did not consider the complex interactions among the sociodemographic and psychological factors. Though they examined the relationship between pain ratings and each of those factors separately, it is widely recognized that sociodemographic and psychological factors are intercorrelated, and their influences on pain are likely to come from their complex interactions. For example, a previous study reported that the ethnic differences in pain were mediated by perceived discrimination10, and another study showed that psychological and personality factors identified to be important for chronic pain were also associated with the socioeconomic status of patients with chronic pain11. These highlight that sociodemographic and psychological factors interact with each other to influence pain. Thus, to better understand the effects of sociodemographic and psychological factors on interindividual differences in pain, it is crucial to test their combined effects, for example, by incorporating them into a single model12.

To test this idea, we reanalyzed the behavioral data from Hoeppli et al. to examine the complex interactions between sociodemographic and psychological factors and their combined effects on interindividual differences in self-reported pain intensity. Specifically, we first examined the correlations between the sociodemographic and psychological measures and then performed a multiple regression with cross-validation using the sociodemographic and psychological measures as independent variables. The result showed that the sociodemographic and psychological measures were highly inter-correlated; 20% of all pairs (11 out of 55 pairs) showed significant correlations at q < 0.05, false discovery rate (FDR) corrected (Fig. 1a). This supports the idea that the sociodemographic and psychological variables are highly inter-connected. When we examined the combined effects of the sociodemographic and psychological measures on pain ratings using multiple regression with leave-one-subject-out cross-validation (LOSO-CV), the prediction-outcome correlation was significant, r = 0.260, p = 0.026. To account for potential issues related to multicollinearity, we also tested a principal component regression with LOSO-CV, and the results were comparable, r = 0.270, p = 0.021. Importantly, when we reduced the number of predictors to two or one, the cross-validated correlations were reduced to negative values (mean r = −0.005 and −0.1608 for two variables and one variable; Fig. 1b). Overall, different from Hoeppli et al.’s report, our findings show that the sociodemographic and psychological factors can explain individual differences in pain ratings, but only when their combined effects were accounted for.

Fig. 1: Reanalysis of data from Hoeppli et al.
figure 1

We reanalyzed behavioral (N = 73) and fMRI data (N = 101) of Hoeppli et al. For the behavioral data analysis, we included participants’ data with no missing values (N = 73). Details of the reanalysis methods can be found in the Methods section. a The correlations between the demographic and psychological measures showed a large number of significant correlations (11 out of 55 pairs) after correcting for multiple comparisons with false discovery rate (FDR) q < 0.05. b Results of multiple regression analysis with leave-one-subject-out cross-validation (LOSO-CV). (Left) We examined the cross-validated prediction performance of linear regression models with varied numbers of independent variables—one, two, and all eleven variables. The cross-validated prediction-outcome correlations were negative for the model with a single independent variable, mean ± SD of r = −0.161 ± 0.334, two-tailed, and for the model with two independent variables, r = −0.005 ± 0.176, two-tailed. However, the multiple regression model with all 11 variables showed a significant cross-validated prediction-outcome correlation, r = 0.260, p = 0.0263 (uncorrected), two-tailed. Each dot represents the mean prediction-outcome correlation, and the error bar represents its standard deviation. (right) The scatter plot shows the actual versus cross-validated predicted pain ratings based on the multiple regression model with all 11 independent variables. The error band represents a 95% confidence interval of the regression line. c (left) The brain activation map illustrates the main effect of the high-intensity heat stimulation (48 °C), thresholded at FDR q < 0.05, two-tailed. (right) No brain regions survived FDR correction (q < 0.05) for multiple comparisons in the analysis of the relationship between average pain ratings and brain activity in response to the high-intensity heat stimulation (48 °C). We show the brain activation map with an uncorrected threshold at p < 0.05 for additional reference. L and R indicate the left and right hemispheres, respectively. d We performed Lasso-regularized Principal Component Regression (Lasso-PCR) to predict individual differences in average pain intensity ratings. (left) The scatter plot shows the actual versus predicted pain ratings with LOSO-CV. The prediction-outcome correlation was not significant, r = 0.1764, p = 0.0776, two-tailed. (right) The map shows the voxels that reliably contributed to the prediction of mean pain ratings based on bootstrap tests (thresholded at uncorrected p < 0.001, two-tailed). Though the weight map was distinct from the predictive map based on our own data (spatial similarity, r = 0.0247), the periaqueductal gray (PAG) was consistently identified as an important region across both datasets. e We also tested an a priori fMRI multivariate pattern-based marker of pain, Neurologic Pain Signature (NPS), and the results showed a non-significant correlation between NPS responses and actual pain ratings, r = 0.0017, p = 0.9869, two-tailed. Error bands denote the 95% confidence interval of the regression lines. Source data are provided as a Source Data file.

Second, Hoeppli et al. reported that the individual differences in pain ratings could not be detected by fMRI signal, which is also somewhat inconsistent with what has been known. For example, two recent studies have shown that multivariate whole-brain functional connectivity patterns can predict the interindividual differences in pain of patients with chronic pain13 or healthy participants14. One of the functional connectivity models also showed a significant association with the number of pain sites in a large-scale dataset in an independent study15. In addition, another study reported that multivariate functional connectivity patterns related to pain-predictive psychological traits were associated with the socioeconomic status of patients with chronic pain11. Note that these studies cannot provide direct comparisons because, unlike Hoeppli et al., which utilized fMRI activation signals, these studies employed fMRI connectivity measures. However, a previous study showed that the striatal fMRI activity could explain increased pain ratings in African Americans compared to Hispanics and non-Hispanic whites10, which was also associated with perceived discrimination. All these findings suggest that fMRI signals can explain interindividual differences in pain to some extent.

To reconcile the discrepancy, we analyzed a large-scale pain fMRI dataset (N = 124) that included a similar number of participants to Hoeppli et al. We aimed to replicate their main findings that there was no brain region or multivariate pattern that could explain interindividual differences in pain ratings. We first examined the distribution of average pain intensity ratings for the high-intensity heat stimuli (47.5 °C). As shown in Fig. 2a, the ratings showed a wide range of distribution, which was consistent with Hoeppli et al.—the averaged pain ratings ranged from “Moderate” to near “Strongest imaginable” on the general Labeled Magnitude Scale (gLMS)16. Then, we analyzed fMRI data using the same methods that Hoeppli et al. used in their study: (1) univariate general linear model, where we included interindividual variations in pain ratings as a covariate, and (2) multivariate lasso-regularized principal component regression (LASSO-PCR) to predict interindividual variations in pain ratings. Though the analysis methods were the same, the results were different. The univariate analysis identified multiple brain regions (Fig. 2b, bottom), including periaqueductal gray (PAG) and supplementary motor area (SMA), significantly correlated with interindividual variations in pain ratings at q < 0.05, FDR corrected. The multivariate predictive modeling with LASSO-PCR also showed a significant prediction-outcome correlation with LOSO-CV, r = 0.252, p = 0.0047 (Fig. 2c). Ventrolateral prefrontal cortex and anterior insula, in addition to the brain regions identified in the univariate analysis, such as the PAG and SMA, appeared to be important contributors to the prediction. In addition, different from Hoeppli et al., the Neurologic Pain Signature (NPS)17, an a priori multivariate pattern-based fMRI marker of pain, was able to explain the individual differences in pain with a significant correlation, r = 0.202, p = 0.0243 (Fig. 2d). Although the effect size is small, there is a qualitative difference between a model that explains minimal variance and one that explains none at all. The former can contribute to a composite model capable of accounting for significant variance, as exemplified in boosting algorithms in machine learning, thereby potentially offering clinical utility. Overall, our results suggest that the brain activity patterns in both univariate and multivariate analyses can capture the interindividual differences in pain ratings.

Fig. 2: Brain activation patterns correlated with interindividual differences in pain ratings during high-intensity heat simulation.
figure 2

We replicated a series of analyses performed by Hoeppli et al. using a new fMRI dataset (N = 124), which included a comparable number of participants to Hoeppli et al. a The plot shows the distribution of average pain ratings for high-intensity heat stimulation (47.5 °C, the number of repeats per participant = 16). The error bars represent the standard error of the mean for each individual’s pain ratings. The ratings are sorted in ascending order. Dashed horizontal lines indicate anchors of the general Labeled Magnitude Scale (gLMS)16. b Univariate analysis results with a general linear model (GLM) using individual differences in averaged pain ratings as a covariate. The input images were the beta coefficient maps for high-intensity heat stimulation. (top) The brain activation map shows the main effects of high-intensity heat stimulation (47.5 °C). L and R indicate the left and right hemispheres, respectively. (bottom). The brain activation map shows regions with significant correlations between average pain ratings and brain activation associated with high-intensity heat stimulation (see Supplementary Table 1 for the list of suprathreshold regions). The brain maps were thresholded with FDR q < 0.05, two-tailed. c Multivariate analysis results with Lasso-regularized Principal Component Regression (Lasso-PCR) to predict interindividual variations in pain ratings. (left) The scatter plot shows the actual versus predicted pain ratings with leave-one-subject-out cross-validation (LOSO-CV) based on the Lasso-PCR model. The prediction-outcome correlation was 0.252, p = 0.0047, two-tailed. (right) The map shows the voxels that reliably contributed to the prediction of mean pain ratings based on bootstrap tests (thresholded at uncorrected p < 0.001, two-tailed; Supplementary Table 2). Thresholding was performed for the purpose of display; all weights were used in the prediction. d We tested the NPS17. The results showed a significant prediction-outcome correlation between the NPS response and the individual differences in pain ratings, r = 0.202, p = 0.0243, two-tailed. The scatter plot shows the actual pain ratings versus NPS responses. NPS response was calculated with the dot-product between the NPS weights and brain response to high-intensity heat stimulation. Error bands represent the 95% confidence interval of the regression lines. Source data are provided as a Source Data file.

There can be many reasons for the discrepancy between our results and those of Hoeppli et al., including the sample differences (e.g., sociodemographic background and psychological status) and the experimental and analysis factors, such as MR scanner, fMRI sequence, experimental design, procedure, rating scale, preprocessing steps, etc. To assess whether this difference could be attributed to the differences in the analysis tools and pipelines, we reanalyzed the fMRI data from Hoeppli et al.—the authors generously shared their preprocessed data with us for this article, allowing us to re-analyze the data with our own tools (for details, see “Methods”). The results were largely consistent with those reported in Hoeppli et al. 1 (Fig. 1c–e), suggesting that the null results observed in Hoeppli et al. cannot be attributed to differences in analysis tools or small changes in the analysis pipeline.

One plausible explanation for the discrepancy may be the limited sample size, particularly given the small to medium effect size, which would require a substantially larger sample to detect such effects reliably18. Thus, determining the predictive accuracy of fMRI-based pain models will require the analysis of additional large-scale samples or repeated validation across multiple datasets. In addition, incorporating intricate interactions among sociodemographic, psychological, and neurobiological factors—collectively referred to as biopsychosocial factors—into the analysis may be crucial. This idea is supported by a recent study showing that the brain-phenotype models reliably failed in individuals who deviated from stereotypical profiles19, underscoring the importance of considering sociodemographic and psychological factors to make generalizable predictive models.

Thus, although Hoeppli et al.’s null findings may initially appear discouraging, they offer deep insights into the intricate interplay among a multitude of biological, psychological, and sociodemographic factors that contribute to the interindividual variability of pain. Overall, these findings shed light on how to approach the understanding and modeling of the multifaceted nature of pain.

Methods

Participants

A total of 137 healthy and right-handed participants were recruited from the Suwon area in South Korea. Eligibility was assessed through online questionnaires, including pain and MRI safety-screening questions. We excluded participants with psychiatric, physiological, or pain disorders, neurological conditions, or MRI contraindications. Additionally, thirteen participants were excluded from the analysis due to technical issues with the thermal stimulus equipment, voluntary requests to quit the scanning session, or the presence of abnormal brain structures (e.g., arachnoid cysts). Thus, we included the remaining 124 participants in the current study (nfemale = 61, mean age = 22.17 years, SD age = 2.69 years). We obtained written consent from all participants, who also received financial compensation for their participation. The present study was approved by Sungkyunkwan University Institutional Review Board. The same dataset was used in a previously published study as an independent test dataset3. The study addressed a different research question from the current study. The sex information of participants was collected by self-report. We did not perform a sex-based analysis.

fMRI experimental paradigm

In the MRI, participants experienced contact heat stimuli on their left forearm and rated pain intensity after each thermal stimulation. There were eight heat pain task runs with twelve trials per run, resulting in 16 trials per participant for each temperature. Each trial consisted of a series of events: (1) watching a 20-second movie clip (pre-heat), (2) experiencing thermal stimulation (12 s; ramp-up: 2.5 s; plateau: 7 s; ramp-down: 2.5 s), and (3) rating pain intensity (5 s). Two of the heat pain task runs were without the pre-heat movie clip, resulting in 12 trials with thermal stimulation and pain ratings. Participants rated the intensity of pain on a general Labeled Magnitude Scale16 after the 3–5 seconds (random jitter) following the stimulus offset. The scale ranged from 0 to 1 with anchors of “No sensation” (0), “Weak” (0.061), “Moderate” (0.172), “Strong” (0.354), “Very strong” (0.533), and “Strongest imaginable” (1). We explained how to use the scale with anchors before the experiment, and we removed the anchors during the actual task to prevent potential influences of anchors on ratings. Thermal stimulation was delivered to the volar surface of the left forearm using a Pathways system (Medoc Ltd) with a 16-mm ATS thermode endplate. The temperatures of thermal stimulation ranged from 45 °C to 47.5 °C (in 0.5 °C increments, totaling six intensities) from the baseline 32 °C. The order of stimulus was pseudorandomized. A single highest temperature (47.5 °C) was delivered before each heat-induced task run to avoid the initial habituation of the skin site to contact heat. In this study, we only used the data responding to the highest temperature stimuli (i.e., 47.5 °C). We collected behavioral data using Matlab (version 2018b, MathWorks) with the Psychophysics Toolbox 3 (http://www.psychtoolbox.org/).

fMRI acquisition and preprocessing

The whole-brain fMRI images and high-resolution T1-weighted structural images were obtained using a 3-Tesla Siemens Prisma scanner with a 64-channel head coil at the Center for Neuroscience Imaging Research (CNIR), Sungkyunkwan University, Suwon, South Korea. We obtained functional echo-planar images (EPI) using the following sequence parameters: TR of 460 ms, TE of 27.20 ms, multiband acceleration factor of 8, field of view of 220 mm, voxel size of 2.7 × 2.7 × 2.7 mm³, and slice order acquisitions of interleaved. The preprocessing of the functional EPI images was performed using Statistical Parametric Mapping 12 (SPM12) and FMRIB Software Library (FSL). To ensure image intensity stability, the initial 18 volumes (approximately 8 s) were removed from each run. Then, the functional EPI images were corrected for motion (i.e., realignment). Distortion caused by the magnetic field inhomogeneity was also corrected using FSL’s topup function. Then, the functional EPI images were co-registered and spatially normalized into the Montreal Neurological Institute normative atlas with voxel interpolation at 2 × 2 × 2 mm³. We then smoothed the images with a 5-mm full width at half-maximum. To reduce motion-related artifacts, we conducted an Independent Component Analysis-based strategy for Automatic Removal Of Motion Artifacts (ICA-AROMA)20. In addition, we excluded some run data based on the following two criteria regarding frame displacement (FD), which quantifies the frame-wise displacement of images: (1) the mean FD of a run exceeding 0.2 mm, and (2) the FD of any volume of a run exceeding 5.0 mm21,22.

Single-trial fMRI data analysis

We utilized a single-trial design approach to model the brain responses to heat stimulation. In this approach, the response magnitude of each voxel for each trial was estimated using a general linear model (GLM). This model included separate regressors for each pain trial, as in the ‘beta series’ approach23. Additional regressors, event boxcars convolved with canonical hemodynamic response function, were created to model the periods of pre-stimulus (movie-viewing), anticipation, heat stimulation, and pain rating. Given that we already removed motion-related artifacts through ICA-AROMA during preprocessing, only five principal components of WM and CSF signal and a linear trend were included as nuisance covariates. We used SPM12 with a 180-second high-pass filter to perform the first-level analysis on this design matrix. Subsequently, we calculated variance inflation factors (VIFs) on a trial-by-trial basis. VIFs are a measure of design-induced uncertainty caused by collinearity with nuisance regressors. This step aimed to identify and exclude trials where the estimates could be significantly influenced by artifacts occurring during the trials. Any trials with VIFs exceeding 3 were excluded from the analyses. On average, 0.1371 trials were excluded due to high VIFs, with a standard deviation of 0.7686. Finally, single-trial beta images were obtained and served as input images for predictive modeling. The original number of trials was 96, but due to the errors in stimulus delivery or high variance inflection factor (> 3) and due to runs excluded because of high mean FD values, the average number of trials used in the analyses was 91.6 (standard deviation = 10.6).

Pain intensity rating analysis

We calculated the mean pain ratings for each participant by averaging pain ratings from the trials with the highest temperature stimulus (i.e., 47.5 °C). The average pain intensity was near the “Very Strong” anchor (0.533), indicating that, on average, participants experienced a very strong level of pain for the highest temperature condition. The average number of trials included in this analysis was 15.5806 (standard deviation = 0.9967).

Univariate analysis with average pain ratings as a covariate

With the average pain ratings obtained from the previous analysis step, we conducted a GLM analysis on the average beta images for the highest temperature using the average pain ratings as a covariate. The main effect of the 47.5 °C heat stimulation is shown in the top panel of Fig. 2b, and the correlates of the average pain intensity ratings are also shown in the bottom panel of Fig. 2b.

Multivariate analysis with the neurological pain signature (NPS)

We tested a priori fMRI multivariate pattern-based marker of pain, Neurological Pain Signature (NPS)17, to examine whether we can predict the individual differences in pain intensity ratings with NPS responses. To obtain the NPS response, we calculated the dot-product of the NPS pattern weights and the average beta estimates for the highest temperature stimulation. We then calculated the correlations between the NPS response and the individual differences in the pain intensity ratings (Figs. 1e and 2d).

Multivariate analysis with LASSO-PCR

To develop an fMRI-based predictive model of the average pain intensity rating, we used lasso-regularized principal component regression (LASSO-PCR) with leave-one-subject-out cross-validation (LOSO-CV). The input features for the modeling were the average beta estimates of the highest temperature condition, and the outcome variable was the average pain intensity rating for the highest temperature stimulation. The cross-validated prediction performance of the predictive model was assessed with a prediction-outcome correlation, which refers to the correlation between the predicted and actual pain intensity ratings. To identify brain voxels that reliably contribute to the prediction, we thresholded the predictive weight map using p-values from a bootstrap test with 5000 iterations.

Re-analysis of fMRI data from Hoeppli et al. 1

To investigate whether different tools and analysis pipelines resulted in the discrepancy between our findings and those of Hoeppli et al. 1, we reanalyzed the fMRI data (N = 101) from Hoeppli et al. 1 with our analysis pipelines. The preprocessing steps done by Hoeppli et al. before data sharing included motion correction, slice timing correction, spatial smoothing (FWHM = 5 mm), and ICA-FIX (for details of the preprocessing, please see ref. 1). We then normalized functional EPI images to the Montreal Neurological Institute standard brain space with the interpolation to 2 × 2 × 2 mm³ voxels using FSL flirt. We performed a first-level GLM with a ‘beta series’ approach23 using SPM12, modeling brain responses to the heat stimulus of each trial with a 180-second high-pass filter. We also modeled brain activity during pain rating by including one regressor for the pain rating period. Event boxcars were convolved with a canonical hemodynamic response function. Given that the step of ICA-FIX, which was already done before the data sharing, was supposed to denoise non-neuronal artifacts, we additionally included only five principal components of WM and CSF signals and a linear trend as nuisance covariates. Finally, we conducted a GLM with averaged beta images for the highest temperature by including average pain intensity ratings as a covariate and a multivariate analysis to predict average pain intensity ratings.

Statistical analysis and software

All analyses in the present study were performed with Matlab (version R2020b, MathWorks). More specifically, we used SPM12 and in-house behavioral and neuroimaging analysis tools (CanlabCore [https://github.com/canlab/CanlabCore] and cocoanCORE [https://github.com/cocoanlab/CocoanCore]). All statistical tests are two-tailed unless otherwise noted.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.