Introduction

Obsessive compulsive disorder (OCD) is a common and disabling psychiatric illness, occurring in 2–3% of the population. It begins during childhood or adolescence in at least 50% of patients [1, 2] and is associated with notable heterogeneity in symptom presentation, with common domains including excessive checking, attempts to produce symmetry, and cleanliness [3]. However, while much research has highlighted heterogeneity in phenotypic presentation in OCD [4], less work has focused on underlying heterogeneity in neural activity [5].

A number of findings provide a foundation to investigate neural heterogeneity in OCD. Broadly, recent models of psychopathology suggest that disruptions of three large-scale brain networks characterize many aspects of psychopathology, including OCD [6,7,8]. One is the frontoparietal network (FPN; also referred to as the central executive network), which is anchored in the dorsolateral prefrontal cortex and the posterior parietal cortex. The FPN is involved in cognitive control, defined as the ability to flexibly adapt behavior in varying conditions. Dysfunction of this network may underlie the tendency of patients with OCD to become stuck in compulsive rituals. Another relevant network is the cingulo-opercular network (CON, also referred to as the salience network), which is anchored in the dorsal anterior cingulate cortex/posterior medial frontal cortex (dACC/pMFC) and anterior insula/frontal operculum. The CON is hypothesized to play several roles such as continuous performance monitoring and integration of internal emotional states with task execution [9, 10]. In OCD, patients often do not feel a sense of completeness following compulsive rituals and continue to repeat them in an attempt to achieve an emotionally satisfied state, possibly due to dysfunction in this circuit [11]. The FPN and CON, which are collectively involved in external goal-directed behavior, act reciprocally with the default mode network (DMN). The DMN is involved in internal, self-referential processes and is anchored in the ventromedial prefrontal cortex and posterior cingulate cortex (PCC). The DMN is typically deactivated during cognitive control tasks [12]. Patients with OCD, however, have been shown to deactivate the DMN to a lesser extent than control participants and show atypical connectivity of the DMN with the CON and FPN [13,14,15]. The failure of the DMN to disengage during task execution could possibly drive the repetitive thought patterns observed in OCD because of failure of the CON to mediate the reciprocal interaction between the FPN and DMN [15,16,17,18].

Several experimental paradigms have focused on tasks that tap functions of these three networks, frequently through tests of cognitive control sub-processes including cognitive interference, response inhibition, and error processing. In prior work in OCD employing these approaches, notable heterogeneity has been found. Some neuroimaging findings have demonstrated that patients with OCD have reduced FPN activation during interference and inhibition processing [19,20,21,22], when control is required to engage a weaker response set to override or inhibit a competing response set [23, 24]. However, other investigations on this topic have reported no difference [25] or increased activation [26,27,28]. In error processing, patients with OCD show an increased amplitude of the error-related negativity (ERN) [29, 30], which refers to a response-locked electroencephalographic component that follows an error and localizes to midline frontal cortex. The ERN findings in OCD have been followed up with neuroimaging work showing increased neural activity in error-related regions, specifically the pMFC/dACC [15, 26, 31, 32]. The finding is thought to reflect either a greater tendency to perceive “erroneous behavior” that leads to increased compulsive rituals, or a compensatory response to override habitual behaviors that drive compulsive ritualizing. However, as with studies examining interference processing, not all groups have reported a greater error-related signal in patients with OCD [28, 33]. While this work suggests the neural substrates that might underlie the symptoms of OCD, the inconsistency in the published literature hinders the identification of reliable biomarkers of the disorder.

Many approaches have been considered to address failures to replicate findings across studies. One solution has been to rely upon findings from meta-analyses, which gain added power from pooling small effects across multiple studies. This approach was exemplified by Norman and colleagues who used the originating statistical parametric maps, as opposed to published voxel peaks passing a given threshold [34]. In this work, they showed that OCD patients had a greater error-related signal in the pMFC, as well as the right anterior insula (aIns), but reduced activation during inhibition tasks in rostral and ventral anterior cingulate cortex, thalamus/caudate, right aIns, medial frontal cortex, and supramarginal gyrus. While the meta-analytic work provides a good picture of findings that might be considered the most consistent on average, the very nature of relying on group averages obscures possible differences between patients which could explain discrepancies in prior literature. For example, if half of patients exhibit a strong positive signal in a region, whereas the other half exhibit an equally strong negative signal, the group average will appear as zero. Moreover, subjects in the high and low activations groups may differentially respond to intervention. This problem of heterogeneity is compounded in imaging data, in which differences in the magnitude of the signal as well as the spatial location, could conceal different patterns of response - an issue that is likely accentuated between individuals at different stages of brain development [35]. Heterogeneity measures are routinely obtained in meta-analyses, but such variance is rarely explained comprehensively by moderators, and variance in single imaging studies is rarely analyzed separately to determine if regional heterogeneity exists.

To address this problem, we applied unsupervised machine learning (ML) to task-based fMRI data to search for clusters of patients that display distinctive patterns of neural activation. This approach is particularly well-suited to identify neuropsychiatric heterogeneity. Traditional regression-based models often address neural targets one at a time, which ignores heterogeneity and neglects important multivariate patterns. As a result, traditional regression-based results represent average effects that are less useful for informing diagnosis and treatment selection for individual patients [36]. In contrast, an unsupervised ML approach can simultaneously integrate multiple inputs across components of activated networks, leading to personalized brain-based profiles. The brain-based profiles could also inform treatment recommendations by predicting differential treatment response to varying intervention approaches. We used the Incentive Flanker Task (IFT) [37], which activates critical nodes of the FPN and the CON while deactivating the DMN. The IFT achieves this end when participants make a speeded response to stimuli requiring cognitive control to resist interference from non-target flanking stimuli. Task difficulty was titrated to ensure around 15% error rates, so that we could also analyze error processing. We hypothesized that multiple brain-based patient clusters would emerge based on task-based fMRI findings, and we sought to contrast these clusters with a brain-based profile derived from nonaffected controls. We also tested post-hoc hypotheses that the brain-based patient subgroups would vary with regard to demographic and clinical correlates as well as treatment response to cognitive behavioral therapy (CBT).

Materials and methods

Participants and procedure

The data used for the present investigation came from a clinical trial examining neural correlates of cognitive behavioral therapy compared to an active comparison intervention for OCD [38]. Participants (N = 192) included clinical patients (n = 128 patients with OCD; 61.7% female) and healthy control participants (n = 64; 65.6% female). Clinical participants included adults (ages 24–45; 55% of clinical participants) and adolescents (ages 12–17), selected in these age brackets to reflect more stable compared to more plastic periods of brain development. In all analyses involving age, age group was treated as a categorical variable. Participants were assessed for a diagnosis of OCD and related psychopathology using age-appropriate semi-structured assessment [39, 40]. Severity of OCD symptoms was measured using the adult/child versions of the Yale-Brown Obsessive Compulsive Scale (Y-BOCS) [41, 42]. Clinical participants were also assessed with the Obsessive Compulsive Inventory-Revised [43], Hamilton Anxiety Scale [44], global assessments of functioning [45, 46], and subscales of the Wechsler Abbreviated Scale of Intelligence-II [47] were used to establish IQ. Control participants were required to have no history of past or current mental illness (except simple phobias). Full inclusion and exclusion criteria for clinical and control participants can be found in the Supplementary Materials.

Following baseline scanning and clinical assessment, block randomization was used to assign clinical participants to either cognitive behavioral therapy (CBT) or stress management therapy (SMT), stratified by medication, sex, and age group. OCD severity was reassessed at week 12 of treatment. All clinical assessment was performed by independent evaluators blind to intervention group. All predictive and machine learning models used full-information maximum likelihood estimation as a method to address missing data. Non-clinical control participants participated in baseline neuroimaging, but they did not participate in clinical assessments beyond OCD diagnosis and did not receive an intervention. Control participants were recruited to match clinical participants on age and sex. Study data were derived from trial NCT02437773 preregistered at clinicaltrials.gov, and CONSORT flowchart materials can be found in Supplementary Fig. S1. Data were collected at the University of Michigan Department of Psychiatry, and informed consent for adult participants and legal guardians following procedures approved by the University of Michigan Institutional Review Board.

fMRI task procedure, image processing and feature extraction

During fMRI scanning, subjects performed the IFT. The IFT consists of pressing one of two buttons to identify one of 4 target letters (S, K, H, C) surrounded by 4 flankers, which are either mapped to the same button response (low interference) or the opposite response (high interference) as the target. Each trial was also preceded by a monetary incentive, indicating how much money a subject would make on correct trial (from 0 to 10¢ to 50¢), and a feedback signal indicated whether or not they were correct in their response, which required that they respond within a certain interval, adjusted to keep error scores around 15%. Accuracy and reaction time was measured during the task. For additional specifics on our IFT implementation, please see Norman et al. [38].

Our scanning procedures are detailed in the Supplemental Materials. First-level contrasts compared activation during correct high interference relative to correct low interference trials (interference contrast), and error relative to correct trials during interference (errors contrast). These contrasts were collapsed across incentive levels which were not a focus of the current study. All subjects were combined in a “super-group” analysis to obtain statistical parametric maps, which were thresholded for brain-wise correction according to random field theory (p < 0.05). We then extracted BOLD signal from regions of interest representative in the CON, FPN, and DMN using 6 mm radius spheres placed from among significant peaks of activation and deactivation for clusters for the interference (high minus low) and error contrasts (obtained during the interference condition). In addition, because we had a significant peak of error-related activation in the putamen, and subcortical structures have been implicated in OCD [48], we included this area as well. This resulted in a matrix of 18 regions by 192 participants, which was submitted to the latent profile analysis described below.

IFT activation

The extracted data included sets of large-scale brain network regions of interest (ROIs) for each contrast (derived from the CON, FPN, and DMN; see Supplementary Table S1). For errors, this included the cingulo-opercular network (pMFC, dACC, bilateral aIns), as well as deactivation in the anterior and posterior default mode network (DMN). Bilateral putamen was also noted to deactivate to errors. For the interference contrast, activation foci also occurred in the CON, along with bilateral frontoparietal nodes and deactivation in the anterior and posterior DMN.

Analytic plan

To identify clusters of patients defined by distinct patterns of activation across extracted ROIs, we implemented unsupervised machine learning through the use of latent profile analysis (LPA, also known as Gaussian mixture modeling). Latent profile analysis was used to evaluate whether participants could be classified into a discrete number of latent groups based on observed interference- and error-related activations extracted for each individual participant. These latent groups are denominated by multiple synonyms in the literature, including profiles, classes, and clusters; we rely predominantly on the term “clusters” throughout this manuscript because it aligns with terminology used in similar unsupervised ML approaches that have been applied to OCD [49]. Latent profile analysis is based on a mixture modeling approach, which uses a categorical latent variable to allow for the possibility of multiple underlying distributions to explain the observed pattern of responses, as opposed to assuming that a variable (or set of variables) is properly represented by a single distribution based on an overall average. In this investigation, we evaluated whether there were discrete clusters of patients based on multivariate neural activation patterns. Models were estimated in Mplus version 8 [50] using full-information maximum likelihood estimation with robust standard errors.

To establish the proper number of patient clusters, an iterative process was used. A 1-cluster model was first fit to the data, and then a 2-cluster model was fit to the data and its fit was compared the 1-cluster model. Subsequently, a 3-cluster model was fit to the data to see if it fit better than the 2-cluster model, and this sequence continued by comparing each k cluster model to a respective k-1 cluster model, up through a 6-cluster model. Model selection and evaluation criteria are detailed in the Supplemental Materials; overall we relied on suggestions by Masyn [51] and employed both quantitative metrics and qualitative model interpretability when making final model decisions. Because a primary goal was to identify pathological profiles of difference between OCD and healthy subjects, healthy control participants were included in ML model estimation by treating their data as training data [50]. This process automatically assigns all controls into a single cluster that is separate from clinical participants, while clinical participants were free to be assigned to clusters based on empirical model estimation. This approach allowed data from control participants to aid in overall variance estimation and allow for direct comparison of model results between clinical patients and controls, while not contaminating cluster formation results for clinical patient clusters.

We also evaluated demographic and clinical covariates associated with brain-based patient clusters. Because clusters were identified empirically, analyses evaluating covariate differences based on cluster membership reflected a post-hoc approach. Accordingly, p-values for these analyses were evaluated based on false discovery rate procedures (FDR) [52]. In this case, each single predictor/outcome was considered as a separate family of hypotheses for evaluation (e.g., evaluating age differences across three different clinical clusters resulted in three hypothesis tests). We used the BCH method of covariate and distal outcome evaluation developed by Bakk and Vermunt [53].

Results

Cluster enumeration results

Based on evaluation of Bayesian Information Criterion (BIC) [54] values and qualitative cluster interpretation, we chose a 4-cluster model as a best fit to the data (see Supplementary Table S2 for detail on BIC and entropy values for all models considered). While a 5-cluster model showed a lower BIC value (ΔBIC = 19.06 compared to 4-cluster model), the additional cluster found showed substantial overlap with clusters found in the 4-cluster model and provided limited information and produced multiple clusters that were very small, limiting quality of inference from these clusters. Confidence in cluster assignment was strong for the 4-cluster model (entropy = 0.97).

Brain activation patterns for identified patient clusters are depicted numerically in Table 1 and graphically in Fig. 1. Across clusters, errors elicited strong activation in the CON and deactivation of the DMN, while interference elicited activation of both CON and FPN, as well as deactivation in DMN nodes. However, these patterns were more pronounced in some patients than others, as demonstrated by the emergence of three different clinical patient clusters. One cluster tracked the pattern of activation and deactivation seen in the control participants (a “normative” cluster accounting for 65.9% of participants with OCD). Another cluster was characterized by stronger activation of CON and FPN to interference tasks, but normal activation of these networks to error processing tasks (an “interference hyperactivity” cluster accounting for 15.2% of participants with OCD). In the DMN, patients in this second cluster exhibited greater deactivation during error processing, along with a failure to deactivate during interference, showing slight activation instead. A third cluster was characterized by greater CON activity and less deactivation in the anterior DMN (an “error hyperactivity” cluster accounting for 18.9% of participants with OCD). Interestingly, this third cluster showed deactivation across the CON and FPN nodes during interference processing, as well as greater deactivation in the DMN.

Table 1 BOLD activation observed in patient clusters and unaffected controls.
Fig. 1: Neural activation patterns are depicted for a number of OCD-relevant brain areas in study participants.
figure 1

Results are reported separately for each of the three empirical subgroups identified in our clinical sample, as well as unaffected control participants. Error = error contrasts (shaded in blue); interference = interference contrasts (shaded in yellow). ACC anterior cingulate cortex, SMA supplementary motor area, AIFO anterior insula/frontal operculum, DMN default mode network, MFC posterior medial frontal cortex, DLPFC dorsal lateral prefrontal cortex, parietal parietal cortex, d dorsal, l left, r right, ant anterior, post posterior.

Demographic and clinical correlates of patient clusters

Demographic and clinical correlates of patient clusters can be found in Table 2. After correcting for false discovery rates at the p < 0.05 level, the only significant between-cluster differences in covariates were found for IFT interference reaction time, with the interference hyperactivity cluster showing longer reaction times than each of the other patient clusters. Notable findings that were nonsignificant after FDR correction include a lower proportion of adults in the interference hyperactivity cluster relative to the normative cluster (36.4% vs. 62.7%; p = 0.05), higher rates of past MDD in the interference hyperactivity cluster relative to the error hyperactivity cluster (42.1% vs .28.9%; p = 0.16), and lower rates of IFT accuracy in the normative cluster relative to the interference hyperactivity cluster (65% vs. 72%; p = 0.05) and the error hyperactivity cluster (65% vs. 72%; p = 0.03). Additionally, differential response to treatment was compared across the clusters (CBT x Y-BOCS change). In this case, a nonsignificant difference was observed, where the normative cluster showed a stronger response to CBT relative to the error hyperactivity cluster (p = 0.07). This approach reflects a comparison of cluster types for each therapy arm and is graphically depicted in Fig. 2.

Table 2 Clinical and demographic covariates associated with study participants.
Fig. 2: Yale-brown obsessive compulsive scale (Y-BOCS) scores at pre- and post-treatment for each patient cluster.
figure 2

Observed changes in OCD symptoms are displayed, stratified by patient cluster and treatment group.

Discussion

We found neural heterogeneity in patients with obsessive compulsive disorder in the large-scale neurocircuits engaged in cognitive control and performance monitoring. In this relatively large sample of participants, we identified three brain-based patient clusters relative to the normative sample of control subjects. The largest normative patient cluster showed activation during interference and error processing, closely replicating the pattern seen in the control participants. In contrast, an error hyperactivity cluster of patients exhibited increased activation in the CON, along with a failure to deactivate both DMN and putamen. This was not a cluster that simply had more “positive” activation overall, as these patients tended to show little to no activation during the interference task along with greater deactivation. On the other hand, a third cluster exhibited interference hyperactivity in the CON and FPN, along with a reversal of deactivation seen in the anterior DMN. During error processing, this cluster showed increased deactivation in the DMN.

These findings identify potentially important differences in the neurocircuits of persons with OCD during interference and error processing. They suggest that each process taps a different set of deviations from normative function, differing based on the individuals being studied. This finding contrasts with the general tendency to use diagnostic processes, research designs, and analytic approaches that assume relative phenotypic homogeneity of patients with OCD. This assumption likely leads to contradictory findings and failures to replicate when differing mixtures of patients are included in different samples.

Multiple considerations are important for interpreting the differing patient clusters that were identified. One hypothesis about OCD pathology is that the DMN fails to disengage during tasks requiring externally-directed attention, possibly due to defective mediation by the CON of reciprocity between task-positive networks (CON and FPN) and DMN, thereby driving persistent obsessions, leading to compulsive behaviors [15,16,17,18]. The profiles of the error hyperactivity and interference hyperactivity cluster both support this hypothesis, in so far as both of these clusters failed to deactivate or showed reduced deactivation in the DMN. However, each cluster showed opposite patterns of deactivation for error and interference contrasts, such that in the other condition the cluster exhibited greater DMN deactivation. Although selecting a response in the presence of interference and monitoring for errors are often lumped together in the broad class of “executive function”, our findings show that these distinct processes tap into different components of OCD pathophysiology, while still sharing common functional nodes such as the DMN and CON. Most importantly for new treatment development, interventions targeting networks may need to identify the profile type of a patient in order to apply the appropriate intervention (e.g., neuromodulation or neurofeedback).

The clinical and demographic characteristics which we tested did not yield strong differences between the clusters, although we saw some patterns which failed to reach significance after FDR adjustment. The exception was the reaction time interference effect, which was significantly larger in the interference hyperactivity cluster. This could reflect a longer time-on-task, which has been associated with greater activation magnitude [55]. This cluster of subjects also had a (non-significantly) higher percentage of adolescent participants, which may also reflect developmental factors. While adults and adolescents have shown several differences in OCD-relevant neural activation [56,57,58,59], we did not find age to significantly differentiate clusters. Regarding treatment response, there was an intriguing visual trend suggesting that the normative patient cluster had a stronger treatment response to CBT relative to SMT. It is possible that this cluster represents a more common form of OCD with fewer biological deficits or abnormalities and is associated with improved CBT response, but these findings would require a fully powered a priori study to elucidate.

These neural findings are placed in the context of our analytic approach, which provides information that is not frequently reported in contemporary unsupervised ML in biobehavioral research. We chose a model-based approach to unsupervised ML, which fits a model that incorporates not only means (such as in traditional k-means approaches), but also estimates variance (accounting for uncertainty and noise in the data) and reports uncertainty in the degree of cluster separation and the effectiveness of each variable/feature in differentiating the clusters (via entropy). This model-based approach creates a direct connection to the broader population of patients with OCD, and it provides information that can limit overconfidence in specific results. For example, our entropy results showed that we could confidently assign patients into specific clusters, but despite observed between-cluster differences, no single feature was an outstanding biomarker. Instead, the full set of biomarkers working in concert was necessary to create patient clusters.

One tradeoff of our model-based ML approach is that it increases robustness and generalizability at the expense of the number of brain areas that we could investigate. Mixture modeling estimates multiple parameters relevant to each brain area, and because there is a limited amount of information in the data, we were limited in how many brain areas we could evaluate. While we selected brain networks that are relevant to OCD, a more exploratory approach may find that additional networks reflect task- and OCD-relevant heterogeneity, though it would have a higher risk of nonreplicability.

Additional limitations of this work are to be noted. While our use of the FDR procedure helps retain power relative to alternative post-hoc corrections, fully powered a priori comparisons are needed to better characterize demographic and clinical correlates of brain-based clusters. This is particularly relevant for age, where early onset cases of OCD can present with a distinct clinical profile [60], and future work may seek to enrich the sample across a broader range of onset age. Our inclusion of patients across age groups provided valuable information, but could also have increased error variance. Future studies with even larger sample sizes may be able to further subdivide the patient clusters that we observed and allow for increased statistical power for between-cluster comparisons.

This study is the first to use unsupervised ML to identify brain-based patient clusters in OCD based on activation of cognitive control and performance monitoring neurocircuits. We identified several discrete neural patterns that are associated with clinical presentation in OCD. Our approach provides unique insight beyond clinical phenotyping, which may only identify one phenotype when there are truly multiple underlying physiological patterns that lead to the condition. Parsing this heterogeneity may allow for greater precision in patient characterization and reframe prior neurobehavioral research in OCD. It can also provide a starting point for neuroimaging-guided treatment selection, where differential treatment response may eventually be detected based on these characteristics.