Identifying and validating subtypes within major psychiatric disorders based on frontal–posterior functional imbalance via deep learning

Converging evidence increasingly implicates shared etiologic and pathophysiological characteristics among major psychiatric disorders (MPDs), such as schizophrenia (SZ), bipolar disorder (BD), and major depressive disorder (MDD). Examining the neurobiology of the psychotic-affective spectrum may greatly advance biological determination of psychiatric diagnosis, which is critical for the development of more effective treatments. In this study, ensemble clustering was developed to identify subtypes within a trans-diagnostic sample of MPDs. Whole brain amplitude of low-frequency fluctuations (ALFF) was used to extract the low-dimensional features for clustering in a total of 944 participants: 581 psychiatric patients (193 with SZ, 171 with BD, and 217 with MDD) and 363 healthy controls (HC). We identified two subtypes with differentiating patterns of functional imbalance between frontal and posterior brain regions, as compared to HC: (1) Archetypal MPDs (60% of MPDs) had increased frontal and decreased posterior ALFF, and decreased cortical thickness and white matter integrity in multiple brain regions that were associated with increased polygenic risk scores and enriched risk gene expression in brain tissues; (2) Atypical MPDs (40% of MPDs) had decreased frontal and increased posterior ALFF with no associated alterations in validity measures. Medicated Archetypal MPDs had lower symptom severity than their unmedicated counterparts; whereas medicated and unmedicated Atypical MPDs had no differences in symptom scores. Our findings suggest that frontal versus posterior functional imbalance as measured by ALFF is a novel putative trans-diagnostic biomarker differentiating subtypes of MPDs that could have implications for precision medicine.


Introduction
Definitive biomarkers have remained elusive in psychiatry, while other fields of medicine have amassed an armory of biomarkers for diagnosis and treatment. This is not entirely surprising as studies have primarily utilized nosology that differentiates neuropsychiatric disorders based on clinical phenomenology in the absence of any biological determinant, albeit the Diagnostic and Statistical Manual of Mental Disorders (DSM) has revolutionized the field and advanced it to its current state. Long conceptualized as distinct diagnostic categories, major psychiatric disorders (MPDs), consisting of schizophrenia (SZ), bipolar disorder (BD), and major depressive disorder (MDD), share substantial core features as implicated by converging lines of evidence from genetic, molecular, histological, and neuroimaging studies [1][2][3][4][5][6]. Thus, there appears to be a greater continuum between psychotic and affective disorders than previously thought. Consequently, understanding the core changes in MPDs is critical for mapping the principal neural pathways resulting in psychopathology and the crossroads at which divergent paths lead to varying clinical phenomenology within and across diagnoses.
Several studies have adopted alternative approaches to identifying brain-based biomarkers that transcend traditional diagnostic boundaries [7,8]. Recently, Clementz et al. conducted a k-means clustering analysis of cognitive and electrophysiological measures using trans-diagnostic data generated from the Bipolar-Schizophrenia Network for Intermediate Phenotype consortium [7]. They identified three "biotypes" that were largely orthogonal to the DSM-IV diagnoses and significantly different with respect to external validating measures such as brain structure and function [9,10]. Their approach has been touted as an important step toward a more neurobiologically based understanding of psychosis [11]. Subsequently, pioneering work by Drysdale et al. identified four biotypes in depression using canonical correlation analysis of the Hamilton Depression Rating Scale (HAMD) to characterize connectivity features [8]. Their work presented yet another strategy for refining classification within clinically heterogeneous diagnoses, as well as identifying individuals who may be more responsive to transcranial magnetic stimulation.
Neuroimaging has offered a wealth of potential biomarkers for neuropsychiatric disorders. Abnormal brain function has been proven to be useful in the assessment of pain [12] and shows great promise for application to neuropsychiatric illnesses. Resting-state functional magnetic resonance imaging is well-established and has been widely performed for noninvasive exploration of the brain's intrinsic functional architecture using measurements of spontaneous low-frequency fluctuations (LFFs) in the blood oxygenation level-dependent (BOLD) signal [13,14]. Although their underlying mechanism is not exactly clear, LFFs appear to arise from neurovascular activity [15] and have been associated with glutamatergic/ GABAergic synaptic currents and glial activity [16,17]. Furthermore, the amplitude of BOLD signal fluctuations is proportional to regional cerebral blood flow, which is an established marker of brain metabolic activity [18]. The amplitude of low-frequency fluctuations (ALFF; generally in the range of 0.01-0.08 Hz) appears to be an efficient index of local spontaneous neuronal activity at rest [19]. Regional variability in ALFF reflects spontaneous fluctuations in a given voxel independent of its neighboring, regional, or network connectivity. Moreover, ALFF exhibits moderate to substantial test-retest reliability [20] ensuring a high upper bound for its validity as a regional functional measure to detect individual differences [21]. Prior studies, including a multi-center study and our previous work, have shown significant alterations in ALFF across MPDs compared to healthy controls (HC), most prominently in frontal, subcortical, and temporal regions, as well as in visual regions (precuneus and cuneus); however, inconsistencies have been reported [6,22,23].
In this study, we present a novel clustering method utilizing deep learning to identify subtypes across the psychotic-affective disorder spectrum in a trans-diagnostic sample of MPDs. We used a deep stacked AutoEncoder to extract low-dimensional features of ALFF followed by an ensemble clustering method to identify ALFF-based subtypes that were maximally dissimilar from each other in MPDs. We then validated the resulting subtypes using cortical thickness, white matter integrity as measured by fractional anisotropy (FA), polygenic risk scores (PRS), and risk gene expression tissue profile. We also examined the effects of medication status on symptom severity to elucidate possible pharmacologic effects within each of the subtypes.

Samples and measures
The study included a total of 944 participants consisting of 581 patients with MPDs (193 with SZ, 171 with BD, 217 with MDD) and 363 HC, who were recruited and scanned at a single site with identical inclusion and exclusion criteria. MPD participants were recruited from the inpatient and outpatient services at Shenyang Mental Health Center and Department of Psychiatry, The First Affiliated Hospital of China Medical University, Shenyang, China. HC participants were recruited from the local community by advertisement. Behavioral symptoms were assessed using the HAMD and Brief Psychiatric Rating Scale (BPRS). Cognitive function was assessed using the Wisconsin Card Sorting Test (WCST). Demographic and clinical characteristics are detailed in Supplementary Tables 1 and 2. Whole blood samples (243 patients and 193 HC) were collected. All participants provided written informed consent after receiving a detailed description of the study. The study was approved by the Institutional Review Board of China Medical University.
Functional MRI, structural MRI, and diffusion tensor imaging (DTI) were acquired in a GE Signa HD 3.0T scanner with a standard 8-channel head coil at the First Affiliated Hospital of China Medical University, Shenyang, China. Functional images were collected with a gradient echo planar imaging (EPI) sequence for ALFF measures. Three-dimensional, high-resolution, T1-weighted images were collected using a 3-D fast spoiled gradient echo sequence to measure cortical thickness. DTI used a singleshort spin-EPI sequence to measure FA for assessing white matter integrity. For all scanning sequence parameters and image preprocessing, please see "Methods" in the Supplementary information.
Genotyping, imputation and calculation of PRS, and risk gene expression were also performed. Details about how they were performed can be found in "Methods" in the Supplementary information.

Ensemble clustering method based on deep learning
Clustering algorithms group data points (i.e., participants) based on their similarity in dimensional space. For highdimensional data, such as whole brain ALFF, the number of dimensions must be reduced to avoid the "curse of dimensionality" [24]. Consequently, clustering results are dependent on the dimensional representation selected for analyses; however, there is no established standard for selecting an appropriate dimensional representation. Thus, our algorithm was designed to perform clustering analyses for multiple dimensional representations and then use consensus group assignment followed by a robustness optimization protocol to achieve the most reliable and stable subtype assignment. Subtypes were identified in n = 581 patients with MPDs using a novel ensemble clustering method based on deep stacked AutoEncoder in the following steps ( Fig. 1).
Step one: dimension reduction To identify principal ALFF alterations, we extracted voxels with significantly different ALFF between MPDs and HC using a general linear model (GLM). For GLM analyses, gender (male/female) and group (MPDs/HC) were included as discrete factors and age as a continuous factor, and the effect of group on ALFF was the primary interest of the analysis. The significance level was set at voxel p < 0.001 with the Gaussian random field (GRF) correction for cluster p < 0.05. A total of 2175 voxels were identified as significantly different in ALFF between MPDs and HC from the whole brain of 42,185 voxels. We then used Auto-Encoder [25], a deep artificial neural network, to further reduce the dimensions of the input data to d ∈ [2,10]. AutoEncoder included an encoder and a symmetric decoder. The encoder compressed the 2175 voxels obtained as described above into a low-dimensional representation consisting of nine layers with sizes 2175-2048-1024-512-256-128-64-32-d and the symmetrical reconstruction by the decoder as the output. Mean square error was used as the loss function to minimize the differences between the input of 2175 voxels and the reconstructed voxels at the output layer. Compared to the conventional dimensionality reduction methods such as principal component analysis, AutoEncoder is capable of learning intrinsic, nonlinear relationships in the input data and therefore better suited for high-dimensional nonlinear data [25].
Step two: ensemble clustering A common problem of clustering high-dimensional data is that inappropriate low-dimensional representations of data will lead to unreliable clustering results. To avoid this problem, we designed a new ensemble method to integrate hierarchical clustering results from multiple d-dimensional representations (d ∈ [2, 10]) (Supplementary Fig. 1) which obtained from the autoencoder. The Euclidean distance was used to compute the distance between participants and the complete linkage method was used to compute the distance between clusters. For each d-dimensional representation (d ∈ [2, 10]), we obtained a set of clustering results for all participants. Therefore, each participant was subsequently assigned nine class labels, one for each of the clustering result based on nine d-dimensional representations. A consensus was determined by the majority of the nine class labels and served as the cluster assignment for each participant. Therefore, the ensembled result can better reflect the inherent clustering characteristics of the data, because it integrates the results from multiple low-dimensional representations of participants.
Step three: optimization of clustering robustness While the ensemble clustering method was effective, it was relatively sensitive to the low-dimensional representations obtained from the autoencoder. To improve the robustness of the clustering results, we merged some clusters based on multiple runs of the clustering method. To this end, we first introduced a new index to quantify the robustness of a cluster. The robustness index R i of cluster i was calculated as where C j i is the j-th run of the clustering method on cluster i. A larger robustness index R i means a more stable cluster i. We then adopted an iterative, hierarchical scheme to merge clusters with low robustness. Specifically, we iteratively combined the two clusters with the lowest robustness indices until all clusters were robust enough, i.e., their robustness indices were greater than a threshold δ. In our analysis, δ was set to 0.8 based on our experiment on the brain image data. Step one: identification of significant functional alterations in MPDs and using AutoEncoder to reduce the dimension of the identified alterations to d ∈ [2,10].
Step two: for each of the nine low-dimensional data from step one, we obtained nine different class labels based on clustering analyses, and five clusters (cluster A, B, C, D, and E) were identified.
Step three: we performed the clusters merging process according to six runs of clustering and obtained two final subtypes. Furthermore, the subtypes varied in patterns of amplitude of low-frequency fluctuation alterations as compared to HC (voxel p < 0.001 with Gaussian random field correction for cluster p < 0.05). MPD major psychiatric disorder; HC healthy control; L left; R right; d dimension.

Subtype-related validation across multi-level biological data
Comparison of ALFF alterations with HC Group comparisons of ALFF values were performed by DPABI [26]. For each voxel, GLM was performed to examine the difference in ALFF between each subtype and HC. For GLM analyses, gender and group were included as discrete factors and age as a continuous factor, and the effect of group on ALFF was of primary interest. Statistical significance was determined by combining individual voxel p (uncorrected) < 0.001 with GRF correction for clusterlevel inference of p < 0.05.

Cortical thickness and white matter integrity
Group comparisons of cortical thickness were performed vertex-wise on the cortical surface by Freesurfer (MRI_glmfit), and FA values were calculated in SPM8 (http://www.fil.ion.ucl.ac.uk/spm). For each vertex or voxel, GLM was used to examine differences in cortical thickness and FA between each subtype and HC. GLM design and statistical significance were the same as those in ALFF analyses.

Genetic loading analysis
Association of PRS (PRS-SZBD and PRS-MDD) with each subtype was performed with logistic regression, and Nagelkerke's pseudo-R 2 was calculated to measure the proportion of variance explained. We estimated and analyzed high-resolution PRS at 105 different levels of P T (ranging from 0 to 0.5 with increments of 0.005 plus 10 −6 , 10 −5 , 10 −4 , 0.001, and 1). To correct for multiple comparison, a significance threshold of p = 0.004 was adopted as suggested by Euesden et al. [27].

Clinical and cognitive measures
Two-sample t-tests were used to examine differences in HAMD and BPRS total and factor scores and WCST scores between subtypes. Statistical significance was set at p < 0.05 with FDR correction for multiple comparisons. HAMD and BPRS factor scores were identified from exploratory factor analysis using the principal component factor method in MPDs (n = 581) ("Methods" in Supplementary information, Supplementary Tables 3  and 4). Subsequently, the resulting HAMD and BPRS factors were used in a group analysis where we performed two-sample t-tests (p < 0.05 with FDR correction) to examine the effects of medication status for each subtype.

Clinical diagnosis-related validation across multilevel biological data
We also performed analogous analyses on ALFF, cortical thickness, white matter integrity, PRS, risk gene expression, and effects of medication status based on clinical diagnosis.

Identified subtypes and relation to clinical diagnoses
The novel ensemble clustering method identified two subtypes in the MPDs sample (n = 581), Archetypal MPDs (cluster A, 60% of the MPDs sample) and Atypical MPDs ( Fig. 1 and Supplementary Table 5). The distribution of clinical diagnosis (SZ, BD, and MDD) varied between Archetypal and Atypical MPDs. A greater proportion of SZ appeared in Archetypal MPDs (40%) than Atypical MPDs (16%). BD and MDD represented 27% and 33%, respectively, in Archetypal MPDs and 35% and 49%, respectively, in Atypical MPDs (Supplementary Fig. 2a). From the perspective of clinical diagnoses, there were more SZ belong to Archetypal MPDs (86%). While the proportion of BD and MDD subtyped as Archetypal MPDs were relatively smaller (65 and 61%, respectively), BD and MDD comprised much larger portions of Atypical MPDs than SZ. 86% of SZ were subtyped asArchetypal Mods (Supplementary Fig. 2b).

Subtype-related gene expression
Combining GWAS data and frontal cortex eQTL, we identified 173 genes significantly associated with Archetypal MPDs (n = 143) and 138 genes with Atypical MPD (n = 100) (Supplementary Excel 1). These genes were then used as input to an expression enrichment analysis on the web-based tool, FUMA. The two sets of genes showed a b  differential expression profiles across 53 human tissues from GTEx [28]. Archetypal MPDs-associated genes were significantly expressed in 21 tissues; about half (11 tissues) represent brain tissues (Fig. 4a). The genes associated with Atypical MPDs were predominantly expressed in nonbrain tissues including the heart, prostate, pituitary, pancreas, thyroid, and liver (Fig. 4b).

Clinical characteristics within subtypes
Medicated Archetypal MPDs had significantly decreased HAMD and BPRS factor scores than their unmedicated counterpart (Fig. 5) (Fig. 5). No significant differences in HAMD and BPRS factor scores were observed between medicated and unmedicated Atypical MPDs (Fig. 5).

Discussion
In this study, we developed a novel clustering method utilizing deep learning to identify two major ALFF-based subtypes, Archetypal MPDs and Atypical MPDs, that differed in genetic, multimodal MRI, and clinical characteristics. Archetypal MPDs (60% of the MPDs sample) had significantly increased ALFF in frontal areas (prefrontal cortex, limbic, paralimbic, and striatum) and significantly decreased ALFF in posterior areas (primary sensory and motor cortices and unimodal association cortices), significantly higher genetic vulnerability with increased PRS-SZBD, enriched risk gene expression in brain regions including the frontal cortex, limbic system, basal ganglia, hypothalamus, cerebellum, and substantia nigra, and significantly decreased cortical thickness and white matter integrity in multiple brain regions, compared to HCs. Medicated Archetypal MPDs had significantly decreased HAMD and BPRS factor scores than unmedicated Archetypal MPDs, suggesting the effect of medication status on symptom severity in this subtype. In contrast, Atypical MPDs (40%) were defined by significantly decreased ALFF in frontal regions and significantly increased ALFF in the posterior brain without associated differences in PRS scores, cortical thickness, or white matter integrity compared to HC. Risk gene expression was prominent in nonbrain tissues, such as heart, liver, pancreas, and pituitary, which are regarded as somatic and endocrine-related tissues. No significant differences in HAMD and BPRS factor scores were observed between medicated and unmedicated Atypical MPDs, suggesting the lack of medication status effects on symptom severity in the subtype. Collectively, our findings implicated functional imbalance between the frontal and posterior regions as a core and differentiating feature across the psychotic-affective disorder continuum. Furthermore, the subtypes, Archetypal and Atypical MPDs, delineated by this feature were differentially associated with genetic vulnerability, risk gene expression, cortical thickness, and white matter integrity. Interestingly, Archetypal and Atypical MPDs were also distinct in the effects of medication status on symptom severity, suggesting possible differential pharmacologic effects in the two subtypes. Additionally, multimodal biological characterization based on clinical diagnosis showed continuum alterations across SZ, BD, and MDD. These findings further support that there is a greater neurobiological overlap than previously thought among these clinical diagnoses.
The observed ALFF pattern in Archetypal MPDs was consistent with our prior findings of increased frontal and decreased posterior ALFF as a shared feature across SZ, BD, and MDD [6,29,30]. Significantly increased ALFF appeared in frontal regions including the prefrontal cortex, limbic, paralimbic, and striatum and significantly decreased ALFF in the posterior primary cortices in MPDs [6]. Moreover, we also found that the ALFF ratio between these regions in slow-4 was negatively correlated with measures of negative and disorganized symptoms across SZ, BD, and MDD [6]. Altogether, these findings suggest impaired balance between regions conventionally known for emotional perception and processing and the visual cortices in MPDs.
Studies utilizing trans-diagnostic approaches are emerging [5,6,9,10] as converging evidence indicates core features across MPDs and increasing focus on the brain and neuropsychiatric disorders from a systems perspective (i.e., National Institute of Mental Health Research Domain Criteria). Our findings in this study defined two subtypes across clinical diagnostic boundaries. Each clinical diagnosis, SZ, BD, or MDD, was represented in each subtype reported herein. We also performed multimodal biological characterization based on clinical diagnosis and found continuum alterations across SZ, BD, and MDD (for details regarding methods and results, see Supplementary information). Altogether, these findings further support that there is a greater neurobiological overlap than previously thought among the three clinical diagnoses. The mismatch between subtypes and clinical diagnoses may in part explain frequent inconsistent results among studies based on clinical diagnosis. The constraints of our current diagnostic system are apparent [7,9,10,31]. Refining the current diagnostic system with relevant biological measures (e.g., frontal and posterior ALFF imbalance) would yield more biologically homogeneous groups, which are critically important for developing more effective and personalized treatment. Along these lines, we developed and compared two classification models using a 3D convolutional neural network that categorized participants based on (1) subtypes, Archetypal and Atypical MPDs and (2) clinical diagnoses, SZ, BD, and MDD. The accuracy and precision for the subtypebased model were significantly higher than the model based on clinical diagnoses, underscoring that clinical diagnoses share more similar features and are less distinguishable from each other in the classification models. For full details regarding methods and results, please see Supplementary information and Supplementary Fig. 8.
The differences between the two subtypes, Archetypal and Atypical MPDs, in association with PRS for BD and SZ, risk gene expression, cortical thickness, and white matter integrity are of significant interest. Compared to HC, Archetypal MPDs had increased PRS-SZBD, indicating a greater genetic vulnerability in this subtype. Archetypal MPDs had enriched risk gene expression in the brain (e.g., frontal cortex, limbic system, basal ganglia, hypothalamus, cerebellum, and substantia nigra), whereas Atypical MPD had risk gene expression more prominent in somatic and endocrine-related tissues such as the heart, liver, pancreas, and pituitary. Significant decreases in cortical thickness and white matter integrity were found broadly in Archetypal MPDs but not in Atypical MPDs. Findings in Archetypal MPDs are consistent with previous studies of MPDs [4,5,[32][33][34]. Decreased neuronal and glial density, and genetic and neurotransmitter alterations have been found in multiple brain regions including the anterior cingulate cortex, dorsolateral prefrontal cortex, and nucleus accumbens across SZ, BD, and MDD [35]. Genetic imaging studies in SZ and BD individuals and their relatives suggest decreased gray matter volume [32,36] and white matter integrity [33,37] as potential heritable biomarkers. Altogether, our findings support the Archetypal MPDs as a genetic and neurodevelopmental subtype of neuropsychiatric disorders. Further studies are needed to determine the nature of Atypical MPDs (e.g., stress-induced or stress diathesis) and better understand the biological implications of this group.
Intriguingly, medicated Archetypal MPDs had significantly decreased symptom severity as measured by the HAMD and BPRS than their unmedicated counterpart but no significant differences in symptom severity were observed in medicated versus unmedicated Atypical MPDs, suggesting differential pharmacologic effects between the subtypes. This is further supported by the findings that the associated decreases in cortical thickness and white matter integrity in Archetypal MPDs could represent pathological processes that are responsive to medications [38,39] or direct compensatory effects of medications [40,41]. Moreover, we found enriched risk gene expression in the brain in Archetypal MPDs but not in Atypical MPDs. Wang et al. previously identified a genetic profile in SZ similar to our Archetypal MPDs that consisted of brain-expressed, high-risk genes enriched for targets of approved drugs [42]. Altogether, these findings raise questions as to whether conventional pharmacologic treatment may be more effective in Archetypal than Atypical MPDs. Further studies are needed to examine differences in treatment response between the two subtypes.

Limitations
There are several limitations in this study. We used a single biomarker (ALFF) approach in our clustering method to identify subtypes within MPDs. There are likely other relevant biomarkers for clustering and subtyping, and multimodal data could capitalize on cross-information of the existing data [43]. Future studies are needed to identify other relevant biomarkers and determine how best to combine different measures of multimodal brain imaging features in clustering analyses for psychiatric disorders. As well as the potential use of the identified neuroimaging markers for individualized prediction of clinical or cognitive measures [44]. Further, while there appears to be a biological mechanism underlying ALFF, the exact nature of the ALFF alterations observed herein are not clear. They could relate to factors such as the number of prior depressive/manic/psychotic episodes. Unfortunately, we did not collect specific data about prior illness episodes and were not able to examine the relationship between ALFF alterations and prior depressive/manic/psychotic episodes. The ALFF alterations could also relate to other fMRI and biological measures aside from cortical thickness and white matter integrity. For greater biological validity and predictive utility, other biological or clinical measures should be included in future studies of the subtypes described here. Moreover, some studies have found that dynamic functional features are more conducive to information related to different mental activity than static features [45]. These cannot be measured using static parameters. Future work should consider applying dynamic functional features to investigate the abnormal activity. In addition, factors such as participant inclusion criteria, the size of our clusterdiscovery data set, and the ordinal nature of our clinical measures could have restricted our ability to identify other subtypes.

Conclusion
In summary, our findings implicated functional imbalance between frontal and posterior regions as a core and differentiating feature among MPDs. These findings could have significant contributions to the development of biologically informed diagnostic classifications and treatment guidelines across the psychotic-affective disorder continuum.