Introduction

Cogive impairment is common to different psychiatric disorders, in particular depression and psychosis. At the same time these disorders show marked heterogeneity regarding the level of impairment of cognitive performance [1, 2]. Cognitive deficits in processing speed and verbal learning are proposed to have a central role in pathogenesis of psychotic illness [3,4,5], they appear prior to the first episode of psychosis in individuals at clinical high-risk (CHR) [6] and can also be observed in patients with mood disorders [7, 8]. Recent unsupervised machine learning (ML) studies investigating cognitive deficits in psychosis spectrum disorders, major depression, and bipolar disorder, found subgroups of individuals exhibiting different degrees of cognitive impairment, ranging from cognitively spared to severely impaired profiles [1, 2, 9,10,11,12,13,14]. This suggests that individuals with different diagnoses might share similar cognitive characteristics.

Mixed samples of patients with schizophrenia, and schizo-affective disorder [15], schizophrenia, and bipolar disorder [16, 17], depression, and bipolar disorder [18, 19], and first episode psychosis, and CHR for psychosis [20], can be subgrouped across diagnoses which supports this notion. In this context neurocognition may serve as an interface across psychiatric diagnoses to identify more homogeneous subgroups that show similarities in clinical symptoms and functioning. Though recent psychopathological models conceptualize cognitive impairment as a transdiagnostic dimension of psychopathology [21], it is unclear to which extent individual diagnoses overlap in severity of cognitive impairment and whether unsupervised ML can identify transdiagnostic subgroups with similar cognitive burden and potentially similar psychopathological pathway.

Neuro-endocrino-immunological alterations early in the development are also suggested to be shared both in depressive [22] and psychotic syndrome [23]. According to the neurodevelopmental hypothesis, particularly in psychosis, pathophysiological processes early in the development affect neuronal circuits which impact cognition and social experiences and eventually increase the vulnerability to the illness. Cognitive subgroups in depression and chronic psychosis are altered in brain structure [15, 24,25,26] and functional brain connectivity [27, 28], suggesting differences in their underlying neurobiological constitution. Supervised ML has shown sensitivity to identify widespread and interrelated brain patterns, which might be helpful to detect the complex patterns underlying cognitive subgroups [29]. To date, no study has investigated whether transdiagnostic cognitive subgroups in the early stages of the illness map onto underlying structural and functional neurobiological signatures. Findings would further clarify the association between early, potentially premorbid, neurobiological alterations and cognitive dysfunction common across disorders.

We aim to investigate whether cognitive subgroups are shared between patients with recent-onset psychosis (ROP), recent-onset depression (ROD) and those at CHR for psychosis using unsupervised ML. We compare single-disease and transdiagnostic subgroup solutions to determine if the transdiagnostic clustering renders single-disease subgrouping obsolete. We confirm this, if cognitive characteristics of the transdiagnostic subgroups overlap with cognitive characteristics of the subgroups identified in the single-disease clusterings. Individuals assigned to more impaired subgroups should show more pronounced brain structural (sMRI) and resting-state functional MRI alterations relative to HC and spared cognitive subgroups as a consequence of early pathophysiological processes.

Materials and methods

Sample

A discovery sample included 465 individuals with psychotic or affective illness or at risk for psychosis and 286 HC, between 15 and 40 years, recruited through the PRONIA study (Personalized Prognostic Tool for Early Psychosis Management, www.pronia.eu; German Clinical Trials Register: DRKS00005042) from seven sites (supplementary material). A replication dataset acquired by the same consortium included 433 patients and 178 HC from the same seven sites (Fig. S1). Written informed consent was obtained from the subjects. Each Local Research Ethics Committee declared their ethical approval for the study [30].

Study group-specific inclusion criteria were used [30] in addition to general inclusion and exclusion criteria (supplementary material). ROP were included in the study, if they fulfilled DSM-IV-TR criteria for a psychotic episode, present in the last three months, lasting longer than one week and with a first onset in the last 24 months. ROD were included, if they fulfilled DSM-IV-TR criteria for a first manifestation of a depressive episode, present in the last 3 months and with onset in the last 24 months. CHR status for psychosis was defined as attenuated psychotic symptoms [31], brief limited intermittent psychotic symptoms, cognitive disturbances [32] or positive family history (1st degree relatives) for psychosis/schizotypal personality disorder according to DSM-IV-TR alongside drop in functioning in the last 6 months. HC volunteers were included, if they did not fulfill any current or past DSM-IV-TR axis I or II diagnosis and/or CHR status for psychosis.

After quality control, the final discovery data set consisted of 668 participants (ROP = 140, ROD = 130, CHR = 128, HC = 270; mean age (yrs; SD) = 25.3 (6.0), females = 353, 52.8%). For the imaging analyses additional 15 discovery participants were excluded due to excessive head movement during the MRI and missing images. The final replication data set consisted of 409 participants (ROP = 108, ROD = 81, CHR = 100, HC = 120; mean age (yrs; SD) = 24.54 (5.7), females = 215, 52.6%; Table 1; Fig. S1; supplementary material).

Table 1 Demographic and clinical characteristics of the discovery and replication sample.

Clinical and cognitive assessment

We assessed clinical symptoms and functioning using the General Assessment of Functioning Scale (GAF) split into a ‘disability’ (D) and ‘symptom’ (S) score [33], the Positive and Negative Syndrome Scale (PANSS) [34] and the Beck Depression Inventory (BDI) [35].

We characterized cognitive performance using a battery of cognitive tests spanning eight cognitive domains according to the MATRICS Consensus Cognitive Battery (MCCB) [36, 37]: visual memory (Rey-Osterrieth Complex Figure Test [38, 39]), social cognition (Diagnostic Analysis of Nonverbal Accuracy [40]), working memory (auditory digit span [41]; self-ordered pointing task [42]), processing speed (verbal fluency test [43]; Trail Making Test A [44]; Digit Symbol Substitution Test [41]), verbal learning and memory (Rey Auditory Verbal Learning Test [45]), executive functioning (Trail Making Test B [44]), attention/vigilance (Continuous Performance Test-Identical Pairs [46]) and salience [47] (Table S1, S2). We assessed intelligence using proxies from the Wechsler Adult Intelligence Scale (WAIS-IV), the vocabulary subtest, and the matrix subtest [41].

Preprocessing of cognitive variables

Preprocessing and clustering analysis of cognitive variables followed a similar pipeline as recently established [14] (supplementary material). After quality control the final analysis data set consisted of 84 cognitive variables showing on average 0.6% missing values (SD cognitive variables = 0.3%; SD participant = 5.3%). We retained the identical set of cognitive variables in the replication data where preprocessing and statistical analysis followed the same steps (Fig. S1). The final replication data set showed on average 0.6% missing values (SDcognitive variables = 0.8%; SDparticipant = 2.4%) prior to the statistical analysis (Fig. S2).

Dimensionality reduction and K-means clustering analyses

Identical pipelines were conducted for the transdiagnostic clustering and the individual study group clustering and contained the following steps. Cognitive data were imputed and effects of age, sex, years of education, and study site were regressed out. To reduce the dimensionality of the cognitive features, we conducted cognitive domain-wise principal component analyses (PCA) (Table S2; Fig. S3) retaining the first component of each cognitive domain (N = 8). The eight cognitive domain scores were used in a K-means clustering analysis embedded in a resampling procedure to determine the optimal number of clusters and cluster stability (supplementary material). HC were used as a comparison group to the obtained cognitive subgroups and not part of the clustering procedure.

Statistical analyses for cluster characterization

We calculated one-factorial Analyses of Variance (ANOVAs) with the factor ‘cluster + HC’ to characterize cognitive, demographic, clinical and functioning differences between clusters and HC. To characterize cognitive differences between subgroups of individual clusterings (ROP, ROD, CHR), we calculated two-factorial ANOVAs with ‘cluster’ and ‘study group’ as between-factors. P-values were Benjamini-Hochberg false discovery rate (FDR) corrected [48] within their domain (cognitive, demographic and clinical/functioning) and FDR corrected pairwise t tests/chi-squared tests (for nominal scales) were calculated for individual comparisons.

Analyses were conducted in R version 3.6.1 (https://cran.r-project.org/bin/windows/base/). We used the ‘clusterboot’- [49] and ‘kmeansruns’-function contained in the ‘fpc’ package [50] for cluster stability assessment and cluster number estimation.

Preprocessing of neuroimaging data

Preprocessing and quality control of the gray matter (GM) images followed the protocols established previously [30] and the CAT12 manual (www.neuro.uni-jena.de/cat12/CAT12-Manual.pdf), respectively. The rsfMRI data preprocessing followed the protocol established in ref. [51]. In brief, rsfMRI data were parcellated into 160 regions of interest according to the Dosenbach functional atlas [52] and mean signal was extracted from 10 mm spheres around each region. We calculated pairwise Pearson’s correlations of the average time series between ROIs using in-house scripts running in Matlab R2015 resulting in connectivity matrices of 12720 resting-state functional connectivity (rsFC) features per participant (supplementary material).

Neuroimaging classification analyses

We built supervised ML models to assess the discriminability of the transdiagnostic clusters with respect to the GM volume and rsFC brain features. Classification performance between the obtained cognitive subgroups was assessed in two separate supervised ML pipelines each embedded in 10 × 10 repeated nested cross-validation using NeuroMiner (http://www.pronia.eu/neurominer) running in a MATLAB 2019a environment (MathWorks Inc.). We used an optimized linear support vector machine (SVM) algorithm and assessed classification performance based on balanced accuracy (BAC). We applied permutation testing (Nperm = 100; alpha = 0.05) to assess the final model significance (supplementary material). Using the same pipelines, we conducted classification analyses between individual study groups and HC, e.g., all ROP individuals against HC, to investigate whether the transdiagnostic subgrouping increased prediction accuracy relative to the individual study group classification.

External validation analyses

We projected the centroids of the transdiagnostic discovery sample cluster solution into the data spaces of the transdiagnostic replication sample. Similarly, we project the centroids of the individual study group cluster solutions into the data spaces of the respective study groups of the replication sample. Participants were assigned to the closest cluster centroid using Euclidean distance. We evaluated the validity of the external replication of the clusters relative to the effects obtained for the discovery solution (supplementary material).

Results

We identified a highly stable two cluster-solution (supplementary material, Fig. S4). Cluster 1 (N = 146) consisted of 79 (54%) ROP, 30 (21%) ROD and 37 (25%) CHR. Cluster 2 (N = 252) consisted of 61 (21%) ROP, 100 (38%) ROD and 91 (41%) CHR. We obtained a higher proportion of ROD in cluster 2 (p < 0.001) and a higher proportion of ROP in cluster 1 (X2(2, 398) = 37.195, both p < 0.001; table S3).

Differences in cognitive performance between transdiagnostic clusters

Analyses revealed reduced cognitive performance in cluster 1 when compared to cluster 2 across all cognitive domains, i.e., social cognition, working memory, processing speed, executive functioning, attention, visual memory, verbal memory and salience (main effect: cluster 1/cluster 2/HC: F(2,665) > 5.822, p < 0.01; individual cluster comparisons: p < 0.001 for all cognitive domains except for salience: p < 0.01). Whereas cluster 1 performed significantly worse than HC across all cognitive domains (p < 0.01) except for salience (p = 0.110), cluster 2 performed comparable to HC with respect to social cognition (p = 0.920), processing speed (p = 0.170), executive functioning (p = 0.370), verbal memory (p = 0.71), attention (p = 0.130), visual memory (p = 0.640) and salience (p = 0.056). Additionally, cluster 2 performed better than HC with respect to working memory (p < 0.05). We found the same pattern for the WAIS vocabulary and matrix scores which were not part of the clustering procedure (Table 2; Fig. 1).

Table 2 Cognitive characteristics of the transdiagnostic cluster solution in discovery and replication sample.
Fig. 1: Cognitive characteristics of clusters based on the transdiagnostic and individual clustering analyses in the discovery sample.
figure 1

A represents the cognitive performances of impaired (blue) and spared clusters (green) for the transdiagnostic cluster solution. B represents the cognitive performances of impaired (shades of blue) and spared (shades of green) clusters for the clusterings based on recent-onset depression patients (ROD), recent-onset psychosis patients (ROP) and clinical high-risk individuals (CHR) separately. For comparison impaired and spared clusters of the transdiagnostic cluster solution are shown in gray. For both sections: High principal component (PCA) scores represent high performance. Abbreviations: vismem visual memory, soccog social cognition, wm working memory, proc processing speed, exfun executive functioning, att attention, sal salience, verbmem verbal memory.

Hereafter, cluster 1 is referred to as the impaired cluster, and cluster 2 as the spared cluster.

Differences in clinical and functioning characteristics between transdiagnostic clusters

The impaired cluster in comparison to the spared cluster showed significantly lower functioning with respect to the GAF symptom scale in the last month (main effect: cluster 1/cluster 2/HC: F(2, 660) = 810.080, p < 0.001; impaired vs spared: p < 0.001) and the last year (main effect: cluster 1/cluster 2/HC: F(2, 660) = 253.120, p < 0.001; impaired vs spared: p < 0.001) before study entry as well as across the lifespan (main effect: cluster 1/cluster 2/HC: F(2, 660) = 92.854, p < 0.001; impaired vs spared: p < 0.001). Both impaired and spared cluster show significantly lower functioning with respect to the GAF symptom scale in the last month (impaired: p < 0.001, spared: p < 0.001) and the last year (impaired: p < 0.001, spared: p < 0.001) before study entry as well as across the lifespan (impaired: p < 0.001, spared: p < 0.001) when compared to HC. Effects between clusters and HC are similar with respect to the GAF disability scale (Table S3; Fig. 2).

Fig. 2: Functional characteristics and characteristics with respect to symptoms of the clusters based on the transdiagnostic and individual clustering analyses in the discovery sample.
figure 2

Functioning differences (A) and psychotic and depressive symptom differences (B, C) of impaired (shades of blue) and spared (shades of green) clusters for the clusterings based on recent-onset depression patients (ROD), recent-onset psychosis patients (ROP) and clinical high-risk individuals (CHR) separately. For comparison impaired and spared clusters of the transdiagnostic cluster solution are shown in gray. Abbreviations: GAF S global assessment of functioning (symptom scale), PANSS positive and negative syndrome scale, BDI Beck’s depression inventory.

The impaired cluster in comparison to the spared cluster showed significantly higher positive (t(245.28) = 4.728, p < 0.001), negative (t(213.82) = 3.955, p < 0.001) and general (t(245.56) = 2.585, p < 0.05) symptoms on the PANSS scale. Both impaired and spared cluster showed significantly higher depressive symptoms on the BDI as compared to HC (main effect: cluster 1/cluster 2/HC: F(2, 621) = 323.526, p < 0.001; impaired vs HC: p < 0.001; spared vs HC: p < 0.001) while the spared subgroup showed higher depressive symptoms in comparison to the impaired subgroup (p < 0.001).

Differences in cognitive performance between single-disease clusterings

Similar to the transdiagnostic cluster solution, the individual clusterings (ROP: impaired: N = 42 [30%], spared: N = 98 [70%]; ROD: impaired: N = 45 [35%], spared: N = 85 [65%]; CHR: impaired: N = 59 [46%], spared: N = 69 [54%]) showed an impaired subgroup with widespread reductions in cognitive performance relative to HC and a spared subgroup often performing similar to or better than HC (Supplementary material, Tables S4S6).

When comparing impaired and spared subgroups across clusterings, we found that the impaired and spared subgroup of the ROP clustering performed significantly worse in comparison to the impaired and spared subgroups of ROD and CHR in the domains of working memory (main effect study group: F(2, 392) = 12.687, p < 0.001; impaired ROD: p < 0.001; impaired CHR: p < 0.001), processing speed (main effect study group: F(2, 392) = 26.603, p < 0.001; impaired ROD: p < 0.001; impaired CHR: p < 0.001), attention (main effect study group: F(2, 392) = 16.453, p < 0.001; impaired ROD: p < 0.001; impaired CHR: p < 0.001), and verbal memory (main effect study group: F(2, 392) = 19.371, p < 0.001; impaired ROD: p < 0.001; impaired CHR: p < 0.001). Additionally, we obtained a significant interaction for visual memory (interaction effect: F(2, 392) = 14.324, p < 0.001) showing that the impaired ROP group performed significantly worse as compared to impaired ROD (p < 0.001) and CHR (p < 0.001) whereas the spared ROP group performed comparable to the spared ROD (p = 0.058) and CHR group (p = 0.652). We obtained a significant interaction for executive functioning (interaction effect: F(2, 392) = 19.273, p < 0.001; impaired ROD: p < 0.001; impaired CHR: p < 0.001) showing that the impaired ROP group performed significantly worse as compared to impaired ROD and CHR (p < 0.001) whereas reductions between the spared ROP group and the spared ROD (p < 0.05) and CHR group (p < 0.05) were less pronounced (Table S7).

Differences in clinical and functional characteristics between single-disease clusterings

Impaired and spared cognitive subgroups of the individual clusterings were less distinct with respect to functional impairments (Tables S8S10, Fig. 2). Whereas impaired and spared subgroups in ROP follow a similar pattern than the transdiagnostic cluster solution, i.e., higher functional impairment in the cognitively impaired in comparison to the cognitively spared subgroup, impaired ROD subgroups and impaired CHR subgroups show less functional impairments. Additionally, we find significantly higher negative symptoms for impaired as compared to spared ROP (t(72.823) = 3.006, p < 0.01) as well as significantly higher depressive symptoms for spared ROD as compared to impaired ROD and HC (main effect: cluster 1/cluster 2/HC: F(2, 377) = 281.619, p < 0.001; spared vs impaired: p < 0.01; spared vs HC: p < 0.001).

Differences in GM volume between transdiagnostic clusters

The SVM classification model based on GM volume separated the cognitively spared cluster from HC (BAC = 53.1%, Sensitivity (Sens) = 55.6%, Specificity (Spec) = 50.6%, positive predictive value (PPV) = 51.7%, negative predictive value (NPV) = 54.5%; p = 0.04) while it could neither separate the cognitively impaired cluster from HC (BAC = 51.4, Sens = 47.9%, Spec = 54.8%, PPV = 36.9%, NPV = 65.6%; p = 0.37) nor the cognitively impaired cluster from the cognitively spared cluster (BAC = 53.0, Sens = 42.4%, Spec = 63.7%, PPV = 40.4%, NPV = 65.6%; p = 0.14) (Figs. 3, S5). Classification of the transdiagnostic subgroups provided no or no substantial gain in accuracy over the classification of individual study groups from HC (Table S11).

Fig. 3: Reliability maps of significant sMRI and rsFC classification models as measured by the cross-validation ratio (CV ratio).
figure 3

The upper row in panel (A) shows the predictive connectivity patterns for the ‘impaired vs HC’ rsFC model. The lower row in (A) shows the predictive connectivity patterns for the ‘spared vs HC’ rsFc model. The ten most predictive connectivity patterns for HC status are marked in blue and the ten most predictive connectivity patterns for impaired/spared cluster status are marked in red. A list containing the predictive features is given in supplementary Table S12. Figures were generated using the BrainNet Viewer. B shows the voxel reliability maps for the significant ‘spared vs HC’ sMRI model. Voxels predictive of spared status are represented by positive CV ratio ( = warm colors) and voxels predictive of HC status are represented by negative CV ratio ( = cool colors). Reliability maps are thresholded at the 99th percentile for both positive and negative CV ratio.

Differences in functional connectivity between transdiagnostic clusters

The SVM classification model based on rsFC separated the cognitively impaired cluster (BAC = 58.5, Sens = 73.6%, Spec = 43.3%, PPV = 41.7%, NPV = 74.8%; p < 0.01) and cognitively spared cluster (BAC = 61.7, Sens = 62.5%, Spec = 60.9%, PPV = 60.3%, NPV = 63.1%; p < 0.01) from HC. The model classifying cognitively spared and impaired cluster was not significant (BAC = 55.9, Sens = 49.3%, Spec = 62.5%, PPV = 43.3%, NPV = 68.0%; p > 0.05) (Table S12, Figs. 3, S5). Classification of the transdiagnostic subgroups provided no or no substantial gain in accuracy over the classification of individual study groups from HC (Table S11).

External validation

Differences in cognitive, functional and clinical characteristics between transdiagnostic clusters

Similar to the findings in the discovery sample, the transdiagnostic cluster solution of the replication sample showed an impaired subgroup with widespread reductions in cognitive performance relative to HC and a spared subgroup often performing similar to or better than HC (Table 2, Fig. S6).

Transdiagnostic cluster effects with respect to functioning were less pronounced in the replication sample (Table S13, Fig. S7). The impaired cluster in comparison to the spared cluster showed significantly higher positive (t(270.63) = 4.603, p < 0.001) symptoms on the PANSS scale. The spared subgroup showed higher depressive symptoms in comparison to the impaired subgroup (main effect: cluster 1/cluster 2/HC: F(2, 346) = 128.659, p < 0.001; spared vs impaired: p < 0.05).

Differences in cognitive, functioning, and clinical characteristics between single-disease clusters

Individual clusterings showed an impaired subgroup with widespread reductions in cognitive performance relative to HC and a spared subgroup often performing comparable to or better than HC. When comparing impaired and spared subgroups across clusterings, we found that the impaired and spared subgroup of the ROP clustering performed significantly worse in comparison to the impaired and spared subgroups of ROD and CHR (Tables S7, S14S16, Fig. S6).

Impaired ROP, ROD and CHR subgroups did not show significantly more functional impairment than their spared subgroups. Impaired and spared clusters were not distinct with respect to BDI and symptoms on the PANSS (Tables S17S19, Fig. S7).

Discussion

The current study identified spared and impaired cognitive subgroups across a transdiagnostic sample and in patients with affective and psychotic illness and CHR state. Impaired subgroups showed widespread cognitive impairment while the spared subgroups showed cognitive performance comparable to HC. Single-disease clustering analyses indicated that ROP were characterized by more impairment in both impaired and spared subgroups than the ROD and CHR groups and provided a more refined picture on functional impairments and symptoms associated with cognitive subgroups than the transdiagnostic clustering solution. We found a higher discriminability of the transdiagnostic cognitive subgroups based on rsfMRI than on sMRI. Analyses based on rsfMRI showed that transdiagnostic clusters were significantly differentiated from HC.

Previous studies of patients with established psychiatric illness and at CHR identified subgroups showing a ‘severe cognitive deficit’ as well as a subgroup with ‘preserved cognitive performance’ [1, 2, 9, 10, 13,14,15,16, 18,19,20]. We showed that this cognitive heterogeneity is also present at an early stage in psychotic and depressive disorders and at CHR. Cognitive impairments in impaired subgroups were most strongly pronounced in working memory, verbal memory, processing speed and attention. Consistently, metanalytical findings show that processing speed, verbal memory and working memory count among the most strongly impaired cognitive domains in schizophrenia and depression [21, 53]. As evident from single-disease clustering, ROP showed significantly more impairment with respect to these cognitive domains than impaired ROD or CHR. Further, processing speed and verbal memory represent the most predictive cognitive domains for transition to psychosis in CHR [54]. Our findings indicate that though cognitive impairment is a transdiagnostic phenomenon which shows substantial heterogeneity, individuals with impaired cognition in psychosis, depression and CHR state vary in severity [21]. To explain these variations differences in pharmacological treatment [21] as well as in illness duration [55] might play a limited role in the current study as individuals showed only short-term exposure to pharmacological treatment and ROP showed shorter illness duration than ROD. Therefore, illness-specific psychopathological characteristics have likely contributed to the differences [21].

Cognitively impaired subgroups across individual psychiatric conditions have been associated with greater deficits in functioning [20, 56,57,58,59,60,61,62] and higher burden in clinical symptoms [57, 59, 61]. The impaired transdiagnostic subgroup was associated with more functional impairment compared to the transdiagnostic spared subgroup and HC. Consistent with our cognitive findings in the single-disease clustering, we found that functional impairment in the cognitively impaired subgroup is more pronounced in ROP than in ROD and CHR. The transdiagnostic clustering indicated significantly higher positive, negative, and general symptoms in the impaired subgroup as compared to the spared subgroup and HC. The effect for negative symptoms seemed to be driven by the cognitively impaired subgroup in ROP which is consistent with previous literature [61, 62]. The spared transdiagnostic subgroup showed significantly higher depressive symptoms as compared to the impaired subgroup and HC which seemed to be mainly driven by spared ROD. Metanalyses in depression show inconsistent findings for the association of cognition and depressive symptom severity [63,64,65].

Both GMV and functional connectivity alterations have been identified in schizophrenia, CHR and depression [66,67,68]. SVM classifiers based on rsFC showed superior accuracy in differentiating the impaired and spared transdiagnostic subgroups which is in line with findings reporting higher sensitivity of ML algorithms in the functional resting-state compared to the neuroanatomical brain modality [29]. Transdiagnostic spared and impaired clusters were classified with relatively low accuracy in both imaging modalities. We found no or no substantial gains in classification accuracy when classifying the transdiagnostic subgroups as compared to individual study groups against HC. In sum, this suggests that transdiagnostic cognitive impairment in early illness accounts for a small amount of variance in structural and resting-state functional brain measurements.

We validated the discovery sample results regarding the transdiagnostic spared and impaired subgroup in the PRONIA replication sample and confirmed the finding that cognitive dysfunction in the impaired cognitive subgroups is more pronounced in ROP as compared to ROD and CHR. Cognitive subgroups in the replication sample were less associated with differences in general functioning and symptoms.

There are limitations to our findings. First, different numbers of cognitive variables per cognitive domain were used for clustering leading to an underrepresentation of certain cognitive domains, e.g., in social cognition (Table S2) [14]. Second, we used SVM algorithms due to their high interpretability while non-linear classification algorithms, such as deep neural networks [69] might have revealed higher classification accuracies. Further, focusing the analyses on apriori defined brain networks [70] might have increased the sensitivity to differentiate the groups. Third, classification performance in the imaging analyses might have been limited due to differences in sample size between groups though we weighted the hyperplane of the SVM algorithm in favor of the minority group. Fourth, due to low SVM classification accuracies in the discovery sample we did not apply our brain models to the replication sample.

We provide evidence that ROP, ROD and CHR differ in cognitive heterogeneity while cognitive subgroups in individual study groups map onto different general functioning and symptoms characteristics. Transdiagnostic impaired subgroups did not reveal increased classification performance in the investigated structural and functional brain modalities relative to the spared subgroup and HC, indicating heterogeneity in underlying neurobiological patterns for the identified transdiagnostic subgroups. Study group specific cognitive subgroups might be more informative than transdiagnostic subgroups.