Introduction

Bipolar disorder (BD) is a heterogeneous mood disorder that is divided into two phenotypically distinct subtypes based on DSM (Diagnostic and Statistical Manual of Mental Disorders) diagnostic criteria [1]. While subtype I (BD-I) is characterized by periods of mania, which may alternate with depressive episodes, subtype II (BD-II) is diagnosed when hypomanic and depressive episodes are present but no episodes of full mania. Beyond diagnostic criteria, differences in the long-term disease course between subtypes have been found, with more (hypo-)manic episodes [2, 3] and hospitalizations [4, 5] in BD-I and more frequent and prolonged depressive episodes [3, 4, 6] in BD-II. Depending on the subtype, different treatment approaches may be more beneficial [4, 7,8,9,10]. Further, differential diagnosis of subtypes is commonly based on phenotypic characteristics and clinical assessment of the degree of impairment, with a clear differentiation only being possible if certain characteristics are present [11]. Therefore, a key challenge for clinicians remains the distinction of BD subtypes [12, 13]. This underlines the importance of investigating other, potentially meaningful markers to enhance subtype diagnosis and treatment response. Studying neuronal [14] and genetic [15, 16] correlates contributes to a more detailed subtype characterization and could provide additional clinical and therapeutic benefits [12, 17].

Previous neuroimaging research showed widespread white and gray matter abnormalities in BD patients compared with healthy controls (HC) [18,19,20,21,22,23]. However, some studies included one subtype only [18, 23,24,25,26] or did not differentiate between subtypes in their analyses [19, 27,28,29,30,31]. Those few studies that directly compared subtypes yielded mainly inconclusive results [20, 21, 32,33,34,35,36,37,38,39,40]: Diffusion tensor imaging (DTI) studies, which usually focus on fractional anisotropy (FA) as a quantitative measure of WM microstructure or integrity, have yielded conflicting results when conducting direct comparisons between subtypes, with some finding reduced FA in the temporal and frontal pathways in BD-I [34, 35, 38] and others in BD-II [32, 39]. Although these studies only allow tentative conclusions due to consistently small samples (mostly n < 30 per group) and varying methods, current evidence rather points to more severe WM microstructural impairments in BD-I compared with BD-II [41]. Regarding gray matter volumes (GMV), some studies found lower volumes in temporal [37, 40, 42], (pre-)frontal [37, 40, 42] and posterior cingulate regions [37] and in the putamen [33] in BD-I compared with BD-II, while others found no GMV differences between subtypes [20, 21, 36, 43, 44]. In conclusion, although there seems to be preliminary data attesting more pronounced WM and GM changes in BD-I compared with BD-II e.g. [35, 37, 38, 42], the evidence is inconsistent and it remains unclear whether existing neuroimaging findings on BD can be generalized to both subtypes.

Going one step further, the issue is whether neurobiological alterations support the categorical classification into clinical subtypes according to DSM, which has so far been based on phenotypic markers only [45]. Even though these phenotypic features are dimensional in nature, they are categorized based on the severity of their expression resulting in the two discrete BD subtypes. Therefore, neurobiological alterations may form a continuum, representing the subtypes as one clinical entity with varying severity. This would be captured by a dimensional diagnostic approach, where neurobiological alterations are associated with dimensional clinical features rather than discrete diagnostic categories [46, 47].

Underlying genetics provide evidence that fit both approaches. Based on the latest genome-wide association study (GWAS), BD subtypes share a large amount of genetic composition, showing a strong between-subtypes correlation of r = 0.85, while correlations with other mental disorders such as schizophrenia (SZ) or MDD were weaker (all r ≤ 0.66) [15]. In contrast, the strength of the GWAS correlations with other mental disorder differed between subtypes, with a greater correlation between BD-II and MDD, and between BD-I and SZ [15, 48, 49]. Moreover, heritability estimates based on twin studies [48, 50] and GWAS-based h2SNP heritability was estimated to be higher for BD-I compared to BD-II [15, 51]. Beyond differences in measures of heritability, identification of distinct loci might be related to subtype-specific symptomatology [15]. BD-I thus might have a stronger genetic component than BD-II.

A higher genetic load, which was also related to a more severe course of BD across subtypes [52], could also be linked to brain structural alterations [53,54,55]. Despite evidence for subtype-specific neuroanatomy and genetics, the relationship between these features in BD subtypes remains uninvestigated. Exploring the effects of genetic factors on brain structure altered in BD subtypes may provide insight into the neural mechanisms by which genetic variation has an impact on the disease at the psychopathological level.

To date, we are not aware of any study investigating subtype-specific differences in GMV and WM within the same sample subdivided by BD type. This study focuses on WM and GMV differences among BD-I, BD-II, and HC, aiming to investigate the neurobiological underpinnings of conventional BD subtype categories. Given previous heterogeneous neuroimaging findings [20, 21, 32,33,34,35,36,37,38,39,40], we employed a whole-brain approach. Rather than adhering to diagnostic categories, we also took a dimensional approach, exploring other potential subtype-specifying factors in relation to neurobiology. Adding this perspective, we aim to contribute to a more nuanced understanding of the neurobiological basis of the subtypes of BD.

First, we expected a decrease in WM integrity and GMV in BD patients, resulting in the following pattern of group differences: BD-I < BD-II < HC (categorical perspective). Second, we exploratively examined potential relationships between white and gray matter, BD polygenic risk scores (PRS) and various clinical characteristics (dimensional perspective).

Materials and methods

Participants

This study included n = 136 patients with BD and n = 136 HC from the Marburg-Münster-Affective-Cohort-Study (MACS; see ref. [56] for a general description). The data in the current study are a subsample of a previously published analysis by our group on WM microstructural differences between healthy, depressed, and bipolar individuals regardless of subtype [22]. Recruitment was conducted via newspaper advertisement and flyers, and in psychiatric hospitals. Inclusion criteria for HC were the absence of any current or lifetime psychiatric disorder, whereas BD patients required a current or lifetime diagnosis of bipolar disorder. The presence or absence of mental disorders was assessed by trained personnel with the Structured Clinical Interview (SCID-I) [57] according to the DSM-IV-TR criteria [58]. Based on this, BD patients were grouped into Bipolar I (BD-I, n = 73; n = 42 female, Mage = 41.77, SDage = 11.51) and Bipolar II (BD-II, n = 63; n = 33 female, Mage = 40.48, SDage = 12.56) subtypes. General exclusion criteria comprised usual magnetic resonance imaging (MRI) contraindications, head trauma, and any history of neurological, cardiovascular, or other severe medical conditions (e.g., cancer, infections, and autoimmune disease). Further exclusion criteria for BD patients were a lifetime diagnosis of alcohol or substance dependence (other than tetrahydrocannabinol dependence), while any current intake of psychotropic medication resulted in exclusion from the study for HC. All participants were aged between 18-65 years and HC and BD patients were matched according to age, sex, and study sites using the MatchIt package in R [59].

During the clinical interview, information about current symptomatology, lifetime course of disease, and psychopharmacological treatment was collected. The 21-item Hamilton Depression Rating Scale [60] and the Young Mania Rating Scale [61] were used to measure current depressive and (hypo-)manic symptoms, respectively. Disease course variables including number and cumulative duration of depressive and (hypo-)manic episodes and psychiatric hospitalizations, time since first symptoms (measured by age minus age of onset), and time since first psychiatric hospitalization were assessed by patients’ self-reports. Current psychopharmacological medication intake was assessed through a previously used composite score, the medication load index [62] (Supplement 1). PRS for bipolar disorder were calculated using genome-wide genotype data and the summary statistics of a recent GWAS of BD (more information in Supplement 2) [15]. Sociodemographic, clinical, and genetic information are provided in Table 1.

Table 1 Demographic and clinical characteristics of BD-I and BD-II patients and HC.

The study was approved by the Ethics Committees of the Medical Faculties, University of Marburg (AZ: 07/14) and University of Münster (2014-422-b-S), in accordance with the Declaration of Helsinki. All participants received financial compensation and provided written informed consent prior to participation.

Image acquisition and preprocessing

3 T whole body MRI scanners (Marburg: Tim Trio, Siemens, Erlangen, Germany; Münster: Prisma fit, Siemens, Erlangen, Germany) were used for acquisition of MRI data. All images underwent quality checks according to the quality check protocol of the MACS study [63]. Due to a body-coil change at the Marburg site, we controlled for three different scanner settings (Münster, Marburg body-coil pre, Marburg body-coil post) in all analyses using two dummy-coded variables with Münster as reference category as previously recommended [63].

White matter microstructure (DTI)

DTI data acquisition, preprocessing and quality assurance followed published protocols and were extensively described elsewhere [22, 63]. Preprocessing and analyses were implemented in FSL6.0.1 (http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/) [64,65,66]. For details on DTI acquisition parameters, quality assurance of the data and preprocessing steps see Supplement 3. As the last step, a diffusion tensor model was fitted at each voxel using “DTIFIT” within FMRIB’s Diffusion Toolbox (FDT) [67] and FA, mean diffusivity (MD), radial diffusivity (RD) and axial diffusivity (AD) were calculated for each voxel per participant. FA is a measure of the directionality of water diffusion on a scale from 0 (indicating isotropic diffusion) to 1 (indicating completely anisotropic diffusion) [68]. See Supplement 3 for more information on MD, RD, and AD measures.

Gray matter volumes (GMV)

High-resolution T1-weighted structural images were collected using three-dimensional fast gradient echo sequences (MPRAGE). Details on acquisition of T1 data for GMV analyses are provided in Supplement 4 and have been described elsewhere [69]. Preprocessing of T1-weighted images was conducted using a default pipeline implemented in the CAT12-toolbox (v1720). Steps included bias-correction, tissue classification, realignment and spatial normalization to MNI space using the Geodesic Shooting algorithm. Data were smoothed with an 8 mm full width half maximum Gaussian kernel.

Statistical analyses

Descriptive and clinical variables between BD patients and HC were analyzed in IBM SPSS Statistics 27 (SPSS Inc., Chicago, IL, USA; Table 1).

White matter microstructure (DTI)

Analysis of DTI data was performed using TBSS [70], a technique designed to reduce registration misalignment (Supplement 3). Voxel-wise statistical analyses were performed by using the nonparametric permutation testing implemented in “randomize” from FSL [71] with 5000 permutations. Threshold-Free Cluster Enhancement (TFCE) was applied to obtain cluster-wise statistics corrected for multiple comparisons [72]. Significance was determined using the 95th percentile of the null distribution of permutated input data of the maximum TFCE scores, allowing to correct estimated cluster sizes for family-wise error (FWE) at p < 0.05. The significant effect mask was placed over the raw diffusion metrics maps of each subject and the values of the respective voxels were extracted and averaged, using “fslstats” from FSL. The mean diffusion metric value for each subject from all voxels of the significant cluster was used for visualization in scatterplots.

The results for WM focus on FA. However, to support the interpretation of these results, the same registration steps and analyses were also performed for the other DTI metrics MD, RD, and AD (Supplement 5).

Gray matter volumes (GMV)

Statistical analyses of GMV were performed on the whole-brain level using Statistical Parametric Mapping (SPM12, Wellcome Department of Cognitive Neurology, London, UK, v7771) with an absolute threshold masking of 0.1. TFCE, as implemented in the TFCE-toolbox, with 5000 permutations per test was applied (http://dbm.neuro.uni-jena.de/tfce, Version r210). The significance threshold was set to p < 0.05 FWE corrected.

All DTI and GMV analyses included the covariates age, sex, total intracranial volume (TIV), site, and scanner settings (body-coil pre, body-coil post with Münster as reference category). The following analyses were conducted:

  1. 1.

    To examine brain structural and microstructural differences between the diagnostic groups (HC, BD-I, BD-II), we first performed F-tests. Subsequently, post-hoc pairwise t-contrasts were calculated. Effect sizes were calculated based on the mean t-value of all significant voxels provided by FSL or SPM and respective sample sizes [73].

  2. 2.

    To investigate putative effects of clinical characteristics, we performed additional analyses:

    1. a.

      In case of significant effects in the contrast BD-II > BD-I in the main analyses (step 1), we repeated the group comparisons by separately including clinical variables (e.g. number of depressive and (hypo-)manic episodes, psychiatric hospitalization, psychopharmacological medication and PRS for BD) potentially related to differences between BD subtypes as additional nuisance variables. An overview of all included clinical variables can be found in Table 1.

    2. b.

      To further determine which clinical characteristics influence brain structural alterations, associations between the clinical variables mentioned in a) and FA or GMV were calculated for the whole BD sample and per subtype (see Tables S5 and S6) using linear regression models. These analyses were calculated irrespective of significant group differences in step 1. Bonferroni correction for the ten regression analyses was applied, resulting in a significance threshold of p < 0.005.

Results

Brain structural differences between HC, BD-I and BD-II

White matter microstructure (DTI)

The F-contrast revealed a significant main effect of diagnosis in FA (ptfce-FWE < 0.001, total k = 7028 voxels in seven clusters, Fig. 1, Table S1). Pairwise post-hoc t-contrasts revealed significantly lower FA values in BD-I patients compared with HC in one large bilateral cluster (d = 0.25, ptfce-FWE < 0.001, k = 45712 voxels, Fig. 2A) as well as compared with BD-II patients (d = 0.36, ptfce-FWE = 0.006, k = 6418 voxels in seven clusters, Fig. 2B). Both effects were most probably located in the forceps minor of the corpus callosum (CC), with almost all other major fiber tracts also affected when BD-I and HC were compared (Table S2). BD-II patients also showed significant lower FA values compared with HC in two small clusters in the left body of the CC (d = 0.56, ptfce-FWE = 0.049, k = 27 voxels in two clusters, Fig. 2C). There was also a significant main effect of diagnosis for RD and MD, reflected in RD by significantly higher scores for BD-I compared with HC and BD-II, and in MD by significantly higher scores only for BD-I compared with HC. No effects were found for AD (Supplement 5).

Fig. 1: Main effect of diagnosis on FA.
figure 1

A Mean fractional anisotropy (FA) across healthy controls (HC), bipolar disorder type I (BD-I) and bipolar disorder type II (BD-II). The mean FA value was obtained from FA values of all the voxels that showed a significant main effect of diagnosis (ptfce-FWE < 0.05). Error bars represent 95% confidence intervals. p values were obtained from pairwise post-hoc t-contrasts. B Density estimation plots of FA values showing distributional overlap between HC, BD-I and BD-II.

Fig. 2: Differences in FA between HC, BD-I and BD-II.
figure 2

Differences in fractional anisotropy (FA) between healthy controls (HC), bipolar disorder type I (BD-I), and bipolar disorder type II (BD-II). A Higher FA in HC compared with BD-I. B Higher FA in BD-II compared with BD-I. C Higher FA in HC compared with BD-II. D Differences in FA between the three groups HC, BD-I and BD-II (F-Test). Coordinates are given in MNI space. Highlighted areas represent voxels (using FSL’s ‘fill’ command for better visualization), where significant differences between groups (ptfce-FWE < 0.05) were detected.

Gray matter volumes (GMV)

In the whole-brain analysis no significant main effect of diagnosis was found (ptfce-FWE = 0.509). Exploratory pairwise comparisons between diagnoses on the whole-brain level, uncorrected at p < 0.001 with a cluster threshold of k = 50 voxels pointed towards putative GMV alterations in parietal, frontal and parahippocampal/fusiform regions in the expected pattern of BD-I < BD-II < HC (Table S3).

Additional analyses

Brain structural differences between BD-I and BD-II subtypes correcting for clinical variables and polygenic risk

White matter microstructure (DTI)

Analyses revealed a stably lower FA in BD-I vs. BD-II even when additionally correcting for number of depressive episodes (d = 0.36, ptfce-FWE = 0.004, k = 6832 voxels), number of (hypo-)manic episodes (d = 0.36, ptfce-FWE = 0.005, k = 8061 voxels), number of psychiatric hospitalizations (d = 0.31, ptfce-FWE = 0.003, k = 16683 voxels), time since first symptoms (d = 0.36, ptfce-FWE = 0.007, k = 6689 voxels), time since first psychiatric hospitalization (d = 0.32, ptfce-FWE = 0.007, k = 17,880 voxels), and lifetime psychotic symptoms (d = 0.37, ptfce-FWE = 0.013, k = 5202 voxels). Similarly, the observed pattern of results did not change when correcting the models for PRS for BD (d = 0.38, ptfce-FWE = 0.020, k = 4742 voxels), childhood adversity (d = 0.32, ptfce-FWE = 0.005, k = 17,248 voxels), body mass index (d = 0.53, ptfce-FWE = 0.038, k = 307 voxels) or medication load (d = 0.36, ptfce-FWE = 0.005, k = 6745 voxels, additional analyses for different types of medication are provided in Table S4). Affected tracts consistently included the forceps minor and major of the CC as well as the anterior thalamic radiation (Table S2). The significant increase in RD for BD-I compared with BD-II also proved stable in these additional analyses, except when PRS for BD, body mass index and psychotic symptoms were included as covariates (Supplement 5).

Associations with clinical variables and polygenic risk

White matter microstructure (DTI)

There was a significant positive associations between the time since first psychiatric hospitalization and FA (ptfce-FWE = 0.021), which, however, did not survive Bonferroni correction (Table 2). RD, MD and AD showed a negative association with this variable, which only survived Bonferroni correction in the case of MD (Supplement 5).

Table 2 Associations between clinical variables and FA within all BD patients.

Gray matter volumes (GMV)

Regression analyses between clinical variables and GMV revealed no significant associations (all ptfce-FWE > 0.072, Table 3). The results of the additional regression analyses between clinical variables and brain structure in both subtype groups separately can be found in Table S5 (DTI metrics) and Table S6 (GMV).

Table 3 Associations between clinical variables and GMV within all BD patients.

Discussion

This study examined brain structural differences between patients with BD types I and II and HC, including white and gray matter, PRS for BD and disease course data, attempting to provide a more profound comparison of the subtypes. As the main result, our analyses revealed group differences with the expected pattern of BD-I < BD-II < HC regarding FA as a measure of WM integrity, whereas no group differences were found for GMV. Secondly, the group differences in WM microstructure were not significantly affected by prior disease course or PRS for BD and thus seem to be nearly independent of other clinical factors or underlying genetics. Finally, we found no associations between brain structure and these clinical or genetic parameters in either BD-I or BD-II.

Bringing some clarity to the heterogeneity of previous neuroimaging findings [20, 21, 32, 34, 35, 42], we demonstrated significantly lower FA, along with higher RD, in BD-I patients compared with BD-II. This finding was accompanied by a markedly different extent of impairment when compared with HC: Whereas BD-I patients showed widespread alterations in WM affecting all major WM tracts (including the CC), BD-II patients differed from HC only in a small local cluster in the CC. The FA reduction in the CC in both subtypes compared with HC matches existing reports of reduced FA levels in this tract as one of the most robust effects in BD [19, 22, 74]. As a crucial connecting pathway between brain hemispheres, the CC plays a central role in interhemispheric integration. Meta-analyses [75,76,77,78] have shown reduced WM integrity in the CC across MDD, BD, and SZ, suggesting disruptions in WM interhemispheric connectivity as a common pathophysiological pathway in major mental disorders, being related to deficits in various emotional and cognitive processes, such as executive functions, attention, working and visual memory [75, 77, 79,80,81]. Specifically, alterations in the forceps minor of the CC, composed of fibers extending laterally from the genu of the CC and connecting the cerebral hemispheres anteriorly, primarily indicate impaired interhemispheric communication within the prefrontal cortex, whose involvement in BD has been elaborated in several studies e.g. ref. [82,83,84].

The GMV analyses yielded no significant group effect, refuting our hypothesis. This aligns with findings from two large-cohort studies by the ENIGMA bipolar consortium [20, 21], indicating no subtype-specific (sub-)cortical GMVs differences. In contrast, prior literature found subtype-specific abnormalities [33, 37, 40, 42, 44], but results varied, potentially due to factors such as sample characteristics (e.g. small sample sizes or current mood state) or methodological differences (e.g. voxel-wise comparisons vs. region-specific parcellation). Our study adopted a whole-brain approach to capture this heterogeneity comprehensively, which, given a more stringent statistical threshold, could explain the lack of subtype-specific effects. Accordingly, the uncorrected exploratory analyses (Table S3) suggest the anticipated pattern of BD-I < BD-II < HC in GMVs in parietal, frontal, occipital and parahippocampal/fusiform areas. These regions partly overlap with reported regions in studies reporting subtype-specific differences e.g. [37, 40]. Neurobiological disparities in GMV between BD-I and BD-II may be less distinct than previously thought, necessitating larger, more homogeneous samples for precise investigations and mapping of potential effects. Given this evidence, the brain structural alterations in BD and its subtypes appear to relate less to the volume of structurally independent areas than to impaired fiber connections between these brain regions.

We found significantly lower FA in BD-I compared with BD-II, accompanied by global FA alterations in BD-I compared with HC. The separate diagnosis of BD-II was amended in DSM-IV [85] to acknowledge the disorder’s heterogeneity and to classify less severe symptoms of mania. The significant neurobiological differences between BD-I and BD-II initially support this categorical classification into subtypes [45]. However, the strong distributional overlap in FA reductions between both subtypes challenges a clear-cut classification based on the extent of WM alterations (Fig. 1B). From the latter argument, our data rather suggest a bipolar spectrum with varying severity [46, 47], mirrored by gradual transitions at the clinical level. Only differences in a few criteria and also subjective assessments determine which subtype a patient is diagnosed with [11, 45, 47]. The distinction between bipolar subtypes may be clearer in the presence of critical severity markers, e.g., psychotic symptoms or hospitalization [47], where subtypes differed in our sample (Table 1). However, our dimensional regression analyses revealed no significant linear associations between brain structure and psychotic symptoms, contrasting previous findings [86,87,88]. Subtype differences were largely unaffected by various variables, although single effects lost significance when controlling for body mass index, PRS or psychotic symptomatology, suggesting a more detailed analysis of these variables in future studies. Nevertheless, these factors provide no obvious explanation for the subtype differences found and challenge the notion of neuroprogressive effects in BD [89,90,91,92]. Given this, we can only cautiously interpret our results of distinct subtypes as a neurobiological correlate of overall disease severity in BD-I versus BD-II.

Considering the overlap in genetic compositions and heritability of BD [15], we examined associations between brain structure and PRS for BD. While we observed a significant difference in PRS between subtypes (Table 1), it was not related to gray or white matter features. The PRS employed in this study [15], predominantly derived from BD-I patients, may contribute to the substantial difference observed between subtypes. The debate surrounds the suitability of genome-wide PRS as a genetic tool for investigating psychiatric disorders [93]. While it enhances generalizability by capturing a broad range of common genetic factors for BD [55], its inclusion of alleles with potentially diverse associations with brain structure may mask effects of specific gene sets [93]. This debate is reflected in conflicting findings: On the one hand, there are studies failing to establish significant associations between BD or SZ PRS and brain structure [35, 93]. In contrast, a recent study in bipolar adolescents, benefiting from fewer confounding factors such as illness course or medication, demonstrated higher bipolar PRS scores correlating with both GM structure and WM diffusion [55].

Overall, the inclusion of larger BD-II samples, deep phenotyping, and the investigation of more specific genetic associations may provide relevant insights into the genetic composition of subtypes and putative relationships to brain structural features [94, 95].

This study has several strengths: A large, well-characterized sample of BD patients and comprehensive analyses including clinical, genetic and neuroimaging data. A few limitations are: First, the cross-sectional nature of our analyses entails interpretative difficulties and lower statistical power than longitudinal data. Particularly, the question whether neurobiological abnormalities are a result or precursor for bipolar subtype cannot be addressed. Second, the cross-sectional acquisition of information on previous disease course by self-reports is often biased by inaccuracies and memory deficits [96,97,98]. Therefore, the regression analyses including these clinical variables should be interpreted cautiously. Third, when using a composite score to represent psychopharmacological treatment, we did consider current medication, but without taking medication history into account, so that possible effects of previous medication use or duration of use cannot be excluded. Fourth, while TBSS offers the advantage of fully automated voxel-wise analysis of the whole brain, revealing changes beyond predefined pathways, the addition of newer tractography techniques may be beneficial for hypothesis-driven identification and detailed analysis of specific pathways [70, 99]. Fifth, we have examined only a selection of dimensions of psychopathology relevant to BD subtypes, but the inclusion of other dimensions, such as affective lability [100, 101], emotion regulation deficits [102], or affective temperaments [103], could also provide important insights. Sixth, we decided to include site as a covariate to control for potential scanner effects between Marburg und Münster as suggested by [63]. However, this approach may over- or under-correct for the site effect and only accounts for linear mean differences [104]. Other approaches of data harmonization between different scanners (e.g. ComBat harmonization) could be discussed as an alternative [105].

In summary, BD-I showed widespread alterations in WM connectivity, with no subtype-specific effects on GMV. The results suggest that BD-I phenomenology may be related to brain structural integrity impairment, which seems mainly independent from disease trajectories and genetics. Our findings may improve our understanding of the pathophysiology underlying the clinical and neurobiological spectrum of BD. Thus, microstructural alterations should be included when discussing the categorical or dimensional classification of bipolar subtypes. Similarly, clinicians may consider the degree of impairment in microstructural integrity when classifying BD patients within the spectrum of affective disorders and adjust treatment selection accordingly, although no potential cutoff regarding the extent of WM alterations has yet been established. Future studies should investigate other biological correlates of BD, for example functional connectivity or inflammatory processes, by more consistently taking subtypes into account.