Introduction

Affective disorders, comprising unipolar (UD) and bipolar (BD) disorders, share common depressive symptomatology, associated morphological brain abnormalities, and genetic liability [1]. The genetic correlation between BD and UD has been estimated to 0.65, and heritability rates of up to 49% for UD and 85% for BD [2, 3]. The most consistent morphological changes shared between UD and BD are reduced gray-matter volume in widespread cortical regions, most notably prefrontal cortex (PFC) and hippocampus [4, 5]. Accordingly, many of the functions associated with the PFC and hippocampus are impaired in UD and BD, such as cognitive domains of memory, attention, regulation of emotions, and stress response [6,7,8].

Family studies have revealed that unaffected first-degree relatives to UD and BD patients may show a similar pattern of brain morphological changes, potentially marking risk for disorder onset [9]. Mothers with UD and their healthy daughters showed reduced gray-matter volume and cortical thickness in the dorsomedial PFC (dmPFC) compared to matched participants [10]. Peterson et al. [11] observed cortical thinning across the lateral surface of the right cerebral hemisphere in first- and second-degree offspring of individuals with UD compared to age matched individuals with no personal or family risk of affective disorder. Reduction in hippocampal volumes have also been reported in healthy dizygotic co-twins of patients with UD [12] and first-degree relatives to patients with UD [13] compared to age and sex matched low-risk controls. However, contrasting these findings both prefrontal cortical thickening [11], increased volume [14, 15], and larger hippocampi [14, 16] have been reported in first-degree relatives of patients with UD and BD.

Given discrepant findings in unaffected relatives, changes associated with risk should be differentiated from changes related to compensatory adaptation possibility marking resilience by investigating differences between unaffected, affected, and low-risk individuals. To address this, Wiggins et al. [17] suggested a model where neural changes shared between affected and unaffected relatives compared to low-risk individuals are markers of risk, whereas neural changes differentiating affected and unaffected relatives may reflect compensatory, protective, or resilience markers. Accordingly, lower cortical gray-matter volume (a result of, e.g., abnormal cortical development or accelerated neuronal loss) present in both UD and BD patients and their unaffected relatives may indicate a neural mechanism of increased risk by virtue of the effect on cognitive, social and emotional processes [18]. Conversely, greater gray-matter volume (i.e., greater neural and/or vascular elaboration) may compensate for inherited vulnerabilities in such functions providing resilience. The Wiggins et al. [17] model further suggests that neural changes in affected individuals relative to low-risk and unaffected relatives may constitute illness-related sequalae.

The hippocampus includes several subfields with distinct function, morphology and differential vulnerability to disease [19]. Advances in the resolution and accuracy of automatic hippocampal segmentations have enabled recent MRI studies to associate affective disorders with morphological abnormalities in specific hippocampal subfields [20]. Based on this method, both UD and BD have been associated with reduced volumes of cornu ammonis 1 (CA1), and the granule cell layer of the dentate gyrus [21]. More extensive subfield volume reductions have also been reported in BD [22].

Hippocampal subfield morphology in unaffected relatives has not been systematically addressed. It remains therefore unclear whether abnormal hippocampal subfield and prefrontal morphologies in patients with affective disorders are scars of illness or whether also present in unaffected relatives marking risk or resilience to disorder onset. To address these questions, we recruited a large sample of monozygotic twins through the unique Danish registers: affected twins (AT) who were diagnosed with UD or BD, unaffected co-twins (UT), and low-risk twins (LT). Monozygotic twins have the same genetic make-up leading to high heritability rates in affective disorders [2, 3]. Comparison of discordant monozygotic twin pairs, i.e., AT vs UT, constitutes therefore a unique methodological approach leveraging high sensitivity to differentiate between morphological abnormalities in UT that mark risk or resilience. Epidemiological and genome-wide linkage studies in UD and BD show consistent overlap in genetic risk factors [23] and hippocampus and PFC morphology changes [4, 5] in these disorders. We therefore included both UD and BD twins and their UT to investigate morphological abnormalities related to a continuum of affective disorders. Due to high-average discordant time in our monozygotic twin sample (12 years) and previous evidence of compensatory emotional processing in the current UT sample [24], we primarily expected to find morphological abnormalities that mark resilience, i.e., distinct morphology in UT compared to AT and LT possibility characterized by larger volumes of specific hippocampal subfields and PFC.

Patients and methods

Participants and clinical assessment

The MRI study included 137 monozygotic twin participants: 67 AT, 39 UT, and 31 LT, Table 1. This is the subgroup that agreed to the MRI investigation and met the eligibility criteria out of a sample of 215 monozygotic twins initially recruited [25]. The twins were identified by linking the nationwide record of the Danish Twin Registry [26] and the Danish Psychiatric Central Register [27]. When uncertain, zygosity was confirmed using pairwise DNA tests. Eligibility criteria for AT were a personal history of UD or BD (ICD-10 codes F31-33). Among the BD patients, 13 were diagnosed with BD-I, 7 with BD-I with psychosis, and 5 with BD-II. UT had a co-twin with a history of UD or BD and no personal history of affective episodes. The LT had neither a personal nor a co-twin history of affective spectrum diagnosis from January 1995 to June 2014. All participants’ age was between 18 and 50 years. The exclusion criteria were: birth weight under 1.3 kg, current severe somatic illness, history of brain injury, current substance abuse, current mood episode defined as scores >14 on either the 17-item Hamilton Depression Rating Scale [HDRS; [28]] or the Young Mania Rating Scale [YMRS; [29]], pregnant or were found to be dizygotic by pairwise DNA tests. To ensure familial low-risk of major psychiatric disorders, UT were excluded if they reported other first-degree relatives with organic mental disorder, schizophrenia spectrum or affective disorders. History of psychiatric disorders in first-degree relatives was assessed using the Family History Method using diagnostic criteria [30].

Table 1 Demographic and clinical comparison of affected (UD or BD), unaffected co-twins, and low-risk twins (n = 137).

Participants underwent a clinical assessment prior to the MRI investigation. Life-time diagnoses of psychiatric illness were assessed using the schedules for clinical assessment in neuropsychiatry [SCAN; [31]]. Observer-based rating instruments included HDRS, YMRS, and the Major Depression Inventory [MDI; [32]]. Experience of childhood trauma related to physical, sexual or emotional abuse, and physical or emotional neglect, was screened using the self-reported retrospective Childhood Trauma Questionnaire [CTQ; [33]]. Cognitive impairment was assessed using the screen for cognitive impairment in psychiatry [SCIP; [34]]. The Edinburgh 10-item Inventory [35] was used to assess handedness. If only one twin from a twin pair was included, data from the Danish Central Research Register were used to determine familial risk status. Discordant status of twin pairs was defined as one twin with a life-time history UD or BD and one twin without such history, assessed retrospectively with the SCAN interview. Discordant time was calculated as the time period between onset of illness for the AT and date of the interview for the UT. All assessors were blinded for participants’ group belonging. All participants gave informed consent to the study conducted according to the Helsinki declaration. The study was approved by the local ethics committee (H-3-2014-003) and the Danish data protection agency (2014-331-0751).

Statistical analyses of demographic and clinical data

Group differences in continuous demographic and clinical variables were assessed in the SPSS 25 statistical software (IBM, Armonk, New York, United States) using mixed models analysis of variance with group (AT, UT, LT) as fixed effect and twin-pair index as random effect. Differences in dichotomous demographic and clinical variables (sex, medication status, handedness, and presence of other non-affective disorders) were assessed using Pearson’s chi-square tests. As post-hoc analyses following imaging findings, we also compared UD with BD patients in relation to demographic and clinical data. We report statistics of the group effect without correction for multiple comparison to aid the interpretation of the imaging findings.

Structural data processing

The MRI acquisition protocols are presented in the Supplementary Materials and Methods. Cortical and subcortical reconstruction and volumetric segmentation were performed using the FreeSurfer image analysis suite v6.0.0 (http://surfer.nmr.mgh.harvard.edu/). For each subject, T1- and T2-weighted images corrected for B0 field geometric distortions were entered in the image analysis pipeline which included correction for intensity homogeneity and motion, intensity normalization, and automatic segmentation of the cortical and subcortical gray-matter structures as documented in [36, 37]. An additional segmentation algorithm able to reliably delineate hippocampal subfields was executed [20]. The whole hippocampus and 11 constituent subfields volumes were obtained: hippocampal tail, subiculum, presubiculum, parasubiculum, cornu ammonis subfiels (CA1-4), molecular layer, dentate gyrus, fimbria and HATA. Since we had acquired two T1 structural images we run the processing pipeline for both of them and used the T2-weighted image for an improved pial surface reconstruction. We calculated the average and the absolute difference between the estimated hippocampal volumes from the two T1-weighted images. When the estimated volume difference was an outlier within its group according to interquartile range multiplied by 2.2 criteria [38], i.e., when the two estimations of the same volume differed significantly, we did not use the average but the estimated value closest to the group mean (n = 3 for left and n = 4 for right hippocampus). Surface-based data created by the processing pipeline included representations of cortical surface area, volume, and thickness. The volumetric segmentations and cortical reconstructions were visually inspected for accuracy by two assessors, and smaller errors of the pial surfaces were corrected for five reconstructions.

Statistical analysis of the MRI data

Hippocampus volume analysis

Group differences in hippocampal volumes were assessed in SPSS 25 using a mixed models analysis of variance with group, age, sex, and estimated total intracranial volume (TIV) as fixed effects, and the twin-pair identification number as random effect. We first assessed volume differences for the left and right hippocampi using a statistical threshold of p < 0.05 after correction for family-wise error (FWE) rate according to Bonferroni. Significant differences in whole hippocampal volumes between two groups were followed by pairwise comparisons across all hippocampal subfields (22 comparisons of the left and right subfields). The ensuing p-values were corrected for multiple comparisons using the false discovery rate (FDR) method across all 22 subfield tests. To aid interpretation of the group findings, we also performed pairwise subfield comparisons between the groups that did no differ significantly in whole hippocampal volumes.

Prefrontal cortical surface analysis

To explore abnormalities in the prefrontal cortical morphology in UT and AT a surface-based analysis was performed within PFC. For this, a left and right hemisphere PFC masks were constructed by adding the superior frontal, rostral and caudal middle frontal, pars opercularis, pars triangularis, pars orbitalis, lateral and medial orbitofrontal, precentral, and the frontal pole regions from the automated cortical parcellation according to the Desikan–Killiany Atlas [39]. To test for group differences in prefrontal volumes we set up GLM models for left and right hemispheres using group as factor and age, sex, and TIV as covariates. We further added a regressor for each twin pair to account for twin variance correlations. The significance level was set at p < 0.05 after correcting for multiple comparisons within the constructed PFC search volumes. We also performed a whole-brain exploration for group differences in cortical gray-matter volume outside the PFC. Since cortical volume is estimated as cortical thickness × surface area, significant regional volume differences were followed by post-hoc test to established which of the two cortical measures had the largest contribution to the observed difference in volume.

Planned follow-up analyses

Significant groups differences in gray-mater volume were followed up by post-hoc analyses (see Table S1 for overview). (I) We tested weather group differences were dependent on mood symptoms (HDRS) and childhood trauma (total CTQ) by adding the respective scores as covariates to the original statistical model. (II) We explored possible interactions between group and sex, and group and diagnosis by adding the interaction terms in different models. (III) The effect of medication (currently medicated vs. non-medicated) was tested in the AT. (IV) Using partial correlation tests, we assessed the association between gray-matter volumes and -duration of illness and age of illness onset in AT, -discordant time in UT, and -childhood trauma and long-term memory performance according to the Verbal Learning Test-Delayed (VLT-D) component of SCIP across the entire cohort. (V) We tested whether the absolute within twin-pair difference in hippocampal volumes was different between groups, or correlated with discordant time in UT. All statistical models were adjusted for age, sex, and TIV.

Results

Demographic and clinical evaluations

The analysis included 67 AT, 39 UT, and 31 LT. The groups were comparable in demographic variables related to age, sex distribution, years of education, and handedness (Table 1). The AT had a significantly higher HDRS and MDI scores than the LT and UT and showed a higher frequency of other non-affective disorders compared to the LT. Childhood trauma experiences were also more severe in both AT and UT than LT. Post-hoc tests revealed a longer duration of illness and number of currently medicated individuals in the BD group compared to UD (Table S2). No significant difference in SCIP cognitive scores were observed across the groups.

Hippocampus analysis

Group differences in whole hippocampal volumes

Whole hippocampus analysis revealed significant group effect for both left (F(2,77) = 6.912, p = 0.002) and right (F(2,74) = 3.320, p = 0.042) hippocampus. The group effects were primarily driven by larger hippocampal volumes in UT compared to AT (left pFWE = 0.002; right pFWE = 0.043; Fig. 1, top). Post-hoc tests showed these differences persisted subsequent adjustment for mood symptoms (left p = 0.003; right p = 0.040), and childhood trauma experiences (left p = 0.005; right p = 0.033). There were no significant differences in hippocampal volumes between UT and LT, but there was a trend for greater left hippocampus in UT compared to LT (left p = 0.053; right p = 0.217). Unexpectedly, there were no significant differences when comparing hippocampal volumes of AT and LT.

Fig. 1: Hippocampal subfield volumes in low-risk, unaffected, and affected monozygotic twins.
figure 1

Top panels—box-and-whisker plot of whole hippocampal volumes in the three groups with values adjusted for age, sex and total intracranial volume. Mean group differences with 95% CI of the pairwise group tests are presented below the box plots. Bottom panels—forest plot showing mean difference between unaffected and affected twins with 95% CI. Right column shows FDR corrected (n = 22) p-values.

Planned follow-up analysis

There was a significant group-by-sex interactions effect with females showing smaller hippocampi compared to males in LT but not in the UT and AT (left p = 0.004; right p = 0.016). We further found a significant group-by-diagnosis effect with BD showing smaller hippocampi than UD patients but not their UT (left p = 0.001; right p = 0.003). This effect was however not significant when accounting for differences in illness duration and medication between the UD and BD subgroups.

In AT, there was a significant effect of current medication status with lower hippocampal volume in medicated vs. non-medicated patients (left p = 0.014; right p = 0.042). We further found hippocampal volumes in AT to show a negative correlation with duration of illness (left r = −0.360, p = 0.004; right r = −0.403, p = 0.001) and positive correlation with age of illness onset illness (left r = 0.360, p = 0.004; right r = 0.403, p = 0.001). In UT, whole hippocampal volumes did not correlate significantly with the twins’ discordant time, but there was a negative trend for the left hippocampus (left r = −0.330, p = 0.056; right r = −0.155, p = 0.383). Across all participants we found a positive correlation between hippocampal volumes and long-term memory performance (left r = 0.197, p = 0.025; right r = 0.216, p = 0.014; five outlying SCIP VLT-D values excluded), and no significant correlation with childhood trauma experience (left r = 0.019, p = 0.835; right r = −0.023, p = 0.804). For the discordant twin pairs, the within-pair volume differences did not correlate significantly with their discordant time (left r = 0.273, p = 0.244; right r = 0.273, p = 0.244). The within twin-pair volume difference was comparable between groups (group effect left p = 0.648; right p = 0.822).

Subfield volumes

Subfield comparison between UT and AT revealed the higher hippocampal volumes in the UT were primarily mediated by larger left CA1-4, subiculum, and dentate gyrus subfields and right hippocampal tail and subiculum subfields (Fig. 1, bottom). Exploratory group comparisons revealed larger left and right presubiculum in the UT compared to LT (Fig. S1).

Prefrontal cortical surface analysis

Group differences in prefrontal cortical surface

The AT showed lower regional cortical volume in left dmPFC compared to LT (BA10; MNI x = −10, y = 60, z = 10; corrected p = 0.021; Fig. 2), difference that persisted subsequent adjustment for mood symptoms and childhood trauma experiences. The dmPFC thickness and surface area measurements underlying the volumetric estimation were not significantly lower, but the surface area showed a trend reduction (thickness p = 0.155, area p = 0.051). The UT showed no significant morphological differences compared to LT or AT. Outside PFC, no other cortical regions showed significant group differences in gray-matter volume.

Fig. 2: Left dorsomedial prefrontal cortex (dmPFC) volumes in low-risk, unaffected, and affected monozygotic twins.
figure 2

Left panel—Cortical volumes estimates showing significant effects between affected and low-risk twins. Right panel—statistical map showing the significant cluster from the low-risk > affected comparison.

Planned follow-up analysis

There was a significant group-by-sex interaction effect (p = 0.049) with females showing smaller dmPFC compared to males across all participants, with a larger difference observed in UT compared to the other groups. There was no significant group-by-diagnosis interaction effect. In AT, medicated showed larger dmPFC than unmedicated patients (p = 0.015). There was no association between dmPFC volume and duration of illness and age of illness onset in AT, or childhood trauma across all participants.

Discussion

The current monozygotic twin study implemented a novel hippocampal subfield segmentation approach and surface-based PFC analysis to investigate morphological differences related to risk and resilience in UT with an index co-twin diagnosed with UD or BD. We found larger hippocampal volumes bilaterally in UT compared to their remitted AT. This effect was attributed to significantly greater sub-regional volume in left CA1-4, subiculum, and dentate gyrus subfields and right hippocampal tail and subiculum subfields. The AT displayed no differences in hippocampal volumes but showed smaller left dmPFC volume compared to LT.

The UT displayed larger hippocampi than the AT and a trend towards larger left hippocampus relative to LT. Since, AT and LT had similar hippocampal volumes, this difference in statistical significance is likely due to the higher sensitivity of the UT vs AT comparison that accounted for within discordant twin-pair variance correlations. Yet, exploratory subfield analysis revealed a larger presubiculum volume in the UT compared to LT. Voxel-based morphometry studies have revealed regional increases in hippocampal volume in first-degree relatives to patients with UD [14] and offspring to patients with BD [16] vs low-risk controls. No previous study in unaffected relatives has systematically investigated hippocampal subfields using the methodological approach implemented here. However, corroborating our findings, the clusters showing increased bilateral volumes reported by [14], were localized to CA, region involved in spatial navigation and memory formation and retrieval [40]. Hippocampal volume reductions have also been reported in preadolescent daughters [41] and dizygotic twins [12] of patients with UD compared to low-risk controls, and interpreted as a risk marker of disorder onset. The inconsistency in the reported hippocampal volumes in unaffected relatives may, however, be reconciled when differentiating between markers of risk (changes shared between patients and their relatives), and markers of resilience (changes unique to unaffected relatives) [17]. This differentiation is further corroborated by accounting for demographic, clinical, and family relation data. According to this model, since hippocampal volumes are typically found lower in affective disorders, similar reductions in unaffected relatives may mark increased risk, especially in young relatives, and larger hippocampal volumes in unaffected individuals compared to their affected relatives suggests a compensatory adaptation to familial risk indicating resilience. Such changes may buffer against adverse experiences to a certain extent.

In the current study we found differences in hippocampus subfield morphology when contrasting UT with AT who had a long discordant time with an average duration of 12 years. Hippocampal volume reduction was most pronounced in AT who had a longer illness duration. Across all participants, hippocampal volumes were positively correlated with verbal memory performance. Taken together, these findings suggest the greater hippocampal volume in UT represents a marker of resilience to affective disorders—rather than a marker of risk [17, 42]. The resilience hypothesis is consistent with our previous observation made in the same twin cohort showing better task-oriented coping strategies in UT than AT [43]. From a mechanistic perspective, such adaptation may involve enhanced hippocampal recruitment leading to neurogenesis [44] or increased dendritic spine density [45] in specific hippocampal subfields involved in mnemonic function and regulation of stress response. For instance, the subiculum showed larger volumes bilaterally in the UT compared to the AT. This subfield plays a major regulatory role in inhibition of the hypothalamic–pituitary–adrenal (HPA) axis [46] that is commonly found to be over-active in depressive patients [47]. Further supporting the resilience hypothesis in UT, patients with UD and BD show volumetric reductions in the same subfields where we found larger volumes in UT, namely the CA1, dentate gyrus, presubiculum and subiculum in UD [21], and CA1-4 and hippocampal tail in BD patients [21, 22]. The hippocampal size in the UT and the within-pair difference between discordant co-twins did not correlate with the discordant time. This suggests hippocampal abnormalities may develop early in life, stabilizing over time. In line with this hypothesis, the within-pair hippocampal volume difference was similar independent whether the co-twins were both healthy, affected or discordant.

We found no differences in hippocampal volumes between AT and LT. In contrast, hippocampal volume reductions have repeatedly been demonstrated in both UD [48] and BD [5, 21]. However, absence of hippocampal abnormalities are also commonly reported, especially in young adults or patients in the early course of disorder [49, 50]. In support of our findings, a meta-analysis of 32 MRI studies in UD patients reported smaller hippocampal volumes only among patients with duration of illness longer than two years or who had more than one affective episode [49], suggesting a progressive reduction of hippocampal volumes correlating with morbidity in affective disorders. In AT we found hippocampal volumes to show a negative correlation with duration of illness and a positive correlation with age of illness onset, supporting the suggested negative impact of recurrent depressive episodes and early age of onset on hippocampal volumes [49, 51]. BD patients showed lower hippocampal volumes compared to UD patients, difference that was accounted for by longer illness duration and more frequent medication in BD compared to UD. In comparison, data from a recent meta-analysis showed a more substantial cortical gray-matter reductions in UD compared to BD in widespread cortical areas including the left hippocampus [4].

The exploratory PFC surface analysis identified a regional decrease in gray-matter volume in the left dmPFC in AT relative to LT. A smaller surface area had the largest contribution to this effect which was similar in UD and BD patients, finding corroborated by the common pattern of gray-matter volume alterations found in UD and BD [4]. According to the model suggested by Wiggins et al. [17], this cortical abnormality in AT may represent an illness-related sequalae, interpretation we suggest cautiously since duration of illness or age of illness onset did not correlate with dmPFC volume. Our finding of lower prefrontal gray-matter volume in AT is consistent with theories of mood dysregulation in affective disorders suggesting that affective instability may result from reduced involvement of regions implicated in the regulation of the emotional reactivity such as dmPFC [8]. No prefrontal markers of increased risk were observed in UT.

Strengths of the study include recruitment based on nation-wide registers resulting in a large sample of monozygotic twins that was investigated by blinded assessors. Integration of structural information from three MRI sequences in each individual (two T1 and a T2 contrast) yielded reliable hippocampal subfield segmentations which were validated by comparing the volumes of the generated subfields from two runs of the segmentation algorithm with different input T1 image of the same individual.

As limitation, the study employed a cross-sectional design which prevents causality between abnormalities in hippocampal morphology in UT and resilience. Our twin sample included a small number of young individuals still undergoing cortical development. Since it is not known by what age cortical resilience markers are developed, it is possible that some of the unaffected twins did not exhibit such cortical changes and may still be considered at increased risk.

In conclusion, this large MRI study in monozygotic twins identifies specific regional structural changes in unaffected discordant twins compared to their co-twins in remission for UD or BD. The distinctive hippocampal morphology in unaffected twins may implicate an enhanced recruitment of specific hippocampal subfields involved in memory and regulation of stress response that could promote resilience to affective disorders. This suggestion warrants prospective follow-up of the current sample. In addition, lower dmPFC volume in AT emerged as a sequalae of illness.

Funding and disclosure

The current study was supported by the Lundbeck Foundation (Grant R108-A10015) and DIS Copenhagen. HRS holds a 5-year professorship in precision medicine at the Faculty of Health Sciences and Medicine, University of Copenhagen which is sponsored by the Lundbeck Foundation (Grant No. R186-2015-2138). KWM holds a 5-year Lundbeck Foundation Fellowship (Grant No. R215-2015-4121). The authors declare no competing financial interest in relation to this work. During the last three years the following authors report received honoraria with no association to the current study: LVK as consultant for Lundbeck; HRS as speaker from Sanofi Genzyme, Denmark and Novartis, Denmark, as consultant from Sanofi Genzyme, Denmark, as senior editor (Neuroimage) and editor-in-chief (NeuroImage-Clinical) from Elsevier Publishers, Amsterdam, The Netherlands, and received royalties as book editor from Springer Publishers, Stuttgart, Germany; KWM received consultancy fees from Allergan and Janssen; MV received consultancy fees from Lundbeck A/S, Sunovion, and Janssen-Cilag.