Bipolar disorder (BD), characterised by alternating episodes of mania and depression, has a lifetime prevalence of ~1% and is a substantial contributor to disability throughout the world [1]. Nevertheless, reliable data concerning the aetiology of BD remain scarce. The heritability of BD is estimated to be above 70% [2,3,4], thus demonstrating an important genetic component in the development of the disorder. Genome-wide association studies (GWAS) in case/control samples have reported that single-nucleotide polymorphisms (SNP) with minor allele frequencies (MAF) of ≥ 1% explain a substantial proportion of the genetic risk for BD [5,6,7,8,9,10,11,12]: the heritability explained by such common variants (i.e., the SNP heritability) is estimated to be 0.17–0.23 on a liability scale [12]. Common variants also make a substantial contribution to the development of schizophrenia (SCZ) and major depressive disorder (MDD) [13, 14]. These three psychiatric disorders have a shared genetic component, whereby relatives of patients with BD have, in addition to BD, an increased risk for MDD and SCZ [15]. In fact, GWAS have shown that many genetic risk variants are associated with all three disorders [16,17,18,19,20,21].

Besides common variants with small individual effects, rare variants with larger effects may also contribute to BD development [22, 23]. In theory, such rare variants should be enriched in families with a high prevalence of illness, termed multiplex families, in comparison to unrelated BD cases. However, it remains unclear whether and to what extent disease incidence in multiplex families is caused by rare variants, a high load of common variants, or a combination of both.

To elucidate the molecular genetic causes of BD, we established the Andalusian Bipolar Family (ABiF) study in 1997, which recruited BD multiplex families [24,25,26]. In the present analyses, we first investigated whether common genetic variants make a significant contribution to the occurrence of BD in ABiF families. Next, we examined whether BD development was attributable to (a) BD-specific risk variants, (b) variants conferring risk to all three disorders BD, MDD, and SCZ, or, (c) a combination of both. To this end, polygenic risk scores (PRS) based on GWAS of BD, MDD, and SCZ were calculated for and compared between ABiF family members and unrelated BD cases and unrelated controls from the same population. Because of the strong genetic correlation between BD, SCZ, and MDD, standard PRS for BD cannot distinguish between BD-specific risks and factors shared between these disorders. To differentiate between genetic risk shared across and specific to any of the three disorders, we calculated PRS of disorder-specific risk variants using genome-wide inferred statistics (GWIS) and PRS of shared risk variants. To evaluate the possibility that population or technical differences between cohorts biased the results on psychiatric PRS, PRS for late-onset Alzheimer’s disease (LOAD) and simulated PRS were analysed as negative controls. Assuming a polygenic model with a contribution of common risk variants, we expected increased psychiatric PRS in the ABiF family members compared to unrelated samples and increased psychiatric PRS in patients compared to controls.

Materials and methods

Sample description

The ABiF study recruited BD multiplex families in Andalusia, Spain [24,25,26]. The present analyses included 395 members of 33 ABiF families. Diagnoses were assigned by two trained clinicians according to the Diagnostic and Statistical Manual of Mental Disorders (DSM)-IV criteria using the best estimate approach [24]. Diagnoses comprised (Table 1 and Supplementary Table S1): BD, n = 166 (families (FAM)BD; BD type I (BD-I): n = 115; BD type II (BD-II): n = 41; not otherwise specified (NOS) BD: n = 10); MDD, n = 78 MDD (FAMMDD); no history of an affective disorder, n = 151 (FAMunaffected). Six unaffected individuals with a history of substance abuse were excluded from the analyses. Forty-four subjects married into the families and had no parent in the ABiF cohort (36 unaffected; 8 MDD). Furthermore, an independent, previously reported Spanish BD case/control (CC) sample was analysed. Here, BD cases (CCBD) were recruited from consecutive clinical admissions and BD was diagnosed, as in the ABiF families, using DSM-IV [9]; unrelated control individuals (CCcontrols) were recruited in the framework of the longitudinal European Community Respiratory Health Survey (ECRHS) study. Blood for genotyping was acquired at the ECRHS2 assessment in 2000–2001. After quality control (QC), the combined data set of both cohorts comprises data from 384 FAM (163 FAMBD, 73 FAMMDD, 142 FAMunaffected, and 6 FAMunaffected with a history of substance abuse) and 438 CC subjects (161 unrelated BD cases; BD-I: n = 156; BD-II: n = 5) and 277 unrelated controls. Of the 161 CCBD cases, 59 (36.6%) reported a family history of BD. However, in contrast to the data collection in the ABiF families, this information relied only on the self-report by the respective CCBD patient, and was not validated via an interview of further family members. BD diagnoses were not available for the unrelated controls, but the self-reported prevalence of current depression in this cohort was 3.3% at the time of genotyping and the self-reported prevalence of lifetime depression was 14.4% at the follow-up 10 years after genotyping, indicating that the cohort is fairly representative of a typical population in regard to the prevalence of depression [27].

Table 1 Characteristics of 389 individuals from the 33 ABiF families, the 161 unrelated bipolar cases (CCBD), and the 277 unrelated controls (CCcontrols)

Note that, while all subjects passed QC in the family-only sample, 11 family members were excluded during QC of the joint sample because they showed significant differences in autosomal heterozygosity from the mean. Reported numbers of subjects thus differ slightly for different comparisons. The joint data set contained 35 unaffected, married-in family members who were excluded from analyses using the combined sample (unless specified otherwise). A detailed description of QC procedures is provided in the Supplementary Methods.

The study was approved by the respective local ethics committees (Comités de ética de la investigación provincial de Cádiz, Córdoba, Granada, Jaén and Málaga), and all participants provided written informed consent. For five adolescents (age 15–17 years), written informed consent was also obtained from the parents.

Genotyping and imputation

Genome-wide genotyping of the FAM sample was carried out using the Illumina Infinium PsychArray BeadChip (PsychChip). QC and population substructure analyses were performed in PLINK v1.9 [28], as described in the Supplementary Methods. Genotyping and basic QC of the CC sample were conducted previously and are described elsewhere [9]. The study used two genotype data sets: Analyses of family members by themselves used variants genotyped on the PsychChip. For analyses on the combined FAM + CC sample, the genotype data of the CC data set were, for the variants genotyped in both samples, merged with the genotype data of the FAM sample. Both genotype data sets (family-only and combined) were imputed independently to the 1000 Genomes phase 3 reference panel using SHAPEIT and IMPUTE2 [29,30,31]. After imputation and post-imputation QC, the combined data set of both cohorts contained 6,862,461 variants with an INFO metric of ≥ 0.8 and a MAF of ≥ 1%. The imputed FAM data set without the CC subjects contained 8,628,089 variants.

Calculation of polygenic risk scores

PRS were calculated in R v3.3 [32] using imputed genetic data. For each PRS, the effect sizes of variants below a selected p-value threshold, both obtained from large GWAS (training data), were multiplied by the imputed SNP dosage in the test data and then summed to produce a single PRS per threshold. Test statistics and alleles in the GWAS training data were flipped so that effect sizes were always positive. Thus, the PRS represent cumulative, additive risk. PRS were scaled to represent the relative risk load (minimum possible cumulative risk load = 0, maximum = 1). For each disorder, ten PRS based on different GWAS p-value thresholds (<5 × 10−8, <1 × 10−7, <1 × 10−6, <1 × 10−5, <1 × 10−4, <0.001, <0.01, <0.05, <0.1, <0.2) were calculated. The number of SNPs used for each PRS is shown in Supplementary Table S2. For additional details, see the Supplementary Methods.

For BD, MDD, and SCZ diagnoses, summary statistics of GWAS by the Psychiatric Genomics Consortium (PGC) were used as training data. For BD, the data freeze contained 20,352 cases and 31,358 controls [12]. As selected index patients from the ABiF families and the unrelated Spanish BD case/control data set were part of this BD GWAS, we recalculated summary statistics for this PGC GWAS without these Spanish samples, to prevent false-positive results caused by sample overlap between training and test samples. For MDD and SCZ, published data sets were used. These contained 130,664 cases and 330,470 controls for MDD [14] and 33,640 cases and 43,456 controls for SCZ [13]. There was no overlap between the subjects included in those GWAS and the ABiF and Spanish case/control samples. Variants with an INFO metric of < 0.6 in the GWAS summary statistics were removed.

Shared psychiatric PRS were generated using all variants showing an association at p < 0.05 in the GWAS of BD, SCZ, and MDD and for which effect sizes pointed in the same direction across studies. For this shared set of variants, p-values and effect sizes, used as weights in the PRS, were obtained using random-effects meta-analysis. PRS were then calculated using the meta-analysis summary statistics. We generated disorder-specific summary statistics to assess genetic risk unique to each disorder. To this end, genome-wide inferred statistics (GWIS) were calculated as explained in detail elsewhere [33]. For example, we calculated BD GWAS summary statistics corrected for the MDD GWAS results (BD-MDD). These BD-MDD GWIS results are similar to results obtained from a conditional analysis for BD corrected for MDD. They represent a genetic unique BD liability, which is estimated based on the heritability of BD and the coheritability of BD and MDD, both estimated using LD score regression [34]. As recommended for this method, variants with an INFO metric of <0.9 or >1.1 were removed. Disorder-specific PRS, e.g., BD-MDD PRS, were then calculated based on the corresponding GWIS summary statistics.

To confirm whether family members and BD cases had an increased PRS specifically for the tested psychiatric disorders but not because of population or technical differences between cohorts, PRS for late-onset Alzheimer’s disease (LOAD) were calculated as a negative control, based on a GWAS by the International Genomics of Alzheimer’s Project (IGAP) with 17,008 cases and 37,154 controls [35]. For additional details, see the Supplementary Methods. Furthermore, 10,000 simulated PRS for each of the ten p-value thresholds were calculated as negative controls. To this end, random variants from across the genome were drawn, using the same number of variants as for the BD PRS at each threshold and random effect sizes from the pool of all available BD, SCZ, and MDD effects. The code for simulating PRS is available at:

Statistical analysis

PRS analyses on binary variables (e.g., diagnoses and comparisons between cohorts) were conducted in R with the function glmm.wald of the package GMMAT, using a logistic mixed model, fitted by maximum likelihood using Nelder–Mead optimisation [36] to account for family structure. For logistic models, PRS underwent Z-score standardisation to generate comparable odds ratios (OR). Family structure was modelled as a random effect, with a genetic relationship matrix calculated on pruned genotype data in GEMMA [37].

Linear mixed models (LMMs) taking family structure into account were calculated using the function polygenic of the package GenABEL [38] for analyses of quantitative variables (anticipation and age at onset). In these analyses, test statistics, including 95% confidence intervals (CI), were calculated using bootstrapping (package boot [39, 40]) and p-values were validated using permutation analysis (10,000 permutations). In these permutation analyses, the null distribution of test statistics was empirically determined by repeating regression analyses 10,000 times with random sampling of phenotype data. To calculate a p-value, the number of tests were counted where a model with a random genotype-phenotype association showed the same or a more extreme p-value than the correct, non-randomised model and this number was divided by the total number of tests (10,000).

For each analysis of PRS, all ten PRS p-value thresholds were analysed. In analyses of the combined FAM and CC data set, sex was used as a covariate. In the analysis of FAM data alone, sex and age at the time of the interview were used as fixed effects covariates; whether an individual had married into the family was incorporated as a second random effect. Following the hypothesis that family members or subjects with a psychiatric diagnosis have increased PRS for psychiatric disorders, one-sided p-values were calculated for all PRS-based analyses. In all analyses, p-values below the significance threshold α = 0.05 were considered as nominally significant. Unless otherwise stated, this threshold was corrected for 10 × 6 = 60 tests using the Bonferroni method (α = 0.05/60 = 8.33 × 10−4). For further details, see the Supplementary Methods.

To determine whether population or technical differences might have influenced the observed effects independently of diagnosis groups, simulated PRS, generated as described above, were analysed. For each model, association statistics of the 10,000 simulated PRS were calculated for the ten p-value thresholds; the disorder PRS at the threshold showing the lowest mean association p-value was analysed further: The number of simulated PRS at this threshold that showed the same or a stronger association was counted and compared to the association of the disorder PRS. This count was used as the number of successes in a binomial test to estimate the probability of success. For computational efficiency, models were fitted using restricted maximum likelihood estimation and the average information optimisation algorithm for this analysis.


Figures 1 and 2 show the test statistics for the PRS with the training GWAS p-value threshold pPRS that showed the strongest association per PRS type. Full results for all ten pPRS per PRS type calculated using logistic mixed models are provided in Supplementary Figs. S1S12 and in Supplementary Tables S3–S11.

Fig. 1
figure 1

Comparison of PRS between FAM and CC samples. Married-in family members were excluded from these analyses. The plots show one-sided p-values, following the hypothesis that family members have higher PRS than individuals from the CC samples. All PRS have been normalised using Z-score standardisation. a, b Comparison of FAMBD cases to CCcontrols. a FAMBD cases had higher BD PRS across all ten pPRS thresholds. The plot shows odds ratios (OR, y-axis, filled circles) and 95% confidence intervals (CI); pPRS thresholds are shown on the x-axis. Results for each threshold are coloured by their degree of significance (one-sided p-values): red = not significant, orange = nominally significant, green = significant after Bonferroni correction for multiple testing (α = 0.05/60 = 0.00083). The top-associated PRS (pPRS = 0.1) is indicated in bold font and was marked by a magenta circle (also in b). b For ten different PRS, this plot shows association statistics for the top-associated pPRS thresholds. The x-axis shows ORs. BD, SCZ, MDD: Standard PRS using the respective PGC GWAS summary statistics. Shared: Shared psychiatric PRS (SNPs with BD, MDD, SCZ p < 0.05, random-effects meta-analysis). BD-SCZ, BD-MDD: BD-specific GWIS PRS corrected for SCZ and MDD, respectively. SCZ-BD and MDD-BD: GWIS PRS for SCZ and MDD, each corrected for BD. LOAD: PRS for late-onset Alzheimer’s disease. Simulated: Mean and CI of the 10,000 simulated PRS at the pPRS with the lowest mean association p-value of all simulated PRS. The column to the left of the plot: pPRS with the strongest association. Supplementary Fig. S2 shows plots for all pPRS. Column to the right: pone-sided = one-sided p-value. For full association test statistics, see Supplementary Table S3. Bonferroni = significant after Bonferroni correction for multiple testing; nominal = nominally significant (p < 0.05); n.s. = not significant. c, d Comparison of FAMBD cases and unrelated CCBD cases. See Supplementary Fig. S3 and Table S4 for more detailed plots and full association test statistics. e, f Comparison of FAMunaffected and CCcontrols. See Supplementary Fig. S4 and Table S5 for more detailed plots and full association test statistics

Fig. 2
figure 2

a, b Comparison of PRS between FAMBD cases and FAMunaffected. The plots show one-sided p-values, following the hypothesis that BD cases have higher PRS than unaffected individuals. Further details of the plots are as described in the legend for Fig. 1. See Supplementary Fig. S5 and Table S6 for more detailed plots and full association test statistics. c, d Analyses of assortative mating (c) and anticipation (d). These plots were not adjusted for covariates; n = sample size. The y-axis shows the PRS values. c: Assortative mating. The plot shows violin- and boxplots of the BD PRS (pPRS = 0.05), comparing unaffected, married-in individuals with no parent among the ABiF families to other FAM and CC subjects. At pPRS = 0.05, married-in family members showed the highest BD PRS compared to CCcontrols (p = 6.5 × 10−5, Supplementary Fig. S6A and Table S7). The BD PRS of married-in individuals was not significantly higher than the PRS of FAMunaffected at any pPRS (p ≥ 0.167, Supplementary Fig. S6B and Table S7). Covariate used: sex. One-sided p-values were calculated, following the hypothesis that married-in individuals have higher PRS than other unaffected subjects. Note that, in the context of assortative mating, the boxplots of affected BD cases are displayed for reference only and have not been included in the analysis. d Anticipation: the BD PRS did not increase across generations. The plot shows violin- and boxplots of the BD PRS (pPRS = 1 × 10−5) across different generations of the FAM sample for the three diagnosis groups. At pPRS = 1 × 10−5, the association of the BD PRS with generation was strongest but not significant (p = 0.45; Supplementary Fig. S7A and Table S8). Married-in family members were excluded from this analysis. Covariates used: sex, age at the interview, diagnostic group. One-sided p-values were calculated, following the hypothesis that the PRS increase across generations

FAMBD cases had higher psychiatric PRS than controls from the general population

On average, familial FAMBD cases had higher BD PRS than unrelated CCcontrols across the pPRS thresholds (Fig. 1a, b; Supplementary Figs. S1 and S2 and Supplementary Table S3). The most substantial support for an increased BD PRS was found with the threshold pPRS = 0.1 (OR = 2.97, one-sided p = 1.9 × 10−11). FAMBD cases also had significantly higher SCZ PRS than CCcontrols; the increase of the MDD PRS was nominally significant (Fig. 1b).

Shared PRS generated from the variants associated jointly with BD, SCZ, and MDD were significantly increased at pPRS ≥ 0.01 in FAMBD cases compared to CCcontrols. The GWIS BD-MDD PRS—the BD PRS corrected for associations shared with MDD—were significantly increased in FAMBD cases compared to CCcontrols. All other disorder-specific GWIS PRS were not significantly higher in FAMBD cases after correction for multiple testing.

No significant increase was found for the negative-control PRS for late-onset Alzheimer’s disease (LOAD) and associations of the PRS for BD and SCZ and of the Shared PRS were significantly stronger than simulated PRS in FAMBD compared to CCcontrols (Table 2).

Table 2 The psychiatric disorder PRS can distinguish better between groups than simulated PRS

FAMBD cases had higher BD PRS than unrelated CCBD cases

The BD PRS was significantly higher in FAMBD than in CCBD cases at pPRS ≥ 0.05, but no other type of PRS was increased in FAMBD compared to CCBD cases (Fig. 1c, d). The association of the BD PRS was significantly stronger than simulated PRS (Table 2).

Unaffected family members showed higher psychiatric PRS than CC controls

In the comparison of FAMunaffected to CCcontrols, PRS for BD and SCZ were significantly higher in unaffected family members (Fig. 1e, f). The increases of the MDD, Shared, BD-MDD, and SCZ-BD GWIS PRS were nominally significant. The associations of BD and SCZ PRS were significantly stronger than the associations of simulated PRS (Table 2).

FAMBD cases had an increased PRS specifically for BD

In comparison to FAMunaffected, the BD PRS and the BD-MDD disorder-specific PRS were significantly higher in FAMBD (Fig. 2a, b). The Shared and the BD-SCZ PRS were increased at nominal significance.

Effects of assortative mating on BD PRS in family members

Eight of the 44 individuals who had married into the families had a diagnosis of MDD and none of BD (Table 1). While the unaffected married-in individuals had higher BD PRS than CCcontrols (p = 6.5 × 10−5), their BD PRS was not higher than the PRS of other FAMunaffected (Fig. 2c; Supplementary Fig. S6 and Supplementary Table S7). We also examined possible anticipation of BD in the families: neither did the BD PRS increase significantly over generations nor did the age at onset decrease over time (Fig. 2d; Supplementary Fig. S7 and Supplementary Table S8).

FAMMDD cases had higher psychiatric PRS than CCcontrols

In comparison to CCcontrols, FAMMDD cases had significantly higher BD and MDD PRS, increases of the Shared and SCZ PRS were nominally significant (Supplementary Figs. S8 and S9 and Supplementary Table S9). Both the BD and MDD PRS were increased at nominal significance when comparing FAMMDD to FAMunaffected (Supplementary Fig. S10 and Supplementary Table S10). Notably, in both comparisons, FAMMDD showed a nominal increase in SCZ-MDD PRS, but not in SCZ-BD PRS.


Genome-wide association studies in large samples of unrelated patients and controls have unravelled the polygenic nature of BD, i.e., many common variants, each with a small effect size, contribute to BD. It has also been consistently shown that BD, MDD, and SCZ share many risk-conferring variants. The aim of the present study was to investigate whether common variants also contribute to BD in families with a high density of the disorder and if so, whether these variants are specific to BD.

We found that, compared to CCcontrols, unrelated subjects from the general population unscreened for BD, affected and unaffected ABiF family members had an elevated genetic risk for the tested psychiatric disorders, mainly for BD but also for SCZ. FAMBD cases were characterised by a particularly high load of BD-specific risk variants: The strongest association observed across all comparisons was the increase of the BD PRS in FAMBD compared to CCcontrols. In addition, the BD but not the SCZ and MDD PRS of FAMBD were significantly higher than the PRS of unrelated CCBD cases and unaffected family members. Together with the disorder-specific GWIS PRS, these results support the major contribution of BD-associated variants to the high density of the disorder in the investigated families.

An increased polygenic psychiatric risk has also been described in other studies of BD multiplex families [41,42,43]. However, the scope and results of these studies differed from the present study to some extent: Fullerton et al. [41] described an increased BD PRS in affected family members compared to unrelated controls and, when selecting families with a high polygenic BD risk load, also to unaffected family members. They constructed PRS only based on a small set of 32 SNPs from an older GWAS [10], and no other PRS were investigated. De Jong et al. [43] focused their analyses in a large Brazilian family with BD and MDD on assortative mating and anticipation and found BD and SCZ PRS to be increased at nominal significance in affected compared to unaffected members. In a large Swedish pedigree with mainly BD but also some SCZ cases, Szatkiewicz et al. [42] reported increased SCZ PRS in affected family members compared to family-level and population controls, as well as BD PRS increased at nominal significance in affected family members compared to family controls. However, no differences were observed between unaffected family members and population controls. Of note, none of these studies investigated differences in PRS between families and unrelated BD cases.

Compared to the CCBD in our study, FAMBD displayed, apart from an earlier age at onset, signs of a less severe clinical picture, i.e., less frequent impairment and less psychosis. This could be explained by the fact that CCBD cases were almost all BD-I patients recruited from consecutive admissions to a hospital, while most of the FAMBD cases were reached through other family members in the context of the study. Apart from this, the FAMBD did not display any striking differences in clinical features compared to the CCBD. Thus, we consider it likely that the increased PRS in the FAMBD is linked to the familial aggregation and not to clinical characteristics.

It appears striking that none of the ABiF family members have been diagnosed with SCZ. However, this can most likely be attributed to ascertainment bias as the recruitment strategy focused on BD multiplex families. With respect to this lack of SCZ diagnoses in the ABiF families, it is of interest that the family members showed not only an increased BD PRS but also increased SCZ and Shared PRS compared to unrelated controls. This increase could be an indirect consequence of the genetic correlation between BD and SCZ [14, 16, 18,19,20,21]. Furthermore, affected family members also had higher Shared PRS than CCcontrols. Of the psychiatric disorder GWAS data sets (i.e., SCZ, BD, and MDD) used in the present analysis, the SCZ GWAS both identified the largest amount of risk loci (108, 30, and 44, respectively) and the corresponding PRS explained the highest amount of case/control variance (7%, 4%, and 2% on a liability scale, respectively) [12,13,14]. Taking this and the genetic correlations between the disorders into account, the SCZ PRS might have included more cross-disorder signals with smaller effects than the PRS of BD and MDD. If family members had an increased Shared risk burden, this cross-disorder risk might have rendered them vulnerable to psychiatric disorders in general, with the high BD PRS then shaping the final BD diagnosis outcome. Of note, the analyses of FAMMDD cases are discussed in the Supplementary Data.

Our study furthermore indicates that assortative mating may have contributed to the increased BD PRS in the ABiF families: in their study, de Jong et al. [43] found no increased PRS in married-in subjects, but an increase of polygenic risk and a decrease in age at onset over generations. We observed that individuals who married into the ABiF families had higher BD PRS than CCcontrols, and their BD risk load was similar to other FAMunaffected. At the time of the interview, none of the married-in family members had a diagnosis of BD. Nevertheless, their increased BD PRS suggests that assortative mating may have occurred. Unaffected individuals with an above average BD PRS may display sub-threshold characteristics of BD, such as a broader range of emotions [44,45,46]. Consistent with the observation that married-in subjects did not have higher BD PRS than the other FAMunaffected, no increase in BD PRS was found across generations. However, assortative mating may have contributed to the establishment and maintenance of a high genetic risk load for BD in these families. Furthermore, assortative mating may have already occurred in previous generations, for which no DNA was available. Of note, DNA was not available for all ABiF family members of the current generations, limiting the scope of the analysis of assortative mating.

Although both the FAM and CC samples were recruited in Spain [9], minor population differences may have influenced the present results. Even if such minor differences existed, it is unlikely that they caused the highly significant associations observed for the psychiatric PRS, given that the pairwise genetic relationship matrix was used as a random effect in the association analyses. Additionally, results from three further analyses support our assumption that systematic differences between the genotype data of FAM, CCcontrols, and CCBD samples did not distort our findings: First, we did not find significant differences between the cohorts in a population substructure analysis (see Supplementary Fig. S11 and Supplementary Methods). Second, PRS for LOAD were not significantly increased in family members in any analysis. Since LOAD shows no genetic correlation with BD, MDD, or SCZ [14, 47, 48], this result further supports the specificity of our analyses. Third, when a psychiatric disorder PRS was significantly increased in family members, this association was stronger than for simulated PRS. While these findings cannot entirely exclude the influence of unknown confounders on our results, we consider them as strong evidence that the high psychiatric PRS observed in family members compared to controls cannot be attributed to population or technical differences between the cohorts.

The lower a pPRS threshold in the GWAS training data, the fewer SNPs were included in the calculation of the corresponding PRS. In most cases, significant differences between groups were not observed for these low pPRS but the higher thresholds based on thousands of variants. This is commonly observed and in line with the polygenic nature of psychiatric disorders as complex disorders, with genome-wide significant SNPs only accounting for a small share of the polygenic signal. The training GWAS used for BD, SCZ, and MDD, the largest available for these phenotypes, differ in the number of included subjects, their statistical power, and the number of identified signals. Therefore, the derived PRS also differ in the number of SNPs used in the calculation of each threshold (see Supplementary Table S2). However, even though the BD GWAS was based on the smallest number of subjects and contained the lowest number of genome-wide-associated loci among the three GWAS, the BD PRS showed the strongest associations with BD case status or family membership, underlining the substantial contribution of BD risk variants to the development of BD in the ABiF families.

One limitation of the study is that the subjects of the unrelated control cohort were not systematically screened for psychiatric disorders. The lifetime prevalence of unipolar depression in this cohort (up to 14.4% until the time of the interview) was in line with typically observed numbers [27], the prevalence of BD was not assessed. However, as BD has a lifetime prevalence of ~1%, we expect up to three BD cases among the 277 controls, a number we consider unlikely to have markedly influenced our results. Moreover, using controls unscreened for BD instead of “super-healthy” controls as a comparison to family members and unrelated BD cases represents a conservative approach and thereby strengthens the observed group differences in psychiatric PRS.

Similarly, around one third of the CCBD reported a family history of BD. The CCBD thus do not represent a sample of truly sporadic BD cases. However, the aim of our study was to investigate how members of multiplex BD families differ from typical BD cases regarding the polygenic contribution to their disorder. The observation that ABiF multiplex cases showed a higher polygenic psychiatric risk than CCBD, despite part of the CC cases also reporting a family history for BD, thus rather strengthens the validity of our findings.

The present study generated substantial evidence that members of the ABiF families, including unaffected subjects, carried a higher risk burden of common genetic risk variants than an unrelated control sample mainly for the psychiatric disorders BD and SCZ and, at least the FAMMDD cases, for MDD. In line with previous theoretical assumptions [49] and preliminary results from a pilot study in a single ABiF family [26], our results suggest that a high polygenic load of common risk variants is a major contributor to the increased risk for BD and MDD in families with a high density of BD. However, given the modest effect sizes of the PRS, they explained only a fraction of the phenotypic variance, and rare mutations such as copy number variants [50] or rare single-nucleotide variants likely also play an important role in each of the families. Sequencing studies carried out in multiplex families have suggested rare variants are involved in the aetiology of BD [51,52,53]. To date, however, it has proven difficult to identify replicable causal associations between rare variants and BD susceptibility. In a pilot study that analysed a single ABiF pedigree, we did not identify any rare causal variants for BD [26]. The analysis of rare variants in the remaining ABiF families using next-generation sequencing technologies is envisioned for the future, including integrative analyses in international consortia such as the Bipolar Sequencing Consortium [54]. Of note, the present analyses did not assess single families separately, but integrated PRS associations across all examined 33 ABiF families. Thus, the degree to which common and rare variants shaped the emergence of psychiatric disorders may vary between families.

Furthermore, PRS are commonly based on and applied to sets of unrelated individuals, and polygenic risk might act differently in the case of familial genetic background. Moreover, a broad range of environmental factors have been shown to influence the risk of psychiatric disorders and might act on top of the increased genetic risk in these families. However, environmental factors have not been systematically assessed in the present study. To further enhance our understanding regarding the aetiology of BD, integrated analyses of common and rare variants, as well as of environmental risk in the ABiF families are warranted in the future.