Many psychiatric disorders have moderate to high heritability; however, the genetics of psychiatric disorders are complex and highly polygenic, with each risk variant only conferring a small effect1. Psychiatric disorders also have a high level of overlapping clinical heterogeneity, with shared genetic risk explaining some of the clinical overlap, and certain combinations of alleles may contribute to the same psychopathological symptoms in multiple psychiatric disorders. Furthermore, some psychiatric disorders may lie on a continuum rather than being disorders with distinct genetics and biological mechanisms2,3.

To accommodate this genetic complexity, investigations of psychiatric disorders have increasingly relied on polygenic risk scores (PRSs), leveraging knowledge from prior large genome-wide association studies (GWASs) to predict genetic risk of particular disorders in a new sample4. When the PRS for one disorder is predictive of a second disorder, this indicates a common polygenic contribution to the two disorders5.

Bipolar disorder (BD) is a complex illness with heterogeneous clinical presentation, and apparent sub-phenotypes often have a different course of illness, prognosis, and treatment response6,7,8,9. In order to personalize treatment, it is crucial to better understand biological underpinnings of BD clinical sub-phenotypes. One approach is to examine potential relationships of clinical phenotypes to different genetic profiles.

Historically, the relationship between schizophrenia (SCZ) and BD has shaped classification systems in psychiatry10. The corresponding link between phenotype and genetics was recently established with the demonstration that BD patients with a history of psychosis, particularly mood incongruent psychosis and psychosis during mania, have increased genetic risk for SCZ11,12,13,14,15,16. However, it is well recognized that BD genetically overlaps–and has high clinical comorbidity with–other major psychiatric conditions, including major depressive disorder (MDD), attention deficit and hyperactivity disorder (ADHD), anxiety disorders, post-traumatic stress disorder (PTSD), obsessive compulsive disorder (OCD), borderline personality disorder, and substance use disorders17,18,19,20,21,22,23,24. While significant advances have been made in understanding the genetic relationship between BD psychotic sub-phenotypes and SCZ11,12,13,14,15,16, little is known about how genetic risks for other psychiatric disorders as well as important personality and lifestyle traits such as body mass index (BMI), risk-taking, and neuroticism relates to psychosis or other BD clinical sub-phenotypes.

The goal of this study was to systematically test if PRSs for major psychiatric conditions and other traits related to BD are predictors of distinct BD sub-phenotypes, in particular with regards to psychosis, age-of-onset, rapid cycling, and suicidal behavior. Understanding the shared genetic risk factors between BD clinical sub-phenotypes and other comorbid conditions may contribute to psychiatric clinical classification systems with a more biologically informed nosological system25.

Methods and materials


Mayo Clinic Bipolar Disorder Biobank

The Mayo Clinic Bipolar Disorder Biobank collection has been described in previous papers9,11,26. We restricted our analyses to cases with European ancestry (N = 968), because PRSs derived from GWASs of participants with European ancestry perform much worse in non-European ancestries27. Sub-phenotypes were determined using the Structured Clinical Interview for DSM-IV (SCID)28 as well as a patient questionnaire (Supplementary Table 1). These were conducted by research coordinators that were trained and certified on using the tools. Any discrepancy between SCID diagnoses and diagnoses in medical records was reviewed by a licensed psychiatrist.

Genetic Association Information Network

The Bipolar Disorder Genome Study Consortium conducted a GWAS of BD as part of the GAIN29. We obtained the data from dbGaP (phs000017.v3.p1), and restricted our analyses to cases with European ancestry (N = 1001). All cases met criteria for DSM-IV-defined bipolar I disorder (BD-I). Subjects recruited at different times were interviewed with a one-time Diagnostic Interview for Genetic Studies 2, 3, or 4 (DIGS 2, 3, 4) conducted by study coordinators (Supplementary Table 1).

Genotyping and quality control

Mayo Clinic Bipolar Biobank

Genotyping and genetic data quality control of this sample was previously described as part of a larger case-control study11. Briefly, the Illumina® HumanOmniExpress platform (Illumina®, San Diego, CA, USA) was used to genotype 1046 BD cases. For quality control purposes, we excluded subjects with <98% call rate and related subjects. Single-nucleotide polymorphisms (SNPs) with call rate <98%, MAF < 0.01, and those not in Hardy–Weinberg Equilibrium (HWE; P < 1e-06) were removed. After these steps 643 011 SNPs and 968 subjects remained.


Genotyping and quality control procedures for the GAIN-BD data were previously described by Smith et al.30, Briefly, the Affymetrix® Genome-Wide Human SNP Array 6.0 platform (ThermoFischer Scientific, Waltham, MA, USA) was used to genotype cases and after excluding SNPs with call rate <98%, MAF < 0.01, and those not in HWE, 726,315 SNPs and 1001 subjects of European ancestry remained.


Genotypes in both the GAIN and Mayo Clinic samples were imputed to the 1000 genomes reference panel, as previously described for the GAIN sample9. Specifically, SHAPEIT31 was used for haplotype phasing and imputation was performed using IMPUTE2.2.232 with the 1000 genome project reference data (phase 1 data, all populations). Dosage data was converted to best guess genotype for the well-imputed (dosage R2 > 0.8) and common (MAF > 0.01) SNPs, resulting in more than 5 million SNPs in both datasets.

Polygenic risk scores

PRSs were included in the analysis if: (1) there was evidence of significant genetic correlation of the trait with BD and (2) we had at least 80% power to detect PRS association in a general case-only analysis of our data assuming 50% prevalence of the sub-phenotype. We began by considering PRSs for major psychiatric disorders (BD33, SCZ34, MDD35, ADHD36, anxiety37, PTSD19, OCD38, anorexia nervosa39, alcohol use disorder40, and insomnia41) and personality and lifestyle traits related to BD (alcohol consumption40, educational attainment (EA)42, risk-taking43, subjective well-being44, neuroticism45, anhedonia46, and body mass index (BMI)47). The GWAS summary statistics were restricted to well-imputed variants (INFO > 0.9) when information on imputation quality was available.

Using linkage disequilibrium (LD) score regression48, we estimated the genetic correlation of the above traits with BD33 (Supplementary Table 2). Insomnia and alcohol consumption did not have significant genetic correlation with BD and were therefore excluded from further analysis.

Using the R package AVENGEME49, we estimated that training sample sizes of 20,000 would achieve at least 80% power in our analysis assuming moderate overlap of the trait with the sub-phenotype (genetic covariance = 0.1), high polygenicity (# of independent SNPs = 20,000), and 0.005 α-level to account for multiple testing. The study of OCD included an effective sample size of <4000 and was thus excluded from further analysis. The final list of PRSs that were tested for association with BD sub-phenotypes is shown in Supplementary Table 2.

For traits that satisfied our inclusion criteria, the PRS-continuous shrinkage (CS)50 auto setting was applied to estimate SNP weights using a fully Bayesian shrinkage approach that shrinks SNP effects with a continuous shrinkage prior. This setting allows the algorithm to learn the global shrinkage parameter from the data to create one set of weights per PRS and therefore does not require a validation dataset. This setting also reduces the multiple testing of standard PRS analyses that search over many p-value thresholds51. PLINK version 1.952 was used to create PRSs using the shrunken SNP weights. The PRSs were then standardized to have a mean of zero and standard deviation (SD) of one.

Statistical analyses

In each dataset, we performed principal components (PCs) analysis of the genotyped SNPs and kept the first four PCs to be used as within-study nested covariates in subsequent PRS association analyses. In all models, study indicator and an interaction of study and within-study PCs were included as covariates to control for population stratification. All 12 PRSs were individually modeled using a multivariate logistic regression model with each sub-phenotype (psychosis, early-onset BD, rapid cycling, and attempted suicide) as the outcome.

We used 10,000 permutations to find the significance threshold to control the false positive rate testing for association with each sub-phenotype with 14 PRSs (α = 0.005) as well as the family-wise error rate (α = 0.001). For each sub-phenotype, we also included all significant PRSs (p < 0.005) in a joint model, to estimate the relative contribution of the PRSs after adjusting for other important PRSs. We report the variance explained in the sub-phenotype by each PRS after adjustment for other PRSs using Nagelkerke’s pseudo-R2 statistic. All statistical analyses were performed in R 3.5.2.


Sample description

Table 1 summarizes the demographic and sub-phenotype information of each study. There was a difference in the sex distribution between the two samples. The GAIN study only included BD type I cases, and the distributions were also significantly different for all sub-phenotypes besides attempted suicide. The GAIN-BD cases had a higher rate of psychosis and early-onset BD, while Mayo Clinic cases had higher rates of rapid cycling, which is more prevalent in women53.

Table 1 Table of sub-phenotypes and sex for each study.

Figure 1 shows a forest plot of the significant PRS associations with each sub-phenotype further broken down by study. Further detailed results for each sub-phenotype can be found in Supplementary Tables 36.

Fig. 1: Forest plot of significant PRS associations with each sub-phenotype stratified by study (black = combined; green = GAIN; maroon = Mayo Clinic).
figure 1

Each bar represents a 95% CI of the increased log (odds) in the sub-phenotype associated with one SD increase in the PRS. The P-values for each PRS included in the model by itself (P.m) or with other significant PRSs (P.j) and adjusted Nagelkerke’s R2 (R2) are listed in the margins for each PRS.


Cases with psychosis versus no psychosis had higher PRSs for SCZ (OR = 1.3, 95% CI 1.15–1.48; p-value = 3.5e-5), but lower PRSs for anhedonia (OR = 0.87, 95% CI 0.79–0.95; p-value = 0.003), and BMI (OR = 0.87, 95% CI 0.79–0.95; p-value = 0.004). These three PRSs explained 2.6% of the variation in psychosis in the joint model. While anhedonia is a component of MDD and the two PRSs are positively correlated (r = 0.41), the PRS for MDD was not associated with psychosis in BD (OR = 0.96, 95% CI 0.87–1.06; p-value = 0.45).

Early-onset BD

Higher PRSs for risk-taking (OR = 1.21, 95% CI 1.09–1.35; p-value = 0.0005; adj. Nagelkerke’s R2 = 0.8%) and anhedonia (OR = 1.16, 95% CI 1.05–1.29; p-value = 0.0047; adj. Nagelkerke’s R2 = 0.8%) were observed in cases with early-onset BD compared to cases that developed BD after age 18.

Rapid cycling

Cases with rapid cycling versus those without rapid cycling had higher ADHD PRS (OR = 1.23, 95% CI 1.11–1.36; p-value = 7e-5; adj. Nagelkerke’s R2 = 0.8%), MDD PRS (OR = 1.23, 95% CI 1.11–1.36; p-value = 4e-5; adj. Nagelkerke’s R2 = 0.5%), PTSD PRS (OR = 1.28, 95% CI 1.14–1.44; p-value = 4e-5; adj. Nagelkerke’s R2 = 0.7%), and PRS for anxiety (OR = 1.19, 95% CI 1.07–1.33; p-value = 0.001; adj. Nagelkerke’s R2 = 0.1%). Cases with rapid cycling also had lower BD PRSs (OR = 0.80, 95% CI 0.68–0.93; p-value = 0.004; adj. Nagelkerke’s R2 = 0.9%). The five PRSs explained 3.9% of the variation in rapid cycling when included in one model.

Attempted suicide

The genetic risk for MDD (OR = 1.26, 95% CI 1.15–1.39; p-value = 1e-6; adj. Nagelkerke’s R2 = 0.7%) and anhedonia (OR = 1.22, 95% CI 1.12–1.34; p-value = 2e-5; adj. Nagelkerke’s R2 = 0.3%) was higher in cases with at least one suicide attempt versus those with none. Cases with an attempted suicide also had a lower PRS for EA (OR = 0.87, 95% CI 0.79–0.96; p-value = 0.0036; adj. Nagelkerke’s R2 = 0.2%). The three PRSs explained a total of 2.3% of the variation when included in one model, but only the MDD PRS remained significant after accounting for the other PRS associations.


To our knowledge, this is the first PRS dissection of clinical sub-phenotypes in BD that is comprehensive with respect to the range of psychiatric, personality, and lifestyle phenotypes for which genetic liabilities were estimated and used to predict the BD sub-phenotypes. Previous analyses of BD sub-phenotypes of psychosis, early-onset BD, and suicide focused on genetic liability to the major psychiatric diagnoses of BD, SCZ, and MDD6,11,12,16,54. Here, we took an expanded agnostic approach to PRS analysis by using many different PRSs beyond just these three to more systematically test for PRS association with clinically important sub-phenotypes of BD, including rapid cycling. Importantly, the contribution of each PRS to a sub-phenotype was assessed after adjusting for the other PRSs’ contributions, thereby assessing how predictive a genetic risk is above and beyond other correlated genetic risks (Supplementary Fig. 1). Our results were highly comparable in the two cohorts, lending greater confidence to the conclusions (Fig. 1). Overall, we find that the different BD clinical sub-phenotypes have different profiles of PRS associations with major psychiatric conditions.

BD with psychosis

Previously, using the Mayo Clinic sample, we showed that BD patients with a history of psychosis during mania had higher genetic risk for SCZ11. Here, this finding is replicated in the GAIN cohort. This finding was also reported by Ruderfer et al.16, in a larger study that included both the Mayo Clinic and GAIN-BD cohorts. However, in addition to this relationship, in the present study we also found that BD cases that have not experienced psychotic symptoms had higher genetic scores for anhedonia and BMI. Association of higher genetic risk for anhedonia with a subtype of BD without psychotic features implies that a patient with more genetic predisposition for anhedonia during major depressive episodes is less likely to include episodes with psychotic features. In fact, rates of psychotic features are higher in BD compared with MDD, and familial studies show a greater heritability of psychotic features in BD relative to in other mood disorders55. Interestingly, we did not observe a significant association with MDD PRS despite a strong genetic correlation between anhedonia and MDD. This may underscore the importance of relying on core symptoms in these analyses, instead of using more complex and syndromal entities like MDD. The relationship between BMI and psychosis is complex and influenced by heritable, environmental, and iatrogenic factors. Over the course of illness, most patients with BD and psychosis gain weight, which contributes to morbidity and mortality56,57. Our finding that BD patients with psychosis have lower genetic predisposition to elevated BMI than BD patients without psychosis suggest that weight gain in those with psychosis may be a side effect of medications, which is in line with historic observations predating discovery of antipsychotic medications58. However, the complex relationship between BD and greater body weight needs to be further explored in the context of sub-phenotypes and use of atypical antispychotics or lithium. Likewise, future studies should further elaborate on the association between BD psychosis sub-phenotypes, e.g. BD with mood congruent vs incongruent psychosis, and PRSs investigated here.

Early-onset BD

We found evidence that higher genetic liability for risk-taking behavior was associated with early-onset BD, but no evidence that genetic risk for SCZ or BD were associated with age of onset of illness. A previous study of polygenic associations with age-of-onset of BD also showed no association of SCZ or BD genetic risk with both a dichotomous sub-phenotype, as defined in our study, or continuous age-of-onset59. Risk-taking is a hallmark feature of normative adolescence but is also commonly seen in mania. There are several possible explanations for the risk-taking PRS and early-onset BD association found in this study. Perhaps the simplest explanation is that youth with particularly high propensity for risk-taking behaviors come to clinical attention earlier and subsequently have BD identified at an earlier age. However, there are several potential limitations that may have affected these findings. Due to the way the data were collected, for this study, age of onset was dichotomized based on a cutoff age of 18, which may have reduced power. Also, our early-onset BD definition did not differentiate between age of first manic and first depressive episodes. Furthermore, given the high genetic and clinical overlap between BD and other conditions investigated here (e.g. ADHD), a study of age-of-onset of any psychiatric disorder/symptom rather than just BD could be informative. It is of note that earlier onset MDD is associated with more pronounced aggressive/impulsive traits60. Nevertheless, the observed association of risk-taking PRS with early vs. late onset BD is intriguing and warrants further investigation.

BD with rapid cycling

Previous clinical studies have shown a strikingly higher clinical comorbidity rate of ADHD in BD patients with rapid cycling compared to non-rapid cycling BD patients61. While the general genetic association between ADHD and BD has been described before17, our results are the first study to show possible genetic underpinnings for this specific rapid cycling BD and ADHD association. We also found a strong association of MDD genetic risk with rapid cycling. This implies that genetic variation related to ADHD and MDD may also be related to episode frequency in BD, and that comorbid ADHD and more depressive episodes would be clinically associated with the rapid cycling form of BD, though the predominant directionality of mood episodes was not discernible from the available data. Rapid cycling BD has been reported to have more episodes of major depression and a higher rate of parental MDD compared with non-rapid cycling BD62, which is consistent with our PRS association findings. Finally, rapid cycling cases had lower BD PRS as reported in a previous investigation15. This could simply reflect that prevalence of rapid cycling in cases ascertained for the sample used in the GWAS of BD by the Psychiatric Genomics Consortium (PGC) was lower than in the two samples included here, but still demonstrates a systematic difference in genetics of rapid cycling and non-rapid cycling BD.

BD with a history of a suicide attempt

Our finding of increased MDD genetic load in BD patients with a history of suicide attempts is consistent with a recent study that included both of the Mayo Clinic and GAIN data, which showed that genetic risk factors for MDD increase the risk for suicide trans-diagnostically54. BD with a history of suicide attempt having a higher MDD genetic liability is consistent with the clinical observation that suicide attempts are most common during major depressive episodes or mixed states and rare during manic episodes or while euthymic63,64. Interestingly, even after adjusting for MDD PRS, we also found that genetic liability for anhedonia is marginally associated with suicide, suggesting that anhedonia may be a particularly relevant factor contributing to suicidality, compared to other components that comprise the MDD syndrome. This is consistent with findings from non-genetic studies, which found that association of anhedonia with suicidality is independent of the association with depression and psychotic features65.

Methodological limitations

The PRSs used in this study are based on data from previously published large scale investigations and are limited by the diagnostic accuracy, recruitment criteria, and methodology of previous studies. The most recent PGC study of BD33 included the cases and controls from the GAIN and Mayo Clinic samples. Sample overlap of testing datasets with training data can create substantial biases in PRS analyses. However, here we studied genetic differences within cases and thus, the sample overlap is not expected to bias our results, because there should be little correlation between case–control status used to build the training models and the within-case sub-phenotypes. However, this is an open research question and requires further methodological study. Thus caution is needed with respect to any interpretation in associations of BD-PRS with sub-phenotypes. Also, because sample sizes in training GWASs vary, the power to detect PRS associations with sub-phenotypes in our analyses was not uniform.

Another limitation is the cross-sectional data collection using SCIDs and patient questionnaires and does not consider developmental trajectories of psychopathology over time. The data also lacks the number, duration, and severity of major depressive and manic episodes to more precisely map the clinical picture onto the PRS profile. Furthermore, sub-phenotypes such as rapid cycling and psychosis could be substance-induced but was not assessed in this study. Finally, it is important to note that no PRS explained a large amount of variation in our analysis. Thus, while the associations identified in this study provide evidence of genetic differences that may underlie clinical subtypes of BD, these PRSs cannot yet be used for purposes of personalized psychiatry.


Our findings contribute to the understanding of the underlying genetic causes of clinical heterogeneity of BD and of comorbidity between BD and other major psychiatric conditions. We find evidence that psychopathologic components of BD, including psychotic symptoms, rapid cycling, and suicidal behavior are linked to the PRSs for related disorders including schizophrenia, ADHD, and MDD, respectively. Finally, larger studies are needed to more precisely map genetic risk factors to clinical sub-phenotypes. Harmonization of sub-phenotypes across studies is a well-recognized challenge. Nevertheless, such efforts are critical in helping to classify psychiatric disorders more accurately and identify risk of suicide, psychosis, and other adverse outcomes in patients.