Introduction

Bipolar disorder is a complex psychiatric disorder typically diagnosed in young adulthood [1]. It affects mood, energy, activity, and concentration and is characterized by episodes of mania, hypomania and major depression. Bipolar disorder is ranked as one of the top causes of lost years of life and health in 15 to 44 year olds [2] and affects 2–3% of the population worldwide [3]. Despite the characteristic phenotype, bipolar disorder is often misdiagnosed, leading to inappropriate or delayed treatment which contributes to the high burden of morbidity and mortality [3, 4]. Observational studies suggest that early intervention might improve disease course and outcome, and delay to first treatment is associated with poorer outcomes [3, 5]. Earlier identification of warning signs, and improved intervention efforts would therefore be of great importance [6].

Bipolar disorder aggregates in families and twin studies report heritability of 60–80% [7, 8]. Molecular genetic studies have led to valuable insights about the genetic influence on bipolar disorder [9,10,11]. Genome wide association studies (GWAS) indicate that a substantial part of the genetic liability is conferred by common single nucleotide polymorphisms (SNPs) [12, 13]. In the most recent GWAS approximately 64 SNPs with genome-wide significant associations with bipolar disorder have been identified [10]. The variance explained by additive effects for bipolar disorder (SNP heritability) has been estimated to be 18.6% [10].

Polygenic risk scores (PRS) use data from large-scale GWAS and combine the effects of many common SNPs to capture the cumulative effect of risk alleles [14]. Risk alleles for bipolar disorders have been reported to overlap with other psychiatric diagnoses such as schizophrenia, major depressive disorders, autism spectrum disorder, anxiety disorders, and traits like intelligence [15,16,17].

Genetic risk for bipolar disorder might manifest in clinical and sub-clinical neurodevelopmental, emotional, or behavioral traits in the general population. The timing and nature of such associations could be informative for early identification efforts. High-risk prospective studies of the offspring of adults with bipolar disorder show elevated rates of behavioral disorders, Attention deficit/Hyperactivity disorder (ADHD), anxiety and depression as well as bipolar spectrum disorder in the offspring [18]. Bipolar PRS also are associated with risk of bipolar disorder in high-risk offspring [19]. These studies suggest that bipolar PRS may manifest early in development.

Prospective population-based cohorts, like the Norwegian Mother, Father and Child Cohort Study (MoBa), represent a unique framework for studying the associations between development, behavior and emotions, and genetic risk from an early age [20]. Only a few studies have used PRS to investigate associations between genetic risk for bipolar disorders and traits related to mental health in adulthood [21, 22], and as far as we know only two studies investigated associations in childhood [23, 24]. The knowledge on how PRS for bipolar disorder is associated with development, behavior and emotions during early childhood in the general population is still uncertain [24].

The potential for bipolar disorder PRS to be differently associated between sexes has been under-explored. The prevalence of bipolar disorder is similar in males and females, but females are more likely to experience rapid cycling and mixed states, and to have patterns of comorbidity that differ from males [1].

Access to the largest and most recent bipolar disorder GWAS provides increased statistical power. In the present study, we aimed to investigate whether, when, and how PRS for bipolar disorder is associated with development, behavior and emotions during early to middle childhood in the general population, in both males and females.

Materials and methods

Participants and measures

MoBa is a longitudinal prospective pregnancy cohort including approximately 114,500 children, their mothers, and fathers [25, 26]. Between 1999 and 2008 pregnant women were recruited to the study and gave written informed consent to participation in 41% of the pregnancies. Blood samples were collected from the children’s umbilical cord at birth [27]. The establishment of MoBa and initial data collection was based on a license from the Norwegian Data Protection Agency and approval from The Regional Committees for Medical and Health Research Ethics. MoBa is currently regulated by the Norwegian Health Registry Act. The current study was approved by The Regional Committees for Medical and Health Research Ethics (14140) and has undergone a Data Protection Impact Assessment.

Mother-reported questionnaire data was collected at different time points, during pregnancy and after birth. In our analyses we use data collected when the children were aged 6 and 18 months, 3, 5, and 8 years. These questionnaires includes several measures of developmental traits, as previously described elsewhere [28]. The current study is based on version 12 of the quality-assured data files released for research in January 2022.

The phenotools-package v0.2.7 (https://github.com/psychgen/phenotools) in R was used to prepare the outcome variables, and to ensure reproducibility. A mean score of the items was computed for each instrument, requiring at least half the items to be non-missing and multiplied by the number of items in the instrument to give a representative score on the scale of the instrument. Measures were reverse coded where necessary so that positive scores reflected higher scores on the measure.

Social communication difficulties were derived from the Ages and Stages Questionnaire (ASQ) scale at 6 months [29]. Both social communication difficulties and repetitive behavior were assessed using items from the Modified Checklist for Autism in Toddlers [30, 31] at 18 months and items from the Social Communication Questionnaire [32] at 3 and 8 years. The mothers reported on a short version [33] of the Childhood Autism Spectrum Test [34] when the children were 5 years old.

Inattention and hyperactivity/impulsivity measures were derived using the Diagnostic and Statistical Manual of Mental Disorders (DSM)-oriented ADHD problems scale of the Child Behavior Check List (CBCL) [35] at 18 months and 3 years. At 5 years the revised Conner’s Parent Rating Scale [36] was used, and items from the Parent/Teacher Rating Scale for Disruptive Behavior Disorders (RS-DBD) [37] was used at 8 years.

Disruptive behaviors were assessed using items from the Aggressive behavior syndrome scale of the CBCL at age 18 months, 3 and 5 years for aggression, and by the RS-DBD at age 8 years, divided into measures of oppositional defiant and conduct difficulties.

Language difficulties were measured at 18 months, 3 and 5 years using the ASQ [29] and the Children’s Communication Checklist-2 [38] at 8 years. Motor difficulties were measured using the ASQ at 6 and 18 months, and the Children’s Development Inventory [39] at 5 years.

Emotional difficulties were assessed at 18 months, 3 and 5 years using the CBCL [40]. Anxiety and depressive signs were measured separately by the 5-item Screen for Child Anxiety Related Disorders [41] and the Short Mood and Feelings Questionnaire [42] at 8 years.

Temperamental/personality traits were measured at 6 months using the Infant Characteristics Questionnaire [43]. At 18 months, 3 and 5 years the Emotionality, Activity and Shyness Temperament Questionnaire [44] was used, and the Norwegian short form of the Hierarchical Personality Inventory for Children [45] was used at 8 years for the personality traits neuroticism, imagination, extraversion, conscientiousness and benevolence.

Details about each instrument is in the supplementary Text 1.

As a secondary set of outcomes to contextualize manifestations of bipolar disorder PRS in clinical terms, we extracted diagnostic outcomes from the Norwegian Patient Registry which contains information on diagnoses from the International Classification of Diseases, Tenth Revision (ICD-10) on in-and outpatients reported from all hospitals and specialized health care services in Norway from 2008-2019 [46]. We created the following diagnostic groups: ADHD without conduct disorder (ICD-10 code F900, F908 and F909, n = 1738), Disruptive behavior disorders (ICD-10 codes F91, F901 and F92, n = 348), Autism spectrum diagnosis (ICD-10 code F84, n = 332), Affective disorders (ICD-10 codes F31-F39, n = 164) and Anxiety disorders (ICD-10 codes F40, F41 and F93, n = 649).

Genotyping and polygenic scores

The genotyping, imputation and quality control of the genetic data is described in the supplementary Text 2. Genotype data passing quality control filters was available for 28,001 unrelated children (47.9% female) of European genetic ancestry. Information on sex was retrieved from the Medical Birth Registry, which is a national health registry containing information about all births in Norway.

PRS for bipolar disorder were generated, using PRSice2 [47], for each child in our analytic sample, based on summary statistics from European samples from the most recent Psychiatric Genomic Consortium (PGC) GWAS for Bipolar Disorder (41,917 cases and 371,549 controls) [10]. Summary statistics from PGC was subject to standardized quality control as outlined in the original paper [10].

The PRS was adjusted for the covariates genotyping batch and population stratification. We used PRS built on 10 p value thresholds for inclusion of SNPs with progressively weaker associations with the disorder in the original GWAS. To avoid overlap between the PGC GWAS and our target sample, Norwegian participants were omitted from the PGC GWAS. A sub-sample from Norway consisting of 1883 adults with bipolar disorder and 47 237 controls were removed from the GWAS to make sure there could be no overlap in the discovery sample and our target sample.

To guard against inflated Type I error from overfitting, we performed a principal component analysis on the set of 10 PRS, using the prcomp function in R. The first principal component reweights the variants included to achieve maximum variation over all the 10 PRS thresholds used [48]. The first PRS principal component was used as the exposure variable in the regression models. All statistical analyses reflect associations between PRS principal components, but the abbreviation PRS will be used in the text.

Statistical analyses

The lavaan-package v0.6-7 [49] was used to run multi-group linear regression models in R v3.6, estimating associations between the PRS and each of the outcome measures. Sex differences were investigated by including sex as a grouping variable. PRS and outcome measures were standardized to zero mean and unit variance prior to analyses.

To account for multiple testing, we corrected the critical p-value threshold corresponding to an alpha level of 5% for the number of effective tests run (Bonferroni correction). To determine the number of effective tests, we ran a principal component analysis on all 51 outcome measures [50]. The number of tests was defined as the number of principal components explaining 80% variance, which was 30. An alpha level of 5% is therefore reflected in a corrected p value of 0.0017 (0.05/30).

We used equivalence testing to examine whether estimated effects could be considered as equivalent to zero in practical terms. This can be done by setting bounds around the point null to create a region of practical equivalence to zero values for which are based on a selected smallest effect size of interest (SESOI). The equivalence testing procedure involves performing two one-sided tests to determine whether the effects at least as extreme as our SESOI can be rejected [51]. Objective SESOI-setting depends upon the intended application of the estimate (for example, as a tool for clinical stratification) and the strength and nature of the existing evidence for the effect in question. In the absence of an objectively agreed-upon SESOI, equivalence testing using a pre-specified SESOI can provide useful additional context when used in conjunction with null hypothesis significance testing (NHST). As such, we use a benchmark value of half a ‘small effect’ (i.e., Cohen’s d = 0.1) [52] as our SESOI as an agnostic starting point with the aim that, as this approach becomes more commonplace in the field, more informed SESOI will be possible.

The equivalence testing was also performed with an alpha level of 5%, after adjustment of multiple testing correction (as described above). However, it is important to note that because equivalence testing consists of two one-sided tests, both of which need to be significant to support a conclusion of practical equivalence to zero, this alpha level is preserved by assessing whether 90% (and not 95%) confidence intervals fall within the region of practical equivalence to zero. Therefore, the equivalence test results are presented with multiple testing-corrected 90% confidence intervals for straightforward interpretation in relation to the null region defined by the SESOI.

As a secondary set of analyses, to see if any patterns from the analyses of dimensional measures could be replicated for categorically defined diagnoses, we ran logistic regression models estimating the associations between the PRS and psychiatric diagnoses in childhood. These analyses are considered as secondary analyses due to power issues preventing sex stratification as per the primary analyses.

Results

Descriptive statistics for the developmental outcomes are presented in Table 1 for males and females separately, including number of items and Cronbach’s alpha for each measure. Supplementary Fig. 1 shows the correlation between these outcomes.

Table 1 Descriptive statistics for measures of neurodevelopmental traits at all ages for children with genotype data.

PRS for Bipolar Disorder and developmental outcomes

The associations between bipolar disorder PRS and developmental outcomes at different time points and in males and females are shown in Fig. 1. Only two outcomes passed multiple testing correction, and we found no robust evidence of sex differences. There was robust evidence for a sex invariant association with conduct difficulties (β = 0.041, CI = 0.020–0.062, P = 0.0001) and oppositional defiant difficulties (β = 0.032, CI = 0.014–0.051, P = 0.0006).

Fig. 1: Bipolar disorder PRS and developmental outcomes.
figure 1

PRS for Bipolar disorder and measures of repetitive behavior, social communication difficulties, language and motor difficulties, hyperactivity, inattention, anxiety, depression, emotional difficulties, fussiness, and trait measures of emotionality, activity, shyness and sociability and personality trait measures benevolence, conscientiousness, extraversion, imagination, and neuroticism. Estimates from linear regression models with sex as a grouping variable in a multi group framework. The darker fill intensity indicate which model (sex difference or no sex difference) provided a better fit to the PRS. Estimates from the better-fitting (sex difference or no sex difference) model also have 95% confidence interval bars, whilst those from the poorer-fitting model are presented only as point estimates for reference. Results presented in a triangle means they passed multiple testing correction with a p value < 0.0017, corresponding to an alpha of 5%.

The results from the equivalence testing on the associations between bipolar disorder PRS and developmental outcomes are shown in Fig. 2. Effects are categorized as either (a) null effects (estimated entirely within the region of practical equivalence to zero); (b) non-null effects (estimated at least partly outside the region of practical equivalence to zero and entirely distinct from the point null); or (c) effects about which we must remain undecided (estimated at least partly outside the region of practical equivalence to zero but not distinct from the point null). See supplementary information for equivalence testing results with the SESOI bounds and multiple testing-adjusted 90% confidence intervals (Supplementary Fig. 2). Figure 2 shows the categorisations in the context of the most extreme absolute value within the multiple testing-corrected 90% confidence interval range for each measure, in relation to the SESOI. The majority of effect estimates could be declared as null on the basis of the equivalence test results. We did not identify any sex differences in the equivalence testing. Estimates for oppositional defiant and conduct difficulties at age 8 were non-null. In the case of motor difficulties at age 3, activity levels at age 5, and benevolence, inattention, and hyperactivity at age 8 the data are inconclusive for these outcomes. Results were also inconclusive for social communications difficulties at 5 years and extraversion at 8 years, but it is worth noting that effects in these domains were estimated substantially less precisely due to, respectively, low sample size and sex differentiation. This means that although practical equivalence to zero was not supported for these outcomes, their point estimates were not particularly extreme.

Fig. 2: Equivalence test results for all developmental outcomes.
figure 2

This figure shows the categorizations in the context of the most extreme absolute value within the multiple testing-corrected 90% confidence interval range for each measure, in relation to the SESOI. Where these values are less than the SESOI, the equivalence test result estimated within the region of practical equivalence to zero. Bipolar disorder PRS and developmental outcomes of repetitive behavior, social communication difficulties, language and motor difficulties, hyperactivity, inattention, anxiety, depression, emotional difficulties, fussiness, emotionality, activity, shyness, sociability, benevolence, conscientiousness, extraversion, imagination, and neuroticism. The null hypothesis in the table refers to a composite null hypothesis of the NHST plus equivalence test. Results presented in a triangle means the composite null test could be rejected. Results presented as circles means they could not be rejected, and results presented as squares means it remains undecided.

Figure 3 shows how the bipolar disorder PRS associates with conduct and oppositional defiant difficulties as the burden of risk variants increases. The plots demonstrate the decreased risk among individuals in the top and bottom deciles of PRS, relative to individuals with PRS in the middle of the distribution. For conduct difficulties, our results show increased risk in the top 90% percentile compared to the bottom 10% with non-overlapping confidence intervals, but not different from the individuals in the middle of the distribution. The confidence intervals overlap between all percentiles for oppositional defiant difficulties, although individuals in the top decile had higher mean score than those in the bottom decile.

Fig. 3: Decile plot for PRS for bipolar disorder and conduct difficulties and oppositional defiant difficulties.
figure 3

Decile plots with confidence intervals of the mean at each decile. The plots demonstrate the increased risk among individuals from the bottom to the top percentiles of Bipolar disorder PRS, relative to individuals in the middle of the distribution.

PRS for bipolar disorder and diagnostic outcomes

The associations between bipolar disorder PRS and grouped diagnostic measures are shown in Fig. 4. None of the outcomes passed multiple testing correction. However, the results from the equivalence testing indicated effects larger than the SESOI could not be ruled out (Supplementary Table 1) for any grouped diagnostic measures, suggesting that more data are needed to draw conclusive inferences about the presence or absence of effects for these outcomes.

Fig. 4: Bipolar disorder PRS and diagnostic groups.
figure 4

PRS for Bipolar disorder and grouped diagnostic measures. ADHD_noconduct; ADHD without conduct disorder (combined F900, F908, F909), Affective; affective disorder (combined F31-F39), Anxiety; anxiety disorders (combined F40, F41, F93), autism; (F84), DisruptiveBD; Disruptive behavior disorder (combined F91, F901, F92). Estimates from logistic regression model. Estimates shown with 95% confidence interval bars.

As a post hoc analysis we investigated if the criteria listed in DSM-5 for oppositional defiant difficulties (irritable mood, defiant behavior and vindictiveness) and conduct difficulties (aggression, deceitfulness, destruction, violation of rules), measured dimensionally in MoBa, were associated with our bipolar disorder PRS, these results are presented in Supplementary Fig. 3. We identified an association for bipolar disorder PRS and defiant behavior, vindictiveness and aggression.

Discussion

We investigated the associations between bipolar disorder PRS and a range of developmental outcomes from infancy to middle childhood. Using results from the largest and most recent GWAS on bipolar disorder we calculated PRS in the largest genotyped population-based pregnancy cohort to date. We found robust evidence for an association with oppositional defiant and conduct difficulties at 8 years. These associations were large enough to be categorized as non-null, falling outside a pre-specified region of practical equivalence to zero based on a SESOI of 0.1 SDs. We remain undecided on whether bipolar disorder PRS is associated with motor difficulties at age 3, activity levels at age 5, and benevolence, inattention, and hyperactivity at age 8, and the grouped diagnostic measures. Most other observed associations were equivalent to zero.

Our main finding is that bipolar disorder PRS manifests in childhood oppositional defiant and conduct difficulties. This is in line with findings from family studies showing that offspring of parents with bipolar disorder are at an increased risk of developing oppositional defiant or conduct disorder, compared to offspring of parents without bipolar disorder [18, 53, 54]. Our observation is also supported by clinical studies reporting lifetime comorbidity of bipolar and conduct disorder [55, 56]. They suggest that conduct disorder might be predictive of future bipolar disorder or account for the failure of early detection of bipolar disorder.

We could not find any studies investigating the association between bipolar disorder PRS and oppositional defiant or conduct difficulties specifically. We explored if listed criteria from DSM-5, measured dimensionally in MoBa, drives the association observed in our sample. The results showed associations with defiant behavior and vindictiveness, but we were unable to identify an association with irritable mood. According to DSM-5 [1] it is not unusual to show the behavioral features without irritable mood in children with oppositional defiant difficulties.

Categorical definitions of psychiatric disorders may not be optimal for investigating associations of genetic risk [57]. The equivalence testing indicated that effects larger than the SESOI could not be ruled out. There were few individuals with a diagnosis in our sample, which most likely explains why we were unable to find any robust associations.

Like other big cohort studies, we identified only a few robust associations with childhood development. In the Avon Longitudinal Study of Parents and Children (ALSPAC) including 6 105 children, aged 7–9 years, PRS based on a smaller discovery sample for bipolar disorder had a robust association with ADHD, while no strong evidence was found for association with emotional or behavioral difficulties [24]. In a meta-analysis including 42,998 individuals aged 6–17 years, no strong evidence for associations between PRS for bipolar disorder and any measured childhood emotional, behavioral or neurodevelopmental trait was identified [23]. Bipolar disorder has been suggested to be a neurodevelopmental disorder by some [58, 59], and according to a Dutch twin study [60] one would then expect the first signs of the illness to manifest early in development and before first manic or depressive episode. In the Pittsburgh Bipolar Offspring study, 75.6% of children of parents with bipolar disorder who themselves also developed bipolar disorder had onset prior to age 12 [18]. Our PRS associations with disruptive behaviors were only robust after age 8 years, suggesting a certain degree of maturation is required before the genetic vulnerability is expressed in observable behavior. Our analyses should be followed up when data from the 14-year questionnaire have been released from MoBa. It will be important to examine if the PRS associations are then more robust and broader, and if any sex differences are detected.

Our study is not without limitations. Some limitations are unique to analyses using PRS based on GWAS. For example the GWAS sample size will affect how many SNPs have been identified as risk SNPs, and the accuracy of individual predictions rely on the size of the GWAS [61]. So far GWAS yield small effect sizes, and ascertainment methods both in the GWAS sample and the target sample will affect the accuracy of PRS [10]. GWAS do not capture rare genetic variants, so only associations with common variants are investigated. Some limitations are specific to using data from MoBa. It is important to note that, the available measures at various ages are not identical across timepoints, they are not analyzed as developmental trajectories, but are specific for given time points for each measure. We used of one of the largest prospective population-based pregnancy cohorts worldwide, but the current subsample may not be adequately powered to identify small effects in some of the measured domains. MoBa is subject to attrition, just like all longitudinal studies [62]. Previous studies have shown that predictors of attrition include presence of behavioral difficulties in the study child [63]. Selective attrition could lead to bias in our estimates; likely in the form of an underestimation of associations between the PRS and the developmental traits. Future studies should investigate this, but it might be that power will be a limitation until sample sizes increase or the predictive power of PRS is substantially enhanced. PRS combined with cognitive performance tests, cortical thickness measures and gray matter density maps, might have increased classification performance, but these findings need to be investigated further [64].

In conclusion, our results suggest that genetic risk for bipolar disorder, as indexed by PRS, might manifest as disruptive behaviors in childhood in the general population. In the case of motor difficulties, activity levels, social communication difficulties, benevolence, inattention, extraversion, and hyperactivity the data were inconclusive. It will be important to examine if the PRS associations are more robust and if any sex differences are detected with a bigger sample and measures available at an older age.