Introduction

Monogenic epilepsies are a heterogeneous collection of rare severe developmental epileptic encephalopathies characterized by a combination of frequent epileptic activity with developmental impairment [1]. Large-scale genetic screens helped delineate the phenotypic spectrum of monogenic epilepsies and identified >150 causal genes [1,2,3]. Many of these genes encode ion channel proteins and play a role in neuronal excitability and action potential generation [1]. Individuals carrying pathogenic variants in the same monogenic epilepsy gene can exhibit varying severity of their phenotype with potentially different comorbid disorders (e.g., autism, ataxia, intellectual disability, or non-neurological disorders) [1].

Several established monogenic epilepsy genes cause mild to severe epilepsy forms [4]. Particularly, loss-of-function (LoF) variants in SCN1A are the most frequent cause of developmental epileptic encephalopathies [4]. Concurrently, other variants in SCN1A confer clinically non-actionable risk for common forms of epilepsy with typically milder outcomes [5]. Many other epilepsy genes cause comorbid disorders that significantly impact the quality of life and life expectancy of affected individuals [6]. However, not every patient expresses the full spectrum of comorbidities, and for many well-established epilepsy genes, new phenotypes are discovered years after the first association with epilepsy [7, 8].

Genotype-phenotype associations for epilepsy genes are typically performed through genetic testing for people with a neurological disease, leading to a potential strong ascertainment bias. We hypothesized that coding variants in epilepsy-associated genes may lead to other recognizable clinical traits in the general population not enriched for individuals with epilepsy. Therefore, we sought to identify associations between established monogenic epilepsy genes and non-epilepsy but disease-related phenotypes in the general population represented by the UK Biobank (UKB) [9].

Methods

To select a high-confidence set of established monogenic epilepsy genes, we merged three studies [1,2,3] (Supplementary Table 1). Variant level genotype and association data for 3700 phenotypes were obtained for all selected epilepsy genes using GeneBass [9] (accessed in August 2021), a resource of rare variant association statistics from single-variant and gene-based tests in 281,850 individuals with whole-exome sequencing (WES) data from the UKB project. The UKB provides medical and genetic data for over 500,000 participants living in the UK [10]. We collected association statistics from SKAT-O, gene-based burden, and single-variant association tests of LoF variants (nonsense, frameshift, splice-site) in all 127 high-confidence epilepsy genes. SKAT-O, gene-based burden, and single-variant tests were performed using a generalized mixed-model as implemented in SAIGE-GENE that can account for population structure and sample relatedness in large cohorts [11]. We explored phenotype associations only for LoF variants, as their functional consequence is more robust to interpret than missense variants. The gene-based burden test provides the highest power when all variants are causal and have the same effect direction, while SKAT-O retains power when only a proportion of the tested variants are causal and in the presence of trait-increasing and trait-decreasing variants [12]. To identify significant associations, we filtered the dataset according to the Bonferroni-corrected thresholds for significance as defined by the developers (α = 2.5 × 10−8 for SKAT-O; α = 6.7 × 10−7 for burden tests). The associations were categorized by trait type. We then explored all significant associations to identify LoF variants driving the signals.

Results

We generated a list of 127 high-confidence epilepsy genes from the three selected studies [1,2,3]. This dataset of 127 genes had 786–2903 genotype-phenotype associations per gene, each with two test statistics, equaling 485,593 genotype-phenotype associations. After multiple testing corrections, for seven of 127 epilepsy genes (5.5%), we found exome-wide significant associations with non-epilepsy phenotypes (Fig. 1A, B). Altogether, there were ten significant associations involving continuous traits only (Table 1). Continuous phenotypes are known to provide more power for association studies than categorical phenotypes [13]. Five associations (50%) in four genes were related to mental health phenotypes based on responses from a subset of 157,366 UKB participants who completed a mental health questionnaire (detailed description of phenotypes in Supplementary Table 2) [9]. Variants in SLC25A12, ADSL, LMNB2, and GABRA1 were associated with mental health phenotypes related to mood and alcohol usage disorders (Fig. 1C). The significant associations were between LoF variants in SLC25A12 and “recent changes in speed or amount of moving or speaking” (PSKAT-O = 3.00 × 10−15) and “recent lack of interest or pleasure in doing things” (PSKAT-O = 1.49 × 10−10); ADSL and “frequency of failure to fulfill normal expectations due to drinking alcohol in last year” (PSKAT-O = 1.57 × 10−13); LMNB2 and “recent changes in speed or amount of moving or speaking” (PSKAT-O = 5.26 × 10−12); and GABRA1 and “frequency of memory loss due to drinking alcohol in last year” (PSKAT-O = 7.52 × 10−9). We identified one association related to brain morphology; between LoF variants in GRIN2D and “volume of gray matter in the right frontal medial cortex” (PBURDEN = 2.63 × 10−7). Four associations were related to blood assay results. CASR was associated with calcium blood levels and CHD2 with lymphocyte percentage (PSKAT-O = 9.99 × 10−9), neutrophil percentage (PBURDEN = 1.29 × 10−7), and lymphocyte count (PBURDEN = 5.55 × 10−7). Two of ten identified gene-phenotype associations (SKAT-O or burden tests) had only one LoF variant with a nominal P value in the single-variant-based tests (SAIGE single-variant tests). Variants contributing to each association are detailed in Fig. 1C and Supplementary Table 3.

Fig. 1: Significant genotype-phenotype associations in epilepsy-associated genes.
figure 1

A Volcano plot illustrating the significant associations identified by the gene-based SKAT-O test. The horizontal dashed line indicates the threshold for a significant association (α = 2.5 × 10−8). The significantly associated genes are labeled according to the phenotype category in GeneBass (details listed in Table 1). B Volcano plot illustrating the significant associations identified by the gene-based burden test. The horizontal dashed line indicates the threshold for a significant association (α = 6.7 × 10−7). The significantly associated genes are labeled according to the phenotype category in GeneBass (as in Table 1). C Heatmap showing the association results of all LoF variants with single-variant P value ≤ 0.05 that were collapsed for each gene (human genome build GRCh38). *GABRA1 does not appear in (A, B) as a beta value was not available.

Table 1 Summary of significant genotype-phenotype associations.

Discussion

There has been an increasing interest in the phenotypic variability associated with monogenic epilepsies [4]. We used gene-based association summary statistics for 3700 phenotypes in 281,850 individuals from the general population to screen 127 established monogenic epilepsy genes for association with non-epilepsy phenotypes. We identified seven genes associated with disease-related phenotypes in the general population (GRIN2D with brain morphology; CHD2 with immune function; SLC25A12, LMNB2, ADSL, and GABRA1 with mental health; CASR with calcium metabolism).

Gain-of-function (GoF) variants in GRIN2D cause severe developmental epileptic encephalopathy [14]. Our study provides evidence that haploinsufficiency of GRIN2D is associated with gray matter volume in a general population cohort. Several GRIN genes are reported as dosage-sensitive, and both GoF and LoF variants are implicated in GRIN-related disorders [15]. Previously, variants in GRIN1 and GRIN2B were found to cause cortical development malformations (i.e., polymicrogyria) [7, 15]. Our study is in line with evidence that LoF variants in GRIN genes cause brain malformations in the rodent knockdown model; however, human malformations generally represent GoF pathophysiology [7].

We found three of the seven pleitropic epilepsy gene-phenotype associations add to a body of evidence for heterogeneous clinical effects of these genes. LoF variants in CHD2 were significantly associated with types of white blood cell counts. Truncating mutations in CHD2 cause a heterogeneous group of epilepsies varying in severity, from febrile seizures plus to more severe epileptic encephalopathies, including a Dravet-like syndrome and Lennox-Gastaut syndrome [8]. LoF variants in CHD2 have also been found to cause non-neurological cancer phenotypes, including chronic lymphocytic leukemia [16]. The evidence provided by our study may indicate patients with CHD2-related epilepsy have higher risk for cancers like lymphomas. However, further research is needed on this topic. In line with previous findings, we found that LoF variant burden in SLC25A12 and LMNB2 were associated with non-epilepsy phenotypes. Variants in SLC25A12 can cause mitochondrial dysfunction, resulting in neurological disease. Mitochondrial dysfunction has a negative impact on synaptic plasticity and neurogenesis associated with depression [17]. We found LoF variants in SLC25A12 associated with the UKB phenotype “recent lack of interest or pleasure in doing things” or anhedonia, a core phenotype of major depressive disorder. The pathogenicity of LMNB2 in disease development has not been studied extensively. However, dysfunction of LMNB2 results in diseases with various inheritance patterns, namely autosomal recessive progressive myoclonus epilepsy and autosomal dominant microcephaly and lipodystrophy [18]. Individuals with LMNB2-related progressive myoclonus epilepsy suffer from progressive ataxia and poor gait [18]. We identified a significant association between variants in LMNB2 and “recent changes in speed or amount of moving or speaking”, pointing to a deleterious effect of LoF variants in LMNB2 potentially independent of the seizure phenotype.

We found for two of seven epilepsy genes an association with previously known phenotypes (GABRA1 with alcohol usage [19]; and CASR with calcium metabolism that plays a vital role in brain development, regulation of neuronal excitability, and synaptic transmission [20]). Finally, the association of the LoF variant burden in ADSL with mental health issues related to alcohol usage is new.

One limitation of our study is the large sample size of the UKB cohort that enables the identification of small gene-level effects as statistically significant, especially for continuous phenotypes. Similarly, risk-increasing variants identified in few or single carriers will be of low clinical significance due to the very low frequency in the general population. However, such variants may be important for some carriers and place them on the extreme end of the trait distribution. Overall, a more detailed clinical follow-up of individual variant carriers is needed to understand the implications of our findings. In addition, the association analysis is based on WES data only from middle-aged individuals of European ancestry. Therefore, generalization of these findings to individuals of other ancestries and age groups is limited. Also, the 3700 phenotypes in the UKB are an extensive selection but may not encompass all relevant phenotypes for the investigated monogenic epilepsy genes. The mental health phenotypes are from questionnaires; thus, the symptoms are subjective and nonspecific. However, although health surveys are subject to biases, they can still be used for association estimates [10]. Exploration of WES data in additionally available biobanks is needed to validate our findings. Finally, our results should be interpreted considering the UKB doesn’t represent a broad distribution across all health outcomes and is enriched with healthy volunteers [10]. The healthy volunteer selection bias is particularly strong for neurological and psychiatric disorders, where disorder status may influence participation in research [10].

Overall, our study adds to our understanding of the phenotypic spectrum of several monogenic epilepsy genes and shows they are likely to cause recognizable clinical traits rather than severe epilepsy forms in otherwise healthy individuals. The evidence provided by the present study may help us better elucidate phenotypic heterogeneity with monogenic epilepsy-associated genes, promote further research and genetic screening of patients with atypical presentation, and inform clinical care of comorbid disorders in individuals with monogenic epilepsies.