Introduction

Epilepsy is one of the most common neurological conditions in children, and has substantial impact on patients’ quality of life and social integration. Epileptic encephalopathy is characterized by refractory seizures, cognitive dysfunction, and poor prognosis. Despite the recent progress in technology, molecular diagnosis of children suffering from possible epileptic seizures can be challenging, due to genetic and phenotypic heterogeneities. A large number of specific pathogenic variations have been related to various forms of epilepsies.1 Next-generation sequencing (NGS) has significantly improved the molecular diagnosis for rare diseases. NGS focusing on genes known to be associated with human diseases is a practical approach as a first-tier assessment for patients with heterogeneous genetic background.2,3,4

In addition, currently medical therapy for epilepsies is not based on the etiology, but the clinical manifestations, and the main purpose is not to rescue the underlying diseases process, but just to reduce the likelihood of seizures occurrence.5

In this study, we performed NGS on 733 children with epilepsy onset before 1 year of age, to detect and quantify genetic variants, and assess existing therapeutic effects. Our findings have important implications for the development of precision medicine strategies.

Materials and methods

Patient cohort

Patients were enrolled between January 1, 2014, and December 31, 2016 with the following inclusion criteria: (1) severe seizures in neonates or generalized epilepsy or intractable epilepsy in infancy with generalized tonic–clonic seizures, (2) seizures onset before 1 year of age, and (3) epileptic syndromes/epileptic encephalopathies with unknown etiology. Patients were excluded if they had traumas, central nervous system infections, hypoxic–ischemic encephalopathy, vascular events, systemic infections, and diagnosed metabolic disorders, and pathogenic copy-number variants were identified using array-based comparative genomic hybridization (CGH).

Clinical manifestations, laboratory results including head computed tomography (CT) and magnetic resonance imaging (MRI), application of antiepileptic drugs, and prognosis were reviewed. Median follow-up duration after was 6 months. Seizure outcome was assessed by using 3 categories: seizure free, seizure reduction, and no change in seizure frequency.

All samples in this study were collected with appropriate informed consent and approval of the ethics committee of Children’s Hospital, Fudan University. The methods used in this study were carried out in accordance with the approved guidelines. The clinical trial registration number is NCT02552511.

Library preparation and sequencing

We performed capture-based targeted resequencing on 2742 genes between January 1, 2014, and October 31, 2015 or exome sequencing (ES) between November 1, 2015, and December 31, 2016. Genomic DNA fragments of patients were enriched for panel sequencing using the Agilent (Santa Clara, CA, USA) ClearSeq Inherited Disease panel kit or for exome sequences using the Agilent SureSelectXT Human All Exon 50-Mb kit. DNA fragments were ligated with adaptors and two paired-end DNA libraries with insert size of 500 bp were formed for all samples. DNA libraries after the enrichment by polymerase chain reaction (PCR) were sequenced on the HiSeq2000/2500 sequencer according to the manufacturer’s instructions (Illumina, San Diego, CA), resulting in the 90-bp paired-end sequencing reads with at least 100-fold average sequencing depth for each sample.

Alignment and mapping

Reads with adaptors, reads in which unknown bases (Ns) are more than 10%, and low-quality reads (the percentage of low-quality bases is over 50% in a read, and the low-quality bases are those whose sequencing quality is no more than 5) were discarded from raw data to generate clean reads. Clean reads were aligned to the reference human genome (UCSC hg19) by Burrows–Wheeler Aligner (BWA) (v.0.5.9-r16). Subsequent processing of sorting, merging, and removing duplication for the BAM files were performed by using SAMtools and Picard (http://picard.sourceforge.net/index.shtml). Variant calls, which differed from the reference sequence, were obtained with the use of GATK.

Variant annotation and filtering

Variants with suboptimal quality scores were removed from consideration. Remaining variants were annotated by ANNOVAR and VEP software6,7 and compared computationally with the list of reported pathogenic variations from the Human Gene Mutation Database (HGMD, professional version). Variants in HGMD were retained. For changes that are not in the HGMD, synonymous variants, intronic variants that were more than 15 bp from exon boundaries (which are unlikely to affect messenger RNA splicing), and common variants (minor allele frequency >1%) were also discarded. Missense variants were assessed with SIFT, PolyPhen-2, and MutationTaster.8,9,10 Our data was shared in Figshare (https://figshare.com/s/2673f93c2398e8d0ee62).

Criteria for classifying pathogenic and likely pathogenic variants

Pathogenic variants: (1) this variant would likely explain the indication for testing and may be responsible for this individual clinical presentation; (2) same nucleotide and amino acid change as a previously established pathogenic variant from both published studies and internal database.

Likely pathogenic variants: (1) this variant would likely explain the indication for testing and may be responsible for this individual clinical presentation; (2) same amino acid change as a previously established pathogenic variant regardless of nucleotide change; or null variant (nonsense, frameshift, canonical +/−1 or 2 splice sites, initiation codon, single or multiexon deletion) in a gene where loss of function (LOF) is a known mechanism of disease; (3) de novo (both maternity and paternity confirmed) in the proband in the negative family history; or inherited from the affected parents.

Positive results: (1) one heterozygous pathogenic or likely pathogenic variant on an autosomal dominant, or X-linked dominant gene that explains the indication for testing; (2) one homozygous or two heterozygous pathogenic or likely pathogenic variants (compound heterozygous) on an autosomal recessive gene that explains the indication for testing; (3) one pathogenic or likely pathogenic variant on an X-linked recessive gene in a male patient that explains the indication for testing.

Results

Patient characteristics

Our cohort consisted of 733 patients. Among them, 411 (56.1%) were males and 322 (43.9%) were females. The demographic and basic clinical characteristics are shown in Table 1. We classified patients based on the onset age of seizures into neonatal group (305) and before 1 year old (428). In this cohort, 476 patients were sequenced using the 2742-gene panel, and 257 patients were screened using exome sequencing.

Table 1 Demographic and clinical characteristics of 733 patients

Capture-based targeted sequencing

For patients undergoing capture-based targeted sequencing, on average, 25.26 million effective reads were generated and the average on target sequencing depth was 209.84× with 98.95% regions covered at greater than 20×, and 99.65% covered at greater than 10×. At least 99.37% of targeted regions were covered. For ES, 71.34 million effective reads were generated and the average on target sequencing depth was 125.47×, with 97.34% covered at greater than 20×, and 99.13% covered at greater than 10×. At least 99.48% of target regions were covered. These results suggested that both capture-based and exome sequencing are sufficient to yield high-quality data for further analysis (Supplementary Table 1).

Genetic spectrums in this cohort

Collectively, 275 variants spanning 90 genes were identified and classified as pathogenic or likely pathogenic in 235 patients (733, 32.1%) (Supplementary Table 2). Among them, 130 /275 variants (47.3%) were previously reported as pathogenic, and 145/275 variants (52.7%) were novel or containing different amino acid change from previously reported pathogenic variants. Among the 275 variants, 155 were missense variants, 49 were frameshift, 45 were nonsense variants, 22 were splicing variants, and 4 were nonframeshift variants.

One hundred forty-nine pathogenic or likely pathogenic variants in 127 patients (476, 26.7%) were identified by capture-based targeted sequencing covering 2742-genes and 126 pathogenic or likely pathogenic in 108 patients (257, 42.0%) were identified by ES.

Genes with pathogenic/likely pathogenic variants identified in more than four patients were ABCC8, CDKL5, DEPDC5, KCNQ2, MECP2, MUT, PCDH19, PRRT2, SCN1A, SCN2A, STXBP1, and TSC2, accounting for 48.7% (134/275) of all pathogenic or likely pathogenic variants in our cohort.

In this cohort, we identified five patients with pathogenic or potential pathogenic variants in multiple genes. Patient 43 was identified with GABRA1 paternal missense variant and SCN2A de novo missense variant. Patient 367 was identified with compound deleterious heterozygous variants in GLDC and a reported CACNA1H missense variant. Patient 633 was detected with two reported nonsense pathogenic variations in PCDH19 and SCN1A gene, respectively. For patient 347 and 602, additional reported splicing site variant in VWF gene, and reported nonsense pathogenic variation in NF1 gene, were identified, respectively.

Genetic spectrum in subgroups with different onset age of seizures

The subgroups with different onset age of seizure showed different pathogenic variant (PV) spectrum (Fig. 1a, b). In this study, we identified 40 variants in SCN1A, only 3 patients (7.5%) with seizures onset within the first month, the distribution showed a significant difference (p < 0.0001) (Fig. 1c). We identified 21 variants in KCNQ2 gene, 20 (95.2%) patients with seizures in the neonatal period, the distribution showed a significant difference (p < 0.0001) (Fig. 1c); and 23 variants in TSC2 gene, only 4 (17.4%) patients with seizures in neonatal period, the distribution showed a significant difference (p = 0.0427).

Fig. 1: Genetic disorder spectrum in the subgroups with different onset ages.
figure 1

a Gene distribution in subgroup with different onset age; the two groups were patients with neonatal seizures and patients with seizures or epilepsy onset within the first year of life. Patients with development delay are denoted by red; patients with normal intellectual are denoted by dark blue; unable to assess the development or lost follow-up are denoted by yellow. b Bar plot for the number of genes observed in each onset age (neonatal and less than 1 year). Each bar in each subfigure represents the counts for the genes labeled above. c Pie chart for the count of the recurrent corresponding gene (SCN1A, KCNQ2) in the two onset ages. The radius of each pie is proportional to the total observation number for each gene

Neonatal seizures

One hundred four pathogenic or likely pathogenic variants were identified in 86/305 patients (28.2%) with neonatal seizures, with KCNQ2 (20/86cases, 23.3%), and STXBP1 (7/86, 8.1%) being the two most frequently mutated genes (Fig. 1a, b).

Among these patients, 35/86 patients (40.7%) had poor prognosis, including 32 patients with developmental delays, and 3 deaths due to CPT II deficiency, lethal neonatal (MIM 608836, CPT2); immunodeficiency 34, mycobacteriosis, X-linked, (MIM 300645, CYBB); and methylmalonic aciduria, mut(0) type, (MIM 251000, MUT). The development statuses of 34 patients (72, 36.1%) cannot be assessed due to age limitation (Fig. 1a, b).

In the neonatal group, our data revealed 36 (86, 41.9%) patients with medical actionable disorders based on molecular diagnosis including ALDH7A1, CPT2, DEPDC5, KCNQ2, KCNQ3, KCNT1, SCN1A, SCN2A gene variants. The implications for treatment decisions include avoiding trigger factors, feeding with special formula diet, and specific medicine. Detailed information is listed in Supplementary Table 2.

Onset less than 1 year of age

Two hundred seventy-one pathogenic or likely pathogenic variants were identified in 149/428 patients (34.8%) with onset before 1 year of age. The top two most frequently mutated genes were SCN1A (37/149, 24.8%) and TSC2 (19/149, 12.8%).

In this subgroup, 95/149 patients (63.8%) had poor prognosis, including 94 patients with developmental delays, and 1 death due to propionic acidemia (MIM 606054, PCCA) (Fig. 1a, b).

In the within 1-year group, there were 74 (49.7%) patients with medical actionable disorders based on molecular diagnosis including CDKL5, DEPDC5, GRIN2A, KCNQ2, MECP2, MTHFR, PCCA, SCN1A, SCN2A, TSC1, TSC2 gene variants. The implications for treatment decisions include avoiding trigger factors, feeding with special formula diet, and specific medicine.11,12 Detailed information is listed in Supplementary Table 2.

Collectively, our results revealed that patients with neonatal seizures were less likely to have developmental delays (p = 0.000614) than the within 1-yr age group. There was no statistically significant in the positive rate between the two subgroups (p = 0.092).

Intervention and effectiveness

In this study, we identified 40 variants in the SCN1A gene. Treatment effects of this patient group are shown in Fig. 2a. Nineteen of the patients achieved remission or were seizure free after treatment. However, 6 of these 19 patients with effective treatments had developmental delay. Twenty-one patients did not respond to the current treatment. There were in total seven antiepileptic drugs used in this group, and 28 patients (70%) used more than two drugs simultaneously. The therapeutic effect of each drug was not statistically significant in the seizure remission group and inefficacy group. In these 40 patients with SCN1A gene variant, 6 were treated with oxcarbazepine (OXC) and 1 was treated with lamotrigine (LTG); both of these drugs belong to the sodium channel blockers, which may increase seizure frequency and should be applied with caution for SCN1A-positive patients.13 Stiripentol belongs to the group of aromatic allylic alcohols, and was recommended for SCN1A-positive epilepsy patients.14

Fig. 2: Treatment in patients with SCN1A, KCNQ2, and TSC2 gene pathogenic variation.
figure 2

a Treatment in patients with SCN1A gene pathogenic variation. b Treatment in patients with KCNQ2 gene pathogenic variation. c Treatment in patients with TSC2 gene pathogenic variation. The horizontal coordinate represents specific drugs. The ordinate represents the percentage of the number of cases. LEV levetiracetam, VPA valproate, BDZ benzodiazepines, TPM topiramate, LTG lamotrigine, OXC oxcarbazepine, KD ketogenic diet, PB phenobarbital, VGB vigabatrin

In this cohort, 21 patients were identified with KCNQ2 gene variant. Treatment effects of patients with KCNQ2 gene variant are shown in Fig. 2b. Eight of the patients achieved remission or were seizure free after treatment, and 3 were lost to follow-up. However, 5 of these 8 patients with effective treatments had developmental delay. Eight patients did not respond to the current treatment. There were in total seven antiepileptic drugs used in this group, and 12 patients (66.7%) used more than two drugs simultaneously. The therapeutic effect of each drug was not statistically significant in the seizure remission group and inefficacy group. Case 51 had seizure onset at first 4 h of life but there was no remission with phenobarbital (PB), levetiracetam (LEV), and valproate (VPA) combination therapy. At the age of 1 year, retigabine was used and it improved the seizure frequency, but this patient had motor development delay and speech delay at the age of 19 months.

We identified 23 variants in the TSC2 gene. Treatment effects of this patient group are shown in Fig. 2c. Fourteen of the patients achieved remission or were seizure free after treatment, and 4 patients were lost to follow-up. However, 10 of these 14 patients with effective treatments had developmental delay. Only 5 patients did not respond to the current treatment. There were in total seven antiepileptic drugs used in this group, and 12 patients (52.2%) used more than two drugs simultaneously. The therapeutic effect of each drug was not statistically significant in the seizures remission group and inefficacy group. The mTOR-inhibitor rapamycin is an established precision medicine approach for TSC1 and TSC2 variants. In this study, 9 (19, 47.4%) were treated with armpamycin. Eight of them were treated with combination therapy, with 4 becoming seizure free, and 3 achieving remission.

Discussion

NGS has significantly changed the approach to PV identification for rare diseases. There are several studies focused on application of NGS as a diagnostic tool for children with epilepsy.15,16,17,18 Mercimek-Mahmutoglu et al.18 reported that targeted NGS panels for epileptic encephalopathies have identified the underlying genetic causes in 28% of the 110 patients with epileptic encephalopathy. In a very recent large cohort study, the authors summarized the genetic results of 327 children suffering from epilepsy with an onset at less than 3 years of age. The identified pathogenic variants were 31 of 114 (27.2%) with epilepsy panels and 11 of 33 (33.3%) with exome. To the best of our knowledge, this is the largest Chinese study interrogating the genetic spectrum in infants with epilepsy onset within the first year of life, and this cohort study first demonstrates the genetic spectrum in different onset stages of epilepsy. In our study, the diagnostic rate was 26.7% for targeted sequencing using a panel consisting of 2742 genes, and 42% for exome sequencing. There are two potential reasons for the robust performance of ES in our cohort. Firstly, among the 257 patients underwent ES, the parents of 233 patients also underwent ES as trio. In contrast, only probands were tested in the cohort that underwent targeted sequencing. Secondly, ES revealed four pathogenic genes, which were not covered in the targeted-sequencing panel, including KCNH1, TUBG1, UNC80, and WDR45. These genes have recently been associated with seizures and epilepsy,19,20,21,22,23 and will be included in the next version of the targeted panel.

Our results revealed that there was no statistical significance in the positive rate between the neonatal seizures group (28.2%) and the 1-year age group (34.8%). During the neonatal period, inborn error of metabolism can lead to seizures, poor prognosis, and death. For seizure onset within the first year of life, the top two genes contributing to this case were SCN1A and TSC2. In the 1-year group, the percentage of patients who have developmental delays (95/149, 63.8%) showed higher than in the neonate group (p = 0.000614). In this cohort, 145 variants were novel or exhibited different amino acid changes from previously reported pathogenic variants, expanding the variant database of 65 genes (Supplementary Table 2). In this study, the top three most frequently mutated genes in this cohort were SCN1A, KCNQ2, and TSC2, accounting for 84 (35.7%) of all pathogenic or likely pathogenic variants. We identified 12 genes, ABCC8, CDKL5, DEPDC5, KCNQ2, MECP2, MUT, PCDH19, PRRT2, SCN1A, SCN2A, STXBP1 and TSC2, each with multiple affected patients, that as a group covers approximately 50% of diagnostic cases. KCNQ2 is the most frequently mutated gene in patients with neonatal seizure, so it should be a priority in genetic screening for neonatal seizures. These genes should be considered as part of the essential panel for epilepsy early diagnosis, if ES is not feasible as first-tier analysis.

In this cohort, three patients were identified with two epilepsy-related genes. Patient 43 was a boy who had seizures starting at 13 days of age, with frequent episodes. The video electroencephalograph (VEEG) showed slow waves and spikes, and the head MRI showed bilateral cerebral ventricular asymmetry. After treatment with combined antiepileptic drugs, seizures were relieved. During the follow-up at 3 months of age, no significant motor developmental delay was found. The father and uncle had a history of suspected epilepsy. Target panel sequencing identified GABRA1 paternal missense variant and SCN2A de novo missense variant. In patient 367, compound deleterious heterozygous variants in GLDC and a reported CACNA1H missense variant were identified. In patient 633, two reported nonsense pathogenic variations in the PCDH19 and SCN1A genes, respectively, were detected. For these 3 patients, multiple seizure-related genes were detected, and further follow-up and assessment were needed for the final interpretation of these variants. For patients 347 and 602, additional reported splicing site variant in the VWF gene, and reported nonsense pathogenic variation in NF1 gene, were identified, respectively. However, follow-up was lost for these two patients, and further specific tests and assessments were not available.

The NGS methods have helped to revealed the cause of epilepsy, and improve the use of current antiepileptic drugs and identify new therapy targets.24 There is increasing evidence that some type of epilepsies respond to a particular antiepileptic medications, indicating personalized therapeutic strategies.5 The top three most frequently mutated genes in this cohort were SCN1A, KCNQ2, and TSC2. Pathogenic variations in SCN1A are present in epilepsy, generalized, with febrile seizures plus, type 2 (MIM 604403) and epileptic encephalopathy, early infantile, 6 (Dravet syndrome) (MIM 607208). It has been reported that SCN1A is the most clinically relevant of all the known epilepsy genes,25 and some of these patients react with a paradoxically increased seizure frequency to the sodium channel blockers.13 In this study, about 50% of patients did not become seizure free or achieve remission with current therapy, and seven patients were treated with a sodium channel blocker. Recent data suggests that fenfluramine and a combination of stiripentol, valproic acid, and clobazam may be effective in patients with Dravet syndrome. KCNQ2 is the most frequently mutated gene in our neonatal seizure group. Retigabine is suggested as a targeted precision medicine for KCNQ2 pathogenic variation, and treatment at higher age was less successful.26,27 Case 51 had no remission with PB, LEV, and VPA combination therapy. At the age of 1 year, retigabine was used and it improved the seizure frequency, but this patient had motor development delay and speech delay at the age of 19 months. Pathogenic variations in TSC2 are present in tuberous sclerosis-2 (MIM 613254). The mTOR-inhibitor armpamycin is an established precision medicine approach in tuberous sclerosis.28 We had 9 patients (19, 47.4%) who were treated with armpamycin; 8 of them were treated with combination therapy, with 4 becoming seizure free and 3 achieving remission. Currently, treatment of epilepsy remains largely empirical. Our findings of new genetic variants and potential treatment strategies will help to establish personalized precision medicine and treatment stratifications for individual patients.

Our data revealed 41.9% and 49.7% patients with medical actionable disorders based on molecular diagnosis in two subgroups, respectively. Early genetic testing for epilepsy provides treatment options at the disease onset. Therefore these genes should also be included in the first-tier panel for epilepsy early diagnostic rather than limited for diagnosis only. We have two patients in whom we identified compound heterozygous pathogenic variations in PCCA and diagnosed propionic acidemia. One of them died at 3 months of age with upper respiratory infection followed by acute metabolic decompensation before the molecular diagnosis. The other patient, a 1-year-old girl, suffered from epilepsy and mild developmental delay. After diagnosis, this patient followed an appropriate diet. Carnitine treatment improved the prognosis, and she had specific treatments for acute attacks. Epilepsy caused by other metabolic abnormalities, such as citrullinemia, fructose-1,6-bisphosphatase deficiency, biotinidase deficiency, and methylmalonic aciduria, can be treated according to the molecular diagnosis. Epilepsy resulting from ALDH7A1 gene pathogenic variation is responsive to pyridoxine treatment.29 Epilepsy resulting from KCNT1 gain of function may reversed by quinidine.30,31 These precision medicine treatments are directly provided by molecular findings, and treatments aim to reverse or circumvent the changes caused by the specific gene pathogenic variation.

In conclusion, the subgroups with different onset ages showed a diverse spectrum of genetic disorders. The 12 most commonly implicated genes in this cohort and the genes with treatment options should be considered as part of the essential panel for early diagnosis of epilepsy onset, if large medical exome analyses or ES are not feasible as first-tier analysis. Genetic results are beginning to improve therapy via antiepileptic medication selection and precision medicine approaches.