Introduction

The development of the cerebral cortex is a complex and multistep process that is controlled and influenced by genetic and environmental factors. The formation of the cerebral cortex starts with the generation of stem cells in the ventricular zone, is followed by their differentiation into neurons and glia, and then by migration to the surface. Once the cells reach their destination, they organize into functional layers. Disruption of any of these steps leads to structural brain anomalies and resultant neurological symptoms [1]. These clinical consequences of malformations of cortical development include cognitive impairment, seizures and cerebral palsy. The classification of MCDs employs brain imaging studies, neuropathological findings and results of molecular studies [2]. The neuroradiologic and pathological classifications distinguish several categories for MCDs including lissencephaly, heterotopia, schizencephaly, polymicrogyria, and focal cortical dysplasia [3].

Molecular studies have contributed immensely to our understanding of brain development and have enabled categorization of MCDs based on the primary molecular mechanism of the disease [4,5,6]. It has been demonstrated that disturbances of many cellular processes, including cellular movement, signal transduction involving reelin-mediated and PI3K/AKT/mTOR pathways, protein glycosylation, and neurotransmission can result in abnormal positioning of neuronal cells during brain development. MCDs have remarkable clinical and genetic heterogeneity—single-gene defects may affect different stages of cortical development and result in distinctive brain anomalies [1, 4].

Identification of novel genes controlling brain development and MCDs enables nosological classification of these disorders and provides critical starting points for elucidating the biological mechanisms underlying disease. It has been estimated that the molecular basis of only half of all known Mendelian phenotypes has been delineated with the application of contemporary technologies, including whole-exome sequencing (WES) and chromosomal microarrays (i.e., array comparative genomic hybridization, aCGH, and genome-wide SNP arrays) leaving a substantial number of potential ‘disease genes’ that remain to be discovered [7]. The clinical classification of genetic disorders is further challenged by the incidence of phenotypes resulting from variation at more than one locus, also referred to as dual molecular diagnoses, and by phenotypic expansion, where variants in one locus are identified in patients demonstrating clinical features beyond the characteristic clinical findings usually observed for a particular clinical entity [8,9,10,11].

To identify potential novel candidate genes and to investigate the possible contribution of dual molecular diagnoses and phenotypic expansion to the etiology of MCDs, we investigated a cohort of 54 patients presenting with disorders of cerebral cortical development. We interrogated their genomes with aCGH and WES for changes: both copy number variants (CNV) and single-nucleotide variants (SNV) in known disease-causing genes and potential candidate genes not previously associated with human pathology.

Materials and methods

Study subjects

We studied 54 Polish pediatric patients in collaboration with the Institute of Mother and Child in Warsaw, Poland. Subjects were selected on the basis of their phenotype and demonstrated radiologic evidence of abnormal cortical development. Many patients underwent cytogenetic and/or molecular evaluation that was unrevealing. Study subjects were investigated via protocols approved by the institutional review boards for the protection of human subjects at the Baylor College of Medicine (Houston, TX) and at the Institute of Mother and Child (Warsaw, Poland).

Genomic DNA preparation

DNA was isolated from clotted whole blood by using the Clotspin Baskets and the Gentra PureGene Blood kit (Qiagen) according to the manufacturer’s instructions.

Whole-exome sequencing

WES was performed at the Human Genome Sequencing Center (HGSC) at Baylor College of Medicine through the Baylor-Hopkins Center for Mendelian Genomics (BHCMG) initiative [12]. Using 1 µg of DNA an Illumina paired-end pre-capture library was constructed according to the manufacturer’s protocol (Illumina Multiplexing_SamplePrep_Guide_1005361_D) with modifications as described in the BCM-HGSC Illumina Barcoded Paired-End Capture Library Preparation protocol. Pre-capture libraries were pooled into 4-plex library pools and then hybridized in solution to the HGSC-designed Core capture reagent (52 Mb, NimbleGen) or 6-plex library pools used the custom VCRome 2.1 capture reagent (42 Mb, NimbleGen) according to the manufacturer’s protocol (NimbleGen SeqCap EZ Exome Library SR User’s Guide) with minor revisions. The sequencing run was performed in paired-end mode using the Illumina HiSeq 2000 platform, with sequencing-by-synthesis reactions extended for 101 cycles from each end and an additional 7 cycles for the index read. With a sequencing yield of 8.6 Gb, the sample achieved 94% of the targeted exome bases covered to a depth of 20× or greater. Illumina sequence analysis was performed using the HGSC Mercury analysis pipeline (https://www.hgsc.bcm.edu/software/mercury) which moves data through various analysis tools from the initial sequence generation on the instrument to annotated variant calls (SNPs and intra-read in/dels). The ACMG guidance for interpretation of sequence variants identified in known disease genes was applied (Supplementary Table 1) [13]. Variants in candidate genes were considered pathogenic or potentially pathogenic based on: (i) variant frequency in the in-house and public mutation databases, (ii) bioinformatics analysis with application of predictive programs, (iii) genotype–phenotype correlation analysis, (iv) familial segregation studies, and (v) functional studies—if available. Identified variants were deposited into the ClinVar database (https://www.ncbi.nlm.nih.gov/clinvar/); consecutive accession numbers SCV000598581–SCV000598612.

CNV derivation from whole-exome data

Genomic coordinates for copy number variants (CNVs) were extracted from WES data using CoNIFER and XHMM [10, 14]. As an input for both tools, all 54 WES samples from the MCDs cohort were processed together. In addition, to identify small homozygous or hemizygous CNV deletions HMZDelFinder (https://github.com/BCM-Lupskilab/HMZDelFinder) was implemented on the larger data set from 4866 BHCMG WES samples, which included 54 samples from this MCDs cohort [15]. As a definition of target regions, we used the intersection of two designs (i.e., HGSC Core and Baylor Human Genome Sequencing Center VCRome 2.1) that were used for exome capture and sequencing of the MCDs cohort [9]. The output of CoNIFER was subjected to manual inspection which aimed at excluding likely false positive calls (i.e., where the signal to noise ratio was low) and common CNVs (i.e., observed >3 in database of genomic variants; DGV). For the remaining calls the genotype–phenotype correlation analysis (based on OMIM and a literature search) was performed for all of the genes encompassed by these CNVs.

Array comparative genomic hybridization (array CGH)

Array CGH was performed using CytoSure Constitutional v3 +LOH (4 × 180k) arrays from Oxford Gene Technology (CytoSure ISCA, v3). The array used in this study contains 178,032-mer oligonucleotide probes covering the entire reference haploid human genome at an average spacing of ~30 Kb. Each patient and reference DNA was labeled with Cy3 and Cy5, respectively. DNA denaturation, labeling and hybridization were performed according to the manufacturer’s instructions. Microarrays were scanned using an Agilent scanner (Agilent Technologies). All scanned images were quantified using Agilent Feature Extraction software (V10.0) and analysed using CytoSure (OGT) software. All genomic coordinates are based on the May 2015 assembly of the human reference genome (hg19).

Results

We investigated 54 unrelated subjects demonstrating radiological evidence of disorders of cerebral cortical development on brain magnetic resonance imaging (MRI). Abnormal head circumference was observed in 29/54 patients and included microcephaly (defined as FOC z-score <−2) in 26/54 and macrocephaly (z-score >+2) in 3/54. The remaining 25 subjects showed either a normal head circumference or had no documented FOC measurements. Brain anomalies were accompanied by a number of neuropsychiatric sequelae, including epilepsy, developmental delay/intellectual disability (DD/ID), abnormal tone, and ataxia. In 10/54 patients, systemic involvement was documented, suggesting the potential etiology was an underlying genetic syndrome (Supplementary Table 2).

Molecular variation affecting known MCD-associated genes was identified in 16/54 patients including exclusive SNV or CNV in 11/54 and 2/54 subjects, respectively. Three individuals were found with both an SNV and CNV (Table 1). In aggregate, SNV were found in 14/54 patients, with de novo SNV being most prevalent and present in 8/14 subjects. Half (7/14) of SNVs were classified as likely pathogenic by ACMG criteria while the other half were categorized as variants of unknown significance (Table 1). All patients but one, 39785IMID, who was found to have a FLNA NM_001456.1:c.7333+1G>A (NG_011506.1:g.153578398G>A) variant inherited from her affected mother, represented sporadic cases of MCDs with reportedly unaffected parents. In three patients, evidence for possible phenotypic expansion was appreciated. Patient 1793IMID, who tested positive for a VUS in TUBB, was found with lissencephaly accompanied by neurologic and systemic findings. Although TUBB has not previously been associated with a more severe lissencephaly phenotype, other tubulin genes have been well established to have an association with this phenotype [16]. Two patients, 40628IMID and 39969IMID, who had a history of polymicrogyria, microcephaly and neurological problems including epilepsy (40628IMID) and spastic tetraparesis (39969IMID) were found with VUS in RELN and MEF2C, respectively. Both genes have been associated with human diseases, RELN with lissencephaly and MEF2C with corpus callosum and white matter abnormalities but not polymicrogyria [17, 18].

Table 1 Clinical and molecular characteristics of MCDs patients with variants in known disease-associated genes

WES data were mined for CNV enabling the identification of likely pathogenic changes in 5 individuals; all computationally identified CNV were subsequently confirmed experimentally by array CGH. These CNVs ranged in size from 0.5 to 12 Mb. Two subjects, 40726IMID and 41983IMID, were found with known CNV associated with NMDs and no SNV (Table 1). Patient 40726IMID who presented with diffuse pachygyria, agenesis of corpus callosum, facial dysmorphia, and neurological problems was found to have a 0.8 Mb deletion at 17p13.3 which is a cytogenetics signature of Miller–Dieker syndrome (MIM#247200), a classical lissencephaly syndrome [19]. Another patient with diffuse pachygyria and neurological deficits was found to have a 3.11 Mb deletion in 15q11.1q11.2 which represents a common deletion in patients with Angelman syndrome (MIM#105830). Interestingly, pachygyria has not been associated with Angelman syndrome [20].

Three individuals, 39175IMID, 39785IMID, and 40689IMID were found to have complex genetic abnormalities involving both CNV and SNV at more than one locus (Table 1). Patient 40689IMID demonstrated a complex brain anomaly involving nodular gray matter heterotopia, hypoplastic cerebellum, brain stem abnormalities, enlarged ventricles and thinned corpus callosum (Fig. 1). He had global developmental delay, epilepsy, retinal disease, axial hypotonia, and limb hypertonia. We detected a complex chromosomal abnormality comprising a 12 Mb deletion within 15q11.2q13.3 (Angelman syndrome region) and a 1.75 Mb deletion in 6q27del. This complex chromosomal abnormality represents a sequela of an unbalanced translocation 45, XY, der (6)t(6,15)(q27,q12)mat,-15 inherited from a reportedly clinically normal mother with balanced translocation. Both CNV are pathogenic and reported to be associated with brain anomalies and neuropsychiatric symptoms [20, 21]. Additional analysis of 40689IMID WES data identified a maternally inherited L1CAM NM_000425.4: c.3163G>A (p.(Gly1055Arg)) VUS. L1CAM variants are associated with a number of X-linked genetic syndromes that involve structural brain anomalies [22]. The clinical relevance of the aforementioned SNV is unknown however its absence in population databases and the  patient’s medical history support potential association of this variant with the phenotype despite nonpathogenic status predicted by bioinformatics algorithms. The complex brain abnormality observed in 40689IMID could be attributed to the two CNVs and potentially the L1CAM variant that may lend additional mutational burden since nodular gray matter heterotopia has been reported in patients with CNV involving 6q27 (Fig. 1b), brain stem abnormalities are observed frequently in patients with 15q11 deletion and Angelman syndrome (Fig. 1c) and L1CAM has been associated with ventriculomegaly (Fig. 1a).

Fig. 1
figure 1

Patient 40689IMID demonstrating a complex brain abnormality due to a synergistic effect of CNV and SNV. Brain imaging studies demonstrated ventriculomegaly (a) that can be explained by L1CAM variant, gray matter heterotopia associated with 6q27 deletion (a, b) and brain stem abnormalities (b, c) caused by 15q11 deletion

A female patient, 39785IMID, was found to have neuroradiologic features of nodular gray matter heterotopia and abnormal corpus callosum accompanied by facial dysmorphism and neuropsychiatric problems, including epilepsy, ataxia, and intellectual disability. WES studies revealed a FLNA NM_001456.1: c.7333+1G>A (NG_011506.1:g.153578398G>A) variant that likely affected splicing and was predicted to be  disease-causing. The patient inherited the FLNA SNV from her mother who demonstrated the same neuroradiologic findings but did not have any neurologic or intellectual deficits. This subject was also found to have two maternally inherited duplication CNVs: one in 8p11.21q11.21 (5 Mb) and another at 11p15.4 (0.5 Mb). We concluded that the patient’s clinical phenotypic findings can be explained partially by the FLNA defect that is associated with periventricular heterotopy, seizures and delayed development. Additional features may be associated with the two duplication CNVs affecting chromosomes 8p11.21–q11.21 and 11p15.4. The latter duplication has been associated with a neuropsychiatric phenotype and mild dysmorphic features [23].

Patient 39175IMID presented with abnormal gyration, suspected hemimegalencephaly, abnormal EEG studies, and delayed development. He was found with a variant of unknown significance in TSC1 NM_000368.1:c. 2194C>T (p.(His732Tyr)) and a 16p11.2 duplication CNV. Patient 39175IMID lacks classical MRI stigmata of tuberous sclerosis (MIM#191100), but this presentation varies among patients affected with this condition. Some of his symptoms can be attributed to 16p11.2 duplication (MIM#611913) that is known to be associated with intellectual disability, behavioral problems in childhood, but also structural brain anomalies [24,25,26].

The WES data were investigated for sequence variants in potential candidate genes. This gene category included genes likely contributing to the observed MCDs as evidenced by (i) functional and/or animal studies, (ii) gene variants reported in multiple unrelated subjects sharing the same clinical phenotype, and (iii) compelling bioinformatics evidence of conservation and pathogenicity for the variant allele. Using these criteria, potentially etiologic, rare variants were identified in CDH4 and ASTN1, both previously reported as candidate disease genes associated with brain malformation in a Turkish cohort with a high rate of consanguinity (Table 2) [4]. In this Polish MCDs cohort, a female patient 16IMID presented with a medical history of microcephaly, simplified gyral pattern and dysgenesis of the corpus callosum. The neurological examination demonstrated axial hypotonia with limb hypertonia. The patient was compound heterozygous for CDH4 variants: NM_001794.4: c.[1351G>A]; [2554G>A] (p.(Glu451Lys), p.(Ala852Thr)) (Fig. 2a). Both variants are rare and predicted to be disease causing by available bioinformatic tools (Supplementary Table 2). We screened our internal exome variant database of patients with brain anomalies and identified a 7-year-old male Turkish subject with a homozygous rare variant that was likely pathogenic: CDH4 NM_001794.4: c.1976G>C (p.(Arg659Pro)) [4]. His clinical presentation was reminiscent of the one observed for the 16IMID Polish patient and included microcephaly, hypoplasia of the corpus callosum, evidence of frontotemporal atrophy and scoliosis. The Turkish subject with the homozygous CDH4 NM_001794.4: c.1976G>C ((p.(Arg659Pro)) variant was born to consanguineous parents that were apparently healthy and his brother, who died during childhood, had a similar clinical presentation.

Table 2 Clinical and molecular summary of patients with MCDs candidate genes
Fig. 2
figure 2

Segregation analysis of disease-associated variants, facial appearance, and brain imaging studies in patients with disorders of cerebral cortical development. a Patient 16IMID, a compound heterozygote for CDH4 variants: NM_001794.4: c.[1351G>A]; [2554G>A] (p.(Glu451Lys), p.(Ala852Thr)), demonstrating microcephaly, simplified gyral pattern and dysgenesis of corpus callosum. b Patient 43066IMID, with ASTN1 NM_004319.2: c.[3283A>C]; [2770C>T], (p.(Met1095Leu), p.(His924Tyr)) variants and diffuse polymicrogyria on brain MRI

ASTN1 variants were identified in three individuals from two families; a Polish patient reported here and siblings from a Turkish family (Table 2). Patient 43066IMID, with diffuse polymicrogyria, spastic tetraplegia, epilepsy, and developmental delay was compound heterozygous for ASTN1 NM_004319.2: c.[3283A>C]; [2770C>T], (p.(Met1095Leu), p.(His924Tyr)) variants (Fig. 2b). Sequence variants in the same gene were also identified in two sisters, BAB3419 and BAB3420 born to consanguineous parents from Turkey, who were previously found with homozygous ASTN1 NM_004319.2: c.2224G>C (p.(Gly742Arg)) SNV and hypoplastic corpus callosum on brain MRI (Table 2) [4].

MACF1, CEP85L, LINGO4, LAMA2, and LAMA5 were other potential candidate genes implicated in brain development and MCDs (Supplementary Table 3). SNV in these genes: MACF1 NM_012090.5: c.[16131T>G]; [2555C>G] (p.(Ser852*), p.(Phe7335Leu)), CEP85L NM_001178035.1: c.191C>T (p.(Ser64Phe)), LINGO4 NM_001004432.3: c.[1262G>A]; [851C>T] (p.(Arg421Gln), p.(Ser284Phe)), LAMA2 NM_000426.3: c.[5179G>C]; [5530C>A] (p.(Glu1727Gln), p.(Arg1844Ser)) and LAMA5 NM_005560.4: c.[10726G>A]; [7114G>A] (p.(Glu367Lys), p.(Asp2372Asn)) were identified only in single-MCDs families despite an extensive variant database search. The aforementioned genes encode proteins expressed in the CNS that are involved in various cellular processes. We consider them as potential candidate genes as further evidence of their association with MCDs is needed to affirm their potential ‘disease gene’ status.

Discussion

Comprehensive genomic analysis of 54 families with disorders of cerebral cortical development was performed with the application of WES and chromosomal microarray analysis. These laboratory investigative approaches yielded a definitive (9/16) or presumptive (7/16) molecular diagnosis in 16/54 (30%) of enrolled subjects. When combined with sequence variants in candidate MCDs genes, the potential molecular diagnostic solved rate increased to 43%. Nevertheless, the majority of cases remained unsolved suggesting there are more genes involved in brain development to be found.

It has been estimated that clinical WES analysis of a clinical population of patients with a multitude of phenotypes achieves a molecular diagnosis in around 25% of patients with suspected genetic disorders [8, 9]. Interestingly the solved rate is even higher, reaching 30–50%, when patients with early onset pediatric/congenital disorders are studied [8, 9]. When comprehensive genomic analysis is performed in a research setting, utilizing exome data of additional family members, allowing application of expanded diagnostic criteria and considering candidate genes, an increment of diagnostic yield above 50% can be achieved [5].

Our study of Polish patients with clinically diagnosed disorders of cerebral cortical development demonstrated that this diagnostic outcome was not evenly distributed among all classes of patients with MCDs; 3 out of 4 patients with heterotopy and 6 out of 10 with lissencephaly were found with variants in known disease-genes but only 2/19 with polymicrogyria and 5/20 with complex brain anomalies. This observation suggests potential for new gene discovery, particularly in the latter two patient groups. However, our observations could also be explained by a role for non-genetic factors, or variants not as readily detected by WES (e.g., indels and mosaic variants), contributing to the development of brain anomalies in groups with a low rate of molecular diagnoses [27]. The role of mosaicism, especially in cases of isolated and focal brain changes e.g., local polymicrogyria, needs to be more rigorously investigated [27].

In 2/54 families, we identified CNV that can solely explain the observed brain anomaly. This low-molecular diagnostic yield for CNV may be due to the fact that most patients had a karyotype or clinical aCGH analysis as a part of their prior clinical work-up and were not included in this MCDs cohort. Three patients, 39175IMID, 39785IMID, and 40689IMID, who presented with brain anomalies and syndromic features, were found with complex genetic abnormalities involving both SNV and CNV affecting distinct genetic loci. These three patients illustrate the concept of dual molecular diagnoses resulting in blended phenotypes [8,9,10,11]. In this case the subjects presented with overlapping phenotypes, wherein a complex clinical presentation results from variants in 2 or more loci; for one case, 40689IMID, the three contributing variants, one SNV and 2 CNV, could be specifically associated with the constituent neurological traits comprising the patients clinical phenotype: ventriculomegaly, heterotopia, and brain stem abnormalities [11]. It has been reported in the literature that up to 8% of patients with an established molecular diagnosis have variants in 2 or more Mendelian loci [8,9,10,11]. This underscores the need for comprehensive molecular evaluation of patients with MCDs that involves both CNV and SNV analysis as we observe a growing number of examples with both SNV and CNV contributing to mutation burden [28].

We provided evidence for association of two novel disease genes, CDH4 and ASTN1, with brain development abnormalities. Patients 16IMID and BAB4860 demonstrated microcephaly and corpus callosum abnormalities. They were found with biallelic (compound heterozygous or homozygous) variants in CDH4 consistent with autosomal recessive Mendelian expectations. CDH4 encodes cadherin 4, which is a member of large superfamily of surface glycoproteins. Cadherins are involved in cell-to-cell adhesions and they are implicated in different human diseases, including cancer, MIM#137215, (CDH1), ectodermal dysplasia, ectrodactyly and macular dystrophy syndrome, EEMS, MIM#225280, (CDH3) or developmental delay, MIM#612580, (CDH15) among many others [29]. The precise role of CDH4 is yet to be determined; however, functional studies demonstrated that CDH4 is highly expressed in fetal brain. It has been suggested that it plays an important role in brain segmentation and neuronal growth and targeting, as CDH4 may provide guidance for layer formation in the mammalian cortex [30,31,32].

Astroactin 1 (ASTN1) variants were identified in three subjects, 42066IMID of Polish descent and BAB3419 and BAB3420, from two Turkish families reported recently in the literature [4]. Patient 42066IMID demonstrated diffuse polymicrogyria while BAB3420 had abnormal corpus callosum. The gene has been extensively studied using an animal model that demonstrated evidence of abnormal neuronal migration in Astn1−/− deficient animals [33]. ASTN1 encodes a neuronal adhesion molecule that is crucial for glial-guided migration of neuronal cells in cortical regions of developing brain therefore it is considered an excellent candidate MCDs gene [34].

In aggregate, we provide additional evidence for the involvement of two novel genes: ASTN1 and CDH4, in brain development and the association of variant alleles in these genes with neuronal migration disorders. We document the contribution of both SNV and CNV to this phenotype and to multilocus mutational burden. We demonstrate how a blended phenotype, caused by overlapping neurological traits, can be potentially dissected into its constituent contributing loci. Moreover, we provide evidence for 5 additional candidate genes, MACF1, CEP85L, LINGO4, LAMA2, and LAMA5 that may be important for cerebral cortex development. The identification of novel genes adds to our understanding of brain development and its associated diseases. It also further informs genetic testing and facilitates genetic counseling in affected families.