INTRODUCTION

Microcephaly is a clinical finding defined as an occipitofrontal head circumference (OFC) of >2 SDs below the mean for age, sex, and ethnicity, which affects approximately 2–3% of the population worldwide.1 Individuals with microcephaly, especially those with an OFC <–3 SD, can manifest neurological features that require medical attention and a search for the underlying etiology among environmental or, more commonly, genetic factors.2

Microcephaly is classified into primary (PM) if present at birth, and secondary (SM) if developing thereafter.3 Accordingly, PM has been shown frequently to result from early defects in neurogenesis due to abnormal regulation of mitotic division, while SM has been often linked to disruptions of later developmental processes such as myelination and synapse formation owing to abnormal endosome regulation, vesicle membrane transport, or synaptic structural support.4,5 However, neuronal migration, DNA repair, and transcription regulation–related pathways are among those affected in both PM and SM.4,5

Microcephaly can be nonsyndromic or present as an associated feature in a variety of genetic syndromes.2,4 Currently, there are over 900 OMIM phenotype entries and almost 800 genes linked to microcephaly with variable expressivity. Particularly, 18 of these genes constitute a distinct PM subclass, termed autosomal recessive primary microcephaly or microcephaly primary hereditary (MCPH), a form of microcephaly that is relatively consistent and thus far better characterized.6,7 On the other hand, SM and non-MCPH PM show considerable heterogeneity; this has not been properly studied so far and hence remained largely elusive.

Previous studies on patients with microcephaly using clinical and radiological information as well as metabolic and targeted genetic testing were able to identify causes in a small fraction of the patients (<20%) (refs. 2,8). Since the advent of next-generation sequencing (NGS), mainly mixed cohorts of neurodevelopmental disorders (NDDs) have been assessed where microcephalic patients accounted for ~15–41% of the cases and on average ~47% of them were identified with a definite cause using exome (ES) or genome sequencing (GS).9,10,11,12,13,14 Until now, there are only two studies that used Mendeliome sequencing or ES to evaluate known disease-causing or candidate genes in exclusive microcephaly cohorts. The first study determined a molecular diagnosis for ~29% of the cases (11/38), but did not differentiate between PM and SM.15 The other study was focused on PM and MCPH from mainly consanguineous families showing the difficulties in their clinical definitions and common overlap with microcephalic primordial dwarfism, and proposed reconsideration of phenotypic boundaries.7

Here, we performed a comprehensive genetic study on a cohort of 62 unselected clinically well-characterized patients with syndromic or nonsyndromic microcephaly of different onset using combined high-resolution chromosomal microarray analysis (CMA) and ES. Our approach sheds light on the genetic landscape of PM and SM and delineates their respective clinical and molecular characteristics. In addition to novel clinical and molecular findings in known disease genes, we identified several novel NDD/microcephaly candidate genes.

MATERIALS AND METHODS

Patient recruitment

Sixty-two unrelated patients, including both syndromic and nonsyndromic, were recruited from 2015 to 2017, clinically assessed in detail, and subjected to defined genetic evaluations (Figure S1). Inclusion criteria consisted of (1) an OFC >2 SDs below the mean at birth or later, based on World Health Organization (WHO) and established growth charts; (2) no clear evidence for an acquired etiology or history of perinatal infection; and (3) without an unequivocal etiological diagnosis after clinical assessment by pediatricians and clinical geneticists (Figure S1). We performed CMA and ES for all patients, and conventional karyotyping for 45 patients including all those who remained undiagnosed after CMA and ES analysis. Genetic testing was performed as part of a research study approved by the ethics commission of the Canton of Zurich or referral centers. Written informed consent for genetic testing, publication of clinical information, and/or photographs were obtained.

CMA

CMA for evaluation of rare coding copy-number variants (CNVs) was performed on DNA extracted from peripheral blood using Affymetrix Cytoscan HD or cytogenetic 2.7 M arrays as previously described.16

ES and Sanger sequencing

ES was performed on DNA extracted from peripheral blood using Agilent SureSelect XT Clinical Research Exome Kit (V5) or Human All Exon (V6) on a HiSeq 2500 System (Illumina, CA, USA) with 125-bp paired-end reads as described elsewhere.17 ES was done as trios (index patient and parents) in 58 families and duos (index patient and mother due to the lack of paternal DNA) in 4 families. ES coverage for targeted bases and off-target mitochondrial bases, and their distribution among diagnosed and undiagnosed patients, are shown in Fig. 1a. Coding plus flanking intronic (±6 bp) regions as well as 666 previously reported mitochondrial DNA variants in 37 mitochondrial genes from the MITOMAP database were analyzed using the NextGENe Software (SoftGenetics, PA, USA) (Figure S1). A second allele search for all de novo variants in recessive OMIM morbid genes or in high-level candidate genes was performed (Supplementary Materials and Methods). Selected variants from ES were confirmed by Sanger sequencing using an AB3730 capillary sequencer (Applied Biosystems, CA, USA).

Fig. 1
figure 1

Exome sequencing (ES) coverage, growth parameters, and genetic evaluations of 62 patients with microcephaly. (a) Average coverages of targeted regions (left) and 20-fold average coverages (right) of ES data for all or mitochondrial genes. On average, ES yielded an average coverage of 222-fold (range: 92–419 fold) and covered about 96% of the targeted bases with ≥20 sequence reads and achieved an average off-target mitochondrial read depth of 43.6-fold (range: 3.9–163.9 fold) with a 20× average coverage of 64.4% (range: 1.9–99.4%). Distribution of average sequencing depth and 20× coverage of the targeted region was indistinguishable among patients with P/LP variants (red dots), high-level candidate variants (yellow dots), or others (VUS, [suspected] candidate, no candidate) (black dots). Mitochondrial genes exhibited significantly lower average coverages and 20-fold average coverages (Welch t test) with a higher variability in the 20-fold average coverages. P/LP pathogenic or likely pathogenic, VUS variant of uncertain significance. (b) SD distributions of growth parameters measured at birth and at the time of last investigation (variable ages). Connected lines represent individual cases. SDs below –2 (dotted line) were considered microcephaly. Dark green dots: primary microcephaly (PM, 36 [58.1%] patients); light green dots: secondary microcephaly (SM, 17 [27.4%] patients); gray dots: unknown onset (9 [14.5%] patients). Note that the distributions for OFC consistently show SD reductions at the last follow-up, suggesting progressiveness of microcephaly with a statistically significantly higher OFC reduction in PM compared with that in SM patients (p < 0.001, Wilcoxon rank-sum test). However, 61.3% of PM and 70.6% of SM patients did not show a decline in length or height similar to that in OFC, indicating a disproportionate microcephaly in the majority of our patients. OFC occipitofrontal head circumference, SD standard deviation (given as standard deviation score). (c) Distribution of (potentially) relevant genetic findings in the total cohort. Inner circle shows percentages of diagnostic and uncertain findings in established disease genes, as well as likely deleterious findings in candidate genes. Middle and outer circles show the distribution of CNVs and SVs, and the inheritance pattern in the respective categories of the inner circle, respectively. P/LP variants were identified in almost 50% of the patients. Most of these variants are SVs with comparable amounts of de novo (DN) occurrence and recessive inheritance. CNV copy-number variant, SV sequence variant. (d) Genetic findings in PM and SM. Diagnostic yields between PM (n = 36) and SM (n = 17) were comparable (left panel). Predominantly recessive inheritance was identified in diagnosed PM patients (~69%) and dominant de novo variants in all diagnosed SM patients (middle panel). Likely gene-disrupting (LGD) variants represented the most common disease alleles (~80%) among the diagnosed PM patients, while LGD and missense variants were equally observed among the diagnosed SM patients (right panel). CH compound heterozygous. Numbers on graphs were given as percentage.

Variant classification

Rare coding CNVs were classified according to Miller et al.18 Rare (minor allele frequency [MAF] ≤2%) sequence variants (SVs) affecting genes known to cause Mendelian disorders were classified according to the American College of Medical Genetics and Genomics (ACMG) guidelines.19 De novo, X-linked maternal, or biallelic variants affecting other genes were classified as suspected candidates, candidates, or high-level candidates according to our defined criteria (Figure S1).

Functional evaluations of selected variants

Structural modeling, cell culture, reverse transcriptase polymerase chain reaction (RT-PCR) and quantitative RT-PCR (qRT-PCR), immunoblotting, immunofluorescence, and imaging were performed to evaluate functional consequences of selected variants (Supplementary Materials and Methods).

RESULTS

Cohort characteristics

We enrolled 62 unrelated patients (29 females, 33 males) with microcephaly of unknown etiology from 62 families (Table S1). PM and SM were determined in 36 (58.1%) and 17 (27.4%) patients, respectively (Table 1 and Fig. 1b). In the other 9 (14.5%) patients, the onset of microcephaly could not be determined. The median age at last investigation was 5.4 years (mean: 6.5 years, 0.8–18), for PM 4.5 years (mean: 5.3 years), and for SM 4.3 years (mean: 5.7 years). The majority were of European descent (77.4%) and the remaining were of Middle Eastern/North African (12.9%) or Indian (9.7%) ancestry. Nine (14.5%) patients were born to consanguineous parents. Seven patients had one or more affected siblings. Notably, follow-up OFC measurements showed a pattern of progressive microcephaly in both PM and SM with a statistically significantly higher OFC reduction in PM than in SM patients (p < 0.001, Wilcoxon rank-sum test) (Fig. 1b). However, 61.3% of PM and 70.6% of SM patients did not show a decline in length or height similar to that in OFC, indicating a disproportionate microcephaly in the majority of our patients (Fig. 1b). Apart from microcephaly, varying degrees of different neurological signs were reported, among which abnormal developmental milestones (developmental delay [DD] or ID) and abnormal cerebral magnetic resonance image (MRI) represented the most common associated features (Table 1). Importantly, we observed that the severity of DD/ID was significantly correlated with the severity of microcephaly among our PM patients (Figure S2A, r = –0.43, p = 0.01, Spearman rank correlation with Bonferroni correction) but not among SM patients or the total cohort (Figure S2B–D). In addition, we found a significant correlation between the severity of DD/ID and abnormal cerebral MRI among the total cohort (Figure S2E, F, p < 0.01, Fisher’s exact test with Bonferroni correction).

Table 1 Summary of main clinical features in our cohort of 62 patients

Genetic findings

We identified pathogenic or likely pathogenic (P/LP) causative variants in 48.4% of the patients (Table 2), and variants of uncertain significance (VUS) in another 4.8% of the patients (Fig. 1c and Table S1). Furthermore, we found likely deleterious variants affecting our novel high-level candidate genes in another 8.1%, affecting our novel (suspected) candidate genes in another 17.7%, and we found no (candidate) causative variant in 21% of the patients (Fig. 1c). We did not find a second disease allele for any patient with inherited heterozygous likely gene-disrupting (LGD) variants in established genes known to cause recessive disorders by our alternative methods (Supplementary Materials and Methods). In six (9.7%) patients, we found P/LP inherited heterozygous variants as secondary findings (Supplementary Results).

Table 2 Summary of main clinical features and genetic findings in patients with P/LP or high-level candidate variants

P/LP variants

We identified pathogenic CNVs in six (9.7%) patients (4 deletions, 2 duplications; 5 [assumed] de novo, 1 X-linked recessive inheritance) (Fig. 1c and Table 2). In one of these patients (ID74601) who had a de novo pathogenic ~1.5-Mb duplication, we identified an additional pathogenic de novo sequence variant (SV) c.3555_3556insA, p.(Ala1186Serfs*5) in KAT6A (NM_001099412.1), which likely contributes to the severity of his NDD phenotype (Table 2 and S1). Furthermore, we identified possible additional hits which may contribute to the expressivity of microcephaly in another patient (ID70688) who was identified with a pathogenic Xp11.22 microduplication affecting HUWE1, PHF8, and FAM120C, a CNV known to cause X-linked ID (MIM 300705) without microcephaly (Table 2 and S1). These hits include an additional microcephaly-related 16p11.2 microduplication (MIM 614671) and a hemizygous nonsense unreported variant c.901C>T, p.(Arg301*) in the last exon of ASB11 (NM_080873.2), which has not been yet linked to any disorder, but encodes an E3 ubiquitin protein ligase with an established role in canonical Notch signaling to regulate proper neurogenesis.20 Therefore, it is possible that these two additional variants may contribute to the manifestation of microcephaly in this patient.

Among the other 56 patients, we identified P/LP SVs affecting 22 different genes in 24 patients (CDK5RAP2 and PLK4 each in two patients), adding up to a total diagnostic yield of 48.4% (Fig. 1c). Among the diagnosed patients (n = 30), only one (~3%) PM patient was born to consanguineous parents. Considering the two microcephaly subclasses, we found comparable diagnostic yields of 44.5% in PM and 47.1% in SM (Fig. 1d). Notably, we observed recessive inheritance in 68.8% and dominant de novo variants in 31.2% of the diagnosed PM patients (n = 16), but dominant de novo variants in all of the diagnosed SM patients (n = 8). In PM, we observed mainly LGD disease alleles (~80%), while in SM, LGD and missense disease alleles were equally detected (Fig. 1d and Table 2). The affected genes in our PM subgroup belong to a variety of pathways including centrosome-associated pathways, regulation of mitotic division, transcriptional regulation, mitochondria-related function, NF-kappa-B signaling, endosome regulation, and DNA repair, whereas the affected genes in the SM subgroup encode proteins playing roles in transcriptional regulation, cell growth and differentiation, protein ubiquitination, mitotic progression, and DNA repair (Table 2).

Of the P/LP SVs, seven (25%) were recurrent variants previously reported, and 21 (75%) were novel. Among the novel variants, we found a de novo noncanonical splice-site variant c.665-4del in DYRK1A (NM_001396.3, NG_009366.1), which was not predicted to have a splice effect, but was demonstrated by us to cause an aberrant splicing at messenger RNA (mRNA) level (exon 6 deletion, r.665_951del, p.[Ile222Aspfs*22]) (Figure S3). We also found in an aborted fetus (ID74812, Fig. 2e, f) a pathogenic nonsense PLK4 variant c.1111C>T, p.(Arg371*) in trans with a likely pathogenic serine substitution c.881T>G, p.(Ile294Ser). The latter variant, which was absent in an unaffected sibling, is the first to be located in the phosphodegron element of PLK4 and predicted to create an additional phosphorylation site likely leading to a reduced protein level via accelerated autodestruction (Table 2 and S2, Figure S4). Phenotypically, this patient presented with previously unreported organ anomalies found in autopsy (ID74812, Table 2 and S2). In an unrelated child (ID77804) with different PLK4 causative variants, we found a novel MRI finding of a large cerebellum and brain stem relative to the supratentorial region (Table 2 and S2). Other novel clinical findings in our study include uvula bifida in a patient (ID53792) with TRMT10A-related microcephaly, short stature, and impaired glucose metabolism 1 (MIM 616033), and a forgotten concept of smaller pituitary glands in Rett syndrome patients21 by our similar observation of pituitary hypoplasia in one patient (ID65891) with MECP2-related X-linked mental retardation 13 (MIM 300055) (Table 2 and S1). Additionally, one patient (ID73824) was identified with a recurrent pathogenic missense variant c.923A>G, p.(Asn308Ser) in PTPN11 (NM_002834.3)—known for Noonan syndrome—and a history of perinatal asphyxia, which may contribute as an environmental factor to her microcephaly as an unusual presentation of Noonan syndrome (Table S1).

Fig. 2
figure 2

Facial photographs of selected patients with expanding clinical features or harboring high-level candidate genes. (ad) Two phenotypically similar patients (ID32410 and 76870) with likely pathogenic variants in mitochondria-related genes MT-ATP6 and KARS at 15 years 3 months and 16 years 2 months, respectively. Note apparently closely spaced eyes, long nose with bulbous tip, apparently narrow mouth with crowded teeth, and large chin. (ef) Patient 74812 with P/LP biallelic variants in PLK4, aborted at gestational week 23. Note sloping forehead, upslanting palpebral fissures, retrognathia, and apparently large ears with increased posterior angulation. (gh) Patient 68629 with biallelic variants in a high-level candidate gene TEDC1 at 5 months (g) and 5 years 8 months (h). Note apparently broad forehead at young age, facial scoliosis (asymmetry with curvatures in relation to the vertical axis of the face), mild ptosis, beaked nose, apparently short ears, and micrognathia. (ij) Patient 60361 with a de novo variant in a high-level candidate gene ZNRF3 at 4 years 9 months. Note sparse hair, left-sided microphthalmia with the secretions around both eyes due to lacrimal duct obstruction, narrow nose and nares, apparently large protruding ears, deep philtrum, thin lip vermilion (i), and oligodontia with conically shaped teeth (j). (kl) Patient 74091 with homozygous variants in a high-level candidate gene DDX1 at 6 months. Note round face with mildly upslanting palpebral fissures, retrognathia, and apparently large ears with increased posterior angulation. P/LP pathogenic or likely pathogenic.

Importantly, in 1 of the 62 patients we identified a likely pathogenic variant m.9185T>C, p.(Leu220Pro) in the mitochondrial gene MT-ATP6 (NC_012920.1) from ES data (ID32410, Table 2 and S1, Fig. 2a, b). This variant was observed in 59% of the reads (Figure S5A), which was also detected by a targeted panel of mitochondrial disease genes in 82% of urothelial cells (data not shown). In a phenotypically similar patient (ID76870, Table 2 and S1, Fig. 2c, d), we found a homozygous deleterious missense variant c.1772A>T, p.(Asn591Ile) in a nuclear gene KARS (NM_001130089.1) which encodes a mitochondria-related protein. Our structural modeling revealed that the isoleucine substitution likely affects the protein structure and/or stability (Figure S5B). We also identified a likely pathogenic variant in another nuclear gene DHTKD1 that encodes a mitochondrial protein (Table 2 and S1). Altogether, we identified three patients with likely pathogenic variants in mitochondria-related genes, accounting for 4.8% of the total cohort.

High-level candidate genes

We identified likely deleterious variants affecting five different high-level candidate genes in five (8.1%) patients without P/LP variants or VUS in established disease genes (Table 2). Four of them (SPAG5, TEDC1, VPS26A, DDX1) were affected by biallelic variants, and one (ZNFR3) by a de novo variant.

SPAG5 (sperm associated antigen 5) encodes a mitotic spindle–associated protein and has been shown to be required for regulation of mitotic spindles and recruitment of the known microcephaly gene CDK5RAP2 to the centrosome during mitosis.22 In a patient (ID81652) with PM, mild speech delay, and short stature, we found an unreported de novo frameshift variant c.1223_1224insAC, p.(Lys409Profs*19) in SPAG5 (NM_006461.3) and, by a second allele search, a maternally inherited synonymous variant c.3189C>T, p.(Gly1063Gly) with extremely low MAF (Fig. 3a and Table S1). Sequencing of mRNA from the patient’s fibroblast showed a deletion of 11 exonic bp resulting in a predicted premature stop codon (r.3189_3198del, p.[Gly1064Glu*3]) (Fig. 3b). Cycloheximide (CHX) rescue treatment showed that both aberrant alleles were subjected to nonsense-mediated mRNA decay (NMD) with some leakiness of the splicing effect (Fig. 3b). Consistently, qRT-PCR (~75 ± 22%) and immunoblotting (~80 ± 26%) revealed a significantly reduced amount of the wild-type SPAG5 at both mRNA and protein levels (Fig. 3c, d). We also observed a reduced SPAG5 intensity mainly in the centrosomal regions where it normally appears more condensed during prophase to telophase (Fig. 3e). However, morphology of the patient’s fibroblasts during different cell cycle phases appeared with no obvious abnormality in the majority of cells (>95%) (Fig. 3e), with apparently unaffected localization of the SPAG5 interacting partner CDK5RAP2 (Figure S6). Nonetheless, since we observed higher mRNA expression levels of SPAG5 in normal human induced pluripotent stem cell–derived neural progenitor cells (NPCs) compared with fibroblasts and other cell types (Fig. 3f), SPAG5 reduction may only pose deleterious effects on highly proliferative NPCs during embryonic development, which could lead to the clinical manifestations in the patient.

Fig. 3
figure 3

Functional evaluations of high-level candidate variants in SPAG5 and TEDC1. (a) Determination of the allelic location of the de novo frameshift SPAG5 variant c.1223_1224insAC. A portion of SPAG5 sequence containing the frameshift variant and a nearby single-nucleotide polymorphism (SNP, rs113667723) was analyzed by Sanger sequencing of the patient’s blood DNA, which confirmed that the frameshift SPAG5 variant was located in the paternal allele by a distinct frameshift pattern of three bases around the SNP position. Blue sequence, paternal; pink sequence, maternal; black and underlined, variants. (b) Sanger sequencing of messenger RNA (mRNA) from the patient’s fibroblast (ID81652) showed a reduced amount of an aberrantly spliced transcript (due to the synonymous SPAG5 variant c.3189C>T with splice effect), which lacks the last 11 bp of exon 20, resulting in an out-of-frame mutation and a premature stop codon p.(Gly1064Glufs*3). In the magnified electropherogram of CHX, asterisk indicates rescued frameshift allele (nucleotide C in blue), leaky splice-site variant allele (nucleotide T in red), and rescued aberrantly spliced allele (nucleotide G in black). This means that the frameshift allele and the aberrantly spliced allele were rescued upon CHX treatment. CHX cycloheximide, DMSO dimethyl sulfoxide, WT wild type. (c) Quantitative reverse transcription polymerase chain reaction (qRT-PCR) showed significantly reduced SPAG5 mRNA levels (~75%) in the patient’s fibroblasts (untreated and vehicle DMSO, p < 0.05, Welch t test), which were rescued upon treatment with CHX. Experiment was done in a triplicate. (d) Immunoblotting against the C-terminal terminal of SPAG5, detecting the two SPAG5 isoforms (full-length and short) and β-actin on protein extracts showed a significant reduction (~80%) of SPAG5 protein in the patient’s fibroblasts (ID81652) (p < 0.05, Welch t test). Note that the short isoform lacks a small portion of N-terminal of which the function has not yet been characterized. Experiment was done in a triplicate. (e) Immunostaining against SPAG5, PCNT, and α-Tubulin shows a reduced SPAG5 intensity mainly in the centrosomal regions where it is more condensed in the control during prophase to telophase. However, morphology of the patient’s fibroblasts appears with no obvious abnormality in the majority of cells (>95%). The nuclei were visualized by DAPI staining (in blue). The scale bar represents 10 μm. (f) RT-PCR showed higher expression levels of SPAG5 in normal human induced pluripotent stem cell–derived neural progenitor cells (NPCs) compared with fibroblasts and other cell types including testis (positive control), heart (negative control), HeLa cell line (highly proliferative control), and NPC-derived neuronal culture at 3 (NC3wks) or 5 (NC5wks) weeks. (g) Sanger sequencing of mRNA from the patient’s fibroblast (ID68629) showed a reduced amount of an aberrantly spliced transcript (due to the noncanonical splice-site TEDC1 variant c.227-5C>G that increases the activity of the cryptic splice acceptor), which lacks the first 40 bp of exon 3, resulting in an out-of-frame mutation and a premature stop codon p.(Glu76Glyfs*11). The levels of the aberrant transcript were rescued upon CHX treatment, indicating that the aberrant transcript was subjected to nonsense-mediated decay (NMD) (see also Figure S7). On the other hand, the sequencing of the other TEDC1 variant c.1111del, which is located in the last exon, did not show a reduced amount of the aberrant transcript. Nevertheless, this variant leads to a frameshift and premature stop codon p.(Ala371Glnfs*12) that removes the last 50 amino acids, likely leading to a deleterious effect on the function of the TEDC1 protein, which remains to be characterized. Bar graphs show the mean ± SEM.

TEDC1 (tubulin epsilon and delta complex 1), previously known as C14ORF80, has been shown to be required for centriole stability.23 In a patient (ID68629, Fig. 2g, h and Table 2) with PM, primordial dwarfism, and moderate global DD, we identified a noncanonical splice variant c.227-5C>G (intron 2) in trans with a frameshift variant c.1111del, p.(Ala371Glnfs*12) (last exon) in TEDC1 (NM_001134875.1) (Table 2 and Fig. 3g). Sequencing of mRNA from the patient’s fibroblasts showed a deletion of the first 41 bp of exon 3 (r.227_267del) predicted to result in a truncated protein (p.[Glu76Glyfs*11]) and CHX rescue treatment confirmed NMD of the aberrantly spliced transcript (Fig. 3g and S7). The other variant was not affected by NMD, but likely results in a C-terminally truncated protein (Fig. 3g and S7).

Our other high-level candidate variants, which were identified in three patients with PM and mild to severe DD, affected ZNRF3 (patient 60361, Fig. 2i, j), a negative regulator of the Wnt signaling;24 VPS26A, a mediator of Wnt transport;25 and DDX1 (patient 74091, Fig. 2k, l), a DEAD box RNA helicase, respectively (Table 2). Structural modeling for these missense variants predicts a variety of adverse consequences, including loss of binding affinity to the interacting protein R-spondin for ZNRF3, loss of the ability to form a water-mediated interaction to neighboring residues for VPS26A, and steric clashes with adjacent residues for DDX1, all likely affecting the protein domain stability and therefore probably contributing to the patients’ clinical presentation (Table 2 and Figure S8). Notably, via GeneMatcher,26 we found an additional patient with unreported biallelic variants (c.133-8T>C, p.[?]; c.839C>T, p.[Thr280Arg]) affecting DDX1. The effects of the splice-site variant remain unknown because of no access to any other sample from this patient. However, our predictions based on the UniProt and PhosphoSitePlus databases suggest that the Thr280Arg change may cause the loss of a phosphorylation site and also interfere with posttranslational modifications of the adjacent residue Lys281, which likely affects the regulation of DDX1 interaction and/or degradation. Moreover, both patients with the recessive DDX1 variants presented with comparable neurological features including severe global DD, spastic quadriparesis, abnormal sleeping pattern, and abnormal movements/seizures, providing additional support for their pathogenicity. Nonetheless, severe microcephaly was only present in the first patient (ID74091), probably due to the contribution of possible other recessive variants in his multiple large runs of homozygosity (Table S1).

Candidate and suspected candidate genes

We found a total of 22 candidate and 26 suspected candidate genes in our cohort. Of these genes, 9 candidate (RNF113A, CEP350, SIK2, RFX7, C2CD5, KIF23, IRS2, UNC13A, PRTG) and 5 suspected candidate (NMI, LARP4B, SEC14L5, PHB2, RAB40AL) genes were identified in 11 (17.7%) patients without P/LP, VUS, or high-level candidate variants.

DISCUSSION

We have elucidated the phenotypic spectrum and genetic landscape including novel findings in PM and SM by detailed clinical assessment and combined CMA and ES of 62 unselected microcephalic patients.

In our cohort, we confirm previous findings2,7 of commonly microcephaly-associated features including DD/ID, abnormal cerebral MRI, seizures, and short stature, but in addition also frequently found movement disorders and behavioral problems. With reference to our total cohort, we corroborate previous studies showing no correlation between the degree of microcephaly and developmental performance,27,28 however, when stratifying patients for PM and SM, we unprecedentedly show here such a correlation among patients with PM. This implies that prenatal onset of OFC deceleration may pose stronger adverse effect on the developmental outcome. Nevertheless, our evidence of the correlation between abnormal cerebral MRI and the severity of DD/ID substantiates a previous observation of abnormal brain scans as a better reflection of developmental performance in microcephalic patients.29 Interestingly, Shaheen et al.7 observed two patterns of head growth in congenital microcephaly with severe and progressive microcephaly (pattern A) in the majority of their patients, and largely stable microcephaly (pattern B) in some patients. However, we observed pattern A, only, in PM, which might be explained by different sets of genes identified or different time points of OFC measurement. This may implicate postnatal functions of the affected genes other than only prenatal roles in proliferation of neural progenitor cells.

Etiologically, we identified P/LP variants in almost half of the cohort (~48%), accounting for a diagnostic yield that is within the higher range achieved by NGS studies on NDDs,9,10,11,12,13,14 but is more than three times that of the previous study evaluating 680 microcephalic children (15%) using non-NGS methods,2 further supporting the effectiveness of ES for routine diagnostic testing. In addition, we have identified VUS and candidate variants in ~31% of the patients. Therefore, our diagnostic yield will likely increase over time as further supporting evidence for the affected genes becomes available.

We also highlighted the importance of evaluating relevant noncanonical splice-site variants through our examples of a synonymous exonic variant in SPAG5, and a –4 intronic variant in DYRK1A, both of which caused aberrant splicing and subsequent NMD. Therefore, it is crucial to investigate such variants, and to validate those with benign in silico predictions that might be false negative due to the complexity of splicing control.

Previously, inborn errors of metabolism including mitochondriopathies have been identified in 3% of microcephalic patients.2 However, the specific percentage of molecularly diagnosed mitochondrial disorders in microcephalic patients has not been reported so far. Our identification of LP variants in mitochondrial and mitochondria-related nuclear genes in ~5% of the patients highlights the significance of mitochondrial disorders even in PM where mitochondriopathies may have been underdiagnosed. Notwithstanding, due to the mitochondrial heteroplasmy and highly variable coverage of mitochondrial genes in ES data (Fig. 1a),30 a targeted assessment of the mitochondrial DNA should be considered.

Despite the comparable diagnostic yields between PM (~44%) and SM (~47%) in our cohort, we illustrate different predominant modes of inheritance and types of causative variants between them. Our observation of predominantly recessive inheritance and biallelic LGD variants in PM patients suggests that complete protein absence may represent the most common cause of PM, which is in line with the findings in MCPH genes.6 On the other hand, dominant de novo LGD or assumed loss-of-function (LoF) missense variants, which we frequently observed in SM patients, suggest haploinsufficiency as a frequent pathomechanism in SM. This difference in inheritance pattern is not explained by a consanguinity bias in diagnosed PM patients, since only 1 of 16 diagnosed PM patients is an offspring of consanguineous parents. Consistent with previous studies,4,5 disease-causing genes identified in our cohort also encode proteins of various pathways, among which transcriptional regulation and DNA damage response are the most frequent in both PM and SM. However, centrosome-associated pathways are exclusively implicated in PM with autosomal recessive inheritance, which highlights their crucial function in cell division during neurogenesis.4,5 Notably, we observed the progressiveness of microcephaly not only in SM patients, but also in all our PM patients, which implicates postnatal defects in neural maintenance and synaptogenesis in both microcephaly subgroups.

Within the undiagnosed patients, we were able to identify five high-level candidate genes, all in patients with PM. Of these five genes, two (SPAG5 and TEDC1) encode centrosomal proteins, two (ZNRF3 and VPS26A) Wnt signaling-related proteins,24,25 and one (DDX1) an RNA trafficking protein.31 In addition to the known centrosomal functions in regulating neuronal progenitor proliferation,32 Wnt signaling has been shown to be essential for transition between symmetrical and nonsymmetrical cell division in human neural stem cells,33 and RNA trafficking to be involved in mRNA translation control of proteins that regulate the balance between maintenance and differentiation of radial glial progenitors and thereby development of the embryonic cortex.31 Therefore, compromise in the function of these proteins may in fact lead to defects in neurogenesis and hence primary microcephaly. However, we suggest considering all our candidate genes for NDD in general, due to the variable presentation of microcephaly in non-MCPH patients.34,35 This variability has been recently demonstrated for FBXO11-related NDD, in which fewer than 25% of the patients presented with microcephaly.35

Clinical variability is often observed in NDDs, even in those with established causative genes, which has been, in some instances, attributed to additional genetic factors.36 In our cohort, we were able to identify additional genetic hits or a perinatal event likely contributing to the severity of ID or the presence of microcephaly in three patients. However, individualized explanation for all variable NDD presentations will require a comprehensive understanding of an individual’s genetic as well as epigenetic status.

In conclusion, we showed that microcephaly is highly heterogeneous both phenotypically and genetically. By using a combined high-resolution CNV and ES analyses, we achieved an effective diagnostic yield of ~48% and in addition proposed five novel NDD/microcephaly candidate genes with supporting evidence. We also shed some light on distinct as well as common characteristics of the two microcephaly subclasses PM and SM, which helps with better management of the patients and understanding of the underlying pathways involved in human brain development.