Introduction

Congenital microcephaly (CM) is defined by an occipital frontal circumference (OFC) that is >2 SD below the mean at birth, although some suggest 3 SD as the cutoff value.1,2,3 In the United States, it is estimated that this birth defect affects 2–12/10,000 live births.4 The small brain size usually carries poor neurodevelopmental prognosis. The etiology of CM is highly heterogeneous.2 Mendelian forms of CM are particularly interesting because they are amenable to human genetics approaches and can unravel the molecular underpinning of early brain development.

Primary microcephaly (PM) is a severe form of isolated CM that was defined on the basis of lack of syndromic features or major brain malformations. Many of the underlying genes were mapped using linkage analysis by exploiting their fully penetrant recessive inheritance pattern. However, recent advances in sequencing technology have greatly accelerated their discovery and there are currently 18 PM genes listed in OMIM. Early ventricular zone neural progenitors are stem-like cells that divide symmetrically to expand the pool of progenitor cells but also asymmetrically to allow neurons to differentiate and migrate to form the multilayered cortex in a highly orchestrated manner. Most of the PM genes identified to date encode proteins that appear to play a role in orienting the mitotic spindle or otherwise control cell division.5,6 Another aspect of PM pathogenesis that has also become apparent through the identification of novel PM genes is DNA damage repair, although the exact link between this fundemental and ubiquitous cellular process and PM remains poorly understood.2

The expanding list of biological processes that are perturbed by PM-related pathogenic variants clearly shows that multiple mechanisms can lead to isolated CM. Furthermore, the finding that several genes cause isolated CM though not classified as PM-related genes suggests a potential overlap that can be exploited to further our molecular insight into brain development. To our knowledge, however, there has not been any systematic phenotypic analysis of molecularly confirmed CM phenotypes to draw patterns that inform the clinical classification of genes that are critical for neurogenesis. In this study, we exploit the large referral base to our program to conduct such systematic clinical and molecular analysis, which sheds light on the genetic control of early brain development in humans.

Materials and methods

Human subjects

Individuals with OFC at birth of ≥−2 SD or those reported by history to have microcephaly at birth were eligible for the study. The SDs were calculated based on standard World Health Organization (WHO) charts taking into consideration both gender and gestational age. A written informed consent was signed by the parents (or legal guardians) of all subjects prior to enrollment (KFSHRC RAC 2080 006 and 2121 053). Blood was collected in EDTA tubes from the index and available family members, and in sodium heparin tubes for the creation of Lymphoblastoid cell lines (LCLs) when needed.

Additionally, we include two previously unpublished cases, each with a biallelic pathogenic variant in ANKLE2 identified by exome sequencing in the research setting. Informed consent was obtained for their research participation and inclusion in this study with approval from their respective institutions (Boston Children’s Hospital, Boston, MA and Institute of Human Genetics, University of California–San Francisco, San Francisco, CA).

Autozygome analysis and targeted and exome sequencing

The likely causal variants were identified using either targeted exome (“Mendeliome assay”) or exome sequencing as described previously.7 For the sporadic cases and those born to nonconsanguineous parents, all the rare (minor allele frequency [MAF] <0.001) heterozygous and homozygous variants were investigated for candidacy. For familial consanguineous cases, DNA samples from affected and available unaffected family members were genotyped using Axiom single-nucleotide polymorphism (SNP) chip array following the manufacturer’s instructions. This was followed by genomewide autozygome analysis using runs of homozygosity >2 Mb as surrogates of autozygosity using AutoSNPa and variants therein were prioritized for candidacy. Rare variants on X-chromosome were also investigated in male cases.

Computational structural analysis of mutants

See Supplemental material.

Results

Marked genetic heterogeneity of CM

Our Mendelian Genomics Program (MGP) serves as a key referral center for a wide range of birth defects, including CM. In this study, we used 2 SD below the mean at birth as the cutoff value for defining CM, OFC measurement below 35 cm within the first 3 months, or recorded microcephalic at birth. For the period 2008–2017, we enrolled 137 families with affected children who met this definition. A potentially causal variant was identified in 104, and these were considered eligible for inclusion in this study, which specifically examines the phenotype of molecularly characterized CM cases.

The variants identified in this cohort fall in four categories.

Category 1: variants in MCPH genes

According to OMIM, 18 genes are classified as microcephaly primary hereditary (MCPH) genes. Pathogenic variants were encountered in ten of these genes, including a common founder pathogenic variant in ASPM, and these accounted for 25/104 (24%) of our cohort (Fig. 1, Table S1, S2, and S3). Case 16DG0186 was found to harbor a previously reported homozygous intronic pathogenic variant in CDK5RAP2:NM_001011649.2:c.4005-15A>G confirmed to cause aberrant splicing by reverse transcription polymerase chain reaction (RT-PCR).8 Unlike the previously reported cases with this variant, our case had evidence of primordial dwarfism (PD) in addition to microcephaly. Interestingly, this case was also found to have a homozygous nonsense variant in CNTRL, a gene approximately 508 kb downstream of CDK5RAP2 and known to encode a centrosomal protein that has not been linked yet to human disease. Both genes are located within the same region of homozygosity (ROH) of ≈17 Mb. The truncated nature of the variant identified in CNTRL and the involvement of CNTRL in centrosome function indicate that this variant may also contribute together with the variant in CDK5RAP2 to the pathogenicity of the disease in our patient.

Fig. 1
figure 1

Pie chart showing the grouping and distribution of the variants identified in this cohort into four categories: variants in MCPH genes, variants in genes with established disease phenotypes in humans, variants in genes reported previously as candidate genes, and variants in genes with no established disease phenotypes in humans. MCPH, microcephaly primary hereditary.

Category 2: variants in genes with established disease phenotypes in humans other than MCPH

This category accounts for 60% of our cohort (Fig. 1). The majority involved genes where CM was in the published phenotype and these include RNU4ATAC, PCNT, AARS, BRCA2, PLK4, XRCC4, DDX11, RTTN, ASNS, NDE1, PQBP1, TSEN15, DONSON, PHGDH, PSAT1, TUBA1A, OCLN, PNKP, KATNB1, EP300, IGF1, CRIPT, SLC25A19, CTSD, BLM, INO80, ERCC4, FOXG1, NSUN2, and VRK1. There is one report of a prenatally diagnosed microcephaly in the setting of a VRK1 pathogenic variant.9 Two families in our cohort were found to have a pathogenic variant in VRK1 with extremely progressive microcephaly of −4.3 SD at age 2 months in one family and −11.1 SD at the age 11 years in the second one (Fig. 2). Although ERCC4 pathogenic variants typically cause postnatal microcephaly, our data corroborate those of Kashiyama et al., who reported the only patient with documented ERCC4-related prenatal onset microcephaly.10 Furthermore, microcephaly is also a common feature of NSUN2-related phenotype but the association of congenital microcephaly was only reported once by Martinez FJ, 2012 and we are confirming in this paper that congenital microcephaly can be part of the phenotype. Reassuringly, the contribution of genes linked only to postnatal and not congenital microcephaly in our cohort was minimal and was limited to RARS, SBF1, and ALDH6A1. To our knowledge only five missense pathogenic variants were reported in ALDH6A1 as disease-causing variants.11,12,13 Thus, the truncating nature of the variant in ALDH6A1 identified in our patient may have allowed us to observe the broader spectrum of ALDH6A1-related microcephaly. RARS-related microcephaly was only reported once in a 2-month-old patient who was compound heterozygous for a startloss and a missense pathogenic variants (patient 4 in Wolf et al.),14 but we show here that RARS pathogenic variants can also cause congenital microcephaly. Similarly, we previously reported the only family with SBF1-related microcephaly caused by a missense variant but had no records of the birth OFC.15,16 The patient we describe here with a homozygous, presumably more severe pathogenic variant clearly has a congenital microcephaly phenotype.

Fig. 2
figure 2

Graph plot showing the available OFC (upper panel) and height/length (lower panel) data points at birth and on the last clinical evaluation that we collected for the cohort. Red color: Cases with variants in microcephaly primary hereditary (MCPH) genes. Green: Cases with variants in genes with established disease phenotypes in humans with congenital microcephaly (CM). Orange: Cases with variants in genes reported previously as candidate genes. Blue: Cases with variants in genes with no established disease phenotypes in humans. OFC, occipital frontal circumference.

Category 3: variants in genes reported previously as candidate genes

These genes include those with only tentative links based on single families. One particularly interesting gene in this category is ANKLE2, which was reported in a single Mexican family comprising two affected children who were compound heterozygous for a missense and nonsense allele.17 In our cohort, we identified one patient with a homozygous likely deleterious variant (Table S1). Through an international collaboration, we were able to identify two further families who share a homozygous likely deleterious variant (Table S3). These families strongly support the designation of ANKLE2 as MCPH16. Another interesting finding is the identification of a likely deleterious homozygous variant in YARS, a gene only known to cause an autosomal dominant form of Charcot–Marie–Tooth disease. This variant in YARS was found in family 89 with two affected siblings (14DG0613 and 14DG0614) born to consanguineous parents (Table S1, Fig. 2). Nowaczyk et al. recently reported two siblings who presented with a multisystem disease and both were found to harbor biallelic likely deleterious variants within the YARS gene.18 Thus, our homozygous variant likely confirms YARS as a bona fide congenital microcephaly gene in the recessive form and adds to the list of genes with markedly distinct dominant and recessive phenotypes.19,20 The third candidate gene that our results appear to confirm is THG1L. The index (14DG0824) in family 91 was born with intrauterine growth restriction (IUGR) and diffuse cerebral and cerebellar atrophy (Table S3). Exome sequencing revealed a novel missense variant in THG1L that is predicted to be pathogenic (it likely destabilizes a short helix resulting in lack of anchorage to the core of the protein, see Supplemental Computational Biology Material). The phenotype is similar to that reported recently by Edvardson et al.,21 which adds to the growing list of human neurodevelopmental diseases linked to abnormal transfer RNA (tRNA) modification.22,23,24 We also highlight FRMD4A, as the fourth previously reported candidate (one presumably pathogenic variant was reported by Fine et al.25) that we confirm in this cohort based on case 15DG0781 (Tables S2 and S3).

Category 4: variants in genes with no established disease phenotypes in humans

These are genes we propose as novel candidates for congenital microcephaly in humans and include BPTF, MAP1B, CCNH, and PPFIBP1. The justification for their listing as candidates is provided in Supplemental Computational Biology Material.

Distinct patterns of OFC in CM patients

We collected extensive data points of OFC for our cohort and this allowed us to observe two distinct patterns (Fig. 2). Pattern A is characterized by progressive and severe microcephaly (>−5 SD), whereas pattern B shows a much milder form of microcephaly that remains largely stable with age. As expected, all patients with MCPH pathogenic variants belonged to pattern A. However, we note that pathogenic variants in the following genes also followed a similar pattern: RNU4ATAC, PCNT, XRCC4, PLK4, RTTN, DDX11, BRCA2, AARS, TUBA1A, NDE1, KATNB1, IGF1, OCLN, EP300, PQBP1, ASNS, DONSON, TSEN15, PNKP, FOXG1, VRK1, and SBF1

Phenotypic overlap between CM and microcephalic primordial dwarfism

We first reported the overlap between PM and microcephalic primordial dwarfism (MOPD) when we showed that CENPJ, a strictly PM gene, can also underlie MOPD.26 Similarly, our cohort revealed that CEP135, a gene only reported in the context of severe PM, can be mutated in MOPD (11DG0561, see Table S2 and S3).27,28 This genetic overlap prompted us to further explore the role of CM genes in somatic growth. Therefore, we plotted the height/length data points (Fig. 2) that we collected for our cohort and noticed an interesting pattern of short stature (>−3 SD, not necessarily congenital) in 11/21 (52%). Further analysis of this trend revealed that it is strongest in PM genes as well as most of the genes that follow pattern A of OFC growth, e.g., ASNS.

Seeking a definition of primary microcephaly

We exploited the extensive phenotypic (including imaging) and molecular data available in our cohort to test the commonly used definition of primary microcephaly, i.e., congenital nonsyndromic microcephaly that progresses with age with largely normal brain magnetic resonance imaging (MRI). Because there is no consensus on the different elements of this definition, we opted to test different combinations of the following cutoff values:

1. Congenital microcephaly: The cutoff value of OFC at birth that we used was >–2 SD but even a more restrictive cutoff value of >−2.5 SD would have been met by all our molecularly confirmed MCPH cases.

2. Progressive microcephaly: although we did not have long term OFC data on all the cases, our cohort suggests that all MCPH cases have an OFC >−5SD after 6 months of age.

3. Nonsyndromic: defining syndromic association can be very problematic because we have shown a strong overlap with primordial dwarfism. In addition, microcephaly-associated facial dysmorphism is common, as well as epilepsy (16/25 [64%]). Therefore, we opted to define “syndromic association” as evidence of a major birth defect in a non–central nervous system organ other than short stature. Such definition will again be met by all of our molecularly confirmed MCPH cases.

4. Lack of major brain malformation: our analysis shows that this is perhaps the most difficult aspect of the definition because we found abnormal neuronal migration (lissencephaly/pachygyria and variation thereof in 11/22), and, to a lesser extent abnormal postmigrational development (3/22) to be common with 64% of the molecularly confirmed MCPH cases having at least one element of these defects (Table S2 and Fig. 3). These accompanying major defects were particularly common in those with pathogenic variants in ASPM, WDR62, CEP152, MCPH1, CIT, CENPJ, and STIL. Because the nondiscriminate use of this criterion will greatly reduce the sensitivity of the common primary microcephaly definition, we suggest that the presence of lissencephaly and pachygyria should be tolerated. Accordingly, we think that RTTN and NDE1 should be classified as primary microcephaly genes. On the other hand, we note the conspicuous absence of “atrophy” in MCPH cases. Indeed, replacing “lack of major brain defects” with “lack of atrophy” would be met by virtually all of our MCPH cases.

Fig. 3
figure 3

Representative magnetic resonance images (MRI) for the cases with pathogenic variant in microcephaly primary hereditary (MCPH) genes and other genes highlighted in this study. (a) Case 13DG0605 with pathogenic variant in ASPM showing pachygyria. (b) Case 15DG1001 with pathogenic variant in MFSD2A showing hydrocephaly. (c) Case 17DG0679 with a pathogenic variant in STIL showing the partial agenesis of the corpus callosum and pachygyria. (d, e) Case 17DG0680 with a pathogenic variant in CEP152 showing polymicrogyria, severe callosal hypogenesis with interhemispheric cyst at left aspect of the falx compressing the left cerebral hemisphere continuous with the third ventricle, and mild dilatation of the lateral ventricles. (f) Case 15DG0077 with a pathogenic variant in BRCA2 showing hypoplastic corpus callosum. (g) Case 13DG0152 with a pathogenic variant in DDX11 showing hypoplastic corpus callosum. (h) Case 12DG1528 with a pathogenic variant in SPDL1 showing virtually no brain in computed tomography (CT) scan. (i) Case with a pathogenic variant in ANKLE2 showing sloping of the forehead along with simplified gyration, partial agenesis of the corpus callosum, and hypoplastic cerebellum.

Discussion

Congenital microcephaly is a birth defect that offers a window into the early human brain development, and deciphering its genetics has greatly added to our understanding of the molecular mechanisms that control this developmental process. Despite the strong interest in this line of research, prior studies have focused on individual novel disease gene identification. When approached as a collective entity, studies have typically only focused on primary microcephaly as a subgroup.29,30,31 Although our study with its unbiased inclusion of all forms of microcephaly supports a distinct clinical and molecular profile of primary microcephaly, we note that a strict definition that requires lack of syndromic and other associated brain malformations has important limitations. First, we note that several genes not classified as primary microcephaly genes can result in a very similar profile, namely RTTN and NDE1. Second, we note the remarkable phenotypic and molecular overlap between primary microcephaly and microcephalic primordial dwarfism, which suggests that the definition should not exclude primordial dwarfism as an acceptable syndromic association. Third, major brain malformations are not uncommon in our molecularly confirmed MCPH cases. Thus, while we support the continued use of the term “primary microcephaly,” it should be emphasized that it has limited sensitivity and specificity.

The contemporary expanded use of genomic tests makes it less critical to narrow the differential diagnosis prior to ordering gene sequencing. Nonetheless, knowledge of which genes can result in congenital microcephaly can be very helpful in variant interpretation. For example, our demonstration that certain disease genes not previously linked to congenital microcephaly can result in this phenotype makes it easier to consider variants in these genes as candidate causal variants in this context. More importantly, the depletion we show in our cohort for genes linked to postnatal microcephaly strongly supports the notion that these two classes of microcephaly are etiologically distinct, which is also important in variant interpretation.

We have previously shown that a gene strictly linked to primary microcephaly (CENPJ) can also cause microcephalic primordial dwarfism, a phenomenon that was later observed for CEP152. Here, we expand the overlap between these two disease entities by showing that pathogenic variants in CEP135, which has only been linked to microcephaly 8, primary, can result in microcephalic primordial dwarfism. Together with the overlapping ontology of the genes for the two conditions and the identical pattern of postnatal head growth, we conclude that primary microcephaly and microcephalic primordial dwarfism are a phenotypic continuum although the genotype/phenotype correlation remains unclear at this point.

Given the recognized extreme complexity of brain development, it is perhaps not surprising that our analysis of genes associated with congenital microcephaly revealed a remarkable diversity of biological functions. Nonetheless, the rather unique phenotype associated with the subset of congenital microcephaly genes that cause MCPH appears to correlate with a rather restricted diversity. Interestingly, our results suggest that genes not listed as MCPH, but which we argue meet the definition (see above), e.g., RTTN and NDE1, encode centriolar proteins and have a remarkably similar gene ontology, which lends further support to their classification as MCPH genes.

The novel candidate genes that we propose in this study deserve a special mention. BPTF encodes bromodomain PHD finger transcription factor, which functions in protein modeling and resembles the largest subunit of NURF (nucleosome remodeling factor) in mammals. Knockout of the ortholog gene in zebrafish (bptf) using CRISPR-Cas9 genome editing causes reduction in the head of size of the mutant larvae compared to controls.32 Furthermore, knockout mice for Bptf display embryonic lethality with failure to establish a functional distal visceral endoderm.33 Recently, pathogenic variants in BPTF were found to cause an autosomal dominant neurodevelopmental disorder with postnatal microcephaly and dysmorphic features.32 It remains to be seen if recessive BPTF variants are a bona fide cause of autosomal recessive CM, as we suggest in this study, through the identification of future patients. The available data on the function of MAP1B make it another compelling CM candidate gene. MAP1B is a neuron-specific microtubule-associated protein that is expressed in different portions of the brain and has a role in brain development. Heterozygous knockout of MAP1B in mouse leads to abnormalities in the shape and size of the cerebellum while homozygotes are embryonic lethal.34 The mouse model therefore seems compatible with MAP1B-related recessive CM as we suggest here. The third gene that we are proposing in this study as a novel candidate gene for CM, CCNH, encodes a cdk-activating kinase involved in the regulation of the cell cycle by activating the phosphorylation of several cyclin-dependent kinases.35 There is no available model for deficiency of the mouse ortholog; however, a study in zebrafish involving a dominant-negative form of cyclin H showed a delay in the onset of zygotic transcription in the early embryo, resulting in apoptosis at 5 h postfertilization and later abnormal body shape, eyes, and brain.36 Finally, we note that PPFIBP1 (liprin-β-1) belongs to the leprin family of proteins, which form complex structures that serve as scaffolds for the recruitment of other proteins.37 Although liprin-α has an established role in synapse morphogenesis, such role has not yet been demonstrated for leprin-β.38 However, the truncating nature to the pathogenic variants we identified in PPFIBP1 predicting loss of nearly 70% of the protein including the three main domains SAM (1, 2, and 3) and the segregation of these variants in three affected siblings with similar phenotype support our proposal that this gene is a candidate gene for syndromic CM.

In conclusion, we present a large cohort of molecularly characterized cases of congenital microcephaly. Our results refine the phenotypic subgrouping of congenital microcephaly, expand its genetic heterogeneity, and provide the rationale for reconsidering some historical boundaries in the phenotypic and molecular classification of this group of disorders.