INTRODUCTION

Knowledge about the genetic architecture of intellectual disability (ID), developmental delay (DD), and autism spectrum disorder (ASD) has increased dramatically over the past decade with the wide application of exome and genome sequencing (ES/GS) methods. As these genetic tools are becoming increasingly available in both the clinical diagnostic and research settings, a growing number of children with neurodevelopmental disorders (NDV) are now identified to have genetic variants that arise either de novo, or inherited as autosomal dominant, X-linked, or, less commonly, autosomal recessive traits. Discovery of more than 1000 genes underlying ID and/or ASD to date has markedly informed the diagnosis for families with ID/ASD and has further led to the identification and characterization of multiple cellular pathways involved in human brain development, behavior, learning, and memory.1,2,3,4 Such gene discovery efforts are important as the developmental roles of many of these pathways would not have otherwise been predicted from in vitro and model organism studies. Herein, we describe the clinical, neuroimaging, and molecular features of 28 individuals with ID and/or ASD due to de novo or inherited variants in the zinc finger protein 292 gene (ZNF292; MIM 616213).

ZNF292 encodes a highly conserved zinc finger protein that acts as a transcription factor. ZNF292 is composed of eight exons, the last of which is the largest and encodes all 16 highly conserved zinc fingers of the predicted 2723-residue protein (canonical transcript in GenBank: NM_015021.2). Three of these zinc fingers (10–12) bind DNA at the promoter of growth hormone where it cooperates with POU1F1, a member of the POU family of transcription factors known to activate transcription in somatotrophs.5 Accordingly, the ZNF292 protein was originally described as an enhancer of growth hormone (GH) expression in the pituitary gland of a rat animal model.5 Its role was further delineated as a tumor suppressor with critical roles in tumor development and progression.6 However, the role of ZNF292 in neurodevelopment is virtually unknown.

MATERIALS AND METHODS

Cohort ascertainment

We identified 28 families with de novo (N = 27) or inherited (N = 1) pathogenic variants in ZNF292 using a combination of trio-based ES (20 families) and multigene panels (8 families), in both clinical diagnostic and research settings. Families were identified across 20 institutions in six countries with data shared via nodes of the MatchMaker Exchange (MME) network, including MyGene2, GeneMatcher, PhenomeCentral, and by querying investigators with large cohorts of patients with ID and/or ASD.7 We identified 12 additional families (15 affected persons total) with variants in ZNF292 that were considered likely pathogenic in a diagnostic setting, or “suspected” pathogenic in a research setting, but for which our confidence in the pathogenicity of these variants was limited due to either incomplete parental testing or the identification of a missense ZNF292 variant of unclear significance. Therefore, we excluded these families to avoid confounding description of both the canonical phenotype and genotype–phenotype relationships (Supplemental Subjects and Methods). We collected and reviewed detailed clinical data including medical records, facial photographs, and magnetic resonance images (MRIs), when available, from affected individuals (summarized in Table 1 and S1). The institutional review board of the University of Washington approved this study. Patient consents were obtained from all individuals for whom identifiable data are presented, including permission to publish photographs.

Table 1 Summary of the clinical features of ZNF292 variant-positive individuals (N = 28)

Molecular methods

Twenty families were tested via exome sequencing (ES) in either a clinical or research setting, and eight families had targeted sequencing of multigene panels. Of persons tested via a targeted multigene panel, five underwent targeted capture of a panel for ID and three underwent testing via a single-molecule molecular inversion probe (smMIP)–based panel of more than 100 genes associated with ID/ASD.8 Targeted and exome sequencing methods are further provided in the Supplementary Methods.

RNA isolation, RT‐PCR analysis, and Sanger sequencing to analyze biallelic expression

We extracted total RNA from blood lymphocytes using the PAXgene Blood System (Becton Dickinson, Franklin Lakes, NJ, USA). First-strand complementary DNA (cDNA) synthesis was performed with the Superscript II Reverse Transcriptase Kit (Invitrogen, Carlsbad, CA, USA). If no exon spanning primers (see Figs. S7, S8 for sequences) could be designed, we performed DNAse digestion of the RNA with the RNase-Free DNase Set (Qiagen, Venlo, Netherlands) prior to cDNA synthesis. All procedures were performed according to the manufacturer’s instructions. Resulting reverse transcription polymerase chain reaction (RT‐PCR) products were subsequently bidirectionally Sanger sequenced on an ABI 3730 sequencer (Applied Biosystems, Waltham, MA, USA) using the same primers and standard methods. RT-PCR Sanger traces were compared with DNA Sanger traces for biallelic expression at heterozygous variant positions with the Sequencher 5.1 software package (Gene Codes Corporation, Ann Arbor, MI, USA). Primer sequences used for segregation and RT-PCR are shown in the legends of Figs. S1 and S2.

RESULTS

We identified 23 de novo predicted loss-of-function variants (pLoFs) (nonsense, frameshift, or splice) in 27 families and one transmitted (i.e., inherited from an affected mother) pLoF in one family (18–003) for a total of 24 putatively pathogenic variants in ZNF292 (Table S1, Fig. 1) in 28 families. Two variants were observed in multiple families: c.6160_6161del (p.Glu2054Lysfs*14) found in four unrelated individuals, and c.3066_3069del (p.Glu1022Aspfs*3) found in two unrelated individuals, one of whom was previously published in a series of 96 individuals with NDV by a group of our authors (B.P., C.T.T., Juliane Hoyer, A.R., C.Z.).9 Two individuals in our cohort were also recently reported in a large cohort of individuals with ASD/ID: 17–022 with c.2490_2494dup (p. Ser832Ilefs*28) and 17–023 with c.4417dup (p. Ser1473Phefs*5).8 All ZNF292 variants identified were absent in population controls (gnomAD release v2.1), with the exception of one variant that was present at a very low frequency: c.1360C>T (p.Arg454*) in 1 of 248,786 alleles (mean allele frequency 0.00000402). Combined Annotation Dependent Depletion (CADD) (v1.4)10 scores (Phred-scaled) for the six nonsense variants in our series ranged from 35 to 42 with a median of 38.11 Most variants (22/24) are located in exon 8, the last and largest exon of ZNF292, which encodes a large DNA binding domain of the protein (Fig. 1). The majority of families (20/28, 71%) had pLoFs that were either insertions or deletions. Accordingly, we sought to explore whether local sequence context contributed to regional instability of this gene. At least 7 of these 20 insertion/deletion events appear likely to have been influenced by sequence context, including five events within palindromic repeat sequences either flanking or directly adjacent to the breakpoints and two in which the variant occurred within a mononucleotide repeat sequence (Fig. S3). This local sequence complexity of ZNF292 may partially explain the high frequency of somatic variants observed in ZNF292 in tumor tissues as well.12,13,14

Fig. 1
figure 1

Genomic structure and distribution of variants in ZNF292. Most of the identified variants in ZNF292 are truncating (frameshift, nonsense) located within the largest and most terminal exon (8) of the gene that encodes a ZNF292 DNA binding domain. Several of these variants lie within zinc finger regions (depicted in gray) and coiled coil domains (depicted in pink) upstream of the nuclear localization signal (NLS, depicted in black). The complementary DNA (cDNA) panel shows the coding and noncoding regions of the gene (in blue and yellow, respectively). The bottom panel shows the predicted protein domains including the zinc finger (C2H2 type) regions (shown in gray), the coiled coil domain (pink), and the nuclear localization signal (black). ZNF292 variants in the main cohort are shown, color-coded by type with nonsense variants shown in yellow and frameshift variants in green. C-terminal coiled coil regions were calculated using multicoil2 (http://cb.csail.mit.edu/cb/multicoil2/cgi-bin/multicoil2.cgi),16 and NLS regions were mapped using cNLS mapper (http://nls-mapper.iab.keio.ac.jp/cgi-bin/NLS_Mapper_form.cgi).15

Ten rare pLoFs in ZNF292 are present in the “controls only” subset of gnomAD (release v2.1), with six frameshifts and four nonsense variants that are predicted to affect the canonical transcript. However, manual review of many of these pLoFs suggests that they may be false positive calls, consistent with our observation of multiple palindromic sequences in ZNF292 that complicates read alignment. For example, manual review of the BAM files available in the gnomAD browser reveals that two frameshifts, c.2574_2575delTC and c.2576_2577insAG, are observed only once and adjacent to one another, suggesting they represent a single miscalled variant. Another variant that was annotated as nonsense, c.2690C>A, actually consists of two adjacent single-base substitutions and should have been annotated as a missense variant rather than a pLoF. A third variant, c.4592delC, was listed in the gene-overview gnomAD interface but was not actually called in the heterozygous or homozygous state in any individuals (Fig. S4). Overall, only half (i.e., two frameshifts and three nonsense variants) in gnomAD controls appear to be of high quality. These findings are consistent with other reports, including recent guidance from the gnomAD consortium, on the need for manual curation and review of pLoFs called in gnomAD (unpublished data; Minikel EV, Karczewski KJ, Martin HC, et al. Evaluating potential drug targets through human loss-of-function genetic variation. bioRxiv. 2019:530881). Furthermore, none of the high quality variants in gnomAD are located between AA1588 and 2649, which correspond to zinc fingers 10–16 plus a putative coiled coil region and final nuclear localization signal, contrasting with variants in our cohort, most of which are within these critical domains.15,16 ZNF292 has a probability of loss of function intolerance (pLI) score of 1.0 suggesting it is highly intolerant to loss-of-function (LoF) variants.17 Assessing the statistical significance of observing 28 families with pLoFs in ZNF292 (23/24 variants being de novo) is challenging due to ascertainment bias as families were collected via matchmaking, rather than sequencing of a single cohort. Accordingly, the true denominator (i.e., number of cases sequenced worldwide and either available via MME or in large ID/DD/ASD cohorts) is unknown.18 This is a longstanding challenge for all studies of rare Mendelian disorders in general in which cases are ascertained and studied via matchmaking. Nevertheless, to estimate the probability of ascertaining 27 families with de novo pLoFs in ZNF292, we tested for enrichment of de novo variants in this gene.19 Specifically, we approximated the number of families with ID/DD/ASD that have been sequenced worldwide and assumed that candidate genes for each family were either published or available for matchmaking via MME. The lower bound (of 100,000 individuals) is threefold larger than the number of families with ID/DD/ASD who have reportedly undergone ES by GeneDx (personal communication, K. Retterer, GeneDx, 4 February 2019) and is the sum of families who underwent clinical sequencing from large diagnostic laboratories in the United States and Europe as well as those sequenced via research studies of large ID/DD/ASD cohorts. For an upper bound (300,000 individuals), we assumed that the ~800,000 persons with rare conditions estimated to have been sequenced worldwide likely represent ~350,000 families (~2 exomes per family; unpublished data: Birney E, Vamathevan J, Goodhand P. Genomics in healthcare: GA4GH looks to 2022. bioRxiv. 2017), and the primary indication for ~70% of those families was ID/DD/ASD. This is likely an overestimate, but it therefore serves as a conservative upper bound. Under these assumptions, the identification of de novo pLoFs in 27 independent families yields a significant enrichment of between 8.4-fold (p = 1.88 × 10−16 if N = 300,000) to 25.3-fold (p = 1.93 × 10−28 if N = 100,000) compared with an exome-wide significance cutoff of p < 2.7 × 10−6 under a Bonferroni adjustment for ~18,500 tests/genes. These data show that ZNF292 variants are likely a recurrent cause of ID and/or ASD. Notably, ZNF292 is not significantly depleted of NMD escape variants.20 To examine this further, we performed RT-PCR on total RNA from two individuals, 17–005 with the c.3066_3069del (p.Glu1022Aspfs*3) variant and 19–011 with the c.1360C>T, (p.Arg454* variant), which showed biallelic expression of the normal and termination codon containing transcript, indicating that these transcripts are not degraded by nonsense‐mediated messenger RNA (mRNA) decay (Figs. S1, S2).

All individuals in this cohort had ID with or without ASD and attention deficit–hyperactivity disorder (ADHD), with the exception of only one individual (17–023) who did not have evidence of ID but had ASD and speech delays at age 6 years. Of the individuals with ID, delays were mild in 11/27 (40%), moderate in 6/27 (22%), and severe in 3/27 (11%). A confirmed or suspected diagnosis of ASD was present in 17/27 (62%) individuals and of ADHD in 9/27 (33%) (Table 1, Fig. S5). Speech delays were prominent in this cohort, seen in 26/27 (96%) individuals overall. One proband (17–027) had severe expressive language delays at age 7 years, another (17–003) was minimally verbal at age 5 years, and two children (17–013, 17–016) were nonverbal at the ages of 4 years and 18 years, respectively. Two children (17–003, 17–015) also had regression of speech and language development at ages 6 years and 2 years, respectively. Another individual (17–007), a 24-year-old male, had progressive developmental issues including memory problems with a suspicion for developmental regression overall. Most affected children walked prior to age 2 years with the exception of one child who remained nonambulatory at age 4 years. Notably, none of the individuals in this series had isolated behavioral issues without ID or ASD (Fig. S5).

Growth abnormalities, including short stature, were diagnosed in 11/27 individuals (Fig. S6). Tone abnormalities were observed in 13/27 individuals including hypotonia (N = 10), hypertonia or mixed tone (N = 3). Dysmorphic facial features, most notably micrognathia and hypertelorism, were observed in 13 individuals (Fig. 2). Less common facial features included prominent incisors, protruding ears, and prominent nasal bridge. Overall, these facial features are not characteristic. Ocular abnormalities including nystagmus, esotropia, and strabismus were found in nine individuals. Four individuals had mild microcephaly with a head occipitofrontal circumference (OFC) of 2–3 SD below the mean, and one child had an OFC of 4 SD below the mean at age four years. Overall, the observed dysmorphic features were nonspecific leading to likely low clinical recognizability of individuals with pathogenic ZNF292 variants.

Fig. 2
figure 2

Facial features of individuals with pathogenic ZNF292 variants. (a, b) Photos of 17–027 showing a thin upper lip, smooth philtral folds, upturned nasal tip, sparse but long eyebrows with synophrys. (c,d) Photos of 18–007 at age 3.5 years showing epicanthal folds, mildly upslanted palpebral fissures, prominent forehead, and bulbous nose. (d) Hand photographs of the child showing ichthyosis. (e, f) Photo of 17–013 as a child (e) and as a teenager (f) showing laterally prominent ears, thick lips with a tented upper lip, short philtrum, prominent eyebrows with very prominent brow ridge, and deep set eyes. (g, h) Frontal and lateral facial photograph of 17–005 at age 4 years 1 month showing mild micrognathia, short philtrum, and mildly downslanting palpebral fissures. All affected individuals have a prominent chin.

Notable brain abnormalities were detected in 3 of 12 probands who underwent brain imaging MRI (Fig. 3, Tables S1, S2). One child (17–009) had complex cerebellar abnormalities with hypoplasia of the cerebellar vermis and hemispheres, with marked cerebellar asymmetry and possible clefting within the cerebellum. There was evidence of asymmetric hemosiderin deposition on imaging suggestive of a previous vascular injury. However, there was no documented history of an in utero vascular insult or injury as pregnancy was uneventful and delivery was at term without complications. The other two children also had evidence of vascular injury on brain imaging. Individual 18–003 was delivered at 25 weeks of gestation and had posthemorrhagic hydrocephalus and focal cystic encephalomalacia attributed to prematurity. Individual 17–008 had a lacunar insult in the subcortical white matter with white matter injury but was delivered at term without notable complications during pregnancy. Other brain MRI findings present included ventriculomegaly, callosal abnormalities, and periventricular nodular heterotopia (each observed in one individual).

Fig. 3
figure 3

Brain magnetic resonance images (MRIs) of individuals with pathogenic variants in ZNF292. (a, b) T1-weighted and T2-weighted brain MRIs of patient 17–003 showing mildly prominent ventricles (f). (c, d) T1-weighted sagittal and axial images of 17–008 showing paucity of the white matter due to an in utero vascular insult and a thin corpus callosum (arrowhead, c). (e–h) T1-weighted and constructive interference in steady state (CISS) images of patient 17–009 showing multiple abnormalities including hypoplasia of the cerebellar vermis and hemispheres, with marked asymmetry (arrow, g; asterisk, g), with possible clefting of the cerebellum (arrow, g), as well as a deep infold within the cortical surface (arrowhead, f). Patient also has evidence of possible hemosiderin deposition that is asymmetric, suggesting a previous vascular insult/injury.

Finally, we identified 12 additional families (15 individuals total) with ZNF292 variants in whom pathogenicity was suspected but was less certain because of one or more of the following reasons: (1) insufficient phenotype data from an individual with the candidate variant to determine their affected status, (2) inability to determine whether a candidate variant was inherited or de novo due to lack of parental genotype data (i.e., incomplete or absent parental testing information), and (3) the candidate variant was missense with no functional data to support pathogenicity.17 Nine of these families were identified by ES and six by multigene panel testing. The clinical and molecular data on this additional cohort are provided in Tables S3, S4, and Figs. S7, S8. Notably, these variants were also rare or absent in gnomAD controls (N = 114, 704) and had high CADD scores. One family in this cohort harbored a variant that was also identified as a de novo variant in our primary cohort (p.Leu2221Serfs*10). However, this family was tested via a targeted multigene panel and there was insufficient data to determine whether the genotype segregated with affected individuals. Therefore, we conservatively assessed this family’s clinical affected status to be uncertain. Nevertheless, it is likely that this secondary cohort is enriched for additional pathogenic variants.

DISCUSSION

We discovered variants in ZNF292 that are likely pathogenic for a neurodevelopmental disorder variably accompanied by ASD and minor dysmorphic facial features that together delineate a novel condition with low clinical recognizability. Indeed, in none of the families with pathogenic ZNF292 variants was the diagnosis of a syndrome with high clinical recognizability even considered. Accordingly, we anticipate that the number of persons with neurodevelopmental delays caused by pathogenic variants in ZNF292 identified via multigene panels or exome or genome sequencing will continue to grow. If this prediction is correct, it seems reasonable to wonder why ZNF292 has not been previously reported as a priority candidate gene for NDV in large cohorts of probands or trios with ID/DD/ASD.

ZNF292 first appeared as a possible candidate gene for NDV in 2012 as it was included in a supplemental table of 77 genes in which de novo variants were found in a cohort of 100 trios with severe ID.2 Over the next seven years, seven additional probands with NDV with de novo variants in ZNF292 were reported across five different ID/DD/ASD cohorts adding up to a total of eight probands with de novo variants in ZNF292 in ~8800 families with ID/DD/ASD tested8,21,22,23,24 (Table S5). In only two of these cohorts was more than one proband identified with a de novo variant in ZNF292, and the largest of such studies (the Deciphering Developmental Disorders [DDD] Study) included only one individual with a pLoF; the others had missense variants. Accordingly, none of these studies had adequate statistical power to detect a significant enrichment of de novo missense variants or pLoFs in ZNF292. A recently reported analysis of 187 candidate genes including ZNF292 by one of our authors (H.G.) detected a significant association (p = 0.016) with ID/DD/ASD only after combining de novo variants identified across 2926 families with ID/DD/ASD from a previously reported association study.8 However, this result would not reach genome-wide significance and the lack of deep phenotyping limited conclusions about both the canonical phenotype and distribution of phenotypic effects in persons with pathogenic variants in ZNF292.

Our study nicely illustrates that substantially greater statistical power can be achieved by testing a very large sample size (i.e., putatively all families with ID/DD/ASD tested to date) via effective data sharing and summing rare variants in the same candidate gene across families tested in research and clinical labs with different (i.e., gene discovery versus diagnostic testing) albeit complementary motivations. This strategy is expected to be particularly productive for identifying moderate- to large-effect alleles for genetically heterogeneous conditions with low clinical recognizability, and to be much more efficient with the emergence of multiple platforms that facilitate global sharing of candidate genes over the past several years. Indeed, the process of matching to build confidence that a candidate gene is causal would likely happen much more quickly today than the seven years, beginning in 2012, required to demonstrate that variants in ZNF292 underlie a neurodevelopmental disorder. However, it should also be noted that most existing platforms for data sharing do not allow public data sharing, encourage sharing of all candidate genes identified in a family and their phenotypic data, or permit direct participation of families in matching.

ZNF292 is highly expressed in the developing human brain (and is among the most highly expressed ZNFs in the BrainSpan data set), especially the cerebellum, with the highest expression identified during the prenatal period (Fig. 4). However, the mechanism by which variants in ZNF292 disrupt human brain development and behavior are unclear. The two most likely possibilities by which pLoF might underlie disease are escape from nonsense-mediated decay (NMD), leading to expression of a truncated protein that has either a gain-of-function or dominant negative effect, or simple haploinsufficiency. Eighteen of the 20 unique pLoFs in our cohort are predicted to escape NMD,20 and in contrast to the remaining handful of high quality pLoF calls in gnomAD, ten variants in our series overlap a residue between 1588 and 2649, which form zinc fingers 10–16, a putative coiled coil region, and the final nuclear localization signal. This is a potentially significant concentration of variants overlapping those residues (Fisher's exact test p = 0.004).15,16 However, the seeming concentration of pLoFs in NMD escape regions of the gene is consistent with the distribution expected by random chance (Fisher's exact test p = 1) and not necessarily indicative of the pathogenic mechanism as the last exon is very large (7152 bp, or ~88% of total coding transcript length).25 Although the prematurely truncated transcript is expressed, it is still possible that the pathogenic mechanism is that of haploinsufficiency, depending what functions are retained by the truncated transcript. Notably, deletions of the 6q locus containing ZNF292 have been identified in individuals with a range of developmental issues including ID and ASD, further supporting the role of this gene in neurobehavioral phenotypes.26

Fig. 4
figure 4

Expression of ZNF292 in developing human and mouse brains. (a) ZNF292 expression in the developing human brain (normalized RPKM data) showing high expression during early prenatal development that diminishes in the postnatal brain. (Data obtained from BrainSpan; http://www.brainspan.org). AMY amygdala, CBC cerebellum, HIP hippocampus, MD medial dorsal nucleus of the thalamus, NCTX neocortex, STR striatum. (b) Zpf292 expression in the adult mouse brain showing the highest expression (indicated by higher intensity staining) in hippocampus and Purkinje cells of the cerebellum.

Finally, one affected parent in our cohort had mild ID that was diagnosed as an adult, suggesting that affected persons may go undiagnosed or be diagnosed later in life. This is consistent with the observation of five pLoFs in the gnomAD “control” group that appear to be valid. These observations suggest that some pathogenic ZNF292 genotypes are incompletely penetrant and/or they underlie mild ID/DD/ASD.

In summary, this study demonstrates that de novo and dominantly inherited variants in ZNF292 are associated with a spectrum of neurodevelopmental features including ID, ASD, and ADHD, among others. The clinical spectrum of individuals with ZNF292 variants is broad, with evidence of incomplete penetrance. This cohort shows that variants in ZNF292 are a recurrent cause of ID with or without ASD and other neurodevelopmental features.

URLs

ExAC database: http://exac.broadinstitute.org/

gnomAD: http://gnomad.broadinstitute.org

Gene: http://www.ncbi.nlm.nih.gov/gene/

Online Mendelian Inheritance in Man (OMIM): http://www.omim.org/

PDB: http://www.rcsb.org/pdb/home/home.do

Combined Annotation Dependent Depletion (CADD): http://cadd.gs.washington.edu/

database for nonsynonymous SNPs’ functional predictions (dbNSFP): https://sites.google.com/site/jpopgen/dbNSFP

REVEL: https://sites.google.com/site/revelgenomics/

dbSNP: http://www.ncbi.nlm.nih.gov/SNP/

MyGene2, National Human Genome Research Institute (NHGRI)/National Heart, Lung, and Blood Institute (NHLBI) University of Washington Center for Mendelian Genomics (UW-CMG), Seattle, WA: http://www.mygene2.org (accessed January 2019).