INTRODUCTION

The nucleosome remodeling and deacetylase (NuRD) complex is involved in the management of genomic integrity, stem cell differentiation, and neurodevelopment.1,2 This important multienzyme complex regulates transcription by linking two independent chromatin-regulating activities: histone deacetylase and adenosine triphosphate (ATP)-dependent nucleosome remodeling activity.1,3 Several neurodevelopmental disorders have been associated with variants in NuRD subunit proteins that include CHD3, CHD4, and GATAD2B.4,5,6,7,8 Previously identified clinical features shared by each of these dominant disorders include intellectual disability, motor delays, and distinct facies.4,5,6,7,8 De novo CHD4 variants were also associated with hearing loss, bony fusions, palatal abnormalities, hypogonadotrophic hypogonadism, and heart defects (MIM 617159; CHD4-related syndrome (CHD4RS) or Sifrim–Hitz–Weiss syndrome [SIHIWES]);5,6,7 while de novo CHD3 variants were associated with neonatal feeding issues, childhood apraxia of speech, joint laxity, undescended testes, and high forehead with frontal bossing (MIM 618205; CHD3-related syndrome [CHD3RS] or Snijders Blok–Campeau syndrome [SNIBCPS]).4 Both disorders were also associated with enlarged cerebrospinal fluid (CSF) spaces and macrocephaly.4,5,6 To date, previous reports ofGATAD2B-associated neurodevelopmental disorder (GAND; MIM 615074) had identified a total of ten subjects who possessed deletion or truncating variants in GATAD2B and presented with intellectual disability, impaired language development, strabismus, and characteristic facies.8,9,10,11,12

In this report, we have characterized the phenotypic data of 50 subjects with multiple types of GATAD2B variants with the purpose of defining the common genetic and clinical features of GAND. Pathogenic variants included deletions (3), nonsense (17), truncating frameshift (16), splice-site (7), and missense (7) changes. Missense variants have not been previously associated with GAND and all of them were located within two highly conserved domains, conserved region-1 (CR1) and conserved region-2 (CR2) of the GATAD2B protein (also known as p66β) (Fig. 1, Table S1). Several of these missense variants were shown to disrupt interactions between the GATAD2B protein and its NuRD binding partners. All subjects had phenotypes that included intellectual disability, impaired language development, strabismus, and characteristic facies. Several additional phenotypic features were identified that included polyhydramnios, neonatal feeding difficulties, anisocoria, aortic valve defects, epilepsy, abnormal brain magnetic resonance images (MRIs), and macrocephaly. This expanded GAND phenotype closely overlaps other NuRD-associated neurodevelopmental disorders indicating that these associated proteins (CHD3, CHD4, and GATAD2B) may have converging molecular functions and result in clinically related syndromes.

Fig. 1: Genomic and protein schematic diagram with GATAD2B-associated neurodevelopmental disorder (GAND) variants.
figure 1

All reported pathogenic GATAD2B variants from our study are represented as deletions, splice-site, or protein changes below the protein diagram. Genomic deletions are represented as bars below the figure. Previously reported variants and deletions are represented with gray letters and bars above the diagram. * monozygotic twins; # somatic mosaic family.

MATERIALS AND METHODS

Standard protocol approvals, registrations, and patient consents

Research protocols and the study were approved by the Cedars-Sinai Medical Center Institutional Review Board (IRB; protocol Pro00037131). All the participants or their families consented to participation in the study. Informed consent was obtained for the publication of all photographs.

Subjects GAND21, GAND24, GAND26, GAND27, GAND29, GAND30, and GAND32–35 were identified and referred to our study through the Deciphering Developmental Disorders (DDD) study (UK Research Ethics Committee approval: 10/H0305/83, granted by the Cambridge South REC).13 Authorization for disclosure of recognizable subjects in photographs and information was obtained by parents.

Clinical genetic evaluations

Genomic DNA was extracted from saliva or blood and genetic testing was performed and analyzed per previous protocols for clinically available trio-based exome sequencing14,15 (N = 44; the DDD study; the Undiagnosed Diseases Network; GeneDx, Gaithersburg, MD; Centogene, Rostock, Germany; Baylor College of Medicine, Houston, TX), intellectual disability panels (N = 4; GeneDx, Gaithersburg, MD; University of Chicago, Chicago, IL; Genome Diagnostics Nijmegen, Nijmegen, NL), and chromosomal single-nucleotide polymorphism (SNP) microarrays (N = 1; GeneDx, Gaithersburg, MD).

Genome sequencing and analysis

Subject GAND50 was diagnosed via genome sequencing analysis. Genomic DNA was extracted from the subject and her unaffected parents. Genome sequencing was performed at Human Longevity Inc (San Diego, CA) on the Illumina Hiseq/novaseq platform to an average depth of 35–40× per sample. Approximately 450 M 150-bp paired-end reads (132 Gb of sequences) were generated across the genome for each sample using Illumina HiSeq X system with median insert size of ~340 bp. Data analysis was performed on the DNAnexus platform using a University of California–Los Angeles (UCLA) custom built pipeline that incorporates BWA-mem v0.7.5 for alignment;16 Picard v2.0.1 (http://broadinstitute.github.io/picard/) for polymerase chain reaction (PCR) and optical duplicate marking; GATK v3.417,18,19 for depth of coverage analysis, indel realignment, base quality score recalibration, haplotype calling, joint genotyping, and variant recalibration; and PLINK v1.0720 for determining regions of excess homozygosity. GoldenHelix VarSeqTM v1.4.5 (Golden Helix, Inc., Bozeman, MT; www.goldenhelix.com) was used for variant annotation, filtering, and interpretation. Manta v1.4.0,20 ERDS v1.1,21 and CNVnator v0.3.322 were used for structural variant (SV) detection. Annotation of SV calls for genic content was performed using in-house scripts and validated by manual inspection and Sanger confirmation at a College of American Pathologists (CAP)/CLIA accredited laboratory (UCLA Orphan Disease Testing Center). No clinically significant de novo, homozygous, hemizygous, or compound heterozygous single-nucleotide variants (SNVs) or small indels were identified. However, a 1.5-Kb de novo deletion, NC_000001.10: g.153788220_153789753del, was observed in the proband that removes a single exon in GATAD2B (exon 7 in transcript NM_020699.3), resulting in a predicted frameshift. No common SVs with similar breakpoints are observed in the Database of Genomic Variants23 or among 2504 individuals called for SV by the 1000 Genomes Project.24

Clinical data

Retrospective clinical, diagnostic, and neurodiagnostic information was collected, analyzed, and reported from medical records and family interviews. Physical exams were performed whenever possible. Please note that the average values of the reported parameters are based on available data with raw data supplied in parentheses wherever pertinent.

Plasmids

Full-length genes for mouse methyl-binding domain protein-3 (MBD3; UniProt ID:Q9Z2D8), and human methyl-binding domain 2-ɑ (MBD2-ɑ; UniProt ID:Q9UBB5), GATAD2B (UniProt ID:Q8WXI9), and partial gene sequences for the C-terminal domains (CTD) of chromodomain-helicase DNA-binding protein-3 (CHD3; UniProt ID:Q12873), chromodomain-helicase DNA-binding protein-4 (CHD4; UniProt ID:Q14839) and chromodomain-helicase DNA-binding protein-5 (CHD5; UniProt ID:Q8TDI0) were cloned into pcDNA3.1 expression vectors to generate HA-tagged GATAD2B, and FLAG-tagged MBD2, MBD3, and the CTD of CHD3 (residues 1246–1944), CHD4 (residues 1230–1912) and CHD5 (residues 1218–1954). Five missense variants (L180P, G406R, R414Q, C420R, and C420S) were introduced into the HA-GATAD2B plasmid using site-directed mutagenesis.

In vitro rabbit reticulocyte lysate protein expression and immunoprecipitation assays

Protein expression and immunoprecipitation experiments were performed as previously described.25,26 In vitro transcription–translation in rabbit reticulocyte lysates (IVT) and pulldown studies were used to generate interaction data for (1) FLAG-tagged MBD2 or MBD3 coexpressed with versions of HA-GATAD2B (wild-type [WT] or L180P) and (2) FLAG-tagged C1–C2 region26 of CHD3, CHD4, and CHD5 coexpressed with HA-GATAD2B (WT, G406R, R414Q, C420R, or C420S). In all experiments, FLAG-fusion proteins were immobilized on α-FLAG affinity beads and used as baits to pull down the coexpressed HA-GATAD2B proteins. Wild-type HA-GATAD2B was expressed individually as a negative control. Lysates to which no plasmids were added were also used as negative controls. In each case, 10% of inputs and 50% of elutions were loaded on a sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS-PAGE) gel and polyvinylidene fluoride (PVDF) membranes were probed with α-HA-HRP (#2999S, Cell Signalling Technology, Danvers, MA) in 1:20,000 dilution and α-FLAG-HRP (#A8592, Sigma Aldrich, USA) in 1:80,000 dilution.

RESULTS

GATAD2B subjects and demographics

Fifty subjects were enrolled in our study. Most subjects were unrelated, although two sets of identical twins and one set of nontwin siblings were identified. Subjects were located in the United States, Canada, United Kingdom, Australia, Portugal, Spain, United Arab Emirates, and Brazil. Fifty-eight percent of subjects were female (29/50). Parents were primarily of Caucasian background (95.5%); a minority of parents were of other backgrounds (Asian [1.5%], Arab [2%], Hispanic [2%], and African descent [1%]). The average age at diagnosis was 6.8 years (SD 4.6). Average age at enrollment in our study was 7.0 years (SD 4.5; range: 20 months to 21 years).

GATAD2B pathogenic variants

Forty-three different GATAD2B pathogenic variants were identified in the 50 enrolled subjects; 41 of these were novel. Almost all subjects (96%) had de novo pathogenic variants, with two subjects inheriting variants via parental mosaicism. Five of the novel variants were found in multiple subjects (p.R116X [two unrelated subjects]; p.Q188Efs36X [two subjects, siblings via paternal mosaicism]; p.R414Q [two unrelated subjects]; p.C420R [two unrelated subjects]; p.E476X [two subjects, monozygotic twins]; p.R478X [three subjects, monozygotic twins and one unrelated child]). Two pathogenic variants in our cohort had been previously reported (c.1217-2A>G splice-site; p.Q190Afs34X)8,9 (Fig. 1 and Table S1).

Multiple types of variants were identified in our subjects that included nonsense (17; 34%), truncating frameshift (16; 32%), splice-site (7; 14%), deletions (3; 6%), and missense (7; 14%) changes. One deletion (~1.5 kb) involved exon 7 and another deletion (~3.5 kb) involved exons 8–11. A third deletion (~175 kb) extended from the noncoding exon 1 of GATAD2B upstream to include seven other genes of unknown function terminating within the NUP21OL gene. Interestingly, the missense variants were all located within CR1 (p.L180P) or CR2 (p.G406R, p.R414Q, p.C420R, C420S) domains of the GATAD2B protein. These domains have important interactions with MDB and CHD proteins within the NuRD complex (Fig. 1).25,26,27,28

Facial features

In addition to macrocephaly, the majority of GAND subjects also had distinct facial features. Dysmorphisms were evaluated in 37 children whose families provided adequate photos. Findings included a high wide forehead/frontal bossing (100%), prominent supraorbital ridges (62.2%), posteriorly angulated ears (59%), ocular hypertelorism (78.4%), downslanting palpebral fissures (45.6%), epicanthal folds (29.7%), prominent or bulbous nasal tip (83.8%), wide nasal base (35.1%), elongated wide nose (35.1%), short philtrum (51.3%), small recessed jaw (24.3%), and a pointed chin (91.9%) (Fig. 2, Table S2). Face2Gene image analysis (FDNA Inc., USA) was used on photographs of 19 GAND subjects to produce a composite model of the facial features associated with GAND compared with age-, ethnicity-, and gender-matched controls. The facial features of the GAND cohort were significantly different from healthy controls (N = 19; area under the curve [AUC] value of 0.933 of receiver operating characteristic [ROC] curve, p value = 0.004).

Fig. 2: Photographs of affected individuals with associated with variant types.
figure 2

ae Nonsense or truncating frameshift variants: (a) GAND19, (b) GAND18, (c) GAND33 and GAND34, (d) GAND53, ( e) GAND20. fk Missense variants: (f) GAND28, (g) GAND15, (h) GAND52, (i) GAND32, (j) GAND55, (k) GAND49. lq Splice-site variants: (l) GAND17, (m) GAND27, (n) GAND42, (o) GAND21, (p) GAND6. qw Nonsense or truncating frameshift variants: (q) GAND40, (r) GAND12, (s) GAND3, (t) GAND13, (u) GAND8, (v) GAND36, (w) GAND2. Composite of 19 photographs of (x) GAND subjects. and (y) 19 healthy controls (FDNA Inc. USA).

Birth and development

Average parental age at birth was in the early 30s. The average gestational age at birth was 38.0 weeks (SD 1.8). Polyhydramnios (45%) was the most common complication during pregnancy. The average birthweight and birth length were within normal limits. Macrocephaly was seen in 60.4% of subjects at birth, with an average birth head circumference of 36.7 cm (SD 1.7). The average percentile rank of the birth head circumference was ~87 (SD 23.5). Macrocephaly at older timepoints was present in 91.8% of subjects, with average head circumference percentile score of 96.0 (SD 6.4) (Table 1 and S3).

Table 1 Phenotypic findings.

All developmental milestones were abnormal from early infancy, including feeding, language, motor ability, and intellectual ability. Infantile hypotonia was present in all subjects (100%), with most being described as “floppy” in infancy. Children sat up on average at 14.7 months (SD 5.2) and most children were ambulatory by an average age of 33.1 months (SD 12.6). Five subjects were nonambulatory (ages: 1.3, 2, 2.5, 5, and 6 years) (Table S4). Most ambulatory children had a persistent unsteady gait (91%). Infantile feeding issues and gastroesophageal reflux disease were reported in 82% of subjects. Two subjects required gastrostomy tube placement (4%). Reflux disease often resolved with age, but most children had persistent oromechanical issues and excessive drooling. (Table 1, S3, S4, and S5).

All subjects had intellectual disability associated with expressive and receptive language issues. The average age of first spoken words was 3.9 years (SD 2.0), although none of the subjects attained an extensive vocabulary. Eleven subjects were nonverbal (ages: 1.3, 2, 3, 3, 3, 5, 5, 6, 6, 11, 14, and 21 years) (Tables S4, S5). All children over 7 years could follow a one-step command (Table S5), although 20% of subjects 0–3 years and 10.5% in the 4–7 years age group could not. A substantial number of subjects never attained complete toilet training (Table S5). Almost all subjects exhibited normal eye contact and social reciprocity (98% each).

Neurological, neuroimaging, and ophthalmological findings

Epilepsy was present in 24% of subjects. Interestingly, 58% of all subjects underwent electroencephalograms (EEGs) for suspected seizure events, with most EEGs being normal (79.3%). Primary generalized and focal features were noted in several subjects (13.7% and 6.9%; respectively). Most epileptic subjects were well-controlled with antiseizure medication (75%; 8/12). Brain MRIs were performed on 88% of subjects, with abnormalities being present in 60% of all neuroimaging studies (Table 1, S6). Predominant findings were enlarged extra-axial spaces/ventriculomegaly (35.6%), white matter signal abnormalities (22.2%), thin corpus callosum (8.9%), and hypomyelination (13.3%) (Fig. 3). Ophthalmological findings included strabismus (88%), astigmatism (41.3%), hypermetropia (30.4%), myopia (21.7%), anisocoria (13.6%), and optic nerve hypoplasia (9.1%) (Table 1).

Fig. 3: Neuroimaging of GAND8 and GAND53.
figure 3

a, b GAND8. a Axial fluid-attenuated inversion recovery (FLAIR) image showed enlarged subarachnoid spaces and nonobstructive ventriculomegaly of the lateral ventricles (cavum vergae was also present). b Sagittal T1 showed macrocephaly, enlarged extra-axial fluid, and a thin corpus callosum (c,d) GAND53. c Sagittal T1 with normal corpus callosum and structures (prominence of the extra-axial cerebrospinal fluid [CSF] spaces previously seen on earlier imaging had resolved). d Axial FLAIR image showing multiple punctate and patchy foci of nonspecific signal hyperintensity scattered in the bilateral cerebral white matter, particularly involving the parietal lobes.

Cardiac findings

Many subjects underwent cardiology consultations and echocardiograms (34%). One subject had left pulmonary artery stenosis, while four other subjects had a bicuspid aortic valve (8%). Two of these subjects underwent balloon valvulotomy. One of these two children had neonatal critical aortic stenosis and subsequently developed aortic regurgitation that required a Ross cardiac procedure for correction.

Evaluation of cohorts with specific variant types

All of the above evaluations were subdivided into cohorts consisting of nonsense, truncating frameshift, deletions, splice-site, and missense variants. There were no significant differences between these variant groups with regard to the abovementioned parameters, with the exception of the frequency of epilepsy in subjects with missense variants (57%), compared with the total (24%) (p < 0.05; Fisher’s exact test) (Tables S2, S4, and S6).

Disruption of NuRD interactions by GATAD2B missense variants

GAND-associated missense variants were all located in conserved regions of GATAD2B: namely CR1 (L180P) and CR2 (G406R, R414Q, C420R, C420S). The CR1 motif is known to bind MBD2/3 NuRD subunits through the formation of a coiled coil.29 The CR2 region has been shown to interact with the C-terminal region of CHD4.26,30 To understand the effect of the GATAD2B missense variants in the context of NuRD assembly, we performed pairwise interaction experiments to assess the binding of GATAD2B to MBD or CHD proteins. Full-length HA-tagged GATAD2B (wild-type or missense variants) was coexpressed with either FLAG-tagged MBD2/3 or the FLAG-tagged C-terminal region of CHD proteins (the C1–C2 region that follows the helicase domain) and their interaction examined in pulldown experiments.26 The L180P variant abrogated the interaction of GATAD2B with both MBD2 and MBD3 (Fig. 4a). The L180 residue lies at the interface of the GATAD2-MBD coiled coil, and the leucine to proline substitution is likely to disrupt both the interaction interface and the intrinsic helical propensity of the CR1 region. Compared with wild-type GATAD2B, the CR2 region variants have variable effects on the interactions between GATAD2B and the CHD paralogues (Fig. 4b). The C420R and C420S variants substantially inhibited these interactions with the CHD paralogues compared with wild-type controls, whereas the R414Q variant caused a mild reduction in these interactions. The G406R variant showed no clear effect. The C420 residue is one of the zinc-binding ligands in the GATAD2B zinc-finger domain found in CR2 and substitution of this cysteine is likely to severely disrupt the structure of the CR2 domain. In contrast, G406 and R414 are upstream of the zinc-finger domain and are predicted to be in a region that is unstructured—consistent with their smaller (R414Q) or negligible (G406R) effect on the GATAD2B–CHD4 interaction. Compared with controls, the effect of these variants on interactions between GATAD2B and the CHD paralogues was consistent with the close sequence similarity between the CHD paralogues in the C2 region of these proteins (Fig. 4b).26

Fig. 4: GAND–associated missense variants L180P, R414Q, C420S, and C420R disrupt binding to nucleosome remodeling and deacetylase (NuRD) components CHD3/4/5 or MBD2/3.
figure 4

a FLAG-tagged MBD2 or MBD3 were coexpressed with HA-GATAD2B (wild-type [WT] or L180P) by in vitro transcription–translation (IVT) in a rabbit reticulocyte lysate. b The FLAG-tagged C1–C2 domains of CHD3, CHD4, or CHD5 were coexpressed with HA-GATAD2B (WT, R414Q, C420R, C420S, and G406R) in IVTs. In all experiments, FLAG-fusion proteins were immobilized on αFLAG affinity beads and used as baits to pull down the coexpressed HA-GATAD2B. As a negative control, wild-type (WT) HA-GATAD2B was added to beads to which no FLAG-fusion protein had been immobilized. In each case, 10% of inputs and 50% of elutions were loaded on an sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS-PAGE) and proteins were detected by western blot, using αHA and/or αFLAG antibodies. # degradation product of FLAG-MBD2.

DISCUSSION

We report the genetic and clinical features of 50 individuals withGATAD2B-associated neurodevelopmental disorder (GAND). Prior to our work, a handful of GAND patients had been reported with distinct facies, infantile hypotonia, intellectual disability, limited language ability, and strabismus.8,9,10,11,12 Our large cohort has allowed us to identify an expanded range of new phenotypic and genotypic features associated with GAND. These new features included macrocephaly, frontal bossing, polyhydramnios, infantile feeding difficulties, cardiac defects, anisocoria, astigmatism, epilepsy (focal and/or primary generalized), and abnormal neuroimaging, as well as a variety of new variant types including missense variants.

The clinical phenotype of GAND subjects was relatively consistent across our cohort. Pregnancies were mostly without complications, with the exception of polyhydramnios (45%). Several children were noted to have macrocephaly on prenatal ultrasound and macrocephaly was frequently present at birth (69%), and became more common with age (91.8%). All subjects had global developmental delay and intellectual disability. All subjects had infantile hypotonia and delayed motor milestones. Most children learned to ambulate. A majority of subjects were noted to have infantile feeding difficulties and gastroesophageal reflux disease (GERD) (82%). These oromechanical issues were also associated with delayed and limited expressive language development in all subjects that was reminiscent of childhood apraxia of speech.4 Receptive language was not as severely affected, with most children being able to follow multistep commands at older ages. Epilepsy was present in a minority of subjects (24%), and could be of focal or primary generalized onset. Most children responded well to antiepileptic treatment. Neuroimaging was abnormal in the majority of subjects (60%), with common features including ventriculomegaly/enlarged CSF spaces, white matter signal abnormalities, hypomyelination, and thin corpora callosi. Almost all subjects made good eye contact and exhibited social reciprocity. Toilet training was not attained in the majority of subjects. Most subjects had ophthalmological issues that included strabismus (88%), which was sometimes associated with astigmatism (41.3%), anisocoria (13.6%), and/or hypoplastic optic nerves (9%). Lastly, a minority of subjects were born with bicuspid aortic valves (10%, compared with 2% in the general population),31 with two subjects requiring surgical intervention. Of note, the two pairs of monozygotic twins in our cohort (GAND33/GAND34; GAND40/GAND41) had similar overall phenotypes with some differences. GAND33 developed epilepsy in her teens and continues to require seizure medication, while GAND34 does not have epilepsy. GAND33 has also plateauted developmentally, while GAND34 continues to progress with school. The other set of monozygotic twins shared very similar phenotypes, although GAND40 has had an easier time with school performance and GAND41 has been more adept at toilet training. Of course, with only two sets of twins it is difficult to make any definitive comments on phenotypic trends of monozygotic siblings. As the NuRD complex plays an active epigenetic role in corticogenesis,32 the identification of more monozygotic twin sets will be valuable to determine variable aspects of the GAND phenotype, especially with regard to cortical function.

Almost all GAND variants were de novo, with a diverse range in variant types that included nonsense (34%), truncating frameshift (32%), splice-site variants (14%), and deletions (6%). Seven subjects had missense variants (14%), which had not been previously reported in GAND. All of the missense variants were located within the two conserved region domains (CR1 and CR2), of the GATAD2B protein. Aside from an increased incidence of epilepsy in subjects with missense variants compared with subjects with other variants (57% to 19%, respectively), there were no dramatic phenotypic differences between subjects with missense versus loss-of-function (deletion, splice-site, nonsense, and truncating frameshift) variants (Table S4). Of course, in the context of these relatively low numbers of subjects (N = 7 vs. N = 43, respectively) that were across a range of ages, it is difficult to make any firm conclusions about specific variant types. If, with larger numbers, subjects with missense variants continue to be similar to subjects with loss-of-function variants, this similarity may indicate that haploinsufficiency of the GATAD2B protein may likely be the primary pathological mechanism in most GAND subjects.8

The GATAD2B protein functions as a subunit within the NuRD complex, which is an important regulator of gene expression during neurodevelopment. NuRD is a multiprotein holoenzyme possessing both histone deacetylase and chromatin remodeling activity. The complex can consist of an assortment of paralogous subunits, which in addition to GATAD2B (and its paralogue, GATAD2A) includes several scaffold proteins, i.e., (1) retinoblastoma binding proteins (RBBP4/RBBP7), (2) metastasis associated proteins (MTA1/MTA2/MTA3), and (3) methyl-domain binding proteins (MBD2/MBD3), along with enzymatically active subunits that include (4) histone deacetylases (HDAC1/HDAC2) and (5) chromatin-helicase DNA-binding proteins (CHD3/CHD4/CHD5).1,2 CDK2AP1 is also present in specific cell types.28 The MBD, GATAD2, and CHD subunits exist as interacting monomers;28,32,33,34,35 therefore, in principle, many configurations can exist with unique combinations of these paralogues that may have specific regulatory functions.20 For example, recent work suggested that MBD2-NuRD converts open chromatin into compacted chromatin, while MBD3-NuRD was associated with more active promoter regions.33 Other reports indicate that the CHD proteins are developmentally regulated during corticogenesis, which influences the generation of neural progenitors and neurons, as well as their subsequent migration and laminar identity.32 Of note, the GATA zinc-finger proteins (GATAD2B and its paralogue, GATAD2A) may have distinct effects on NuRD activity. For example, GATAD2A plays a specific role in DNA repair,35 as well as having a role in repression of pluripotency factors during induced pluripotent stem cell (IPSC) reprogramming;36 however, when GATAD2A expression was reduced, GATAD2B was incapable of rescuing the lost GATAD2A activity. To date, no specific roles for GATAD2B have been identified, although it has been shown to be the predominant GATAD2 gene expressed in the brain.37

One hypothesis regarding the general function of GATAD2 proteins is that they provide a “bridge” that links MBD proteins (via CR1) and the CHD chromatin remodeling proteins (via CR2) in the NuRD complex.2,25,26 GATAD2B-CR1’s interaction with MBD proteins occurs through coiled-coil domains within each protein that links GATAD2B with the core proteins of NuRD through MBD.26 In turn, the GATAD2-CR2 domain extends this bridge by its direct interactions with the carboxy-terminal C2 region of CHD proteins, thereby linking CHD proteins to the NuRD core.30 The validity of this bridge model is corroborated by the fact that all of our cohort’s pathogenic missense variants were located within these two regions. Furthermore, our immunoprecipitation assays indicated that several of these missense variants disrupted GATAD2B interactions as predicted by the model. For example, our L180P variant (analogous to L159 of GATAD2A) is located within the coiled-coil domain of CR1 (CR1-CC) that is a key contact point for MBD binding (Fig. S1).29 Our immunoprecipitation assays showed this variant prevented GATAD2B from interacting with MBD2 or MBD3 (Fig. 4a). This result is in line with the prediction that the substituted proline is likely to destabilize the CR1-CC α-helix that forms a coiled coil with MBD. It had been shown previously that a GATAD2A CR1-CC missense change (K149R) abolished the interaction of GATAD2A with MBD proteins, as well as MBD-mediated transcriptional repression.27 Our other missense variants were clustered within the GATAD2B-CR2 domain. The C420S and C420R variants affect the first of four zinc-binding cysteines of the GATA zinc-finger domain and so would be predicted to disrupt the proper folding and the known interaction of this domain with the CHD carboxy-terminal C2 region.30 Our immunoprecipitation assays confirmed this to be the case. It is also important to note that several subjects with CHD3-related syndrome and CHD4-related syndrome have been reported with carboxy-terminal variants (CHD3: p.R1881L, p.F1935Efs108X; CHD4: p.R1870X) that lie within the CHD carboxy-terminal C2 region.4,7 These variants could also disrupt the GATAD2B–CHD bridge and lead to the exclusion of CHD activity from NuRD complexes causing their associated neurodevelopmental disorders.4,7,8 The R414Q variant was present in two subjects and resulted in a milder disruption of in vitro GATAD2B–CHD interactions. It may be that, in the context of expression within a complete NuRD complex, this substitution may be more disruptive or alter interactions with other proteins associated with NuRD activity. In contrast to the effects of these other missense variants, the G406R substitution did not alter the in vitro interactions between GATAD2B and CHD proteins. This variant involve highly conserved GATAD2B-CR2 residue (Fig. S1) that is located in a region that has not been characterized structurally or functionally and that lies just N-terminal to the GATA zinc-finger (Fig. S1). Further work to determine the mechanism by which this variant affects GATAD2B function is required; however, the nucleotide substitution (c.1216G>C) associated with G406R involves the terminal nucleotide of exon 7 and so might interrupt proper splicing of intron 8. Whether these missense variants produce proteins capable of antagonizing CHD–MBD–NuRD interactions in a dominant negative manner or work via a dosage effect similar to haploinsufficiency with a similar resulting phenotype remains to be determined; nonetheless, their observed phenotypes are not dramatically altered from subjects with loss-of-function variants. Identification of more missense subjects and additional work are required to answer these questions.

Currently, no other germline disorders have been associated withGATAD2A or most of the other NuRD proteins. Nevertheless, numerous mouse models have indicated the importance of each of these subunits in development.2 Besides GATAD2B, the only other NuRD complex genes that are associated with human disease are CHD3 and CHD4, which have a substantial phenotypic overlap with GAND (although an HDAC1 missense variant was reported in a large exomic screen associated with epilepsy without a detailed report of the phenotype).4,5,6,7,8,38 CHD4RS (or SIHIWES) has been recently reported in 32 subjects, while CHD3RS (or SNIBCPS) has been reported in 35 subjects.4,5,6,7 Our expanded GAND phenotype indicates these three neurodevelopmental disorders share features that include macrocephaly, developmental delay, intellectual disability, infantile hypotonia, and ventriculomegaly. Interestingly, each of these disorders share some facial features associated with macrocephaly that include a high broad forehead, frontal bossing, ocular hypertelorism, and a wide nasal bridge (Fig. 2).4,5,6,7,8 Many GAND and CHD3RS subjects also had a prominent nose/nasal tip and pointed chins, while CHD4RS subjects had a small nose and ears with a square chin.4,5,6,7,8 Besides these facial characteristics, other variable features across these disorders include male genital abnormalities in both CHD3RS and CHD4RS, inguinal hernia in CHD3RS, and deafness in CHD4RS. Alternatively, GAND and CHD4RS are associated with congenital heart defects;5,6,7,31 whereas GAND and CHD3RS subjects presented with oromechanical dysfunction that was absent in CHD4RS that included neonatal feeding issues (and perhaps polyhydramnios) and language deficits reminiscent of childhood apraxia of speech.4,39

Childhood apraxia of speech (CAS) is a rare neurodevelopmental disorder associated with oromotor incoordination that limits fluent speech.39 Pathogenic variants in several members of the forkhead transcription factor family (FOXP1, MIM 613670; FOXP2, MIM 602081) have been associated with oromotor incoordination and language issues classified as CAS.39,40,41 Similar to CHD3,4 GATAD2B is one of a few documented proteins that interact with FOXP1/FOXP2 (via its CR2 domain and the zinc-finger/leucine-zipper repressor domain of FOXP),4,42 with GAND, CHD3RS, and FOXP1 subjects sharing several phenotypic features that include high forehead, frontal bossing, intellectual disability, and delayed language milestones. These data indicate that these proteins may undergo important interactions during neurodevelopment and corticogenesis that have essential roles in language development. Further work on the specific interactions of CHD3, GATAD2B, and the FOXP proteins is warranted and may provide further clarification of the molecular events required for corticogenesis and language development.

In summary, our GAND cohort has provided a more established picture of the genotypic and phenotypic features of this disorder. At this time, this diagnosis has been associated with several developmental, cardiac, ophthalmological, and neurological issues. Therefore, we recommend any child who is genetically diagnosed with GAND undergo consultations with specialists in these fields. Additional evaluations should include speech, physical, and occupational therapy consultations. Notably, the identification of macrocephaly as a feature of GAND has aligned this phenotype with other NuRD-associated disorders that include CHD4RS and CHD3RS. One hypothesis for this could be that GATAD2B dosage effects cause associated deficiencies in CHD3 and/or CHD4 chromatin remodeling activity in GAND children by the decreased availability of haploinsufficient GATAD2B levels to bring CHD-associated chromatin remodeling activity to NuRD complexes. Therefore, it may be prudent to evaluate GAND subjects for additional clinical features seen in CHD3RS and CHD4RS (e.g., bony lesions, hearing loss, hernias, or genital abnormalities), as these findings may also be present in GAND, albeit at lower frequency (e.g., one male subject had an undescended atrophic testis, which is more frequent in CHD3RS).4

GAND-associated missense variants were another important finding within our cohort. These variants were only present within GATAD2B’s CR1 or CR2 domains, which are known to interact with other NuRD proteins. Whether these variants act via a dominant negative mechanism by sequestering NuRD components or by blocking their interactions with other effector proteins (e.g., CHD4, CHD3, FOXP) and/or altering its DNA binding will require further evaluations with affected samples. It is interesting to note that subjects possessing missense variants and loss-of-function variants had similar phenotypes for the most part, which may indicate that they act through similar dosage-dependent mechanisms. More work will be required to unravel these interesting questions with regard to GATAD2B, the NuRD complex, and their phenotypic overlap with other neurodevelopmental disorders.