Biallelic variants in PCDHGC4 cause a novel neurodevelopmental syndrome with progressive microcephaly, seizures, and joint anomalies

Purpose We aimed to define a novel autosomal recessive neurodevelopmental disorder, characterize its clinical features, and identify the underlying genetic cause for this condition. Methods We performed a detailed clinical characterization of 19 individuals from nine unrelated, consanguineous families with a neurodevelopmental disorder. We used genome/exome sequencing approaches, linkage and cosegregation analyses to identify disease-causing variants, and we performed three-dimensional molecular in silico analysis to predict causality of variants where applicable. Results In all affected individuals who presented with a neurodevelopmental syndrome with progressive microcephaly, seizures, and intellectual disability we identified biallelic disease-causing variants in Protocadherin-gamma-C4 (PCDHGC4). Five variants were predicted to induce premature protein truncation leading to a loss of PCDHGC4 function. The three detected missense variants were located in extracellular cadherin (EC) domains EC5 and EC6 of PCDHGC4, and in silico analysis of the affected residues showed that two of these substitutions were predicted to influence the Ca2+-binding affinity, which is essential for multimerization of the protein, whereas the third missense variant directly influenced the cis-dimerization interface of PCDHGC4. Conclusion We show that biallelic variants in PCDHGC4 are causing a novel autosomal recessive neurodevelopmental disorder and link PCDHGC4 as a member of the clustered PCDH family to a Mendelian disorder in humans.


INTRODUCTION
Protocadherins (PCDHs) comprise a large family of over 80 cell surface receptors that are mainly expressed during the development of the vertebrate nervous system and play a crucial role in the discrimination between self and nonself cell surface identities in the course of establishment and generation of neuronal circuits [1,2]. Based on their genomic organization, human PCDHs can be divided into two families which are either encoded by genes distributed across the genome (nonclustered PCDHs) or genes clustered in a 1-Mb region on human chromosome 5 [3]. Clustered PCDHs (cPCDH) are encoded by a total of 53 genes arranged in three subclusters (PCDHA, PCDHB, and PCDHG) within this region [4][5][6]. All cPCDHs have a similar structure. They are type I transmembrane proteins containing six extracellular cadherin (EC) domains, a transmembrane region, and, in case of αand γ-PCDH, an intracellular domain (ICD) [1]. In the PCDHA and PCDHG subclusters, multiple "variable" exons, that encode for the entire extracellular region, the transmembrane domain and a variable part of the intracellular region, are tandemly arranged upstream of three "constant" exons, which are shared within a subcluster and code for a common C-terminal intracellular domain [4,7]. cPCDHs are widely expressed in the developing and mature nervous system including the spinal cord, cerebellum, and hippocampus [8][9][10][11]. They have been shown to form homophilic cisand trans-interactions inducing the formation of multimeric protein complexes [12][13][14]. Neurons have been suggested to create a unique "barcode" by the expression of different combinations of these proteins that results in the generation of neuron-specific sets of cis-dimers and allows self-nonself discrimination based on the formation of transhomophilic interactions [2,15]. Recent functional studies have linked numerous cPCDHs to critical neuronal processes such as regulation of neuronal survival, axon outgrowth and targeting, dendrite arbor complexity, self-avoidance of sister axon and dendrite branches, and synaptogenesis [8,[16][17][18]. Whereas knockout mice of the α-Pcdh cluster are viable and fertile and show only abnormal axonal projections of serotonergic and olfactory sensory neurons [16,19], disruption of the γ-Pcdh locus leads to neonatal lethality [8,20,21]. Recent studies revealed that Pcdhgc3, Pcdhgc4, and Pcdhgc5 are crucial for the observed lethality [22,23].
Hitherto, rare variants in nonclustered PCDH have been identified in individuals with different neurodevelopmental disorders. Rare biallelic variants in PCDH12 (OMIM 605622) and PCDH15 (OMIM 605514) have been reported in patients with diencephalic-mesencephalic junction dysplasia syndrome 1 (DMJDS1; OMIM 251280), Usher syndrome type 1F (USH1F, OMIM 602083), and nonsyndromic hearing loss (DFNB23; OMIM 609533), respectively [24,25]. Furthermore, more than 100 disease-causing variants have been described in PCDH19 (MIM 300460) in developmental and epileptic encephalopathy 9 (DEE9, OMIM 300088), making it one of the clinically relevant genes in epilepsy [26]. So far, no disease-causing variant has yet been identified in any of the cPCDHs to be causative for a Mendelian disorder in humans, despite their important role during neurodevelopment and in neural circuit assembly. In this study, we report the identification of biallelic disease-causing variants in Protocadherin-gamma-C4 (PCDHGC4) in 19 individuals from nine unrelated families. Affected individuals presented with progressive microcephaly, global developmental delay, intellectual disability, seizures, joint anomalies, and additional dysmorphic features. These findings establish biallelic PCDHGC4 variants as genetic cause for a novel neurodevelopmental disorder in humans, and elucidate the associated phenotype.

MATERIALS AND METHODS Subjects
Individuals who participated in this study were clinically characterized in several clinics across the world (see Supplemental Information), and we used the GeneMatcher tool [27] to connect centers in which genetic analyses were performed. All individuals reported herein are born to consanguineous families of different geographic origin, and respective families were not related to each other. Subjects or their legal representatives gave written informed consent for the molecular analyses, publication of the results and clinical information, including photographs. All studies were performed in accordance with the Declaration of Helsinki protocols and were reviewed and approved by the local institutional ethics board. DNA from participating family members was extracted from peripheral blood lymphocytes by standard extraction procedures.
Genome/exome sequencing and linkage analysis Genome and exome sequencing was performed on patient/parent trios (family 8), single (family 9), or multiple affected family members (families 1-7). Details on sequencing and variant screening as well as genome-wide linkage analysis (family 1) are provided as Supplementary Information.

Variant verification and Sanger sequencing
Verification of identified nonsense and missense variants was performed using standard methods for polymerase chain reaction (PCR) amplification and Sanger sequencing. Primer sequences are available on request. The coding sequence of PCDHGC4 (NM_018928.2) was analyzed and variants were confirmed by a second PCR on an independent DNA sample and analyzed for cosegregation within the respective families.
Structural analysis of mouse Pcdhgb7 and in silico analysis of the mutational effect Crystal structure of the Ca 2+ -bound form of mouse Pcdhgb7 was obtained from the Protein Data Bank (www.wwpdb.org; PDB ID 5v5x). Structural analysis, data visualization, and figure preparation were carried out with the program PyMOL 2.3 (www.pymol.org; Schrödinger, LLC) and WebLab viewerPro (Molecular Simulations Inc.).

RESULTS
Clinical presentation of individuals with a novel neurodevelopmental phenotype In a national and international collaboration, we recruited 19 individuals from nine unrelated families with a clinical diagnosis of a neurodevelopmental disorder. Clinical findings on all affected individuals are summarized in Table 1, with pedigrees and clinical photographs shown in Fig. 1. Comprehensive clinical information on families (1-4, 7, 8) is provided as Supplemental Information. For five individuals (families 5, 6, and 9), no extensive clinical descriptions are available.

Identification of biallelic truncating and missense PCDHGC4 variants
We performed linkage analysis (family 1) and/or genome/exome sequencing in probands and proband/parent trios. Based on parental consanguinity, autosomal recessive inheritance was considered likely, and we prioritized homozygous, rare exonic, and splice site variants (see Supplemental Information). We identified three different missense variants and five protein truncating variants in the Protocadherin-gamma family member PCDHGC4 (OMIM 606305; NM_018928.2) in all affected individuals (Fig. 3a, b, S1, Table 2). All variants fully cosegregated with the phenotype in the respective families and are absent or very rare in the general human population with minor allele frequencies (MAFs) ranging from 0 to 4*10 −6 , in line with an autosomal recessive pattern of inheritance ( , that were predicted to lead to an early stop and premature protein truncation, and were absent from the gnomAD database (Fig. 3b, Table 2). In family 9, we found the homozygous variant c.2443-1G>A at the acceptor splice site of intron 1, and by employing an exon-trapping approach we could show that this variant leads to a loss of the acceptor splice-site recognition resulting in severe splicing defects such as whole-exon skipping or usage of a cryptic exonic acceptor splice site, which both are predicted to induce a frameshift and premature protein truncation (Fig. 3b, S3). Within the family of γ-PCDHs, PCDHGC4 is the only member that is not only highly conserved across species, but also under strict mutational constraint [23]. Truncating variants in PCDHGC4 are rarely observed in healthy control individuals. For the canonical transcript of PCDHGC4 (ENST00000306593.1, NM_018928.2) only 12 alleles with nonsense variants, all in heterozygous state, were reported in the gnomAD database in contrast to 29.6 that were expected to be observed in the >240,000 alleles (probability of loss of function intolerance [pLI] = 0.98). Further, biallelic copy-number variants (CNVs) encompassing PCDGHC4 have not been reported so far in the DECIPHER database, the Database of Genomic Variants (DGV), and the structural variant (SV) data set of gnomAD with only two (DGV) and six (gnomAD) heterozygous alterations enlisted in these data sets that affect PCDHGC4. Interestingly, genetic disruption of the entire γ-Pcdh cluster as well as singular knockout of Pcdhgc4 in mice also cause a severe neurodevelopmental phenotype, both resulting in neurodegeneration in late embryonic stages and leading to early neonatal lethality [8,[20][21][22][23]. Furthermore, we identified three different homozygous missense variants, c.1449C>G (p.  Table 2). In silico prediction of the pathogenic effect of these missense variants by different prediction tools leads to the classification as damaging (SIFT), probably damaging (PolyPhen-2), and a Combined Annotation Dependent Depletion (CADD) score of 24.1 to 26.9, indicating deleteriousness of these variants ( Table 2) [Val606Gly]) extracellular cadherin (EC) domain and are predicted to lead to the substitution of phylogenetically highly conserved amino acids in PCDHGC4 (Fig. 3c). EC domains are extracellular Ca 2+ -binding domains, which upon Ca 2+ binding can mediate conformational changes influencing the rigidity of the EC domains of PCDHGC4, which enables cisand trans-homophilic interactions [2]. Ca 2+ binding is a crucial process for correct PCDH function. Upon binding of Ca 2+ , which is mediated by several calcium-binding motifs at the junctions of the EC repeats of PCDHs, the conformation and rigidity of these segments is controlled, allowing formation of cisas well as trans-dimerizations [28,29]. Whereas EC1 to EC4 contribute to the formation of head-to-tail trans interactions between different cells, EC5 and EC6 are involved in cis-dimerization processes. To gain further insights into the pathogenic effects of the missense variants, we performed an in silico analysis of the mutational effect on the protein structure using the crystal structure of mouse Pcdhgb7, a close homologue of PCDHGC4. All three missense variants were located in or directly adjacent to a Ca 2+ -binding motif. The p. (Asp483Glu) variant affects an aspartate that is part of the highly conserved DXD motif in the EC5 repeat of PCDHGC4 directly involved in calcium coordination (Fig. 3d, e)  the size of the residue, which is predicted to perturb the local structure and to shift the position of the coordinating carboxyl oxygens of this residue away from the optimal geometry of calcium-binding ligands. This should decrease the Ca 2+ -affinity of this motif. Interestingly, a similar substitution, p.Asp377Glu, has already been described for PCDH19, and it has been shown to impair PCDH19 function and cause early infantile epileptic encephalopathy [30]. The p.(Ala488Val) alteration is located in close proximity to the DXD motif within the fifth EC repeat (Fig. 3d,  f). Structural analysis of this highly conserved residue shows that the +5 position (in relation to the DXD motif) is generally a small amino acid (glycine or alanine). Substitution of this residue with valine, as identified in our patients, introduces a large, hydrophobic amino acid, which might interfere with the adjacent Ca 2 + -binding motif, thereby impairing Ca 2+ -affinity of PCDHGC4 (Fig. 3f). Similarly, also the p.Val606 is located in proximity to a DXD motif of PCDHGC4. In contrast to the other missense variants, the p.(Val606Gly) substitution is located in the third strand of a seven-stranded β-sheet of the sixth EC domain, embedded in a hydrophobic pocket (Fig. 3g). Substitution of valine at position 606 with glycine is predicted to cause structural perturbation of this region, which potentially might impair these interactions and, as a result, directly affect the cis-dimerization capability of PCDHGC4. Cis-dimerization of γ-PCDH is not only important for trans homophilic interactions on the cell surface, but also essential for cell surface delivery of newly synthesized γ-PCDH itself, as demonstrated by experiments on induced mutational disruption of the cis-interface [31]. Currently, we can only speculate about the direct effect of these three identified missense variants on PCDHGC4 protein function, but given the fact that they are all located in the fifth or sixth EC repeat, it seems likely that they directly or indirectly influence these cis-dimerization processes, thereby interfering with cell surface transport of PCDHGC4containing dimers [31].
In an additional family from Iran, we identified the homozygous missense variant c.2524G>C (p.[Gly842Ser]) in a patient presenting with facial dysmorphism, metopic craniosynostosis, ventriculomegaly, focal clonic seizures, and moderate global developmental delay (Fig. S4). This variant affects a residue within the ICD of PCDHGC4. The ICD of γ-PCDH plays an important role in the regulation of downstream signaling cascades, e.g., in the inhibition of FAK and PYK2 kinase activity, which is crucial for the promotion of dendrite arborization in cortical neurons [17]. Still, further studies are required to prove causality of this variant as well as to fully determine the specific function and involvement of this residue in intracellular, γ-PCDH-regulated signaling pathways.

DISCUSSION
In the present study, we provide strong genetic evidence that biallelic nonsense and missense variants in PCDHGC4 cause a distinct neurodevelopmental phenotype comprising progressive microcephaly, short stature, intellectual disability, seizures, and joint anomalies. In all 19 affected individuals from nine different families, we were able to identify homozygous disease-causing variants in PCDHGC4 that most likely lead to a loss of function of the encoded protein.
Interestingly, we observed seizures in 10 of 19 patients. Generally, development of focal seizures is considered to be caused by a disturbance of the excitation/inhibition balance in cortical neurons. Within these neuronal circuits, GABAergic cortical inhibitory interneurons (cINs) play an important role in restraining excitation levels in the brain under normal conditions, and alterations in the number of cINs have been associated with epilepsy [32]. During embryonic development, the number of cINs is regulated by programmed cell death. Initially, excess numbers of cINs are generated from a pool of cIN  progenitor cells which migrate to the developing cortex. Upon arrival,~40% of these cells are eliminated by endogenously triggered programmed cell death [33][34][35]. Interestingly, except Pcdhga9, all 21 γ-Pcdhs are expressed in cINs. Expression of four isoforms, Pcdhga1, Pcdhga2, Pcdhgc4, and Pcdhgc5, increases significantly between P8 and P15, corresponding to the period in which programmed cell death of cINs takes place [36]. Recent studies showed that Pcdhgc3, Pcdhgc4, and Pcdhgc5 are crucial components in the regulation of this programmed cell death. Loss of these isoforms enhances the number of cINs undergoing apoptosis, which results in a reduced cortical density of cINs [36]. A similar function of γ-PCDHs in controlling programmed cell death has also been described for neuronal cells of the spinal cord and the retina [8,9,22]. Still, further molecular and cellular studies are required to determine whether diseasecausing variants in PCDHGC4 alone are sufficient to increase programmed cell death in neuronal cells and to give rise to the clinical presentation observed in our patients via this pathway. Interestingly, genetic disruption of the entire γ-Pcdh cluster as well as singular knockout of Pcdhgc4 in mice both result not only in neurodegeneration in late embryonic stages but also lead to early neonatal lethality [8,[20][21][22][23]. Currently, it is unclear why disruption of Pcdhgc4 in mice leads to neonatal lethality, whereas biallelic loss-of-function variants in human, as observed in our patients, result in a milder neurodevelopmental disorder comprising progressive microcephaly, seizures, and intellectual disability, especially when considering that both humans and mice share the same set of 22 members within the γ-PCDH cluster. But the difference between the observed phenotypes suggests that the human brain might compensate for the functional failure of PCDHGC4 resulting in a higher tolerance of loss-of-function variants in terms of lethality.
So far, to the best of our knowledge, no member of the clustered PCDH family has been shown to be involved in the pathogenesis of a congenital human disorder. In recent years, disease-causing variants in several nonclustered δ-PCDH family members have been described and closely linked to different neurodevelopmental diseases. This includes biallelic loss-offunction variants in PCDH12 and PCDH15, which were identified as cause of diencephalic-mesencephalic junction dysplasia syndrome type 1 (DMJDS1), Usher syndrome type 1F, and nonsyndromic hearing loss, respectively, as well as PCDH19, in which over 100 different missense and nonsense variants have been reported to underlie X-linked developmental and epileptic encephalopathies 9 (DEE9) highlighting the importance of cell-cell communication via PCDH19 at the early stages of brain development [24][25][26]37]. Although recent studies indicate that complete or partial epigenetic dysregulation of the clustered PCDH occurs in cells of patients with Williams-Beuren syndrome or Down syndrome, and hypermethylation of all three PCDH clusters is detectable in Wilms tumors, a direct link to a monogenic, congenital human disorder has not been established before [38][39][40]. Currently, it is unclear whether this is due to functional redundancy of the encoded PCDHs. Mice lacking the αor β-Pcdh cluster are viable and fertile [16], whereas knockout of the whole γ-Pcdh cluster results in neonatal lethality [8,20,21]. Similar consequences were observed when only the γC3 to γC5 isoforms within this cluster were disrupted, indicating that one of these three isoforms has a critical function [22]. Very recent results based on the generation of single knockouts of γ-Pcdh members suggest that Pcdhgc4 is the crucial isoform within the gamma cluster, which is required for neuronal survival and responsible for neonatal lethality [23]. The unique role of PCDHGC4 is further supported by genetic data indicating that PCDHGC4 is the only member within the γ-PCDH cluster that is under strict mutational constraint [23]. We can only speculate about the molecular basis of the distinct role of PCDHGC4, especially as its overall structure is similar to other γ-PCDHs.
In conclusion, we show that biallelic truncating and missense variants in PCDHGC4 cause a specific human phenotype characterized by neurodevelopmental delay, progressive microcephaly with mild to severe intellectual disability, global developmental delay, joint anomalies, and seizures, providing evidence that diseasecausing variants in a single member of the clustered PCDH family are involved in the pathogenesis of a congenital disorder in humans.

DATA AVAILABILITY
The data that support the findings of this study are available on request from the corresponding authors. The genetic data are not publicly available due to privacy or ethical restrictions.