INTRODUCTION

Protocadherins (PCDHs) comprise a large family of over 80 cell surface receptors that are mainly expressed during the development of the vertebrate nervous system and play a crucial role in the discrimination between self and nonself cell surface identities in the course of establishment and generation of neuronal circuits [1, 2]. Based on their genomic organization, human PCDHs can be divided into two families which are either encoded by genes distributed across the genome (nonclustered PCDHs) or genes clustered in a 1-Mb region on human chromosome 5 [3]. Clustered PCDHs (cPCDH) are encoded by a total of 53 genes arranged in three subclusters (PCDHA, PCDHB, and PCDHG) within this region [4,5,6]. All cPCDHs have a similar structure. They are type I transmembrane proteins containing six extracellular cadherin (EC) domains, a transmembrane region, and, in case of α- and γ-PCDH, an intracellular domain (ICD) [1]. In the PCDHA and PCDHG subclusters, multiple “variable” exons, that encode for the entire extracellular region, the transmembrane domain and a variable part of the intracellular region, are tandemly arranged upstream of three “constant” exons, which are shared within a subcluster and code for a common C-terminal intracellular domain [4, 7]. cPCDHs are widely expressed in the developing and mature nervous system including the spinal cord, cerebellum, and hippocampus [8,9,10,11]. They have been shown to form homophilic cis- and trans-interactions inducing the formation of multimeric protein complexes [12,13,14]. Neurons have been suggested to create a unique “barcode” by the expression of different combinations of these proteins that results in the generation of neuron-specific sets of cis-dimers and allows self–nonself discrimination based on the formation of trans-homophilic interactions [2, 15]. Recent functional studies have linked numerous cPCDHs to critical neuronal processes such as regulation of neuronal survival, axon outgrowth and targeting, dendrite arbor complexity, self-avoidance of sister axon and dendrite branches, and synaptogenesis [8, 16,17,18]. Whereas knockout mice of the α-Pcdh cluster are viable and fertile and show only abnormal axonal projections of serotonergic and olfactory sensory neurons [16, 19], disruption of the γ-Pcdh locus leads to neonatal lethality [8, 20, 21]. Recent studies revealed that Pcdhgc3, Pcdhgc4, and Pcdhgc5 are crucial for the observed lethality [22, 23].

Hitherto, rare variants in nonclustered PCDH have been identified in individuals with different neurodevelopmental disorders. Rare biallelic variants in PCDH12 (OMIM 605622) and PCDH15 (OMIM 605514) have been reported in patients with diencephalic–mesencephalic junction dysplasia syndrome 1 (DMJDS1; OMIM 251280), Usher syndrome type 1F (USH1F, OMIM 602083), and nonsyndromic hearing loss (DFNB23; OMIM 609533), respectively [24, 25]. Furthermore, more than 100 disease-causing variants have been described in PCDH19 (MIM 300460) in developmental and epileptic encephalopathy 9 (DEE9, OMIM 300088), making it one of the clinically relevant genes in epilepsy [26]. So far, no disease-causing variant has yet been identified in any of the cPCDHs to be causative for a Mendelian disorder in humans, despite their important role during neurodevelopment and in neural circuit assembly. In this study, we report the identification of biallelic disease-causing variants in Protocadherin-gamma-C4 (PCDHGC4) in 19 individuals from nine unrelated families. Affected individuals presented with progressive microcephaly, global developmental delay, intellectual disability, seizures, joint anomalies, and additional dysmorphic features. These findings establish biallelic PCDHGC4 variants as genetic cause for a novel neurodevelopmental disorder in humans, and elucidate the associated phenotype.

MATERIALS AND METHODS

Subjects

Individuals who participated in this study were clinically characterized in several clinics across the world (see Supplemental Information), and we used the GeneMatcher tool [27] to connect centers in which genetic analyses were performed. All individuals reported herein are born to consanguineous families of different geographic origin, and respective families were not related to each other. Subjects or their legal representatives gave written informed consent for the molecular analyses, publication of the results and clinical information, including photographs. All studies were performed in accordance with the Declaration of Helsinki protocols and were reviewed and approved by the local institutional ethics board. DNA from participating family members was extracted from peripheral blood lymphocytes by standard extraction procedures.

Genome/exome sequencing and linkage analysis

Genome and exome sequencing was performed on patient/parent trios (family 8), single (family 9), or multiple affected family members (families 1–7). Details on sequencing and variant screening as well as genome-wide linkage analysis (family 1) are provided as Supplementary Information.

Variant verification and Sanger sequencing

Verification of identified nonsense and missense variants was performed using standard methods for polymerase chain reaction (PCR) amplification and Sanger sequencing. Primer sequences are available on request. The coding sequence of PCDHGC4 (NM_018928.2) was analyzed and variants were confirmed by a second PCR on an independent DNA sample and analyzed for cosegregation within the respective families.

Prediction programs

In silico prediction of the mutational effect for all missense variants was performed using Combined Annotation Dependent Depletion (CADD; https://cadd.gs.washington.edu), MutationTaster (www.mutationtaster.org), PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2), and SIFT (https://sift.bii.a-star.edu.sg). Variants with potential effects on splicing were characterized using ESEfinder and RESCUE ESE (see Supplemental Information).

Structural analysis of mouse Pcdhgb7 and in silico analysis of the mutational effect

Crystal structure of the Ca2+-bound form of mouse Pcdhgb7 was obtained from the Protein Data Bank (www.wwpdb.org; PDB ID 5v5x). Structural analysis, data visualization, and figure preparation were carried out with the program PyMOL 2.3 (www.pymol.org; Schrödinger, LLC) and WebLab viewerPro (Molecular Simulations Inc.).

RESULTS

Clinical presentation of individuals with a novel neurodevelopmental phenotype

In a national and international collaboration, we recruited 19 individuals from nine unrelated families with a clinical diagnosis of a neurodevelopmental disorder. Clinical findings on all affected individuals are summarized in Table 1, with pedigrees and clinical photographs shown in Fig. 1. Comprehensive clinical information on families (1–4, 7, 8) is provided as Supplemental Information. For five individuals (families 5, 6, and 9), no extensive clinical descriptions are available.

Table 1 Summary of genetic data and clinical features of affected individuals.
Fig. 1: Pedigrees and clinical characteristics of individuals harboring biallelic disease-causing variants in PCDHGC4.
figure 1

(a) Pedigrees of nine unrelated families with disease-causing variants in PCDHGC4. All affected siblings (solid symbols) in each family carry homozygous disease-causing variants in PCDHGC4 while unaffected  parents are heterozygous for identified PCDHGC4 variants (white symbols). (b) Upper panel: facial features of subjects IV-3 and V-1 from family 1 (left), clinical characteristics of subjects II-1 and II-2 from family 2 showing kyphoscoliosis, clinodactyly and hallux valgus (subject II-1), and kyphosis and hypoplasia of the toes (subject II-2). Lower panel (from left to right): facial features and hand anomalies observed in subjects II-1 (22 years) and II-3 (14 years) from family 4, clinical characteristics of subjects VI-1 and VI-2 from family 5, and subjects IV-2 and IV-3 from family 6, and facial features and feet anomalies observed subject VI-1 from family 8.

Common features in our patient cohort were developmental delay (DD)/intellectual disability (ID) (18/19), microcephaly (12/19), seizures (10/19), hypotonia (10/19), and skeletal/joint anomalies (10/19). Occipital–frontal circumferences (OFCs) at birth ranged from 1.7 SD (individual IV-3, family 6) to −3 SD (individual VI-2, family 5), and we observed microcephaly at birth (OFC ≤ −2 SD) in 2/19 patients. However, at follow up examinations, 12/19 individuals showed progressive mild to severe microcephaly with values from −2 SD to −5.5 SD. Neuroimaging was available for 12 individuals. Brain magnetic resonance image/computed tomography (MRI/CT) abnormalities (11/12 patients) were rather nonspecific (Fig. 2, Table 1). Microcephaly, thin cerebral cortex, mild ventriculomegaly, and cortical atrophy were the commonest features. Seizure types ranged from singular febrile seizures (family 3, subject II-3), recurring events (family 2, subjects II-1 and II-2) to generalized tonic, clonic–focal to multifocal seizures (families 4 and 5; Table 1). Electroencephalogram (EEG) data was available for four subjects (family 2, subjects II-1 and II-2; family 5, subject VI-2; family 7, subjects III-1) and showed no abnormalities. ID and DD were present in all our patients, and we observed motor and speech developmental delay as well as mild to severe cognitive impairment. Three individuals presented with kyphosis and/or scoliosis, hyperextensible joints were observed in three individuals and contractures as wells as arthrogryposis were present in four individuals (Fig. 1b, Table 1). Dysmorphic facial features were rather nonspecific and did not reveal a common, recognizable facial presentation within our patient cohort (Fig. 1b, Table 1).

Fig. 2: Neuroradiologic features of affected individuals.
figure 2

Sagittal (a) and axial (b) T2-weighted images of subject IV-3 from family 1 at the age of 10 revealed no structural brain anomalies but showed microcephaly and thin cerebral cortex. (c) Sagittal T1 section after gadolinium injection of subject II-1 (family 4) at 10 years of age and (d) axial T2-weighted images at 16 years revealed no brain-specific abnormality except for a discreet prominent aspect of the lateral ventricles. (e) Sagittal T1 section after gadolinium injection and (f) axial T2-weighted images of subject II-3 (family 4) at 7 years of age revealed a prominent aspect of the lateral ventricles, of the 3rd and, to a milder degree, of the 4th ventricle. (g) Coronal T2-weighted and (h) axial T2-Flair images of subject IV-2 (family 6) at 3 months of age showing normal signal intensity, age-appropriate myelination process and slightly enlarged cerebrospinal fluid (CSF). Coronal (i) and sagittal (j) computed tomography (CT) images of the same subject at the age of 3 years revealing left-sided subcortical hypodensity within left temporal lobe and confirming prominent CSF space.

Identification of biallelic truncating and missense PCDHGC4 variants

We performed linkage analysis (family 1) and/or genome/exome sequencing in probands and proband/parent trios. Based on parental consanguinity, autosomal recessive inheritance was considered likely, and we prioritized homozygous, rare exonic, and splice site variants (see Supplemental Information). We identified three different missense variants and five protein truncating variants in the Protocadherin-gamma family member PCDHGC4 (OMIM 606305; NM_018928.2) in all affected individuals (Fig. 3a, b, S1, Table 2). All variants fully cosegregated with the phenotype in the respective families and are absent or very rare in the general human population with minor allele frequencies (MAFs) ranging from 0 to 4*10−6, in line with an autosomal recessive pattern of inheritance (Table 2). We identified four homozygous loss-of-function variants in PCDHGC4, c.118C>T (p.[Gln40*]), c.324del (p.[Phe108Leufs*14]), c.1243C>T (p.[Arg415*]), c.1724dup (p.[Leu575Phefs*63]), that were predicted to lead to an early stop and premature protein truncation, and were absent from the gnomAD database (Fig. 3b, Table 2). In family 9, we found the homozygous variant c.2443-1G>A at the acceptor splice site of intron 1, and by employing an exon‐trapping approach we could show that this variant leads to a loss of the acceptor splice‐site recognition resulting in severe splicing defects such as whole‐exon skipping or usage of a cryptic exonic acceptor splice site, which both are predicted to induce a frameshift and premature protein truncation (Fig. 3b, S3). Within the family of γ-PCDHs, PCDHGC4 is the only member that is not only highly conserved across species, but also under strict mutational constraint [23]. Truncating variants in PCDHGC4 are rarely observed in healthy control individuals. For the canonical transcript of PCDHGC4 (ENST00000306593.1, NM_018928.2) only 12 alleles with nonsense variants, all in heterozygous state, were reported in the gnomAD database in contrast to 29.6 that were expected to be observed in the >240,000 alleles (probability of loss of function intolerance [pLI] = 0.98). Further, biallelic copy-number variants (CNVs) encompassing PCDGHC4 have not been reported so far in the DECIPHER database, the Database of Genomic Variants (DGV), and the structural variant (SV) data set of gnomAD with only two (DGV) and six (gnomAD) heterozygous alterations enlisted in these data sets that affect PCDHGC4. Interestingly, genetic disruption of the entire γ-Pcdh cluster as well as singular knockout of Pcdhgc4 in mice also cause a severe neurodevelopmental phenotype, both resulting in neurodegeneration in late embryonic stages and leading to early neonatal lethality [8, 20,21,22,23].

Fig. 3: Molecular characterization and in silico analysis of identified disease-causing variants in PCDHGC4.
figure 3

(a) Schematic representation of the human γ-PCDH cluster. Variable exons of the γ-PCDH A and B subfamilies are shown in gray and black, respectively. Variable exons of the γ-PCDH C subfamily are shown in purple, γ-PCDH constant exons in blue. (b) Schematic representation of the genomic (upper panel) and protein structure (lower panel) of PCDHGC4, and localization of the identified disease-causing variants. Introns are shown by black horizontal line, coding exons by purple and blue bars, noncoding regions of exons by small blue bar (upper panel). Scale bar is referring solely to exons. Protein structure of PCDHGC4 with six extracellular cadherin (EC) repeats (purple), the transmembrane region (gray), and the intracellular domain (ICD, blue). (c) Amino acid sequence alignment of PCDHGC4 across different species including mouse Pcdhgb7 (lower line, all panels) for residues p.Asp483 and p.Ala488 (upper panel) and p.Val606 (lower panel) that are altered in the affected subjects. Protein sequences were prepared from UniProtKB and alignment was performed using Clustal Omega. Position of the altered residues in human are indicated (top numbers). (d) Three-dimensional structure of the EC3 to EC6 domains of Pcdhgb7. Structural information was obtained from the Protein Data Bank (PDB) and is available under the accession number 5v5x. Pcdhgb7 is shown in ribbon representation. β-strands are shown as arrows (blue), a short helical part in red, calcium ions in sphere representation (green), and aspartate at position 478 within the Ca2+-binding DXD motif in space filling representation (red). (e-g) Close up crossed eyes stereo views of p.Asp478 in Pcdhgb7 corresponding to p.Asp483 in PCDHGC4 (e), p.Gly483 (corresponding to p.Ala488 in PCDHGC4) (f), and p.Leu602 (corresponding to p.Val606 in PCDHGC4) (g). Affected amino acid residues are labeled in red, calcium ions are shown in sphere representation (light green), oxygen ligands of the adjacent calcium ion in space filling representation (f, dark green), surrounding hydrophobic residues p.Pro558, p.Tyr604, and p.Val644 of p.Leu602 in space filling representation in yellow (g).

Table 2 In silico prediction and population allele frequencies of PCDHGC4 (NM_018928.2; ENST00000306593.1) variants identified in this study.

Furthermore, we identified three different homozygous missense variants, c.1449C>G (p.[Asp483Glu]), c.1463C>T (p.[Ala488Val]), and c.1817T>G (p.[Val606Gly]), in PCDHGC4 in affected individuals of four additional consanguineous families (Fig. 3, Table 2). In silico prediction of the pathogenic effect of these missense variants by different prediction tools leads to the classification as damaging (SIFT), probably damaging (PolyPhen-2), and a Combined Annotation Dependent Depletion (CADD) score of 24.1 to 26.9, indicating deleteriousness of these variants (Table 2). Two missense variants, p.(Asp483Glu) and p.(Ala488Val), were classified as polymorphisms by a single in silico prediction tool, MutationTaster. In two families, families 3 and 6 from Iraq and Saudi Arabia, respectively, we identified the identical missense variant, c.1463C>T (p.[Ala488Val]), in PCDHGC4. In affected individuals of both families, this variant was within a shared haplotype of approximately 309 kb between chr5:140,750,044 and chr5:141,059,868 suggesting a founder nature of the variant. On protein level, the three missense variants are located in the extracellular domain of PCDHGC4 within the fifth (p.[Asp483Glu) and p.(Ala488Val]) or sixth (p.[Val606Gly]) extracellular cadherin (EC) domain and are predicted to lead to the substitution of phylogenetically highly conserved amino acids in PCDHGC4 (Fig. 3c). EC domains are extracellular Ca2+-binding domains, which upon Ca2+ binding can mediate conformational changes influencing the rigidity of the EC domains of PCDHGC4, which enables cis- and trans-homophilic interactions [2]. Ca2+ binding is a crucial process for correct PCDH function. Upon binding of Ca2+, which is mediated by several calcium-binding motifs at the junctions of the EC repeats of PCDHs, the conformation and rigidity of these segments is controlled, allowing formation of cis- as well as trans-dimerizations [28, 29]. Whereas EC1 to EC4 contribute to the formation of head-to-tail trans interactions between different cells, EC5 and EC6 are involved in cis-dimerization processes. To gain further insights into the pathogenic effects of the missense variants, we performed an in silico analysis of the mutational effect on the protein structure using the crystal structure of mouse Pcdhgb7, a close homologue of PCDHGC4. All three missense variants were located in or directly adjacent to a Ca2+-binding motif. The p.(Asp483Glu) variant affects an aspartate that is part of the highly conserved DXD motif in the EC5 repeat of PCDHGC4 directly involved in calcium coordination (Fig. 3d, e). Although this variant does not change the charge of the coordinating residue, it alters the size of the residue, which is predicted to perturb the local structure and to shift the position of the coordinating carboxyl oxygens of this residue away from the optimal geometry of calcium-binding ligands. This should decrease the Ca2+-affinity of this motif. Interestingly, a similar substitution, p.Asp377Glu, has already been described for PCDH19, and it has been shown to impair PCDH19 function and cause early infantile epileptic encephalopathy [30]. The p.(Ala488Val) alteration is located in close proximity to the DXD motif within the fifth EC repeat (Fig. 3d, f). Structural analysis of this highly conserved residue shows that the +5 position (in relation to the DXD motif) is generally a small amino acid (glycine or alanine). Substitution of this residue with valine, as identified in our patients, introduces a large, hydrophobic amino acid, which might interfere with the adjacent Ca2+-binding motif, thereby impairing Ca2+-affinity of PCDHGC4 (Fig. 3f). Similarly, also the p.Val606 is located in proximity to a DXD motif of PCDHGC4. In contrast to the other missense variants, the p.(Val606Gly) substitution is located in the third strand of a seven-stranded β-sheet of the sixth EC domain, embedded in a hydrophobic pocket (Fig. 3g). Substitution of valine at position 606 with glycine is predicted to cause structural perturbation of this region, which potentially might impair these interactions and, as a result, directly affect the cis-dimerization capability of PCDHGC4. Cis-dimerization of γ-PCDH is not only important for trans homophilic interactions on the cell surface, but also essential for cell surface delivery of newly synthesized γ-PCDH itself, as demonstrated by experiments on induced mutational disruption of the cis-interface [31]. Currently, we can only speculate about the direct effect of these three identified missense variants on PCDHGC4 protein function, but given the fact that they are all located in the fifth or sixth EC repeat, it seems likely that they directly or indirectly influence these cis-dimerization processes, thereby interfering with cell surface transport of PCDHGC4-containing dimers [31].

In an additional family from Iran, we identified the homozygous missense variant c.2524G>C (p.[Gly842Ser]) in a patient presenting with facial dysmorphism, metopic craniosynostosis, ventriculomegaly, focal clonic seizures, and moderate global developmental delay (Fig. S4). This variant affects a residue within the ICD of PCDHGC4. The ICD of γ-PCDH plays an important role in the regulation of downstream signaling cascades, e.g., in the inhibition of FAK and PYK2 kinase activity, which is crucial for the promotion of dendrite arborization in cortical neurons [17]. Still, further studies are required to prove causality of this variant as well as to fully determine the specific function and involvement of this residue in intracellular, γ-PCDH-regulated signaling pathways.

DISCUSSION

In the present study, we provide strong genetic evidence that biallelic nonsense and missense variants in PCDHGC4 cause a distinct neurodevelopmental phenotype comprising progressive microcephaly, short stature, intellectual disability, seizures, and joint anomalies. In all 19 affected individuals from nine different families, we were able to identify homozygous disease-causing variants in PCDHGC4 that most likely lead to a loss of function of the encoded protein.

Interestingly, we observed seizures in 10 of 19 patients. Generally, development of focal seizures is considered to be caused by a disturbance of the excitation/inhibition balance in cortical neurons. Within these neuronal circuits, GABAergic cortical inhibitory interneurons (cINs) play an important role in restraining excitation levels in the brain under normal conditions, and alterations in the number of cINs have been associated with epilepsy [32]. During embryonic development, the number of cINs is regulated by programmed cell death. Initially, excess numbers of cINs are generated from a pool of cIN progenitor cells which migrate to the developing cortex. Upon arrival, ~40% of these cells are eliminated by endogenously triggered programmed cell death [33,34,35]. Interestingly, except Pcdhga9, all 21 γ-Pcdhs are expressed in cINs. Expression of four isoforms, Pcdhga1, Pcdhga2, Pcdhgc4, and Pcdhgc5, increases significantly between P8 and P15, corresponding to the period in which programmed cell death of cINs takes place [36]. Recent studies showed that Pcdhgc3, Pcdhgc4, and Pcdhgc5 are crucial components in the regulation of this programmed cell death. Loss of these isoforms enhances the number of cINs undergoing apoptosis, which results in a reduced cortical density of cINs [36]. A similar function of γ-PCDHs in controlling programmed cell death has also been described for neuronal cells of the spinal cord and the retina [8, 9, 22]. Still, further molecular and cellular studies are required to determine whether disease-causing variants in PCDHGC4 alone are sufficient to increase programmed cell death in neuronal cells and to give rise to the clinical presentation observed in our patients via this pathway. Interestingly, genetic disruption of the entire γ-Pcdh cluster as well as singular knockout of Pcdhgc4 in mice both result not only in neurodegeneration in late embryonic stages but also lead to early neonatal lethality [8, 20,21,22,23]. Currently, it is unclear why disruption of Pcdhgc4 in mice leads to neonatal lethality, whereas biallelic loss-of-function variants in human, as observed in our patients, result in a milder neurodevelopmental disorder comprising progressive microcephaly, seizures, and intellectual disability, especially when considering that both humans and mice share the same set of 22 members within the γ-PCDH cluster. But the difference between the observed phenotypes suggests that the human brain might compensate for the functional failure of PCDHGC4 resulting in a higher tolerance of loss-of-function variants in terms of lethality.

So far, to the best of our knowledge, no member of the clustered PCDH family has been shown to be involved in the pathogenesis of a congenital human disorder. In recent years, disease-causing variants in several nonclustered δ-PCDH family members have been described and closely linked to different neurodevelopmental diseases. This includes biallelic loss-of-function variants in PCDH12 and PCDH15, which were identified as cause of diencephalic–mesencephalic junction dysplasia syndrome type 1 (DMJDS1), Usher syndrome type 1F, and nonsyndromic hearing loss, respectively, as well as PCDH19, in which over 100 different missense and nonsense variants have been reported to underlie X-linked developmental and epileptic encephalopathies 9 (DEE9) highlighting the importance of cell–cell communication via PCDH19 at the early stages of brain development [24,25,26, 37]. Although recent studies indicate that complete or partial epigenetic dysregulation of the clustered PCDH occurs in cells of patients with Williams–Beuren syndrome or Down syndrome, and hypermethylation of all three PCDH clusters is detectable in Wilms tumors, a direct link to a monogenic, congenital human disorder has not been established before [38,39,40]. Currently, it is unclear whether this is due to functional redundancy of the encoded PCDHs. Mice lacking the α- or β-Pcdh cluster are viable and fertile [16], whereas knockout of the whole γ-Pcdh cluster results in neonatal lethality [8, 20, 21]. Similar consequences were observed when only the γC3 to γC5 isoforms within this cluster were disrupted, indicating that one of these three isoforms has a critical function [22]. Very recent results based on the generation of single knockouts of γ-Pcdh members suggest that Pcdhgc4 is the crucial isoform within the gamma cluster, which is required for neuronal survival and responsible for neonatal lethality [23]. The unique role of PCDHGC4 is further supported by genetic data indicating that PCDHGC4 is the only member within the γ-PCDH cluster that is under strict mutational constraint [23]. We can only speculate about the molecular basis of the distinct role of PCDHGC4, especially as its overall structure is similar to other γ-PCDHs.

In conclusion, we show that biallelic truncating and missense variants in PCDHGC4 cause a specific human phenotype characterized by neurodevelopmental delay, progressive microcephaly with mild to severe intellectual disability, global developmental delay, joint anomalies, and seizures, providing evidence that disease-causing variants in a single member of the clustered PCDH family are involved in the pathogenesis of a congenital disorder in humans.