Introduction

Chromosomal microarray analysis has become a powerful diagnostic tool to test patients with neurodevelopmental disorders. However, the pathogenicity of some CNVs is difficult to interpret because their causal relationship to the phenotype remains unclear. This is especially true when the CNV is inherited from a healthy parent. Several hypotheses have been proposed to explain the incomplete penetrance of CNVs: phenotypic variation extending into the normal range, imprinting, other epigenetic modification, mosaicism in the unaffected parent, the action of a modifier gene on a key dosage-sensitive locus, modification of the abnormality during transmission, the presence of a “second hit” elsewhere in the genome, and in cases of chromosomal deletions, unmasking of a recessive variant in the contralateral allele in the proband [1, 2]. Indeed, an abnormal phenotype may result from the combined effect of a mutant allele and a deletion at the same locus following a recessive mode of inheritance. To our knowledge, the first case reported was the unmasking of a hemizygous P gene variant by a chromosome 15q deletion in a patient with albinism and Prader–Willi syndrome [3]. Similarly, 22q11.2 deletions associated with hemizygous variants in SNAP29 lead to cerebral dysgenesis, neuropathy, ichthyosis, palmoplantar, and keratoderma (CEDNIK) syndrome, while variants in the SCARF2 gene cause Van den Ende-Gupta syndrome [4, 5]. Additional examples are the 16p11.2 and 16p13.11 regions in which variants in the CLN3 and NDE1 genes respectively result in unusually severe phenotypes [6, 7]. It is worth noting that in these mentioned cases the recessive variant results in a null allele. Interestingly, it has also been shown that some conditions, such as congenital scoliosis and thrombocytopenia absent radii (TAR) syndrome could be caused by the compound heterozygosity for a deletion and a hypomorphic allele [8, 9]. Finally, homozygous deletions have also been reported as a mechanism to explain some cases of phenotypic variability [10, 11].

Currently, next-generation sequencing technologies, in particular whole-exome sequencing (WES), provide a unique opportunity to search for additional single-nucleotide variants (SNVs) that may contribute to variable phenotypic expression of inherited CNVs. Our project aimed to evaluate the frequency of recessive variants unmasked by a deletion to explain the phenotypic variability between a parent and his/her child carrying the same chromosomal deletion.

Subjects and methods

We collected a cohort of 19 patients with neurodevelopmental disorders harboring a rare CNV inherited from a healthy parent. The genomic imbalance was detected by array comparative genomic hybridization (CGH) and was confirmed by Fluorescence in situ Hybridization (FISH) analysis. FISH analysis was also carried out in the parents in order to determine the inheritance of the CNV.

For patients 14–19, because of the small number of genes mapping within the deleted chromosomal segment, mutation screening on the contralateral allele was performed by conventional Sanger sequencing. The remaining cases were investigated by WES analysis.

We first focused on hemizygous SNVs located within the deleted fragment and corresponding to non-synonymous variants, splice acceptor and donor site mutations and coding insertions/deletions (indels). We searched for homozygous deletions by evaluation of coverage depth using an in-house algorithm developed by the bioinformatics platform of our institute. We also used the IgView browser to visualize the deleted regions and look for segments not covered by any reads, which could correspond to homozygous deletions. Second, we excluded common (minor allele frequency >1%) genetic variants reported in public databases (dbSNP138 (http://www.ncbi.nih.gov/SNP), the 1000 Genomes Project (http://browser.1000genomes.org/index.html), the NHLBI ESP Exome Variant Server (http://evs.gs.washington.edu/EVS/), the ExAC browser (http://exac.broadinstitute.org)) and in-house exome data containing information for >5000 samples. Finally, the in silico prediction tools SIFT (score ≤0.05) and PolyPhen-2 (score >0.15), were used to evaluate the potential impact of the variants on protein function.

Chromosomal deletions and SNVs were added to the DECIPHER (https://decipher.sanger.ac.uk; patients IDs shown in Table 1) and ClinVar databases respectively (www.ncbi.nlm.nih.gov/clinvar; variations IDs: 431697–431698).

Table 1 Clinical description and array-CGH findings of the patients included in this study

Detailed methods are presented as Supplementary Information.

Results and Discussion

The patients included in this study are presented in Table 1. The size of the deletions ranges from 165 kb to 4.6 Mb, with an average of 1.1 Mb. The number of genes located within the deleted segment ranges from 1 to 45, with an average of 11 genes. None of the inherited CNVs encompass an imprinted region, and parental mosaicism was excluded by FISH analysis on blood samples from the parent transmitting the CNV.

Our strategy allowed us to identify a candidate disease-causing hemizygous variant in 2/19 cases (Supplementary Table). No homozygous deletion was identified. For the negative patients, sequence analysis was extended to a 1-Mb long interval flanking the deleted segment, seeking for variants that may disrupt long distance regulatory regions. No additional candidate variant was identified. Finally, we also investigated the WES data for rare non-synonymous, indels or splicing variants genome-wide. This strategy allowed us to identify variants among ID-associated genes for three patients.

In patient 14, Sanger sequencing analysis did not identify any variation in the NLGN1 gene present on the remaining allele. The patient was further investigated using a targeted sequencing of genes known to be involved in agenesis or dysgenesis of the corpus callosum. A previously described pathogenic variant in the SMARCA4 gene, responsible for Coffin Siris syndrome was identified. In patient 6, in addition to the NCOR1 variant, WES revealed a maternally inherited variant in the X-linked SOX3 gene (NM_005634.2:c.1287C>G), which has not been previously reported. Variants in this gene have been associated to growth hormone deficiency, variable degrees of additional pituitary hormone deficiencies, and intellectual disability. As the patient does not meet any of these clinical criteria, we do not consider this variant as causal. Finally, we also identified a NM_005120.2:c.271C>T variant in the X-linked MED12 gene in patient 8. In this case, the contribution of this variant to the phenotype remains uncertain.

Characterization of candidate disease-causing variants

Patient 4 is a 4-year-old girl presenting with developmental delay, growth retardation and facial dysmorphism (i.e., synophrys, slightly upslanting palpebral fissures, a short nose with anteverted nostrils and a small mouth with a thin upper lip) and harboring a 9q deletion inherited from her healthy mother. Exome sequencing identified a hemizygous one-base pair deletion in the NUP214 gene (Nucleoporin 214 kDa): Chr9(GRCh37):g.134074412del (reference sequence NG_023371.1), NM_005085.3:c.5521+10del. This very rare variant, observed only once in the ExAC browser, was further confirmed by Sanger sequencing and shown to be paternally inherited (Fig. 1a–e). In DECIPHER, only one patient with multiple congenital anomalies carrying an overlapping deletion was reported (#285904). However, the deletion was more than twice as large and occurred de novo.

Fig. 1
figure 1

Molecular cytogenetic and genomic findings for patient 4. a Overview of the genes included in the deleted segment, using UCSC Genome Browser (GRCh37 build). b Array-CGH profile of chromosome 9, showing the 1.9 Mb deletion in the 9q34.12q34.13 region. c FISH analysis on cultured lymphocytes of patient 4 with BAC clone RP11-738I14 (green) localized within the deleted segment at 9q34.13 and the contig 9qtel probe (control probe) (red). The white arrow shows the chromosome carrying the 9q34.12q34.13 deletion. d FISH analysis on cultured lymphocytes of the mother of patient 4, using the same FISH probes. The white arrow shows the 9q34.12q34.13 deletion. e Sanger sequencing traces showing the confirmation of the c.[5521+10del] variant (red box) in the NUP214 gene in patient 4. This single-nucleotide deletion is present in the heterozygous state in the father. f RT-qPCR on mRNA extracted from peripheral blood, showing a 90% decrease in expression level of NUP214 mRNA in patient 4 and a 50% decrease in both parents compared to controls. g Representation of the NM_005085.3 transcript of the NUP214 gene (from Ensembl). The red star shows the localization of the c.[5521+10del] variant, in intron 29

The consequence of the c.[5521+10del] variant on NUP214 expression was then assessed by quantitative reverse-transcription PCR (RT-PCR) of RNAs extracted from the patient’s leukocytes. RT-PCR was performed using primers flanking exon 28 and exons 30/31 junction and primers flanking exon 28 and intron 29 to test for potential exon skipping or intron retention, respectively (reference sequence for exon numbering NG_023371.1). Both tests showed no difference in the splicing of the transcript between patient and control cells, but an effect on splicing in other tissues, especially in the brain, cannot completely be ruled out. However, expression level of the NUP214 transcript was significantly decreased and close to zero in our patient compared to the controls (Fig. 1f). The expression level of NUP214 is decreased by about 50% in the mother carrying the deletion and in the father carrying the single-nucleotide variant, compared to the controls. Furthermore, we performed WES on the parental DNA and did a trio analysis seeking for de novo or compound heterozygous variants that could participate to the phenotype. This analysis did not identify any other variant potentially affecting gene function.

The NUP214 gene (nucleoporin 214 kDa) encodes a 214 kDa nucleoporin that assembles with other proteins to form the nuclear pore complex. The NUP214 protein is localized on the cytoplasmic side of this complex, where it participates in the progression of the cell cycle and in the nucleo-cytoplasmic transport of macromolecules [12]. This gene is already known in human pathology. In some forms of acute myeloid leukemia and myelodysplastic syndromes, a translocation t(6; 9)(p23;q34) is at the origin of a fusion gene formed by the 3′ portion of the NUP214 gene and the 5′ portion of the DEK gene, located on chromosome 6 (ref. 13). According to the ExAC (Exome Aggregation Consortium) database and the study from Lek et al. [14], this gene is predicted to be extremely loss-of-function intolerant (pLI = 0.99). Finally, murine models have shown that the KO mice for Nup214 are not viable, demonstrating the essential role of this gene in the early stages of embryonic development [15]. While further functional studies are needed to unambiguously demonstrate the pathogenicity of this variant, we believe that our data support its role in the phenotype observed in patient 4.

Patient 6 is a 6-year-old boy presenting with moderate intellectual disability, joint hyperlaxity and a thin skin, and harboring a 17q deletion inherited from his father. WES allowed us to identify a candidate variant in the NCOR1 gene (Nuclear receptor co-repressor 1): NM_006311.3:c.97C>T Chr17(GRCh37):g.16097787G>A p.(Arg33Cys). Sanger sequencing further confirmed the variant and demonstrated maternal inheritance (Fig. 2). To our knowledge, no similar-sized deletion was reported in DECIPHER.

Fig. 2
figure 2

Molecular cytogenetic and genomic findings for patient 6. a Overview of the genes included in the deleted segment, using UCSC Genome Browser (GRCh37 build). b Array-CGH profile of chromosome 17, showing the 1.3 Mb deletion in the 17p11.2p12 region. c FISH analysis on cultured lymphocytes of patient 6 with BAC clone RP11-692E18 (green) localized within the deleted segment at 17p12 and the contig 17qtel probe (control probe) (red). The white arrow shows the chromosome carrying the 17p11.2p12 deletion. d FISH analysis on cultured lymphocytes of the father of patient 6, using the same FISH probes. The white arrow shows the 17p11.2p12 deletion. e Sanger sequencing traces showing the confirmation of the c.[97C>T] variant (red box) in the NCOR1 gene in patient 6. This single-nucleotide variant is present in the heterozygous state in the mother. f Representation of the NM_006311.3 transcript of the NCOR1 gene (from Ensembl). The red star shows the localization of the c.[97C>T] variant

The p.(Arg33Cys) variant alters an absolutely conserved residue and was predicted to be damaging by various in silico tools (PolyPhen-2 score: 1, SIFT scores: 0 and disease causing according to Mutation Taster). This amino acid localizes in the domain of interaction with the ZBTB33 transcription factor [16]. According to the ExAC browser database this variant has been observed in 6 out of 79236 alleles, but never in a homozygous state. The consequence of the c.[97C>T] variant on NCOR1 expression was assessed by RT-PCR of RNAs extracted from leukocytes. This assay showed no difference of expression between the patient and controls (data not shown). Trio analysis with parental samples did not identify any other variant potentially affecting gene function.

A great deal of evidence supports the contribution of this variant to the phenotypic variability between the patient and his father. NCOR1 is a ubiquitously expressed co-repressor, originally identified as the mediator of ligand-independent transcriptional repression of the thyroid hormone and the retinoic acid receptor [17, 18]. NCOR1 mediates transcriptional repression by forming a co-repressor complex with the histone deacetylase HDAC3, the transducin β-like 1 (TBL1, also known as TBL1X), the protein TBL-related 1 (TBLR1, also known as TBL1XR1) and the G-protein-pathway suppressor 2 (GPS2). This complex binds to DNA promoters through interaction with transcription factors, leading to the deacetylation of local histones by HDAC3 and thus repression of gene transcription [16, 19]. NCOR1 plays a major role in neural development, controlling lineage progression and differentiation programs in neural progenitors [20,21,22]. Consistent with the key role of this gene during development, mice inactivated for Ncor1 gene die at early embryonic stage [23]. Repression by the NCOR complex is also important for adult neurogenesis [23]. Finally, according to the ExAC database and the study from Lek et al. [14], NCOR1 is predicted to be extremely loss-of-function intolerant (pLI = 1). Based on these observations, we propose that the p.(Arg33Cys) variant may affect either the stability of the NCOR1 protein or its ability to bind to other components of the co-repressor complex. Hemizygosity for this variant might severely impair the function of the complex and affect neural development. More functional studies are necessary to confirm this hypothesis and the pathogenicity of this SNV.

In conclusion, in our study, the unmasking of a recessive variant by a deletion may explain the phenotypic differences in 2/19 CNVs with variable expressivity. Thereby, analysis of the genes included in the contralateral segment to the deletion is a major point to consider in the etiological investigation of a patient with a neurodevelopmental disorder and harboring a deletion inherited from a healthy parent. Finally, our data also suggest that investigating patients with inherited CNVs might be an interesting approach to identify new autosomal recessive ID genes. Different computational approaches have been recently developed to detect CNVs from next-generation sequencing data supporting the clinical implementation of whole-genome sequencing as a primary test in the future.