Introduction

The majority of mutations in Mendelian disorders are detected by DNA sequencing. However, in recent years the significance and frequency of pathogenic intragenic deletions or duplications (copy number mutations) have become increasingly evident. Quantitative methods are now used for molecular diagnosis of a limited number of disorders.1 In a variety of genetic disorders, including Rett syndrome, Smith–Magenis syndrome, and Prader–Willi syndrome, routine testing includes copy number analysis by multiplex ligation-dependent amplification (MLPA) assay, quantitative PCR (qPCR), or fluorescence in situ hybridization. The same methods have also been used to identify larger chromosomal rearrangements, including those affecting subtelomeric regions.2,3 Array comparative genomic hybridization (CGH) has proven to be a powerful tool for copy number analysis but has been used mostly for cytogenetic analysis to detect large genomic deletions and duplications that extend hundreds of kilobases to megabases.4 Some recent reports have shown that high-resolution array CGH with probes densely distributed across individual genes can detect small deletions or duplications, but this approach has not been widely applied in a clinical setting.5,6 We constructed a custom high-density oligonucleotide array with probes targeted to the individual exons of 589 genes associated with known genetic disorders to identify whole and partial gene deletions or duplications. Our results from testing 3,018 patients with this array demonstrate that exon array CGH complements DNA sequencing and increases the mutation detection rate in the molecular diagnosis of autosomal and X-linked Mendelian disorders.

Materials and Methods

Array design

The 3,018 clinical cases tested by exon array CGH were analyzed on one of the two array versions. On the first array version, 1,584 cases were tested; 1,434 cases were tested on the second. The first version included two or more probes in most exons of 465 targeted genes and three probes in each intron, regardless of size. Coverage of some small exons included probes in the intronic sequence immediately flanking the exon, based on the premise that any deletion/duplication of that exon would probably extend into the intron. Genes with unprocessed pseudogenes were not included in the target list. The second version of the exon array included seven probes across each exon and the flanking 250-bp intronic sequence on either side. The array was designed to include probe coverage of 589 genes, including all 465 genes from the first array version. Intronic regions other than the 250-bp exonic flanks were not covered, and deletions or duplications in those regions would not be detected. Genes with processed pseudogenes (e.g., PTEN) were furnished with extra probe coverage within the 250-bp exon-flanking intronic sequences. Untranslated regions of the exons were also covered if part of those exons included coding sequences. The average gene size and the exon size in this collection of 589 genes were 104 kb and 741 bp, respectively. Each array version was validated with 60–80 DNA samples that previously showed either normal copy number or a known MLPA- or qPCR-confirmed deletion or duplication in one of a variety of genes (data not shown).

Following validation of the exon array, prospective studies were performed on blood samples from patients referred to our clinical laboratory for deletion/duplication testing for various Mendelian disorders. For autosomal recessive disorders, deletion/duplication testing was performed if DNA sequencing identified only a single mutation. For individuals with an autosomal dominant disorder or for female carriers of an X-linked disorder (affected or unaffected), deletion/duplication testing was either offered as a second test if gene sequencing was negative or in combination with the sequencing test when that disorder had a known high frequency of deletions.

Array hybridization and data analysis

DNA was extracted from blood samples on the QiaCube (Qiagen, Valencia, CA) automated system. Labeling was carried out with the Enzo CGH labeling kit for oligo arrays (Enzo Life Sciences, Plymouth Meeting, PA). Array CGH was performed with 0.5 μg of DNA according to the manufacturer’s protocol (Agilent Technologies, Santa Clara, CA). The data for each patient were examined only for the specific gene (or gene panel) requested. Data were analyzed using the ADM-1 algorithm in DNA Analytics/Genomic Workbench software. Reportable data were based on log2 ratio deviations >0.25 and including two or more adjacent probes.

Confirmation of array findings

Probe deviations including two or more adjacent probes were confirmed by qPCR with custom-designed primers, MLPA or, in one case, whole-genome array CGH. MLPA was performed according to the manufacturer’s protocols (MRC-Holland, Amsterdam, The Netherlands). The following SALSA MLPA kits were validated for clinical use: P067 PTCH1, kit P187 Holoprosencephaly HPE, P225-B1 PTEN, P101 STK11, P219-B1 PAX6, P313 CREBBP, P215 EXT, P180 Limb Malformations-2, and P015 MECP2. qPCR analysis was performed using a TaqMan assay based on the Human Universal Probe Library set from Roche Applied Science (Roche, Indianapolis, IN; https://www.roche-applied-science.com/sis/rtpcr/upl/index.jsp?id=UP030000). Custom primers were designed using the online Universal Probe Library assay design center (https://www.roche-applied-science.com). Seventy nanograms each of patient sample and control DNA samples were added to separate reaction mixtures containing a TaqMan probe and FastStart TaqMan mastermix (Roche) and locus-specific primers. The amplification was carried out at 60°C annealing temperature for 45 cycles on a Stratagene Mx3000P/3005P machine (Agilent Technologies). Three normal genomes were tested in triplicate along with the patient sample. In a valid positive assay, the Ct for the clinical sample showed a difference of at least one PCR cycle relative to the normal sample, the triplicates deviated by <0.30 Ct, and all negative controls showed normal copy number for the locus tested. The assay used an internal normalizing target (SOD1 gene in 21q22.11) to ensure that equal amounts of DNA were used in all tested samples. The difference in Ct values between the clinical sample and one normal control sample was expressed as a fold change in the copy number of the target gene.

Results

Clinical testing of 3,018 cases

Exon array CGH was used to examine 3,018 patients for deletions or duplications in 219 genes. More than one gene was analyzed in 307 individuals affected with a genetically heterogeneous disorder. Therefore, a total of 4,354 genes were analyzed in the 3,018 individuals. Exon array CGH identified 98 partial or whole-gene deletions and two duplications, corresponding to a detection rate of 3.3% in the individuals tested ( Figure 1 ). qPCR, MLPA, or whole-genome array CGH confirmed copy number mutations detected by exon array CGH. Table 1 and Supplementary Table S5 online lists all the copy number mutations identified in our cohort, and Figure 2 shows selected examples of partial gene deletions or duplications detected by exon array CGH. Copy number mutations were identified in 53 of the 219 genes tested. No deletions or duplications were found in the remaining 166 genes, although most of these were evaluated in 10 cases or fewer. Forty percent of the 3,018 cases were referred for PTEN testing. The overall detection rate for copy number mutations by exon array CGH for Mendelian disorders in our cohort excluding PTEN cases was 5.3% (95/1,787; Figure 1 ).

Figure 1
figure 1

Percentage of exon array CGH results classified according to mode of inheritance in 100 positive cases. Results are shown with and without the PTEN cases included. AD, dominant; all, all cases tested by exon array CGH; AR, recessive; CGH, comparative genomic hybridization; XL, X-linked.

Table 1 Copy number mutations identified by exon array CGH in 3,018 patients
Figure 2
figure 2

Examples of exon array comparative genomic hybridization data from four separate genes. Arrows mark the probes that identified the deletion or duplication. (a) A heterozygous deletion of exon 5 in NSD1 in a patient with Sotos syndrome. (b) A partial gene duplication including exons 3–4 in MECP2 in a female patient with Rett syndrome. (c) A heterozygous deletion of exon 31 in VPS13B (COH1) in a patient with Cohen syndrome; this patient also had a sequence change on the other allele. (d) A partial gene duplication including exons 1–5 in PTEN in a patient with Cowden syndrome.

Autosomal dominant disorders

Patients tested for autosomal dominant disorders, most of which were caused by haploinsufficiency, constituted a group of 2,567 individuals. Almost half of these individuals (n = 1,231) were specifically referred for PTEN testing. The remaining 1,336 individuals diagnosed with other autosomal dominant disorders were tested for copy number mutations in one or more of a variety of genes. For 13 of those genes, array testing was performed concurrently with sequencing, while other genes were tested after a mutation was excluded by sequencing. Sixty-nine of the 1,336 individuals carried a gene deletion, yielding a positive rate of 5.2% for this group of dominant disorders ( Figure 1 ). The majority of individuals who tested positive had partial gene deletions involving one or more exons; whole-gene deletions accounted for 34% of all mutations observed.

The frequency of whole or partial deletions and duplications in patients sent for PTEN analysis with test indications of Bannayan–Riley–Ruvalcaba syndrome (BRRS), Cowden syndrome (CS), macrocephaly, and/or autism spectrum disorder was 0.5%, which is lower than the previously published rate seen in smaller cohorts of patients with normal results by sequencing.7,8 Whole-gene deletions in PTEN were observed in three patients with BRRS, CS, or Sotos syndrome–like overgrowth. One patient with developmental delay and multiple lipomas had a partial gene deletion including exons 3–9, and two patients with suspected CS showed mosaicism for a partial deletion of exons 6–9 and a partial duplication of PTEN involving exons 1–5, respectively ( Figure 2d ). No copy number mutations were observed in individuals referred for macrocephaly/microcephaly associated with autism spectrum disorders.

In agreement with published literature, a high rate of pathogenic gene deletions was identified in several autosomal dominant disorders in our data set (see Supplementary Table S1 online). For Peutz–Jeghers syndrome, 10 out of 31 patients (32%) with negative sequencing results carried a deletion detected by exon array CGH in the STK11 gene. This is comparable to previously published reports of 16–30% incidence of exonic deletions in STK11.9 Two whole-gene and eight partial STK11 deletions were observed, including four cases with an apparently recurrent deletion of exon 1. The positive rate of 9.3% for CREBBP intragenic deletions in 75 patients with Rubinstein–Taybi syndrome and of 7.3% for EXT1 or EXT2 deletions in 55 patients with multiple exostoses falls in the range reported by others.10 Finally, consistent with previous reports, 1 out of 13 patients (8%) showed a deletion in the MYCN gene in individuals with Feingold syndrome, and 1 out of 11 patients (9%) had a deletion in the SOX9 gene associated with campomelic dysplasia.11

We also identified seven deletions in the PTCH1 gene, including two mosaic mutations, in 128 patients diagnosed with Gorlin syndrome (basal cell nevus syndrome). The incidence of PTCH1 deletions was, therefore, over 5%, and most of these were private mutations. This represents the largest number of patients tested concurrently for sequencing and copy number mutations in PTCH1.11,12

For some disorders the positive rate in our data set was lower than that published elsewhere (see Supplementary Table S1 online). For example, only 3 out of 116 patients (2.6%) with von Hippel–Lindau syndrome had a deletion in the VHL gene, while the published deletion rate is up to 30%.13 This discrepancy is probably attributable to the fact that molecular testing is requested after a gene deletion has been excluded by fluorescence in situ hybridization. Likewise, only 3 out of 204 (1.5%) patients with Alagille syndrome tested by sequencing and exon array CGH had a gene deletion in our cohort, whereas fluorescence in situ hybridization analysis elsewhere has revealed gross deletions, including JAG1, in 5.7% of patients.14 Other disorders with lower than expected deletion rate in our data set were hereditary angioedema (SERPING1: 1/27 or 3.7%), branchiootorenal syndrome (EYA1: 2/41 or 4.9%), and Duane radial ray syndrome (SALL4: 1/58 or 1.7%; Table 1 ).15,16

We also identified unique deletions in several other dominant disorders after DNA sequencing failed to reveal a pathogenic mutation ( Table 1 ). Testing for developmental eye disorders revealed four exonic deletions involving the SOX2 and OTX2 genes in 26 (15.4%) individuals with anophthalmia or microphthalmia and two FOXC1 deletions in 12 (17%) patients with Axenfeld–Rieger syndrome. Likewise, 7 out of 36 patients (19%) with a clinical indication of aniridia had a deletion in the PAX6 gene and previous reports show a wide range for the incidence of PAX6 deletions.17,18 Notably, more than half of the PAX6 deletions that we found extended into the neighboring ELP4 gene but did not include the WT1 gene. Deletions were also detected in individuals with Birt–Hogg–Dube syndrome (FLCN: 1/15 or 6.7%), Holt–Oram syndrome (TBX5: 1/21 or 4.5%), long QT syndrome (KCNQ1: 2/71 or 2.8%), and multiple endocrine neoplasia (MEN1: 1/43 or 2.3%; Table 1 ). Pathogenic deletions in the disorders above are rare and deletion testing is not routinely performed. This applies to many other disorders as well. For instance, although only a small number (one to three) of individuals were referred for testing, we identified novel pathogenic intragenic deletions in IRF6 (Van der Woude syndrome), NSD1 (Sotos syndrome), SHANK3 (22q13.33 subtelomeric deletion syndrome), TWIST1 (Saethre–Chotzen syndrome), and TP63 (ectrodactyly ectodermal dysplasia and cleft lip/palate syndrome 3; Table 1 ).

Autosomal recessive disorders

We tested 138 individuals by exon array CGH analysis for autosomal recessive disorders and identified 14 deletions of one or more exons (see Supplementary Table S2 online). No duplications were identified. Fifty-four cases were referred with an indication of an inborn error of metabolism (IEM). Seven of these cases were positive for a deletion and represented a positive rate of 13.0%, compared with 8.3% for other indications (see Supplementary Table S3 online).

Most of the 138 cases had sequencing prior to or concurrently with deletion/duplication testing. Of 93 individuals with a single pathogenic mutation identified by sequencing, 10 were also found to have a whole or partial gene deletion, which corresponded to a positive rate of 10.8% (see Supplementary Table S3 online). In 45 IEM cases with one mutation detected by sequencing, we identified seven deletions (15.6%). Of the 48 cases referred for other indications, only three were positive (6.3%). Forty-five individuals who had no prior sequencing or with negative sequencing results had deletion/duplication testing, and four (10.1%) were homozygous for intragenic deletions.

X-linked disorders

Exon array CGH was performed in 313 individuals for deletions or duplications in X-linked genes. Eleven patients showed a pathogenic copy number mutation, representing a positive rate of 3.5% (see Supplementary Table S4 online and Figure 1 ). One patient had partial gene duplication and the remainder carried a whole or partial gene deletion. Eight of the 11 individuals with a positive exon array CGH result were females diagnosed with one of several conditions (hypohidrotic ectodermal dysplasia, focal dermal hypoplasia, juvenile retinoschisis, Rett syndrome, Coffin–Lowry syndrome, and ornithine transcarbamylase deficiency). The three remaining deletions were found in males with agammaglobulinemia, dilated cardiomyopathy, or Norrie disease.

Detection of mosaicism

Array CGH has limited sensitivity to detect mosaic genomic changes. Mosaicism for ≥25% mutant allele is reliably detectable using whole-genome cytogenetic arrays.4 Using the exon array, we identified five mosaic mutations in four genes: one each in PTEN, STK11, and PAX6 and two in PTCH1 ( Table 1 and see Supplementary Figure S1 online). In each case, the patients with the mosaic deletions were clinically affected with the disorders associated with those specific genes. MLPA was used to confirm the mosaic deletion in all five instances, although the level of mosaicism could not be accurately determined.

Size of intragenic copy number changes

The minimum size of the detected exonic copy number mutations ranged from ~169 bp to 317 kb ( Figure 3a ). Forty-one percent of the deletions and duplications were <5 kb, and 42% included only one or two exons ( Figure 3b,c ). A deletion as small as 169 bp was detected by four 60-mer probes that had significant sequence overlap. The larger deletions and duplications included complete genes. For all positive cases, the minimum size of a deletion/duplication was determined by the sequence positions of the probes that showed abnormal copy number. The break points are located at an undetermined site within regions flanking neighboring probes that showed normal copy number. Because our targeted exon-centered array did not contain a whole-genome backbone, the size of deletions/duplications extending beyond the 5′- and 3′-ends of genes could not be determined.

Figure 3
figure 3

Minimum size distribution of deletions and duplications identified by exon array comparative genomic hybridization. (a) Sizes of all deletions and duplications detected in 100 positive cases. The smallest deletion was 169 bp and the largest deletion was a whole-gene deletion of 317 kb. (b) Size of deletions and duplications by ranges. Forty-one percent of mutations were below 5 kb. (c) Deletions and duplications separated by number of exons affected. Note the high proportion of single- and dual-exon deletions.

Discussion

In this study, we evaluated 3,018 patients with a suspected Mendelian disorder for intragenic deletions/duplications by exon-level array CGH in one or more of 219 genes and found 100 mutations (2.9%). Excluding the large number of cases tested for a single gene (PTEN) with a low frequency of deletions, the rate of deletion/duplications in the remaining genes was 5.3%. Whole-gene deletions accounted for 34% of mutations. Most of the remaining partial copy number mutations were private. An important observation from our data is that 41% of the exonic copy number mutations were <5 kb and/or involved only one or two exons. There are no previous reports describing a large number of copy number mutations and showing the distribution based on size or the number of exons affected. This is the first report of exon-level array CGH testing of a very large clinical cohort. Even with our exon-targeted array design that does not interrogate promoters or introns, our data demonstrate that intragenic copy number mutations are more prevalent than perhaps previously suspected in Mendelian disorders as a whole, and should be routinely considered in the diagnostic workup of these disorders. Exon array CGH is a robust, high-resolution copy number assay that fits this purpose.

Ninety-eight percent of mutations we identified were deletions. A variety of other studies have also reported a higher proportion of deletions among copy number mutations within single genes.19,20 Several explanations may account for this observation. For example, carrying a complete extra copy of a gene is often not pathogenic.21 Another potential explanation is that duplications including the first or last exon of a gene may leave the final gene structure sufficiently preserved for normal transcription. For these reasons, individuals with deletions are more likely to manifest a phenotype and seek medical care. Finally, as with large intrachromosomal rearrangements, the mechanism by which intragenic copy number mutations occur may favor the formation of deletions over duplications. Because most of the mutations in our data appear to be nonrecurrent, the mechanism for their formation is probably not based on homologous recombination but rather on one of a variety of other mechanisms.22

We observed a high frequency of pathogenic intragenic copy number mutations in autosomal recessive disorders when a single heterozygous mutation was identified by sequencing (10.8%). An even higher rate was observed for IEMs, with a 15.6% detection rate in cases in which a single mutation was identified by sequencing. This high rate of deletions detected for IEMs is not surprising because molecular diagnosis is usually performed after positive biochemical results indicate a likely candidate gene. The identification of both disease alleles is important to confirm a diagnosis, for family counseling, and for carrier and prenatal genetic testing.

To our knowledge, this is the first case of an intragenic exon-level deletion in the DHCR7,23 HADHB, and MOCS2 genes. There have been previous reports of exon-level deletions in other genes reported here, including GAA, PAH, PCCA, and DCLRE1C. A large deletion of exon 18 has been identified in 5–8% of GAA alleles in patients with both infantile and adult-onset Pompe disease from diverse ethnic backgrounds.24 This mutation occurs at even higher frequency in the Dutch population and has been proposed to be a founder mutation in this population.25 Large deletions in the PAH gene, including the deletion of exon 5 reported here (PAH consortium database; http://www.pahdb.mcgill.ca), have been found in ~0.5–0.8% of phenylketonuria patients.26 In the PCCA gene, large deletions of one or more exons have been reported in association with propionic acidemia.27 In one study of 66 patients with propionic academia, exonic deletions were identified on 21.2% of alleles of patients with no mutation identified by sequencing on one or both alleles.28 For the DCLRE1C gene, in two studies the most frequent mutations were gross deletions of exons 1–3 or exons 1–4 due to a homologous recombination between the functional gene copy and a pseudogene located 61.2 kb upstream. Such deletions in DCLRE1C were reported in 59% of patients with mutations in one study and in all six of patients with mutations identified by sequencing in a second study of Saudi Arabian patients.29,30

Unlike in recessive disorders, intragenic copy number mutations in dominant disorders are more frequently recognized because a single mutation is sufficient to cause the phenotype. Translocations and intrachromosomal deletions have helped identify many causative genes in dominant disorders, e.g., PAX6 in WAGR syndrome.31 Deletions are common in some disorders, such as aniridia, von Hippel–Lindau syndrome or Peutz–Jeghers syndrome. We found a large number of copy number mutations in dominant disorders in our study, consistent with previous reports. This underscores the utility of exon-level array CGH for a broad range of disorders caused by haploinsufficiency ( Table 1 ), including hereditary cancer predisposition syndromes such as Von Hippel–Lindau syndrome, Peutz–Jeghers syndrome, multiple hereditary exostoses, and Gorlin syndrome.

For some disorders, the frequency of copy number mutations in our data deviated from previously reported accounts. One explanation is that patients referred for testing do not always meet stringent clinical diagnostic criteria for the suspected disorder and may have another genetic disorder (phenotypic heterogeneity) or have mutations in another gene (genetic heterogeneity). In addition, fluorescence in situ hybridization may have been used to exclude large deletions in disorders in which such deletions are relatively frequent (e.g., disorders associated with VHL and PAX6/WT1 mutations). Finally, because the number of patients studied for some of these disorders in our study or elsewhere was small, the observed frequency of copy number mutations in those genes may not be a true representation of the actual deletion/duplication rate.

An example of a discrepancy between published mutation rates and our data is deletions/duplications involving PTEN. Mutations in this gene are associated with multiple disorders, including CS, BRRS, PROTEUS syndrome, PROTEUS-like syndrome, and macrocephaly and autism spectrum disorders.32,33 The published mutation rates range from 0.26% of all cases to 13% of sequencing-negative cases.34,35 In our large cohort of 1,231 individuals, exonic copy number mutations of PTEN were found in 0.5% of individuals tested. Individuals with a whole or partial gene deletion or a partial gene duplication had CS, BRRS, and general overgrowth phenotypes, and none had a diagnosis of macrocephaly or autism spectrum disorder. One previous study identified one whole and two partial PTEN mutations by MLPA in a carefully selected cohort of 122 CS or BRRS patients who were negative by gene sequencing, reflecting an overall deletion rate of 2.5%.8 However, all patients with a PTEN deletion had a BRRS phenotype (3/27 or 11%). Another study examined 30 sequencing-negative patients with CS and found four deletion mutations involving exon 1 (13%).35 By contrast, a large multicenter prospective study of 3,042 individuals with relaxed CS clinical criteria revealed large exonic deletions of PTEN in only 8 (0.26%).34 These deletions accounted for <3% of all types of mutations identified. Results of this large study concur with our data (0.5% frequency) in an unselected cohort referred for PTEN testing. Our results support the current understanding that PTEN deletions are more common in BRRS and perhaps CS, but are not a general cause for autism spectrum disorders.36,37

We found several copy number mutations in X-linked genes that illustrate two advantages of using exon array CGH. First, it readily identified female carriers of a whole or partial gene deletion on the X chromosome. In contrast to males, in whom hemizygous deletions can be readily recognized by failure of PCR amplification, heterozygous gene deletions in females are not detectable by Sanger sequencing. Deletions are quite frequent in several X-linked disorders, including ectodermal dysplasia (~10%), OTC deficiency (~8–15%), and Rett syndrome (~10%).11 The second advantage of the exon array is that it can also detect pathogenic intragenic duplications, in both males and females, that might otherwise go undetected by sequencing. The X chromosome contains many disease genes in which intragenic duplications have been reported, including MECP2 (Rett syndrome), RPS6KA3 (Coffin–Lowry syndrome), and NR0B1 (46,XY gonadal dysgenesis).11 We tested 7 males for X-linked disorders and found no duplications, but 1 out of 211 females was positive for a partial gene duplication in MECP2.

Our findings illustrate the utility of a robust exon array CGH method for detecting intragenic copy number mutations in Mendelian disorders. The data presented here emphasize the enhanced testing sensitivity for Mendelian disorders concurrently or after sequencing. They suggest that exon array CGH should be routinely used for autosomal recessive disorders, particularly IEMs that have only one mutation identified by sequencing or for conditions in which loss-of-function mutations or abnormal gene dosage explain the associated phenotype. Although we have targeted only 589 genes of known clinical significance and examined only a single gene or a small panel of genes in each individual, exon-level array CGH could also be designed as a screening test. For example, exon-level coverage of a few thousand genes may be appropriate for simultaneous analysis on a whole-genome cytogenetic array CGH design because many genes linked to developmental disorders arise from haploinsufficiency. It is also now feasible and relatively inexpensive to perform whole-exome copy number analysis by array CGH to complement exome sequencing until robust algorithms to calculate copy number from next-generation sequencing data become available.

Disclosure

The authors are current or past employees of GeneDx.