Main

Clinical molecular testing is getting more comprehensive, yet more complicated, with the burgeoning number of known disease-causing genes. Most clinical laboratories develop assays for mutation detection in disease-associated genes based on the reported mutation spectrum (Fig. 1). The mutation spectrum associated with genetic disease ranges from point mutations and small deletions and insertions, which can be detected by direct DNA sequencing, to single- and multiexon deletions and duplications. The gene mutation spectrum and comprehensive genetic testing vary between genes; for instance, >65% of Duchenne and Becker muscular dystrophy cases are due to deletions or duplications of single or multiple exons of the dystrophin gene.13 In contrast, >90% of cystic fibrosis cases are due to single base-pair substitutions, microinsertions, or microdeletions in the CFTR gene4,5; 80% of the familial adenomatous polyposis cases are due to point mutations, whereas, only 20% are due to APC gene deletions or duplications6; and around 75% of the Sotos syndrome cases are due to point mutations in the NDS1 gene, while 25% of cases are due to deletions and duplications.7

Fig. 1
figure 1

Reported mutation spectrums for various genes. Mutation spectrum varies from point mutations and small deletions and duplications to intragenic single- and multiexon deletion and duplication mutations. Approximately 90%, 80%, and 75% (respectively) of the reported mutations in the cystic fibrosis (CFTR), familial adenomatous polyposis (APC), and nuclear receptor binding SET domain (NSD1) genes are point mutations, whereas 65% of the reported mutations in the dystrophin gene are deletion and duplication mutations.

The current available techniques for detecting single- and multiexon deletions and duplications in a single gene include multiplex polymerase chain reaction (PCR), quantitative PCR, Southern blotting,8 multiplex ligation-dependent probe amplification (MLPA),9 detection of virtually all mutations-SSCP (DOVAM-S),10,11 and single condition amplification/internal primer sequencing (SCAIP).12

Clinical assays based on multiplex PCR and quantitative real-time PCR have been used traditionally to detect intragenic deletions and duplications for a small number of genes. Importantly, multiplex PCR can detect deletions but cannot detect duplications accurately or examine multiple genes simultaneously. Quantitative real-time PCR methods are more accurate than multiplex PCR, but detection of intragenic duplications is still a challenge.13,14 In addition, PCR efficiency and PCR product saturation may adversely impact the sensitivity and accuracy of real-time fluorescent PCR, whereas spectral overlap limits the use of real-time fluorescent PCR in multiplex assays.14,15 Both multiplex PCR and quantitative real-time PCR, which use a single-probe hybridization that usually targets a common deleted region within a specific gene, can be hindered by single nucleotide polymorphisms, and the boundaries of deletions and duplications cannot be determined.

Southern blotting can detect relatively large deletions and duplications within a specific gene, however, Sothern blotting is labor-intensive, time- and DNA-consuming, requires hazardous reagents, and is not designed for simultaneous examination of multiple genes.16 MLPA, on the other hand, is a high-throughput technique in which multiple segments can be screened simultaneously to detect intragenic deletions and duplications; however, MLPA assay development is laborious, and occasionally gives false-positive and false-negative results, especially with diploid genes. MLPA false-negative and false-positive findings are caused largely by the presence of single nucleotide polymorphisms in the targeted regions. Single-exon dosage changes (duplications in particular) detected by MLPA require further confirmation by a second method (such as multiplex PCR or sequencing), which adds to both the complexity and cost of MLPA assays.14

DOVAM-S is a high-throughput multiplexed scanning assay that uses single-strand conformational polymorphism to detect virtually all point mutations and small deletions and duplications.10,11 Sequencing of all variant bands is required to distinguish polymorphisms from disease-causing mutations, which does decrease the efficiency of this assay. In contrast, SCAIP was developed as a rapid, accurate, and economical direct sequencing technique for exons and flanking intronic sequences of large, multiexon genes.12 Despite the improved detection of deletions and point mutations, SCAIP does not detect intragenic duplications. SCAIP can be labor-intensive, as it requires PCR amplification of all exons individually as the first step.

Recently, comparative genomic hybridization (CGH) using oligonucleotide arrays has been implemented in cytogenetic and molecular diagnostic laboratories as a robust, rapid, sensitive, and relatively inexpensive assay for detecting known and new microdeletion or microduplication syndromes,17 and targeted gene deletions.1820 Recently, we developed a two-step high-resolution approach using CGH and resequencing microarrays to comprehensively detect mutations in the dystrophin gene.21

Here we describe the development, validation, and implementation of a targeted, high-density oligonucleotide CGH microarray in a clinical laboratory. Our targeted gene CGH array permits a high-resolution analysis on a common platform of 71 genes involved mainly in lysosomal storage diseases and metabolic disorders with equal sensitivity and accuracy (Table 1). The array was developed to detect deletions and duplications for all genes sequenced by the Emory Genetics Lab (EGL) to meet our goal of offering truly comprehensive molecular testing. The sensitivity and specificity of our targeted CGH array allows accurate detection of single- and multiexon deletions and duplications in autosomal and X-linked genes in both males and females. Our data demonstrate the advantage of using a high-resolution targeted gene CGH array to complement direct gene sequence analysis, particularly when direct sequencing fails to identify two mutations in autosomal recessive cases with an established clinical or biochemical diagnosis.

Table 1 A list of genes represented on our targeted comparative genomic hybridization (CGH) array

MATERIALS AND METHODS

CGH array design

The targeted gene high-resolution oligonucleotide CGH array was custom designed on a NimbleGen (Madison, Wisconsin) 385K platform to detect deletions and duplications in 71 genes associated with various genetic disorders. Long oligonucleotides (45–60 mer) were used to achieve isothermal Tm across the array, with repeat sequence masking implemented to ensure greater sensitivity and specificity. Our targeted gene CGH array has 389,587 unique sequence probes tiled on the array; the average spacing is 2 bp within coding regions and 25 bp within intronic regions, with repeat sequence masking. Use of intronic oligonucleotide probes allows us to detect dosage changes within the entire genomic region of the gene and determine the approximate breakpoints.

CGH protocol and analysis

DNA was extracted using the Qiagen Puregene DNA extraction kit (Valencia, CA) according to the manufacturer's recommendation. Patient samples were tested against sex-matched controls. Control DNA from males and females was obtained from Promega (Pittsburgh, PA). DNA samples (2 μg) were sonicated to obtain a fragment size between 500–2000 bases, as verified on a 1% agarose gel. We used Klenow enzyme (NEB, Ipswich, MA) and Cy3 or Cy5 9mer wobble primers (Trilink Technologies, Nashua, NH) to label patient and control DNA samples, respectively. Labeled samples were purified by isopropanol precipitation. Thirteen micrograms of labeled DNA from each patient and sex-matched control were combined and desiccated in a vacufuge (GMI, Ramsey, MN), then resuspended in appropriate hybridization buffer mixed with Cy3 and Cy5 control CPK6 50mer oligonucleotides. Arrays were hybridized between 16 and 20 hours at 42°C in a microarray user interface (MAUI) hybridization system (BioMicro Systems, Salt Lake City, UT), then washed according to the manufacturer's recommendation and immediately scanned on a GenePix 4000 scanner (Molecular Devices, Sunnyvale, CA). Next, data were extracted from images using manufacturer-provided software (NimbleScan, NimbleGen Systems). Normalized log (2) ratio data were analyzed using two different analysis programs: (SegMNT or DNA copy) NimbleScan.22 Both software programs graphically display results in a bar graph, where the y-axis indicates gain or loss of material (1 = gain, 0 = normal, −1 = loss) and the x-axis indicates the position of each feature on the chromosome.

Assessment of array quality

To assess quality on each array, we used three control resequencing oligonucleotides, which correspond to synthetic sequence designed with no cross-hybridization potential with any known sequence. This sequence was designed to have three distinct sequencing domains (A, B, and C), each with unique characteristics. The “A” domain contains long runs of G nucleotides that can be difficult to synthesize. The “B” domain contains a large perfect hairpin sequence. The “C” domain contains a straightforward domain that should always resequence. Resequencing of both the forward and the reverse strands generates six scores for the Cy3 channel and six scores for the Cy5 channel: A-forward and A-reverse, B-forward and B-reverse, C-forward and C-reverse. Failure of domain “C” indicates a catastrophic failure. A score from 0 to 100% is obtained that correlates with the overall performance of a microarray experiment.

Data masking feature

Data files (.gff) generated from different averaging windows using NimbleScan software were parsed using a custom program (Nimkit) that was developed in-house. Nimkit enables the laboratory to select and analyze only the gene of interest. Nimkit generates a gene-specific report summarizing breakpoints detected in the gene of interest, the respective log (2) ratios, and the exons present at each region. All other regions are masked and not analyzed by Nimkit, preventing genetic analysis of genes for which clinical testing was not requested.

Patients

DNA samples from clinically and biochemically diagnosed patients referred for molecular diagnostic testing were obtained from the EGL. Deletions and duplications in 13 previously characterized samples (V1–V13) were used for validation. In addition, deletion and duplication mutations were characterized in six autosomal cases (T1–T6) and one X-linked case (T7), where direct sequencing failed to identify a mutation. Targeted CGH array findings were confirmed by sequencing across the breakpoints or on the EmArray Cyto6000 oligonucleotide array.

RESULTS

A custom CGH array was designed to detect deletions and duplications in 71 genes involved mainly in lysosomal storage and metabolic diseases (Table 1). DNA samples from 13 patients (V1–V13; Table 2) were used to validate the targeted gene CGH array. Deletions or gross genomic changes in these samples were previously characterized by direct sequencing, allele-specific PCR, fluorescence in situ hybridization, or the EmArray Cyto6000 oligonucleotide array. Table 2 summarizes the disorder, the gene involved, and previously identified mutations for each of the samples tested. There was 100% concordance between our targeted gene CGH array analysis and the previously characterized deletion or duplication. Furthermore, we detected deletions/duplications in six autosomal cases (T1–T6) and one X-linked case (T7), where direct sequencing failed to identify a mutation (Table 2). Patients V3, V7, V10, T1, and T5 are discussed in further detail below.

Table 2 A list of patients, disorders, associate genes, and mutant alleles

Patient V3

A sample from this patient was referred to the EGL for molecular testing for maple syrup urine disease (MSUD) to confirm a clinical diagnosis. MSUD is an autosomal recessive disorder caused by deficient activity of the branched-chain alpha-keto acid dehydrogenase (BCKD) enzyme complex. The BCKD complex consists of two 2-oxoisovaleratedehydrogenase subunits, called alpha (BCKDHA, E1 alpha) and beta (BCKDHB, E1 beta), and a lipoamideacyltransferase (DBT, E2).2325 Biochemical testing demonstrated a low level of leucine decarboxylation activity. Direct sequence analysis failed to identify any mutations in the BCKDHA, BCKDHB, or DBT genes in this patient.

We were unable to PCR amplify exon 2 of the DBT gene with either of two independent sets of PCR primers. As exon 2 could not be amplified, a homozygous deletion of DBT exon 2 was assumed. Targeted gene CGH array analysis of the DBT gene detected a homozygous deletion of exon 2 in this patient. Approximate genomic break points at nucleotide positions g.100,484,550 in intron 1 and g.100,477,050 in intron 2 indicate a 7500-bp deletion, which includes exon 2 (Fig. 2,C and D). Figure 2D displays a scatter plot, where the y-axis corresponds to signal intensity. A loss of genomic material is represented as a shift down of each feature, depicted on the x-axis. These data show a deletion of the corresponding genomic position, including exon 2 of the DBT gene in patient V3 (Fig. 2D).

Fig. 2
figure 2

Targeted gene CGH array for detecting single- and multiexon gene deletion and duplication mutations. A, Horizontals lines represent all 11 exons of the GALT gene. B, Bar graph displays loss of genomic material that corresponds to all exons of the GALT gene, except for exon 8. C, Horizontal lines represent all 10 exons of the DBT gene. D, Bar graph displays loss of genomic material that corresponds to exon 2 of the DBT gene. E, Horizontal lines represent all 17 exons of the GALC gene. F, Bar graph displays loss of genomic material that corresponds to exon 11 through exon 17 of the GALC gene. G, Horizontal lines represent all 13 exons of the PAH gene. H, Bar graph displays loss of genomic material that corresponds to exon 5 through exon 6 of the PAH gene. I, Horizontal lines represent all 10 exons of the BCKDHB gene. J, Bar graph displays a gain of genomic material that corresponds to exon 11 through exon 14 of the BCKDHB gene. Loss and gain of genomic material are depicted as a shift down and shift up of the features on the x-axis, respectively.

Patient V7

This patient was identified via newborn screening and diagnosed with Duarte variant galactosemia. Galactosemia is an autosomal recessive disorder caused by deficiency of the galactose-1-phosphate uridylyltransferase (GALT) enzyme that results from mutations within the GALT gene.26 Galactosemia is a clinically heterogeneous disease with presentations ranging from the Duarte variant and patients who are apparently normal or have mild symptoms to classical severe galactosemia.26,27 The Duarte allele is associated with the amino acid change asparagine to aspartic acid at position 314 (N314D).26 Elsas et al.,28 using G for the classic galactosemia allele, D for the Duarte allele, and N for the normal allele, proposed that the D/N, D/D, and D/G genotypes show 75%, 50%, and 25% of normal GALT enzyme activity, respectively. Biochemical testing on individual V7 demonstrated low levels of GALT enzyme activity in red blood cells (25%), consistent with the Duarte galactosemia variant (D/G). A sample from this patient was referred to the EGL for molecular testing. Upon direct sequencing of the GALT gene, an apparent homozygous nucleotide change c.2744A>G (p.N314D) in exon 10 was identified, consistent with a D/D phenotype. In addition, direct DNA sequencing on samples from the parents identified an N314D change in only one of the parents. Coffee et al.29 have previously characterized two consecutive deletions in the GALT gene encompassing 5.0 Kb, which results in the deletion of all exons of the GALT gene, with only 117 bp retained that correspond to portions of exon 8 and intron 8 of the GALT gene. To resolve the apparent discrepancy between the biochemical (D/G) and the apparent molecular (D/D) phenotype, allele-specific PCR analysis was performed. A 5.0-Kb deletion was identified that retained 117 bp, including exon 8, within the GALT gene in this individual and in one of the parents. Thus, the actual molecular phenotype for this patient is D/G, consistent with the biochemical phenotype. In the process of validating the targeted gene CGH array, we analyzed this sample and confirmed the deletion of all exons, with the exception of exon 8 (110-bp) of the GALT gene, which was retained (Fig. 2,A and B). Targeted CGH array analysis detected two consecutive deletions. The first deletion being at approximate genomic position g.34, 636,633 upstream of the start site of the GALT gene to g.34, 638,808 in intron 7. The second deletion starts at genomic position g.34, 638,918 in intron 8 to g.34, 641,648 extending beyond the end of the GALT gene. The deleted regions upstream and downstream of the GALT gene do not contain any known genes.

Patient V10

A sample from this individual was referred to the EGL for molecular carrier testing for Krabbe disease. The individual's daughter was diagnosed with Krabbe disease, an autosomal recessive neurodegenerative disorder caused by deficiency of the enzyme galactocerebrosidase (GALC). Mutations within the GALC gene cause Krabbe disease.30 A 30-Kb deletion starting in the large intron 10 and extending beyond the end of the GALC gene, encompassing exon 11 to exon 17, is the most common deletion observed in Krabbe patients.31 Allele-specific PCR analysis for the 30-Kb deletion within the GALC gene detected a homozygous 30-Kb deletion in this individual's daughter.32 Additionally, direct sequence analysis failed to identify any other mutations. Allele-specific PCR analysis for the 30-Kb deletion within the GALC gene also detected a heterozygous 30-Kb deletion. Consequently, targeted gene CGH array analysis for the GALC gene on this individual detected a deletion of 32 Kb with approximate genomic breakpoints at nucleotide position g.87,492,950 in intron 10 and g.87,460,950 extending beyond the end of the GALC gene. Similarly, a heterozygous 30-Kb deletion within the GALC gene was detected by allele-specific PCR analysis in this individual's partner and son (V11 and V12, respectively); these results were confirmed by targeted gene CGH analysis for the GALC gene (data not shown).

Patient T1

A sample from this individual was referred to our laboratory for molecular testing for phenylketonuria (PKU) because of elevated plasma phenylalanine consistent with PKU/hyperphenylalaninemia. PKU is an autosomal recessive disorder, with 97–99% of affected individuals having a deficiency of the enzyme phenylalanine hydroxylase (PAH). Mutations within the PAH gene cause PKU,33 with the remaining 1–3% of PKU cases due to defects in the formation or recycling of tetrahydrobiopterin (BH4).34 Direct DNA sequencing of the PAH gene identified one missense mutation (p.R243Q) in exon 7 of the PAH gene, but a second mutation in the PAH gene was not identified. Targeted gene CGH array analysis of the PAH gene detected an 11.7-Kb deletion encompassing most of exon 5 and all of exon 6 of the PAH gene (Fig. 2,G and H), with approximate genomic nucleotide positions g.101,772,838 in exon 5 and g.101,784,528 in intron 6 of the PAH gene. Junction fragment PCR analysis across the breakpoints indicated a deletion of 11,652 bp in size, with the 5′ breakpoint in exon 5 and the 3′ breakpoint in intron 6. Direct sequencing of the junction fragment product confirmed our targeted gene CGH array data, where most of exon 5 (with the exception of the first 11 nucleotides) and all of exon 6 are deleted in patient T1 (Fig. 3).

Fig. 3
figure 3

Direct sequencing across the PAH gene break points in patient T1. Top histogram shows exon 5 control sequence (arrowhead represents 5′ break point, and the deleted sequence is represented as gray letters below the histogram). Middle histogram represents the control sequence of intron 6 (arrowhead represents 3′ break point, and the deleted sequence is represented as gray letters below the histogram). Bottom panel represents patient T1 sequence of a PCR-amplified product across the break-point junction (arrowhead represents the break-point junction).

Patient T5

A sample from this patient was referred to the EGL for molecular testing for MSUD. Previous biochemical testing on this patient indicated elevation of the branched chain amino acids leucine, isoleucine, and valine, as well as alloisoleucine in plasma, consistent with a diagnosis of MSUD. Direct sequence analysis identified one copy of a c.752T>C (p.V251A) mutation in exon 7 of the BCKDHB gene in this patient. A second mutation was not identified. As with the previous patient, the availability of the targeted gene CGH array analysis for the BCKDHB gene as a clinical molecular test enabled the identification of a second mutation in this patient. Targeted CGH analysis for the BCKDHB gene identified a duplication mutation (100-Kb) encompassing exon 7 to exon 9, with approximate genomic breakpoints at nucleotide positions g.80,955,500 in intron 6 and g.81,055,500 in intron 9 (Fig. 2,1 and J).

DISCUSSION

CGH arrays have been widely used to detect gross chromosomal changes, including large deletions and duplications across the human genome; however, the application of targeted gene CGH arrays for detecting intragenic deletions or duplications is in its infancy. We designed, validated, and implemented a targeted gene CGH array analysis as a clinical molecular testing tool to simultaneously screen for deletions and duplications of single- and multiexon mutations in a large set of autosomal and X-linked disease-associated genes. In particular, we were interested in genes involved in lysosomal storage and metabolic disorders, and other genes for which the EGL offers direct DNA sequence analysis.

The targeted gene CGH array has several advantages over other available molecular methods, such as multiplex PCR, quantitative PCR, Southern blotting, MLPA, DOVAM-S, and SCAIP sequencing. These traditional methods are time-consuming, labor-intensive, complex, and fail to detect all mutations accurately. Duplication mutations in particular are very difficult to detect in both males and females in autosomal cases and in females in X-linked cases. In addition, mutations detected by the currently available techniques, especially single-exon deletions, require further costly confirmation by a different technique, which can affect the turnaround time for results. Because these more common techniques depend only on one or two probes for each targeted region, false-positive findings are known to occur. Furthermore, genomic positions of the breakpoints cannot be determined by traditional methods. In contrast, the targeted gene CGH array is a rapid, comprehensive, relatively inexpensive, highly sensitive, and accurate method for detecting single- and multiexon deletions and duplications in a large set of genes simultaneously on a common platform.

The array was designed as an exon centric array with an extremely dense spacing in the exons (2 bp) and more widely spaced oligos (25 bp) in the introns. The 385K Nimbelgen array format supported this design due to large excess of oligos. This array also lends the flexibility to add several additional genes. Addition of new genes may require reducing the density of the oligos in exons but this will not reduce the sensitivity of the array. From our prior work on development of a targeted array for the dystrophin gene21 we have determined that it is important to have a minimum of four oligos for an exon size of 100 bp to allow for optimal detection of deletions and duplications. As gene CGH array technology is a recent technology to be used in the molecular setting and not yet a “Gold” standard, our laboratory is confirming all deletions/duplications by a secondary method. Deletions can be easily confirmed by designing primers across the breakpoints, whereas, duplications are relatively difficult to confirm using an alternative technology. Methods such as breakpoints PCR by taking into consideration the possibility of tandem duplication or inversion, southern blot, real time PCR, or confirmation on another array can be considered. In our experience, the targeted CGH array was able to detect the single-exon retention of exon 8 (110-bp) in the middle of the large GALT gene deletion and multiexon duplications in the GALC and BCKDHB genes that had not been found before.

Targeted gene CGH array analysis has a higher sensitivity than the currently available methods for detecting deletions and duplications. Our targeted CGH array is based on the use of overlapping oligonucleotides representing the coding region of each gene, thereby allowing us to oversample and lower the likelihood of false-positive findings. Targeted CGH array analysis confirmed previously characterized deletion mutations in 13 cases with a 100% concordance (Table 2). Hence, targeted gene CGH array analysis can determine approximate genomic breakpoints for all deletion and duplication mutations.

The need for comprehensive mutation detection assays with a reasonable turnaround time for results is a pressing necessity. Deletions and duplications constitute a significant portion of the mutation spectrum for some genes. The availability of comprehensive mutation detection assays that can accurately identify intragenic deletions and duplications in a single gene in a reasonable time will yield more accurate mutation detection across the mutation spectrum for disease-associated genes and will greatly improve clinical diagnosis, prenatal diagnosis, and genetic counseling. For example, intragenic deletion and duplication mutations had not been documented in MSUD patients previously. Molecular characterization of MSUD alleles has a 95% mutation detection rate with a mutation spectrum of 95% point mutations; the remaining 5% were considered uncharacterized mutations,35,36 and large and single-exon deletions and duplications had not been seen before. Here we present for the first time two cases of multiexon deletion and duplication mutations in the BCKDHB gene, identified in patient T6 and T5, respectively. Furthermore, we also identified a multiexon duplication in the GALC gene (individual T4, Table 2), which had not been documented before. Such duplication mutations are difficult or impossible to detect by other methods.

Use of a targeted gene CGH array will improve the mutation detection rate for autosomal and X-linked genes, as intragenic single- and multiexon deletions and duplications can be identified easily. An improved mutation detection rate is critical for prenatal diagnosis and genetic counseling, as the identification of disease-causing mutations is important for better genetic counseling and a must for prenatal testing. Failure to identify disease-causing mutations would make prenatal testing for at-risk families impossible. In our experience, the targeted CGH array detected intragenic single- and multiexon deletions and duplications in eight patients for whom DNA sequence analysis failed to identify a disease-causing mutation. Although the number of samples we have tested per gene remains small, the collective mutation detection rate for the genes that have been tested on our CGH array thus far is 13.3% (eight individuals out of 60). Hence, the combination of direct DNA sequence analysis with targeted gene CGH array analysis significantly improves the mutation detection rate for genes listed in Table 1 to 98%, given that a rough estimate of the mutation detection rate for most genes is 80–85%. Despite the maturity of sequencing and CGH, an estimated 1–2% of mutations located deep within intronic or promoter regions cannot be identified. The comprehensive testing strategy using the CGH array significantly increases the possibility of finding a mutation in a patient. This is a cost-effective approach that provides options for additional family members to be tested, prenatal testing and enrolment in clinical trials especially for metabolic disorders where the diagnosis has been confirmed by biochemical testing. Molecular tests can be expensive and the availability of a comprehensive testing strategy will reduce the burden for the payers and patients. Therefore, the targeted gene CGH array is appropriate to be applied for maximal mutation detection in a specific gene associated with a clinical or biochemical diagnosis in contrast to the genomic CGH arrays used in cytogenetic laboratories which involve testing for multiple loci across the genome involved in a range of phenotypes.

Here we propose a two-step approach based on the mutation spectrum of the gene of interest, wherein the CGH array complements the gene-sequencing assay. Our approach enables the detection of virtually all mutations in a large set of genes (Fig. 4), where we first sequence the coding region of the gene of interest to identify two mutations. If direct DNA sequencing fails to identify two mutations, we will reflex the sample to targeted gene CGH array analysis for the gene of interest. These steps are flexible and interchangeable depending on the mutation spectrum for each gene.

Fig. 4
figure 4

A two-step approach to detect virtually all mutations by complementing gene sequencing analysis with targeted CGH array analysis based on the mutation spectrum of genes.

To avoid testing for genes that are not requested and also to avert from providing unwanted medical information, we have developed a data masking feature that only allows the extraction of data corresponding to the gene being tested. In other words, the data for all other genes is masked and not extracted. We developed a custom program (Nimkit) to initially select the gene of interest before the data analysis. Nimkit analyzes data only for the gene being tested and generates a report specifying any breakpoints detected, the respective log (2) ratios, and the exons of the gene of interest. As Nimkit extracts data for the entire gene of interest deletions and duplications of the probes in the introns and noncoding exons will also be detected. Unless the identified intronic and noncoding region deletion or duplication has been previously identified and shown to affect gene function by methods such as cDNA analysis such events will be reported as variants of unknown clinical significance. Additional family studies to determine segregation of the variant with disease may help clarify the clinical significance of such variants but functional studies will need to be carried out to determine the effect of such intronic and noncoding region deletions or duplications.

The CGH array format allows simultaneous development of an assay for detecting deletions and duplications in a number of genes. This format is flexible to adding, removing, or masking genes on the array. The CGH array format and flexibility therefore results in significant savings for a clinical laboratory in test development costs. The major expenditure of this assay is the cost of the CGH array. The reagents are relatively cheap and several arrays can be processed at the same time and data can be generated in two days. Thus, this assay requires significantly reduced labor and has fewer overheads in comparison to the more traditional methodologies.

In summary, targeted gene CGH array analysis will significantly improve the detection of single- and multiexon deletions and duplications in autosomal and X-linked disease-associated genes, especially when the targeted CGH array is designed to complement gene-sequencing analysis. Targeted gene CGH arrays will enable comprehensive clinical molecular testing, and effective genetic counseling. Finally, the targeted gene CGH arrays analysis can be easily adopted by the clinical molecular testing laboratories as a rapid, cost-effective, and highly sensitive and accurate approach for the detection of single- and multiexon deletion and duplication mutations.