INTRODUCTION

Many genetic diseases have been shown to arise from dosage imbalance of one or more developmentally important genes caused by structural rearrangements of the genome as a result of microdeletions, microduplications, translocations, and inversions. Diseases arising from such structural rearrangements have been designated genomic disorders1 and are estimated to occur at a frequency of 0.7 to 1 per 1000 live births.2 A few of the better characterized genomic disorders include Prader-Willi (PWS [MIM 176270]) and Angelman syndromes (AS [MIM 105830]) on 15q11-q13,3,4 Williams-Beuren syndrome (WBS [MIM 194050]) on 7q11.23,5 Smith-Magenis syndrome (SMS [MIM 182290])/duplication 17p11.2 on 17p11.2,6 and several rearrangements associated with 22q11 including DiGeorge and velocardiofacial syndromes (DGS/VCFS [MIM 188400])7,8 and cat eye syndrome (CES [MIM 115470]).9

Although a number of well-defined genetic syndromes arise from a specific chromosomal deletion or duplication, most multiple congenital anomalies (MCA) do not have a defined genetic cause. Chromosomal abnormalities or subtelomeric rearrangements are detected by routine karyotype in a small subset of patients with MCA and/or mental retardation (MR).10 In the patient population with MR, up to 5% may have subtelomeric abnormalities. The incidence of subtelomeric rearrangements may be as high as 13% when the patients also have dysmorphic features and/or other structural malformations.11,12 Patients with MCA are particularly likely to have a chromosomal rearrangement because a deletion (or duplication) of several contiguous genes often perturbs more than one organ system. Many such chromosomal rearrangements are not detected because of limitations of conventional methods of analysis. Routine karyotype can detect only relatively large (>5 Mb) rearrangements. Subtelomeric analysis, although submicroscopic, is still limited to specific regions of the chromosomes.

It is likely that a significant proportion of children with MR/MCA have a submicroscopic chromosomal rearrangement that is not evident by karyotype or subtelomeric analysis. Techniques designed to detect such copy number differences are evolving at a rapid pace based on the availability of genomic resources and technological advances. Comparative genomic hybridization (CGH) has been used for the detection of copy number changes in solid tumors using metaphase chromosome spreads.13 Although CGH is robust for the identification of large-scale chromosomal imbalances, it does not reliably identify genomic changes involving less than 5 to 10 Mb. Recent developments in microarray technology have allowed a shift from chromosomal to microarray-based formats for CGH (array CGH).14–16

Array CGH is now widely used in the detection of chromosomal imbalances in solid tumors,17–24 MR,25–27 and other constitutional chromosomal aberrations.28–34 Array CGH has provided a high-resolution, high-throughput technique to identify smaller rearrangements in patients with MCA that would otherwise be overlooked by standard karyotyping and fluorescence in situ hybridization (FISH).35,36 These studies have used array platforms spotted with different types of DNA probes including bacterial artificial chromosomes (BACs), cDNAs, and repeat-free, PCR fragments.

BAC-based microarrays are particularly popular and have been used successfully to detect recurrent microdeletions in patients with MR.37–39 A highly targeted BAC microarray was used by Sharp and colleagues,37 whereas the other two groups used a more genome-wide approach using a tiling BAC array38 or a 1-Mb resolution whole-genome BAC array.39 BAC microarrays are now routinely used in clinical diagnostics and have a much higher detection rate for copy number abnormalities than standard cytogenetic analysis.40,41 BAC arrays used in clinical diagnostics are highly targeted for known regions of copy number abnormalities and contain 400 to 850 BAC clones.40,41 Genome-wide BAC arrays at 1-Mb resolution25,27,39 and whole-genome tiling arrays38 have better genomic coverage than targeted arrays. Yet, because of the large size of the BACs (150–200 Kb), these arrays do not necessarily improve the resolution, as they are unable to reliably detect aberrations smaller than the BAC insert. Thus, oligonucleotide-based arrays are slowly emerging as the platform of choice for genome-wide analysis of copy number alterations because of their high throughput and high resolution.42–53

OLIGONUCLEOTIDE-BASED MICROARRAYS FOR COPY NUMBER ANALYSIS

The use of oligonucleotide arrays for high-resolution copy number analysis was first described by Lucito and colleagues42 in a methodology they called representational oligonucleotide microarray analysis (ROMA). ROMA is a form of comparative genome hybridization that uses a two-color assay to cohybridize the test genome and a reference genome to an oligonucleotide microarray. The microarrays used in ROMA contained 70-mer oligonucleotide probes that were either printed on glass slides or synthesized directly on a silica surface using laser-directed photochemistry. Further, it used a technique called representation, in which the genome complexity is simplified using PCR strategies.54 The oligonucleotide probes were chosen to be within the portion of the genome that was amplified in this lower-complexity representation. This oligonucleotide-based microarray was then used to detect amplifications and deletions, both homozygous and hemizygous, in cancer genomes.42 ROMA has also been used successfully to detect copy number variation (CNV) in normal human genomes mediated by relatively large (100 kb to 1 Mb) deletions and duplications.42,43 The utility of ROMA-based arrays in the detection of copy number alterations in patient samples was demonstrated by their use in the detection of known chromosomal abnormalities.51 High-density, ROMA-based arrays were recently used to detect de novo copy number alterations in patients with autism spectrum disorders (ASD).55 ROMA-based microarray analysis was performed on 264 families, which included families containing one or more children diagnosed with ASD and control families with no diagnoses of autism. The results suggested that spontaneous copy number changes are more frequent in patients with ASD than in unaffected individuals.55 The CNVs detected in this study ranged in size from 99 kb to 12 Mb, and the number of genes that were either deleted or duplicated ranged from 1 to 69.55

High-density synthetic oligonucleotide arrays have been commercially available for several years, from Affymetrix, Inc. (Santa Clara, CA) for applications such as high-throughput monitoring of gene expression and genotypic analysis for linkage studies.56 A few of these high-density arrays had been previously applied to detecting genomic alterations at the level of loss of heterozygosity in tumor samples.57,58 More recently, Affymetrix developed a high-density genotyping platform for the genome-wide analysis of 11, 555 SNPs.59 The subsequent development of algorithms that were capable of detecting copy number gains and losses from this SNP array data led to the application of the Affymetrix GeneChip Mapping10K SNP Array to the detection of copy number alterations mainly in tumor cells.45–48,60 Rauch and colleagues46 were the first to report the use of the 10K SNP array for molecular karyotyping of patients with MCA/MR syndromes. They successfully detected 15 previously characterized chromosomal aberrations using the 10K array in patients with known syndromes such as DiGeorge/VCFS, Angelman, Smith-Magenis, and Williams-Beuren.46 The aberrations detected in patients ranged in size from 600 Kb to 13 Mb.46

The subsequent development of the GeneChip Mapping 100K SNP Array set by Affymetrix, which contains 116,204 SNPs, provided a tool for assessing copy number alterations at a much higher resolution.61 The 100K arrays were used successfully for the detection of previously well-characterized, clinically significant cytogenetic abnormalities.62 In this study, Slater and colleagues analyzed 23 individuals, 17 of whom had known cytogenetic abnormalities including unbalanced, structural, and whole chromosome abnormalities. The 100K arrays were shown to successfully detect pathogenic amplifications and deletions ranging in size from 1.3 to 145.9 Mb.62 Further, the 100K SNP arrays detected previously undetectable, submicroscopic microdeletions in patients with MCA of unknown etiology.52 In this study, Ming and colleagues demonstrated the utility of high-density arrays in the clinical diagnosis of patients who had previously been tested for chromosomal aberrations by standard clinical tests. Novel, de novo deletions were detected in 2 of 10 patients tested, including a 1.7-Mb interstitial deletion in 1p36 and an approximately 3-Mb deletion in 3p21.52

Subsequently, a larger study in which 100 children with idiopathic MR were analyzed with the 100K array set allowed the detection of submicroscopic, de novo copy number alterations as small as 178 Kb in 11 patients.53 They also successfully detected mosaicism in a patient with mosaic trisomy 9, which was later validated by cytogenetic analysis.53 An additional benefit of the SNP arrays is the availability of genotype information on thousands of SNPs, which allows for the detection of copy-neutral chromosomal aberrations such as uniparental disomy (UPD).53,62,63 UPD can lead to disease phenotypes as a result of uniparental imprinting or acquired homozygosity of a recessive mutation. A well-characterized example of this phenomenon is maternal UPD of chromosome 15, which leads to inheritance of uniparental imprinting, leading to PWS in a subset of nondeleted patients.64 Thus, SNP-based arrays allow the detection of copy-neutral events underlying genetic disorders, which manifest in the form of MCA/MR. Further, in cases of copy-neutral UPD and copy number aberrations, the analysis of parental genotypes would allow the identification of the parent of origin and potentially the mechanism of the aberration.

The Affymetrix arrays require genome complexity reduction for improved signal to noise ratio, which they achieve by a PCR-based strategy called whole-genome sampling, very similar to the representation technique used by ROMA.65 The use of spotted oligonucleotide arrays for CGH without genome complexity reduction was first demonstrated by Carvalho and colleagues,66 who used oligonucleotide arrays containing 18,861 oligonucleotides (60-mers) representing 18,664 genes for the analysis of copy number changes in tumor samples.66 Total genomic DNA from the test and reference genomes was labeled using random priming in separate reactions using Cy3 and Cy5, which were then cohybridized to the oligo array.20 Further refinement of probe-design criteria, assay conditions, and analysis methods by Barrett and colleagues44 led to the commercial availability of 60-mer oligonucleotide arrays for CGH by Agilent Technologies (Palo Alto, CA). One such commercially available oligo array from Agilent with 22,500 probes was used for further optimization of copy number analyses in mouse and human tumors.67 Higher-density arrays are now available in both preconfigured and custom formats from Agilent for array CGH applications. A customized array created on the backbone of Agilent's Human Genome CGH microarray kit 44B was used to evaluate genome-wide copy number alterations, with special emphasis on subtelomeric regions, in patients with developmental delay and MR.68 This approach accurately detected 15 previously well-characterized subtelomeric aberrations and microdeletion syndromes that ranged in size from 600 Kb to 154 Mb. 68 The utility of this array in clinical diagnosis was demonstrated by the successful detection of two novel, previously undetectable aberrations in patients with MR. This included a 3-Mb deletion in 14q11 and a 3.5-Mb deletion in 17q24-q25.1.68

High-resolution, oligonucleotide-based array platforms for CGH are also available commercially from NimbleGen Systems Inc. (Madison, WI) and Illumina Inc. (San Diego, CA). NimbleGen offers preconfigured genome-wide and custom tiling arrays containing long oligonucleotides (45-85 mers) for direct CGH. The use of NimbleGen tiling arrays for CGH was first described by Selzer and colleagues69 for the analysis of cancer samples using both whole-genome and custom fine-tiling arrays. They used the whole genome arrays to first identify the copy number aberrations in neuroblastoma samples and then designed custom tiling arrays to fine map the breakpoints of these aberrations.69 A chromosome 22 tiling array from NimbleGen was used to detect microdeletions and microduplications associated with DiGeorge/VCF syndromes and other constitutional, chromosome 22 copy number aberrations.70 Custom fine-tiling arrays from NimbleGen have been used successfully for the discovery of new genomic disorders in children with MCA/MR37 and to fine map microdeletion breakpoints in patients with MR71 and cognitive and behavioral abnormalities.72

Whole-genome genotyping BeadChips from Illumina offer another alternative platform for the high-resolution analysis of genome-wide copy number alterations. The Illumina BeadChip arrays, similar to Affymetrix SNP arrays, were originally designed for whole-genome genotyping.73 The development of algorithms for the extraction of copy number data from the SNP arrays allowed the use of Illumina's Human-1 and HumanHap300 Infinium BeadChips, containing 109,000 and 317,000 SNP-based probes respectively, for CGH applications.74 The Illumina BeadChip arrays accurately detected previously well-characterized copy number alterations in tumor samples and patients with MCA.74 Thus, oligonucleotide-based arrays offer an alternative to BAC-based arrays for the analysis of copy number alterations in patients with MR/MCA. Further, the high resolution and high density of probes on oligonucleotide arrays allows more precise localization of microdeletion and microduplication breakpoints.62,71,72 The development of even higher-density oligonucleotide arrays in the foreseeable future should enhance our ability to detect increasingly smaller copy number changes containing single genes or even single exons within a gene.

OLIGONUCLEOTIDE-BASED ARRAY PLATFORMS: CHOICES AND CONSIDERATION

Available Array Platforms

Noncommercial, oligonucleotide-based arrays have been generated by spotting long, synthetic oligonucleotides on derivatized glass slides in academic settings and used successfully in research.66,75 Thus, it is possible to generate an in-house oligonucleotide array that can be optimized for uses in CGH and expression studies.76 The more attractive option for potential users of oligonucleotide arrays is to purchase them commercially, which leads to a considerable amount of savings in time and effort. An important advantage of commercially produced, oligonucleotide-based arrays is the quality control used by the manufacturer to ensure consistently high levels of performance and reproducibility. The choice of the array platform will depend on several considerations, including array design, cost of experiment, and more importantly whether the array platform is appropriate for a given study. The four major manufacturers of oligonucleotide-based array platforms for high-resolution, genome-wide analysis of copy number alterations are Affymetrix, Agilent, Illumina, and NimbleGen. The features of these platforms are briefly described below with respect to array design and experimentation with a discussion about the potential advantages and disadvantages of each platform (Table 1).

Table 1 Commercially Available Oligonuleotide Array Platforms

The Affymetrix GeneChip arrays contain 25-mer SNP-based oligonucleotide markers or probes that are directly synthesized on the array surface using a process called photolithography77–79 [http://www.affymetrix.com/]. The currently available high-density platforms are shown in Table 1. The newly introduced SNP 5.0 array, which is a single array containing approximately 500,000 probes from the 500K array set plus approximately 420,000 additional probes. These additional probes are not SNP-based and were designed to cover regions that are currently under-represented in the SNP-based arrays, for a more uniform distribution across the entire genome. Further, Affymetrix plans to launch an even higher-density platform, the SNP 6.0 with approximately 1.8 million probes, in the near future. The Affymetrix arrays require 250 ng/array (500 ng total) of genomic DNA and use a PCR-based strategy called whole-genome sampling for genome complexity reduction during labeling.65 Further, in these arrays, only the test genome is labeled and hybridized to the array. The signal intensity from the probes is then computationally compared with a control set for the evaluation of copy number changes in the test sample. The control set is available from Affymetrix in the form of array data obtained from the analysis of apparently healthy HapMap individuals. Although Affymetrix provides analysis software (CNAT 4.0) for the detection of copy number alterations and loss of heterozygosity (LOH), more robust analysis tools are now available from third-party providers, both commercial and academic. The more reliable, noncommercial software for the analysis of Affymetrix arrays include dChipSNP80 and CNAG.81 Affymetrix arrays have been successfully used in the detection of copy number alterations in patients with MR/MCA in several studies.52,53,62

The Agilent CGH microarrays contain 60-mer oligonucleotide probes printed onto derivatized glass slides using a noncontact industrial inkjet printing process44 [http://www.agilent.com]. The currently available high-density platforms from Agilent are shown in Table 1. Further, Agilent offers the ability to design custom arrays made from oligonucleotides that can be selected from Agilent's database of more than 8 million predesigned and validated probes for array CGH. The Agilent arrays require 500 ng of genomic DNA but can also use PCR-amplified genomic DNA, which can be as low as 10 ng. This type of amplification introduces additional variation and is not recommended for clinical applications.76 The Agilent arrays use a true comparative genome hybridization that requires a two-color assay in which the test and reference genomes are labeled with different fluorophores (usually Cy3 and Cy5) and then cohybridized to the same array. The signal intensity ratios of the test sample versus the reference sample are then calculated for each probe across the entire genome. The CGH Analytics 3.4 software provided by Agilent has a user interface for visualization and analysis of copy number aberration patterns from microarray profiles. Agilent array CGH platforms have also been used successfully to detect copy number alterations in patients with MR/MCA.68

The Illumina BeadChip arrays are based on their BeadArray technology, in which silica beads self-assemble in microwells on silica slides. Each bead is then covered with hundreds of thousands of copies of a specific 50-mer SNP-based, oligonucleotide probe73 [http://www.illumina.com/]. The currently available high-density platforms from Illumina are shown in Table 1. Illumina has recently introduced a new platform, the HumanCNV370-Duo, specifically for the analysis of CNVs in the genome. In addition to the more than 318,000 SNP-based probes from the HumanHap300-Duo, the HumanCNV370-Duo contains probes for approximately 11,000 CNVs that have been identified in the genome. The Illumina arrays require 750 ng of genomic DNA, which is isothermally amplified, fragmented, and hybridized to the array. The assay uses a base extension step after hybridization for allele-specific extension. The products are then fluorescently stained using a dual-color approach. The signal intensity from the probes is then computationally compared with a control set for the evaluation of copy number changes in the test sample. The control data are a built-in component of the BeadStudio software provided by Illumina, which has a user interface for visualization and analysis of genotypes as well as copy number aberrations. Illumina arrays have been used successfully to detect copy number alterations in patients with microdeletions associated with MR/MCA.74

The NimbleGen CGH microarrays contain isothermal, 45- to 85-mer oligonucleotide probes that are synthesized directly on a silica surface using light-directed photochemistry69 [http://www.nimblegen.com]. The currently available high-density platforms include the human whole-genome CGH microarray with 385,000 probes with a median interprobe distance of approximately 6 Kb. In addition to the preconfigured whole-genome array, NimbleGen also provides researchers with the ability to design custom fine-tiling arrays that allow user-specified array design of up to 385,000 probes per array. NimbleGen is planning to introduce a higher-resolution genome-wide array with 2.1 million features in the near future. The NimbleGen arrays require 1 to 3 μg genomic DNA from both the test and reference samples. The genomic DNA samples are randomly fragmented into lower molecular weight species, differentially labeled with fluorescent dyes, and cohybridized to the same array. Data are extracted using NimbleGen's NimbleScan software and viewed with NimbleGen's SignalMap data browser software. NimbleGen array CGH platforms have been used successfully to detect copy number alterations in patients with MR/MCA.37,71

Factors to Consider in Array Platform Selection

These available high-density oligonucleotide array platforms can potentially provide the resolution desirable for the detection of chromosomal copy number alterations. Yet, there are several factors that determine whether any array platform has the necessary spatial resolution. These factors include the number of probes, their chromosomal distribution, and sensitivity of detection.76 Most oligonucleotide array platforms require at least five adjacent probes to be deleted or duplicated for reliable detection. Thus, even though the average distance between probes would suggest a resolution of 5 to 6 Kb, the expected detection sensitivity is dependent on how many adjacent probes are present with a given fragment and the distance between them. Thus, the expected resolution of the currently available high-density arrays is somewhere in the range of 30 to 350 Kb (Table 1). It is worth noting that the observed resolution can vary greatly and is dependent on several factors, including probe density in a given region and the level of noise in the experiment. Thus, the functional sensitivity of these oligonucleotide arrays is usually much lower than what is suggested by the probe density and expected resolution.82 Further, the SNP-based microarrays (Affymetrix and Illumina) are potentially limited in the selection of probes as they require the presence of a validated SNP within the probe sequence. Thus, distribution of probes on these arrays is not as even as desired across the genome. The newer generation of Affymetrix (SNP 5.0 and 6.0) and Illumina (HumanCNV370-Duo and 1 Mb) arrays are addressing this issue by placing probes that are not SNP-based to fill the gaps in the SNP-based arrays. It remains to be seen how well this approach of placing SNP and non-SNP based probes on the same array will perform. Thus, the Agilent and NimbleGen platforms currently allow more freedom in the design of the array and probe distribution as they are not limited by the SNP content of the region of interest. This results in more evenly distributed probes across the region.

The lack of even distribution of probes notwithstanding, the SNP-based array platforms have one important advantage, in that they provide genotype information along with copy number data. As previously discussed, the SNP genotypes are helpful in detecting regions of LOH in the genome. The utility of LOH detection in cancer samples is already well documented,48 but LOH can also be relevant in constitutional disorders. In constitutional disorders, copy neutral LOH can result from UPD, which has been shown to cause genomic disorders characterized by multiple anomalies including developmental delay and MR.64 Thus, SNP-based array platforms can provide information about copy-neutral events such as UPD, which are undetectable by non–SNP-based arrays like Agilent and NimbleGen.

Further, the SNP genotype information can provide additional support for the copy number values computed from the ratio of the signal intensities in the test and reference samples (Fig. 1). In the case of a microdeletion, the loss of one copy is detected by a log2 ratio of −0.5 for the deleted probes (Fig. 1, A and B). In addition, all SNPs in this region are scored as homozygous (hemizygous) because of the loss of one allele. Besides resulting in high LOH scores, this is easily visualized on the analysis plots from both Affymetrix (Heterozygous SNP calls) and Illumina (B allele frequency plot) (Fig. 1) serving as an independent confirmation of the deletion. The B allele frequency plot provided by Illumina BeadStudio output provides the allelic ratio information (Fig. 1, B and C). In the case of a microduplication, the allelic composition for each probe can have one of four possible genotypes resulting from the duplication event including AAA, AAB, ABB, and BBB. The homozygous SNPs still cluster at 0.0 and 1.0, but the heterozygous SNPs that normally cluster between 0.4 and 0.6 (diploid AB) get split up into two clusters. The cluster at 0.33 represents the SNPs that have a genotype of AAB and the cluster at 0.67 represents the SNPs that have a genotype of ABB. This split in the heterozygous SNP cluster within the B allele frequency plot provides additional confirmation of a microduplication detected by a log2 ratio of 0.5 for the duplicated probes (Fig. 1C). Thus, the availability of the genotype information along with copy number should make the detection of authentic copy number alterations more robust using SNP-based platforms like Affymetrix and Illumina. Dye-swap experiments have been used with the Agilent and NimbleGen arrays to serve as a confirmation of detected copy number alterations. The high cost of the arrays and experimentation makes this option prohibitive in most cases.

Fig 1
figure 1

Graphical output for copy number using SNP-based arrays. For probes that are normal copy number, the signal intensity ratio of the subject versus controls is expected to be 1, and log 2 R ratio should be approximately 0.0 (log 21 = 0). Loss of copy number resulting from deletion in the subject would result in a negative log 2 ratio (mean log 2 ratio ∼−0.5). Gain of copy number resulting from duplication in the subject result in a positive log 2 ratio (mean log 2 ratio ∼0.5). A, Copy number output for chromosome 15 using the Affymetrix 50K Xba Mapping GeneChip as computed by copy number analysis software CNAG.81 Red dots represent raw log 2 R ratio values for each SNP. Blue curves represent copy number inferences based on local mean analysis for 10 consecutive SNPs. Heterozygous SNP calls are shown as green bars below the ideogram. The deletion detected in this patient based on log 2 R ratio is shown as a red bar. The deleted region has no heterozygous SNP calls. B and C, Copy number output using the Illumina HumanHap550K BeadChip Array as computed by BeadStudio software provided by Illumina. The two plots shown are for B allele frequency and log 2 R ratio. B, The deletion on chromosome 15 detected in this patient based on log 2 R ratio is shown by a red bar. The B allele frequency plot for this patient has no heterozygous (AB) SNP calls. C, The duplication on chromosome 22 detected in this patient based on log 2 R ratio is shown by a green bar. The B allele frequency plot for this patient shows the split in the heterozygous SNP cluster representing the AAB and ABB genotypes.

Most array platforms have background noise in the data, and this is especially true for oligonucleotide arrays. It has been suggested that the level of noise is inversely proportional to the length of the oligonucleotide probes.76 Thus, the Affymetrix arrays with 25 mers are expected to have the highest noise level. The Agilent (60 mers) and NimbleGen (50-85 mers) arrays would therefore be expected to have less noise. Computational tools and statistical algorithms that allow extensive normalization of the data to control for noise have been developed and greatly minimize background.44,74,80,81,83 Another factor that may contribute to noisy data from the Affymetrix and Illumina arrays is the fact that the signal intensities are compared with a control set that was previously analyzed under different conditions. Thus, the Agilent and NimbleGen arrays have an advantage over the other platforms as the test and reference genomes are processed similarly and compared with each other on the same slide.

For copy number analysis of tumors, an important consideration in the selection of an oligonucleotide array platform is its suitability for the type of sample available. Thus, arrays that require genome complexity reduction are not suitable for formalin-fixed paraffin-embedded samples. For patients with MR/MCA, the source of DNA is usually a blood sample that yields high-quality genomic DNA suitable for all four platforms previously discussed. Future improvements in experimental and computational analysis may enable the use of low-yield samples such as buccal swabs or saliva for copy number detection using oligonucleotide arrays. The cost of the array, experimentation, and additional equipment needed are also factors for consideration in the selection of array platforms. The Affymetrix arrays are the most competitively priced, with a total cost of approximately $250 per sample for the available 500K set and SNP 5.0 (a few reagents have an additional cost). The cost of Agilent, NimbleGen, and Illumina arrays range between $550 and $800 (based on list price just for the arrays; assay kits and reagents may have an additional cost) depending on the density of the platform. Both Affymetrix and Illumina arrays require expensive, array-specific fluidics stations and scanners. Even Agilent and NimbleGen arrays require some investment in expensive equipment including hybridization chambers and scanners.

A perceived limitation of high-resolution, genome-wide oligonucleotide arrays is that, in addition to the pathogenic copy number alteration, they will also detect copy number variation of regions that may not be involved in the patient's disease phenotype. Several recent studies conducted on normal, healthy individuals have revealed extensive CNV within the human population.43, 84–90 Initial studies have suggested that there are at least 20 CNVs in each of our genomes.90 This number is likely to increase as denser arrays are used for genome-wide scanning of copy number alterations. Thus, the presence of these CNVs could potentially hinder the accurate detection of pathogenic copy number aberrations. Although this may be true for novel copy number alterations found in patients with MR/MCA, this is clearly not an issue in the detection of known genomic disorders like DGS/VCFS, PWS/AS, or WBS, all of which are associated with large, well-defined copy number alterations.3–5,7,8 Parental testing is currently used to determine whether a novel copy number alteration detected in patients with MR/MCA is de novo before it is considered potentially pathogenic.37–39,52,53

It is worth noting that not all inherited copy number alterations are benign. Thus, the parent of a severely affected patient may carry the same copy number alteration but be phenotypically normal or have a milder phenotype, as has been shown for the 22q11.2 microduplication syndrome.91,92 Conversely, not all apparently de novo copy number alterations have to be pathogenic. Based on my observations (unpublished data), many copy number alterations localize to gene-poor regions and may represent rare variants that have no relevance to the patient's phenotype. Thus, the determination of clinical relevance of any observed copy number alteration would require a thorough analysis of the altered region especially for genic content. Thus, a region that is duplicated or deleted in a patient, is de novo, and contains multiple well-characterized genes is highly likely to be pathogenic. Single gene deletions or duplications, even if de novo, will be harder to interpret unless they involve a gene that has already been associated with a known syndrome. Many of the known CNVs found in control individuals involve single genes.

Thus, a database of CNVs found in healthy control individuals could provide a useful resource for the detection of pathogenic microdeletions and microduplications. A database of CNVs detected among multiple studies of control individuals has been created by the Toronto Center for Applied Genomics (http://projects.tcag.ca/variation/), which is periodically updated from published results. One has to be careful when using such resources, as the detection of a copy number aberration, once or a few times, in such a database does not necessarily make it a nonpathogenic CNV. These databases are highly dependent on the publications that report those particular CNVs and are not necessarily monitored for pathogenic status or authenticity. Ideally, the CNV database needs to be created from a large population of control individuals who have been clinically evaluated to rule out severe phenotypes found in patients with MR/MCA. This CNV database will then allow us to distinguish nonpathogenic CNVs from pathogenic copy number alterations detected in patients with MR/MCA using high-resolution oligonucleotide microarrays.

SUMMARY

It is clear that oligonucleotide-based microarrays are increasingly becoming the platform of choice for the detection of genome-wide copy number alterations. Further, most available high-density platforms have very good coverage across the regions that are known to be associated with clinically well-characterized genomic disorders.37–39,52,53,68,70,71 Thus, it is highly likely that genome-wide oligonucleotide arrays will be applied in clinical diagnostics in the foreseeable future. Commercial array manufacturers have already begun the process of testing and applying their high-density arrays in clinical diagnostics. Thus, it is critical that researchers, diagnostic laboratory directors, and clinicians are aware of the advantages and the limitations of these high-resolution platforms to better interpret the results and determine the clinical relevance of the copy number alterations detected in patients with MR/MCA.