Introduction

Chromosome analysis is an important part of the diagnostic approach to identifying congenital malformations, developmental delays, birth defects and mental retardation. G-banding analysis has been the gold standard for cytogenetic diagnosis ever since the development of chromosome banding techniques in the late 1960s (Caspersson et al. 1968). Although highly reliable, the major limitation remains the requirement for cell culture, resulting in a delay of as many as 14 days in obtaining test results (Shaffer and Lupski 2000). In an attempt to overcome some of the limitations of G-banding analysis, a number of innovative molecular technologies have been developed that have dramatically improved the ability with which structural and numerical chromosome abnormalities can be identified. The development of microarrays that identify selected loci from the entire genome (Snijders et al. 2001; Vissers et al. 2003) has initiated significant interest in adapting the array technology for the diagnosis of chromosomal abnormalities. Microarray comparative genomic hybridization (array CGH) is based on the same principle as conventional CGH, but array CGH differs in that genomic clones from selected regions of the genome replace the “control” metaphase cells as the target DNA (Solinas-Toldo et al. 1997; Geschwind et al. 1998; Pinkel et al. 1998; Albertson et al. 2000; Bruder et al. 2001; Snijders et al. 2001; Yu et al. 2003). Since genomic clones are used as the target DNA, the resolution of the technique is theoretically expanded to the size of an individual clone, depending on its size and spacing. Therefore, rearrangements and deletions, which are not visible by either routine G-banding analysis or conventional CGH methods, can be detected (Ishkanian et al. 2004).

Although it is becoming accepted that array CGH will have a place in clinical genetic testing, it is far from clear how this will be best applied. The coverage and resolution of array CGH are dependent on the design and density of the array used. Although ostensibly appealing, an array covering the entire genome at a very high resolution would have potential disadvantages in clinical use; large arrays are more expensive to fabricate, to control for quality and to analyze (Rickman et al. 2006). Recent investigations showing significant levels of copy number polymorphism in normal populations (Iafrate et al. 2004; Sebat et al. 2004) reinforce the desire to test only a limited number of clones which would preclude complications in interpretation (Cheung et al. 2005; Rickman et al. 2006; Sahoo et al. 2006; Shaffer et al. 2006). In prenatal diagnosis, it is very important that the results be obtained promptly in order to alleviate the patient’s anxiety. In addition, when ambiguous results in conventional G-banding analysis are encountered [e.g. small supernumerary marker chromosomes (sSMC) and the unbalanced product of a translocation], it is very important to be able to explore the origin of the abnormal chromosomal fragment (Liehr et al. 2004; Liehr et al. 2006).

Therefore, we have developed an array CGH system (consisting of an array CGH chip in addition to its exclusive analysis software) for constitutional genetic diagnosis. Our array CGH chip consists of 1440 non-overlapping bacterial artificial chromosome (BAC) clones (MACArray Karyo 1400 BAC-chip; Macrogen, Seoul, Korea), which were selected among 96,768 BAC clones constructed by the Korean Genome Project and carefully mapped and validated by end-sequencing and fluorescence in situ hydridization (FISH). Our analysis software (MacViewer) has a wide range of features with a user-friendly Windows graphical user interface (GUI) and easy-to-use software tools. In contrast to traditional pull-down menus, with only one click of the auto execution button all analysis steps, including spot gridding, segmentation and quantification, are automatically processed and ultimately reported as a log2Test/Reference (T/R) signal ratio graph.

In the study reported here, we evaluated our array CGH analysis system as a molecular diagnosis system in which we used genomic DNA extracted from pre- and postnatal clinical subject samples, including amniotic fluid (AF), chorionic villi (CV), cord blood (CB), peripheral blood (PB) and products of conception (POC).

Materials and methods

Sample collection

To validate our array CGH system, eight standard cell strains, containing cytogenetically mapped Patau [AG12070 (XX), GM00526 (XY)], Edward [GM00143 (XX), GM01359 (XY)], Down (GM03606 (XX), AG05397 (XY), Turner (GM10179), and Klinefelter (GM03102) syndrome loci, were purchased along with a normal cell strain [GM08400 (XX), GM08402 (XY)] from the Coriell Institute for Medical Research (Camden, NJ). Genomic DNA to be used for array CGH analysis was extracted from these cells.

The archived DNA samples used for this study, which contain known karyotype abnormalities as determined by G-banded analysis, consisted of 42 clinical samples containing (AF (n = 15), CV (n = 24), CB (n = 1) and PB (n = 2) that had been previously collected with informed consent in the Hamchoon Women’s Clinic. Additionally, the 222 clinical samples used for array CGH and G-banding analysis in this study consisted of AF, CV, CB, PB and POC. During the study period (3 months and 10 days), 384 patients visited Hamchoon Women’s Clinic of Korea, of whom 222 agreed to take additional DNA chip analysis. One hundred and eighteen AF samples were collected for increased risk for chromosomal abnormalities and high risk for neural tube defect on second trimester maternal serum screening (n = 92), for advanced maternal age (n = 23) and for abnormal ultrasonography (USG) findings (n = 3). Seventy-nine CV samples were collected for advanced maternal age (n = 52) and for increased risk for chromosomal abnormalities on first trimester screening, including increased nuchal translucency (n = 27). Seventeen CB samples were collected from the cord of newborns, and PB were obtained from adults. Appropriate ethical approval was obtained during the recruitment of patients. The reference DNA used in this study was extracted from a placenta that was identified as a normal male (46, XY) by means of karyotyping.

Cytogenetic analysis

Cells from diverse clinical samples were cultured, arrested in metaphase and harvested, and the chromosomes were then G-banded using standard techniques. Image acquisition of metaphase cells and the subsequent karyotyping were performed. The karyotypes were characterized according to the guidelines of the International System for Human Cytogenetic Nomenclature (ISCN 1995).

Construction of the BAC library and the BAC-mediated CGH microarray

The array (MACArray Karyo 1400 BAC-chip) used in this study consists of 1440 human BACs that are spaced approximately 2.3 mb on average across the entire genome (Cho et al. 2005; Park et al. 2006). The BAC clones consisted of 1440 clones that 356-cancer related genes from the proprietary BAC library of Macrogen. The source of human DNA for making the BAC library was the human sperm derived from one Korean man. Approximately 1440 clone locations are shown in our website (http://www.macrogen.co.kr/eng/biochip/genelist_overview.html). Briefly, the pECBAC1 vector (Frijters et al. 1997) was digested with HindIII, and size-selected pooled male DNA was used to generate a BAC library. These vectors were then transformed and grown in the Escherichia coli DH10B strain. All clones were two-end sequenced using an ABI PRISM 3700 DNA Analyzer (Applied Biosystems, Foster City, CA), and their sequences were Blast analyzed and mapped according to their positions as described in the University of California, Santa Cruz (UCSC) human genome database (http://www.genome.ucsc.edu). Confirmation of the locus specificity of the chosen clones was performed by removing multiple loci-binding clones by individual examination using standard FISH procedures as described previously (Pinkel et al. 1986). These clones were prepared by the conventional alkaline lysis method to obtain BAC DNA. The DNA was then sonicated to generate fragments of approximately 3 kb before mixing with 50% DMSO spotting buffer. The arrays were manufactured by an OmniGrid arrayer (GeneMachine, San Carlos, CA) using a 24-pin format. Each BAC clone was represented on an array as triplicate spots, and each array was pre-scanned using a GenePix4000B scanner (Axon Instruments, Foster City, CA) for proper spot morphology.

DNA isolation, DNA labeling, array hybridization and analysis

Test subject DNA was extracted from 2.6 mg of tissue (CV, POC), 4 ml of AF and 300 μl of blood (CB, PB) using the PureGene kit (Gentra Systems, Minneapolis, MN) and then dissolved in 100 μl of DNA Hydration Solution. In the case of AF, DNA was dissolved in 25 μl of DNA Hydration Solution. The concentration of test DNA was approximated by comparing the band intensities of test and reference DNA with that of a λHindIII ladder (SM0101; Fermantas, Vilnius, Lithuania). After extracting the DNA, we labeled 50–500 ng of both the test and reference DNA with Cyanine 3- and Cyanine 5-dCTP (Perkin Elmer), respectively, by a random priming method using Exo-Klenow Fragment (Invitrogen, Carlsbad, CA) for 16 h. Labeled test and reference DNA was purified, precipitated with ethanol plus Human Cot-1 DNA and dissolved in hybridization buffer according to the manufacturer’s recommendations. For hybridization, labeled DNA was denatured at 70°C for 15 min, hybridized with an array CGH slide that had been pre-hybridized with denatured salmon sperm DNA and then incubated at 37°C for 42–48 h. Following hybridization, the slides were washed for 15 min at 46°C in 50% formamide and 2× SSC, washed for 15 min at 48°C in 2× SSC and 0.1% sodium dodecyl sulphate (SDS), washed for 15 min at room temperature in PN buffer (0.1M sodium phosphate buffer, 0.1% NP40, pH 8.0) and finally washed for 5 min at room temperature in 2× SSC. The slides were dehydrated by briefly soaking them in an ethanol series (70, 85 and 100%) followed by centrifugation (1500 rpm, 5 min, room temperature). Images of the hybridized slides were acquired using a GenePix4000B dual-laser scanner (Axon Instruments, Union City, CA) by simultaneously scanning each array at wavelengths of 635 and 532 nm. The spots were analyzed with our newly developed software (MacViewer software). The mean ratio of fluorescence intensities derived from the hybridized subject and reference control DNA at each test spot on the microarray was calculated and normalized by the mean ratios measured from reference spots on the same slide. Because each clone is printed in triplicate on the microarray, the mean ratio of the three normalized spots for each clone was obtained, automatically converted to a log2 scale and plotted in graph form.

Fluorescent in situ hybridization

Cells from diverse clinical samples were cultured, arrested in metaphase and harvested. The FISH analyses were performed as described elsewhere (Hayashi et al. 2005) using a BAC located around the region of interest as probes. Capturing of the hybridization signals was performed using a CytoVision system.

Results

The initial assessment of the performance of our system was carried out using hybridizations of genomic DNA samples from a series of standard cell lines purchased from Coriell Cell Repositories. We analyzed the results from 34 independent tests designed to detect trisomies of chromosomes 13, 18 and 21 and sex chromosome aneuploidies (Turner and Klinefelter syndromes), respectively. The means (calculated as a log2T/R signal ratio value) and standard deviations (SD) for all of the assays are shown in Fig. 1. The log2 values behave in a linear fashion because the relation of the intensities of both dyes inversely influences the value of the intensity ratio at a spot. To determine the range of values that could be confidently diagnosed for every assay, we calculated the 99% confidence interval (CI) for the distribution of control and affected individuals. We selected a cut-off value using 34 times the SD across all the assays, without false positive or false negative results. For the autosomal and X chromosomes, an average log2T/R signal ratio value >0.200 was classified as a chromosome number gain, whereas an average log2T/R signal ratio value ≤0.200 was classified as a chromosome number loss. For the Y chromosomes, an average log2T/R signal ratio value >0.200 was classified as a chromosome number gain, and an average log2T/R signal ratio value ≤0.400 was classified as a chromosome number loss.

Fig. 1
figure 1

Typical results of control and affected individuals for all assays. The Y-axis represents the log2T/R signal ratio means value with standard deviations for all of the assays. For autosomes and X chromosomes, an average log2T/R signal ratio value >0.200 was classified as a chromosome number gain, whereas an average log2T/R signal ratio value ≤0.200 was classified as a chromosome number loss. For the Y chromosomes, an average log2T/R signal ratio value >0.200 was classified as a chromosome number gain, and an average log2T/R signal ratio value ≤0.400 was classified as a chromosome number loss

To validate the criteria and guidelines for clinical diagnosis of our array CGH system, we tested 42 archived genomic DNA samples with known karyotype abnormalities identified by G-banded analysis. Among these 42 archived DNA samples, there were 24 cases of trisomy 21, nine cases of trisomy 18, four cases of 47, XXY, three cases of 45, X and two cases of trisomy 13. We found that the results of the array CGH were completely in accord with cytogenetic banding results (Table 1). Additionally, to determine the applicability of using this procedure in constitutional diagnosis using pre- and postnatal clinical samples, we performed cytogenetic analysis and array CGH simultaneously. To this end, 222 diverse clinical samples containing AF (n = 118), CV (n = 79), CB (n = 17), PB (n = 2) and POC (n = 6) were analyzed. In a blinded fashion, the samples from this analysis were compared to each other to ascertain whether array CGH could verify all of the abnormalities found by G-banding analysis and to determine whether additional undetected changes could be identified. As shown in Table 2, in 221 of the 222 clinical samples the karyotype results of array CGH demonstrate an exact concordance with those defined by banding analysis (concordance rate = 99.5%). Of these 222 clinical samples, 11 cases showed abnormal results in the cytogenetic banding analysis. Among these 11 cases, both techniques obtained abnormal results in eight cases – six cases of autosomal trisomies (trisomy 9, 15, 16, 18, 20, and 21) and two cases with sex chromosome abnormalities (XXY, XYY). Two cases of sSMC, the origin of which cannot be determined by conventional cytogenetics, was identified by origin by array CGH. One case that was identified as 47, XXX by cytogenetic banding analysis proved to be 46, XX by array CGH. Notably, a considerable portion of the abnormities was identified in POC. Figure 2 provides a comparison of the data for typical autosomal trisomies and sex chromosome abnormalities by array CGH and conventional cytogenetic analysis.

Table 1 Results of array comparative genomic hybridization (CGH) analysis on 42 archived genomic DNA samples with known chromosomal abnormalities
Table 2 Comparison between array CGH analysis and cytogenetic banding analysis on 222 clinical samples
Fig. 2
figure 2

Detection of autosomal trisomies (a) and sex chromosome abnormalities (b) by array comparative genomic hybridization (CGH) and karyotyping. A subject sample was shared for array CGH and karyotyping. In array CGH, a normal male (46, XY) was used as reference. A dot represents a bacterial artificial chromosome (BAC) clone, the X-axis represents chromosome number (1–22, X, Y) and the Y-axis represents the log2T/R signal ratio value. Green dots represent a copy number gain (log2T/R signal ratio value >0.200) and red dots represent a copy number loss (log2T/R signal ratio value <−0.200). The table below the graph represents the average log2T/R signal ratio value for each chromosome

As shown in Fig. 3, an unknown origin chromosome was identified by array CGH. G-banding analysis of PB and AF showed the karyotype to be 46, XX, add (8) (p23.3) and 46, XY, 21ps+, respectively (Fig. 3a,b, left panel). The unknown origins of the derivative chromosomes detected by array CGH were 46, XX, der (8) (8qter → 8p23.3::13q32.1 → 13qter) and 46, XY, der (21)(15qter → 15q22.31::21p11 → 21qter), respectively (Fig. 3a,b, right panel). To confirm the origin of the derivative chromosome, we carried out FISH analysis with 21 subcentromeric (red) and 15q22.31 duplicated (green) region probes, revealing the presence of signals for both the 21 subcentromeric and 15q22.31 region on the derivative chromosome 21 (Fig. 3c). These findings suggest that array CGH may be a useful method for identifying unknown additional and rearranged chromosomes.

Fig. 3
figure 3

Detection of an unknown origin as supernumerary marker chromosome, derived from chromosome 13q32.1 in the peripheral blood (PB) sample (a) and derived from chromosome 15q22.31 (indicated by dotted circle and arrow) in the amniotic fluid (AF) sample (b). Left and right panel is G-banding and array CGH analysis, respectively. In the right panel, the red line represents each patient: reference (normal male) fluorescence intensity ratios for each clone; the blue line represents fluorescence intensity ratios obtained from a second hybridization in which the dyes have been reversed (reference: patient). c The addition region detected by array CGH was verified by fluorescence in situ hybridization (FISH) analysis. The FISH probes corresponding to BAC clone 21q11.2 (no. 88) in the centromere of chromosome 21 [labeled with fluorescein isothiocyanate (FITC) and shown in green] and BAC clone 15q22.31 (no. 1074) in the duplicated region from chromosome 15 (labeled with rhodamine and shown in red) were hybridized to metaphase cells. Red signals could be detected on both chromosome 15 and der(21) (green box), and the green signal could be detected on both chromosome 21 and der(21)

In summary, array CGH detected the abnormalities identified by cytogenetic analysis in 42 of the 42 archived samples and 221 of the 222 clinical samples.

Discussion

The MACArray Karyo 1400 BAC-chip is a genomic array with 1440 targets, spotted in triplicate, which includes 41 subtelomeric regions, 13 clones for DiGeorge syndrome, ten clones for Williams syndrome, eight clones for Cri-du-Chat syndrome, seven clones for Prader-Willi syndrome, five clones for Miller-Dieker syndrome, five clones for Wolf-Hirshhorn syndrome, four clones for Smith-Magenis syndrome, and other loci of interest, allowing for rapid fine mapping of regions of gained or lost DNA sequence. However, our array CGH is composed of 1440 non-overlapping BAC clones that provides an average of 2.3 mb of resolution. Thus, this platform has some limitations in facilitating the discovery and description of minor genetic aberrations, such as microdeletions and microduplications. In addition, our platform has 35 BAC clones containing copy number variation (CNV) regions (6482 CNVs; http://projects.tcag.ca/variation/), but the size distribution of these 35 CNV events ranges from 1 to 74 kb, with an average size 11 kb. Consequently, we did not find them in this study. The maximum resolution that could be obtained using BAC-based clone arrays would consist of a genomic tiling path array of 32,000 targets, which has already been achieved (Ishkanian et al. 2004). Even higher resolutions can be obtained using arrays with overlapping BAC clones (Ishkanian et al. 2004), smaller insert fragments (Bruder et al. 2001), PCR products (Mantripragada et al. 2004) or oligonucleotides (Lucito et al. 2003). However, array CGH at higher resolutions will likely await clinical application (Vermeesch et al. 2005). At the higher resolution level obtained by array CGH, similarly polymorphic loci are detected and, due to the higher resolution, the number of variants that are observed is equally increased. Two recent studies have reported the prevalence of large-scale copy number variations throughout the human genome (Iafrate et al. 2004; Sebat et al. 2004). Also, large arrays are more expensive to make and to control for quality. Distinguishing benign genomic variants from disease-causing gains or losses can be challenging even when clones are selected for low rates of polymorphism (Cheung et al. 2005). Therefore, the ideal array would contain the minimum number of clones that will deliver the required diagnosis.

In this study, we have demonstrated the usefulness and clinical applicability of our array CGH system (consisting of an array CGH chip plus its exclusive analysis software) for the detection of chromosomal abnormalities in a total of 264 archived and newly obtained clinical samples, including prenatal (AF, CV) and postnatal (CB, PB, POC) samples. Moreover, the average time to obtain the results was <72 h by array CGH in contrast to 10–14 days by conventional banding analysis. In the case of prenatal diagnosis, it is very important to obtain the result promptly in order to alleviate the pregnant woman’s anxiety. In addition, the sample quantity needed to obtain a result by array CGH was smaller (AF: 4 ml; CV and POC: 2.6 mg of tissue; 300 μl of CB and PB) than that is required in conventional cytogenetic banding studies. In particular, we recovered 50–100 ng of genomic DNA from only 4 ml of AF, which was amplified to 5–7 μg after random priming labeling using the Exo-Klenow fragment. Within 20 of removing after 20–30 ml of AF from the fetal sac, the 4-ml aliquot was used for array CGH, with the remainder being used for G-banding analysis. Miura et al. (2006) reported that cell-free fetal DNA in the supernatant of AF (10 ml) can be used for array CGH as a sample for prenatal diagnosis. Other researchers have reported the feasibility of performing array CGH with DNA isolated from as little as 1 ml of uncultured AF (Rickman et al. 2006). Through direct use of AF without cell culture, array CGH can be adopted as a rapid pre-screening. Moreover, array CGH pre-screening results are an important complement for G-banding analysis when karyotyping results cannot be reported due to cell culture failure. Chromosomal analysis of POC has been an important component to our increased understanding of the causes of fetal demise and multiple miscarriages. However, diagnosis in POC samples is often hindered by a relatively high (10–40%) rate of tissue culture failure (Lomax et al. 2000). There is no need for tissue culture in an array CGH platform, which is particularly advantageous in POC samples, as confirmed by recent studies (Shaffer and Bejjani 2004; Benkhalifa et al. 2005).

In all of the 264 clinical samples but one, the results of array CGH were in exact concordance with those of the cytogenetic analysis. The Korea Food and Drug Administration (KFDA) has recently approved the marketing of our array CGH system for the diagnosis hereditary diseases due to chromosome aberration. We have encountered one case of sex chromosomal polysomy, 47, XXX, which could not be identified by array CGH. In the case of 47, XXX, the range of the standard deviation by array CGH is wide and, therefore, likely to result in error. Consequently, it is necessary to change reference the DNA, such as sex-matched DNAs or 47, XXY DNA, instead of 46, XY DNA. A recent report (Ballif et al. 2006) described the inherent limitations of array CGH in detecting sex chromosome anomalies, especially polysomies of the X chromosome, and recommended the use of 47, XXY DNA for reference instead of 46, XY DNA. Thus, we tried to examine array CGH using the 47, XXY DNA for reference instead of 46, XY DNA; the XX and XXX samples showed a log2T/R signal ratio range of −0.063 to approximately 0.009 (−0.027 ± 0.036) and 0.263 to approximately 0.395 (0.329 ± 0.066), respectively (data not shown). In these experiments, the trisomy X was clearly distinguishable from a normal diploid female (46, XX).

Additionally, in this study, an unknown origin, such as “additional marker” chromosome, was identified by array CGH (Fig. 3). Small supernumerary marker chromosomes are structurally abnormal chromosomes equal in size or smaller than chromosome 20; they cannot be identified or characterized unambiguously by conventional cytogenetic banding techniques (Liehr et al. 2004). In general, the risk for an abnormal phenotype is about 7% when de novo sSMCs deriving from chromosomes 13, 14, 21 and 22 are ascertained prenatally (Lin et al. 2006). Patients with small derivatives of chromosome 15 tend to have a normal phenotype, and sSMCs derived from chromosomes 13, 21, and 14 also appear to have a low risk of abnormalities (Liehr et al. 2004, 2006). The characterization of sSMCs is of utmost importance for genetic counseling, especially in prenatal diagnosis (Douet-Guilbert et al. 2007). Thus, because of the several possibilities of different phenotypes attributing to the chromosomal origin of the sSMC as well as the impossibility to determine the origin by routine cytogenetics alone (Jardim et al. 2007), molecular cytogenetic methods, such as array CGH, are necessary to identify these additional chromosomal markers.

In summary, molecular analysis by our array CGH system is a promising technique that allows for the rapid screening of samples for genome-wide chromosome changes and for the unknown origin of the fragment and may augment standard karyotyping techniques for pre- and postnatal genetic diagnosis. However, array CGH is unable to detect polyploidy and balanced chromosomal rearrangements. Therefore, array CGH is a useful rapid diagnostic method, especially when combined with conventional cytogenetic analysis.