Main

Tumors are commonly characterized by recurrent DNA copy number alterations (CNAs), which often result from unbalanced translocations, amplifications, and deletions. Analysis of these and other types of structural chromosomal alterations continue to play a crucial role in the diagnosis and prognosis of cancer patients. Historically, these clonal karyotypic aberrations have been detected through conventional cytogenetic approaches such as karyotype analysis on Giemsa-banded metaphase chromosomes and metaphase-based comparative genomic hybridization (CGH). These methods offer poorer resolution (10–100 fold) than recently developed molecular techniques and depend on actively dividing cells and the subjective interpretation of metaphase chromosomes by highly trained technologists. Conversely, conventional cytogenetics facilitates the identification of chromosome abnormalities on an individual cell level in fresh (viable) tissue, which is important when one considers the level of heterogeneity present in most tumors including the presence of nontumor tissue in the pathology sample. Without tumor enrichment methods, molecular analyses could be compromised when the entire sample is analyzed, thereby diluting and underestimating the genetic properties of the tumor.

Over the past several years, array CGH (aCGH) has emerged as a reliable and reproducible high-resolution molecular approach for detecting genomic imbalances in cancer, with the advantage of using fixed and frozen (nonviable) pathology tissue.15 Genomic imbalances in tumor cells can be precisely delineated and breakpoints determined with a high degree of accuracy. Recurrent aberrations can rapidly be identified across broad patient populations leading to diagnostic assays or the identification of potential targets for therapeutic intervention.610 The ability to recognize and detect the progression of genetic events occurring during tumorigenesis is critical to developing strategies for therapeutic intervention. Therefore, identification of the targets of genetic instability that lead to invasive cancer might be crucial in (a) understanding the basis of neoplastic progression; (b) identifying potential candidate genes for more accurate diagnostic assays for risk assessment, early detection, or outcome prediction; and (c) predicting drug responsiveness or identifying molecular targets for alternative and, perhaps, less toxic therapies.

New technology, however, inevitably results in methodologic challenges that must be overcome or minimized for aCGH analysis to perform optimally and to provide an accurate reflection of the magnitude and position of genomic imbalances in a tumor sample. In this review, we focus on our recent efforts to resolve some of these pressing issues: quality of DNA and whether archival tumor tissues can yield high-quality DNA for aCGH analysis; methods to amplify DNA when very few target cells are available and whether such amplification techniques contribute additional technical noise to the aCGH profile; microdissection methods to enrich the tumor fraction before aCGH analysis; and, finally, the performance of two widely used aCGH platforms: Roswell Park Cancer Institute (RPCI) bacterial artificial chromosome (BAC) and Agilent Technologies (Santa Clara, CA) oligonucleotide arrays.

MATERIALS AND METHODS

Human cell line sample and preparation

The human Hodgkin's lymphoma cell line L428 (CD30+) for use in the laser capture microdissection (LCM) and downstream aCGH experiments described herein was purchased from Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH; (German Collection of Microorganisms and Cell Cultures, no. ACC 197). Conventional GTG (g-bands obtained by trypsin and subsequent staining with Giemsa)-banded cytogenetic studies using standard techniques and 24-color spectral karyotyping (SKY) (Applied Spectral Imaging, Carlsbad, CA), performed according to manufacturer's instructions, were used to confirm the highly complex near-tetraploid karyotype before aCGH.

Exponentially growing cells were collected by centrifugation, washed, and fixed in formalin for 20 hours before paraffin embedding. Serial sections from the formalin-fixed, paraffin-embedded (FFPE) blocks were fixed onto PALM membrane slides (Zeiss, Bernried, Germany). Slides were pretreated as follows for immunostaining: 1 hour at 65°C in a dry oven, 1 minute in xylene at room temperature, 5 minutes in 100% ethyl alcohol (2), 5 minutes in 3% H2O2, rinsed in distilled H2O. Antigen retrieval was performed at 98°C for 30 minutes using the Biocare Medical Decloaking Chamber and Diva Decloaker Universal Heat Retrieval Buffer (Biocare Medical, Concord, CA). In parallel, exponentially growing cells were collected by centrifugation, washed, and embedded in Tissue Tek Optimum Cutting Temperature (OCT) compound (Sakura, Torrance, CA) before snap freezing in −47°C 2-methylbutane. Serial sections from the frozen blocks were fixed onto PALM slides, which were pretreated as above, without xylene. Antigen retrieval was not performed on frozen samples. Slides were stained with monoclonal mouse antihuman CD30 (Dako, Carpinteria, CA) using the DakoCytomation Autostainer (Dako, Inc.) per manufacturer's instructions. The identical approach has recently been used successfully in an ongoing study with archived FFPE patient tissue samples.

Groups of L428 cells, ranging from a single cell to 100 cells, were catapulted and isolated into 10 μL Tris EDTA (TE) buffer by targeted ultraviolet A pulse on the PALM Microbeam workstation (Carl Zeiss MicroImaging, GmbH, Bernried, Germany). From the frozen sections, only groups of 100 cells were isolated by LCM. Isolated cells or representative dilutions of DNA extracted from exponentially growing cells were amplified by random fragmentation whole genome amplification (WGA) using the GenomePlex single cell whole genome amplification kit (WGA4, Sigma-Aldrich, St. Louis, MO), according to product instructions. Random fragmentation WGA was selected over random priming (or multiple displacement) WGA for this project and the lung adenocarcinoma project because random fragmentation WGA is amenable to amplification of the lower quality DNA that is often associated with archival tissues. Furthermore, the Sigma kit facilitated amplification from very low quantities of starting material. The quality and quantity of DNA were estimated by UV absorbance and electrophoresis. Fragment size after amplification was estimated electrophoretically and was within the manufacturer's range for product size (Sigma-Aldrich). The quantity and size of amplified product were highly reproducible (data not shown). Unamplified bulk DNA was treated throughout as the reference for comparison.

Patient tumor samples and preparation

Seventeen head and neck squamous cell carcinoma (HNSCC) samples and seven FFPE ovarian tumors were obtained from the RPCI Translational Tissue Resource and Paraffin Archive. The HNSCC samples were chosen from a cohort of 59 frozen HNSCC tumors that provided successful aCGH results. Two corresponding FFPE blocks were included for each HNSCC sample, except for three, for which only one block was available. All samples were obtained under protocols approved by the RPCI Institutional Review Board and reviewed by a pathologist (D.H. and P.M.-F.) to ensure that the tumors contained >50% tumor cells. DNA from the fresh frozen tissue and the FFPE tumors was prepared using the DNeasy tissue (Qiagen, Valencia, CA) and Puregene DNA purification (Gentra, Minneapolis, MN) kits, respectively, as described.11 Tissue sections (4 μm) selected from two different paraffin blocks containing the same HNSCC tumor were stained with hematoxylin and eosin (H&E) to demarcate areas for tumor macrodissection. Additional HNSCC tissue sections were chosen and prepared similarly to examine concordance with their matched frozen tissue. The quality of 19 FFPE-derived DNA was assessed using the BioScore Screening and Amplification kit per manufacturer's instructions (Enzo Life Sciences, Farmingdale, NY). The Bioscore assay requires 100 ng of FFPE-derived DNA; for the Hodgkin's lymphoma and lung adenocarcinoma analyses described in this study, the starting material was too low to use the Bioscore assay; therefore, alternate quantification, qualification, and WGA methods were used.

Fifty lung adenocarcinoma samples were obtained from the City of Hope Department of Pathology. All samples were obtained under protocols approved by the City of Hope Institutional Review Board. Tissue sections (4 μm) were H&E stained and reviewed by a pathologist (L.M.W.) to confirm the diagnosis.

Two manual dissection techniques of the lung adenocarcinoma tumors were compared in this study. DNA was extracted (1) directly from frozen OCT tumor tissue, using a H&E slide for guidance and scraping the tumor cells from the area identified by the pathologist or (2) directly from H&E-stained slides where the tumor-containing region had been identified and located to the slide by a pathologist. Coverslips were removed with xylene or water (depending on mounting medium) and, if xylene, rehydrated in a 95%, 70%, and 50% ethanol series. For extraction from OCT tissue or H&E slides, cells were manually scraped using a scalpel either from the tissue (OCT) or the slide (H&E). DNA extraction proceeded as per manufacturer's protocol (Qiagen EZ1 DNA Extraction Kit; Qiagen). To have sufficient DNA for aCGH (1 μg), random fragmentation WGA was performed using the WGA2 kit (Sigma-Aldrich) using 10 or 50 ng of the extracted DNA, per manufacturer's instructions. Amplified amounts (1–10 μg) and fragment size were in the manufacturer's range (Sigma-Aldrich) and were reproducible (data not shown).

aCGH analysis

For BAC aCGH studies, the RPCI 19k BAC array was used. DNA printing solutions were prepared from sequence connected RPCI-11 BACs by ligation-mediated polymerase chain reaction, as described previously.1113 The minimal tiling RPCI BAC array contains 19,000 BAC clones that were chosen by virtue of their Sequence-Tagged Site (STS) content, paired BAC end-sequence, and association with heritable disorders and cancer. The backbone of the array consists of 4600 BAC clones that were directly mapped to specific, single chromosomal positions by fluorescent in situ hybridization (FISH).12 Reference and test sample genomic DNA (1 μg each) were individually fluorescently labeled using the BioArray CGH Labeling System (Enzo Life Sciences) as described.14 The hybridized BAC-based aCGH slides were scanned using a GenePix 4200AL Scanner (Molecular Devices, Union City, CA) to generate high-resolution (5 μm) images for both Cy3 (test) and Cy5 (control) channels.

For oligonucleotide-based aCGH studies, the Agilent 244k CGH array was used. Reference and test sample genomic DNA (1 μg each) were individually fluorescently labeled using the Agilent Genomic DNA Labeling kit, direct method, as described by the manufacturer (Agilent Technologies, Santa Clara, CA). After hybridization, slides were washed and scanned in an Agilent microarray scanner to generate high-resolution (5 μm) images for both Cy3 (test) and Cy5 (control) channels.

All 59 HNSCC samples were analyzed by BAC aCGH with DNA derived from frozen tissue, and the details of this study will be reported elsewhere. We used subsets from this group of 59 HNSCC samples and the seven ovarian samples for the following BAC aCGH comparisons: replicate analysis of six tumors from frozen tissue banks and analysis of 10 tumors with matching FFPE samples. Using both HNSCC and ovarian tumor samples, comparisons in the signal to noise on the BAC aCGH platform was performed for the following groups: 59 frozen samples, 24 samples from FFPE-derived DNA, and 13 samples from WGA-derived DNA.

Microarray image and data analysis

BAC array image analysis was performed using ImaGene (version 6.1.0) software (BioDiscovery, Inc., El Segundo, CA) as described.14 Image analysis on the Agilent 244k arrays was performed using the Feature Extraction version 9.1 (Agilent Technologies; CGH-v4_91) protocol. The results were imported into CGH Analytics version 3.4.27 (Agilent Technologies). The log2 Tumor/Control (hereafter log2) profiles for all arrays were segmented using the DNAcopy software15 as described in the statistical analysis section.

FISH validation of CNAs

To validate CNAs by locus specific FISH, H&E slides were soaked in xylene until the coverslips fell away, rehydrated in an ethanol series, and air-dried. Slides were fixed in cold Carnoy's fixative for 1 hour and air dried for 1 hour. For L428 validation (supplemental Fig. 2), fresh slides were prepared from harvested exponentially growing cells and allowed to age overnight at room temperature. The fresh slides were immersed in 2XSSC at 37°C for 10 minutes, and the H&E slides were pretreated with pepsin (0.5 mg/mL in 10 mM HCl) at 37°C for 5 minutes. After pretreatment, standard FISH procedures were followed as described15 with a minor modification, namely, the probe and slide were codenatured on an 80°C hot plate for 5 minutes. Fifty interphase cells were analyzed by two independent researchers (100 cells total) using a dual-band pass filter on a Nikon HFX-DX microscope. For L428, 100 interphase cells were analyzed by two independent researchers (200 cells total).

Statistical analysis

All BAC array data were preprocessed using methods as described.16 Regions with common copy number means were identified by segmenting the genome using the DNAcopy software. The median absolute deviations (MADs) were calculated for the BACs on each segment, and the median of the MAD score was taken across all segments. Each BAC was assigned a “fitted” log2 value equal to the median of the segment for which the BAC was determined to be a member. All BACs with a log2 value >4 median of the MAD score from the fitted log2 ratio were identified as outliers. The fitted log2 ratios for outlier BACs were set to the original log2 ratio values. Missing values were replaced by the average fitted log2 values of the nearest nonmissing flanking BACs. Noise values were also calculated for each platform via mean squared error from fitted CBS segment values. Signal-to-noise values were computed using probes located on the X chromosome. The signal-to-noise values were computed by taking a median of the X chromosome fitted log2 divided by the MAD for that sample. The significance of signal-to-noise values between the BAC aCGH platform and the Agilent aCGH platform was determined via two-sample paired t tests for matched samples that were analyzed on both platforms. Correlation of the signal was calculated via Pearson correlation and Spearman correlation statistics. (See online supplemental material for details regarding the calculations.)

For the LCM samples, the samples representing different numbers of isolated cells and distinct treatments (FFPE and WGA) were compared using descriptive statistics including weighted and unweighted Pearson correlations and concordance metrics to describe the proportion of aCGH segments that agreed between pairwise sample comparisons. Comparisons between paired undissected and dissected samples in the lung adenocarcinoma samples were made between segments that were ±2 SD from the mean of all segments for either treatment group. The BACs that comprised these segments were then compared using a two-sided nonparametric Mann-Whitney test. For all tests, P < 0.05 was considered significant.

RESULTS

Microdissection to isolate rare cells or to separate cellular populations

To test the hypothesis that amplified DNA from a limited number of archival cells will closely approximate the aCGH pattern of untreated tumor cells, FFPE cells derived from the L428 Hodgkin's lymphoma cell line were individually isolated and pooled into aliquots ranging from 1 to 100 cells for WGA and aCGH. Isolated cells were amplified using a single-cell kit based on random fragmentation WGA (WGA4; Sigma-Aldrich). This method of DNA amplification consistently and reproducibly yielded microgram quantities of high-quality 0.5 to 1.0 kb DNA (range, 1.5–12 μg, mean ± SD = 6.7 ± 2.9 μg), even from a single cell or single cell equivalents (6 pg DNA). When CNAs were assessed by BAC aCGH, the technical noise in the data decreased and the number of distinct segments increased as cell numbers increased from 1 to 100 (Fig. 1) with the 100-cell sample very closely approximating the untreated (unamplified) reference DNA. Importantly, regions of change were evident in the single-cell samples that persisted and increased in resolution with increasing cell numbers as shown in Figure 1 for the 7p and 9qter chromosome regions.

Fig 1
figure 1

Comparison of representative array comparative genomic hybridization (aCGH) profiles from 1 cell, 10 cells, and 100 cells isolated by laser capture microdissection (LCM) from the human Hodgkin's lymphoma cell line L428. Chromosomal location (Mb) and log2 ratio are plotted along the x- and y-axes, respectively. Replicates of individually isolated and amplified samples are shown, and chromosomes 7 and 9 (left and right columns, respectively) are displayed for instructional purposes; the remainder of the genome showed similar aCGH signals (and noise) and relationships between the treatment groups. Segments indicating loss (e.g., chromosome 7p21.1-p14.2) and gain (e.g., chromosome 9q32.2-q32.3) are evident even in the single-cell samples, which become more apparent and statistically significant as the number of cells increases to the gold standard untreated reference. (See text for detailed statistics.)

Positional log2 ratio values for each segment were compared to determine correlation between the replicates and treatments. The data were highly reproducible between replicates, with values for single-cell, 10-cell. and 100-cell replicates ranging from r = 0.83 to 0.85, r = 0.87, and r = 0.967, respectively. Overall, the best correlation with the gold standard untreated reference sample was that of the 100-cell groups (r = 0.94–0.947), but the correlation r values for the 75-cell, 10-cell, and the single-cell groups were still unexpectedly high (r = 0.91, r = 0.75–0.80, and r = 0.34–0.51, respectively). Of interest, the single-cell and 10-cell groups revealed consistent CNAs (Fig. 1). When replicates were grouped and compared with untreated reference, the correlations increased as expected (data not shown). The single-cell and 10-cell groups uncovered relevant aCGH elements, as shown in Figure 1. In every case, the P values of the correlations were <0.0001, indicating that all pairwise correlations were significantly >0.

A comparison between treatments suggested that none of the conditions of the current study affected downstream data generated by aCGH (Fig. 2). Comparison of representative aCGH data from untreated reference DNA to diluted, amplified DNA demonstrated that WGA was not introducing undesirable allele bias or otherwise altering aCGH results. Comparison of representative aCGH from FFPE, LCM-isolated, and amplified DNA (right) to the amplified reference (center) demonstrated that neither FFPE nor LCM was affecting aCGH results. Both conclusions are strongly supported by a comparison of the two treatment extremes in the current study, i.e., the untreated reference and the most treated (FFPE, LCM, WGA) 100-cell sample (Fig. 2) and the highly significant correlation between these two data sets (r = 0.95, P < 0.0001). The region 2p14 → 2p22.1 was consistently elevated in L428 (arrows), which coincided with copy number gains assessed by 24-color SKY (data not shown).

Fig 2
figure 2

Determination of chromosome 2 copy number changes under different experimental conditions: comparison between formalin-fixed, paraffin-embedded (FFPE) or frozen, LCM and WGA treatment, WGA alone, and untreated samples. An example with chromosome 2 is provided, left to right: untreated reference DNA extracted from bulk cell culture, DNA (0.6 ng total) diluted from untreated bulk culture reference and amplified using WGA, 100 cells laser capture microdissection (LCM) isolated from FFPE samples and amplified using WGA, and 100 cells LCM isolated from frozen samples and amplified using WGA. Chromosomal location (Mb) and log2 ratio are plotted along the x- and y-axes, respectively. Across all chromosomes and within individual chromosomes (chromosome 2 shown in this example), the data were highly correlative when treatments were compared (see Results), demonstrating that the archiving procedures, LCM, or WGA techniques did not introduce experimental artifact.

For the lung adenocarcinoma study, DNA was extracted initially from frozen OCT tumor tissue using an H&E slide for guidance as described above. A major concern with this method was the depth of the frozen tissue block, which could have increased the potential for inclusion of adjacent nontumor cells in downstream analyses (e.g., WGA and BAC aCGH). A minor concern was the extent of tumor necrosis. Due to these concerns, a slide-based method was adopted (described above) based on 4-μm sections mounted on microscope slides, which would minimize accidental inclusion of adjacent nontumor cells or necrotic tissue (Fig. 3, A). The patterns of CNAs were similar between the microdissected and the undissected samples (Fig. 3, B, upper and lower, respectively); however, the magnitude of change (segment value) was increased in the microdissected sample. In addition to the overall means being different, the overall variation was higher in the microdissected sample. This variation (noise) could have been due to the heterogeneity of the tumor population and the absence of any muting effect from nontumor DNA. The microdissected sample in the bottom panel was also amplified by WGA (see Material and Methods for details), which may have contributed somewhat to the noise; however, statistically, genomic amplification did not alter the aCGH profiles, which showed high correlation to unamplified DNA (r = 0.67–0.79), a finding consistent with our LCM data (Figs. 1 and 2). Microdissection resulted in a substantial enrichment in tumor cells for downstream applications, as evidenced by the significant differences in aCGH segment values between undissected and microdissected samples (P < 0.05) (Fig. 3, C).

Fig 3
figure 3

Importance of isolating lung adenocarcinoma tumor cells from frozen sections before array comparative genomic hybridization (aCGH). (A) Hematoxylin and eosin (H&E)–stained slides were examined by a pathologist (L.M.W.), who carefully determined the tumor cell area (circled in photo). (B) Copy number alterations (CNAs) are enriched after microdissection. Top panel, representative aCGH profiles for chromosomes 1, 5, 12, and 20 after tumor cells were dissected from nontumor tissue as described in the text. Bottom panel, representative aCGH profiles from undissected tissue, where only subtle CNAs are observed. The subtle changes observed in the bottom panel are more pronounced, illustrating how adjacent nontumor DNA mutes the tumor aCGH signature. (C) aCGH segments in the microdissected samples that were outside of the mean ± 2 SD of all segments were compared with the corresponding regions from the undisssected samples to substantiate the observations made in B. An example from a single such pairwise comparison is given. The mean of all the BAC log2 ratio values for a given segment (±SD) from the undissected sample is shown next to the same information for the corresponding microdissected sample. Both gains and losses are enriched after microdissection, and all comparisons are statistically significant at P < 0.05 (two-sided Mann-Whitney U test). These changes, most especially the deletions, would not have been easily detected if the sample had not been microdissected.

Validation of CNAs

One of the strengths of BAC aCGH methodology is the ease of CNA validation that is performed rapidly and economically using BACs in the area of the CNA as templates for FISH. In the frozen lung adenocarcinoma samples, BAC aCGH uncovered several CNAs that were common across samples. We focused our FISH-based confirmation on 5p, which showed a gain in 10 of 38 lung adenocarcinoma tumors analyzed to date. This chromosome arm was recently reported to be amplified in small cell lung carcinoma and contains the TRIO gene, which is associated with the activation of Jun kinase and may have a role in cell growth and migration.17 To confirm this gain, BAC RP11-81P9 was selected within the elevated segment (and close to the TRIO gene coding sequence) and was labeled for FISH as described in Materials and Methods. This region was consistently elevated across samples, and the estimated population fraction by aCGH, with a gain of 0.4 to 0.45, allowed for over- and underestimates by FISH (Fig. 4). The fraction of the population with copy number gains by FISH (35 ± 4%) was similar to the estimated fraction based on aCGH values for this specific BAC itself (log2 = 0.29, estimated fraction = 44%). The same validation approach demonstrated concordance between aCGH estimated copy number, FISH copy number, and SKY data in the L428 cell line (supplemental Fig. 2).

Fig 4
figure 4

Confirmation of array comparative genomic hybridization (aCGH) copy number alterations (CNAs) by fluorescence in situ hybridization (FISH). (Top) A recurrent CNA from the lung adenocarcinoma study was observed on band 5p15.2, close to the cellular locus for the TRIO gene. The BAC, RP11-81P9, circled in red and adjacent to the TRIO gene coding region, was used to confirm CNAs by FISH (image, Roswell Park Cancer Institute aCGH viewer). (Bottom) FISH analysis for the TRIO gene (RP11-81P9; red signal) and a reference probe on the opposing chromosome arm (RP11-51D11; 5q35; green signal), the fraction of the population showing gains of the TRIO gene sequence by FISH was similar to the fraction estimated by aCGH log2 values; left to right: interphase nuclei from the same tumor showing two, three, and four copies of the TRIO gene, respectively.

Comparison of BAC and oligonucleotide aCGH technologies

We designed a performance comparison of high-resolution BAC RPCI 19k minimal tiling arrays and Agilent oligonucleotide CGH platforms using DNA isolated from a series of HNSCC frozen tissue samples and matched multiple FFPE blocks, ovarian cancer FFPE samples (adenocarcinoma and neuroendocrine tumors), and, for a subset of the HNSCC cases, WGA (Bioscore) was used to further determine the effect of DNA quality for aCGH studies. This analysis allows us to quantify the effect of DNA source on these aCGH platforms by correlating across arrays as well as DNA source. Pearson's correlation coefficients of FFPE DNA assessed by Bioscore to the CGH array results on matching frozen samples were calculated (supplemental Table 1). Overall, the Bioscore assay was successful in identifying the FFPE samples that would yield high-quality, interpretable CGH results.

In addition to the lower quality of DNA from FFPE tumor blocks, tumor heterogeneity can also result in a lowered correlation. Although FFPE and frozen tumor samples are derived from the same original tumors, they may differ in the degree of cellularity (i.e., percentage of normal cells within a sample), tumor necrosis, and heterogeneity in tumor cell populations. BAC aCGH reveals CNAs in one HNSCC FFPE tumor block that are absent from the matched frozen sample and an FFPE block from a different region of the same tumor (Fig. 5). Amplification of a large region on chromosome 8q encompassing the MYC oncogene was identified; however, this region of the tumor did not show amplification on the X chromosome in this FFPE block, which was observed in the frozen and the alternate FFPE tissue block. Thus, although the sample of the tumor that was selected for the frozen tumor bank and the sample that was embedded as one of the two FFPE tumor blocks have essentially identical aCGH profiles (high correlation coefficients), the sample that was selected for the second FFPE block represents intratumor heterogeneity with amplified MYC sequences remaining on band 8q24.1, the normal cellular locus for MYC compared with the other two sections and correlates to a lesser degree. This finding underscores the vast genomic instability of tumors and the need to examine either more than one area of a tumor mass or, as shown by the microdissection studies, the importance of examining the pathologically defined region of the tumor that accurately reflects the true CNAs and the biology of the tumor.

Fig 5
figure 5

Tumor heterogeneity between sections examined from replicate formalin-fixed, paraffin-embedded (FFPE) blocks. Chromosomes 2, 8, 11, and X are shown as examples. Copy number alterations (CNAs) on chromosomes 11 and X show similar profiles in a frozen sample (A),array comparative genomic hybridization (aCGH) profile of an alternate FFPE sample (D) shows CNAs distinct from A, B, and C, suggesting that the same tumor may give different aCGH profiles depending on the region of the tumor that is sampled.

We compared source DNA within and across BAC and Agilent aCGH platforms. Chromosome-specific Circular Binary Segmentation (CBS) plots for a frozen HNSCC tissue sample and a FFPE sample are shown in Figure 6. The Agilent platform segmentation most closely matches the BAC platform on the frozen tumor samples. Genome-wide comparisons for the same tumor samples are shown in supplemental Figure 3 and 4. We then compared DNA source across a series of samples for each platform. Signal was estimated by the magnitude of change on the X chromosome because chromosome X was always altered by virtue of the sex-mismatched controls. Signal-to-noise calculations were computed as described in the statistical analysis. The Agilent aCGH platform yielded many more outlying segments of smaller length than the BAC platform for the same sample (supplemental Table 2). The same 4 median of the MAD score cutoff rule for outliers was implemented for both the BAC and Agilent platforms. The larger number of outliers for the Agilent arrays suggests that the platform is either identifying small regions of aberration missed by the BAC platform or that the noise process for the Agilent array is more tail heavy/susceptible to large spurious outlying values. On average, there are more segments as determined by CBS on the Agilent platform than the BAC platform, and the segments are smaller in length (supplemental Table 2). It should be noted that the CBS algorithm was applied with a setting of α = 0.025 to data for both platforms. The α value is proportional to the probability of spuriously identifying a segment break. Therefore, the higher density of the Agilent arrays is expected to provide an increased number of spurious segmentations. For the matched frozen or FFPE samples, signal-to-noise ratio was significantly higher for the BAC aCGH platform than the Agilent aCGH platform (P < 0.001 for matched frozen and P < 0.001 for matched FFPE from a paired t test).

Fig 6
figure 6

Performance of bacterial artificial chromosome (BAC) and Agilent aCGH from frozen (A) and FFPE (B) sections. BAC array comparative genomic hybridization (aCGH) is shown across the top row in both A and B, and Agilent aCGH is shown across the bottom row in both A and B. The log2 ratios are plotted in blue and the CBS segmentation values are in red.

The signal-to-noise results were calculated for each source DNA type for the BAC aCGH platform (supplemental Table 2, see online supplemental material for calculation details). The signal-to-noise values decrease when moving from frozen tissue samples to FFPE or WGA-derived DNA samples. This is consistent with the increase in noise, either MAD or mean squared error, when moving in the same direction (supplemental Table 3).

We can also further explore the correlation, in terms of moving from frozen to FFPE to WGA samples for each platform. Pearson and Spearman correlations were calculated for both preprocessed log2 ratios and segment values obtained from CBS. Using the Pearson correlation as the similarity measure, FFPE samples and WGA samples are the most similar, whereas, as expected, frozen samples and WGA samples are the most dissimilar (supplemental Table 4, see online supplemental material for statistical methods). Similar results were observed using the Spearman correlation. In summary, FFPE and WGA matched samples are the most similar, whereas frozen and WGA matched samples are the most dissimilar. Note that in both cases, the agreement between DNA obtained from frozen tissue banks and paraffin archives increases if CBS fitted segment values are used as opposed to the preprocessed log2 ratios.

DISCUSSION

Genome-wide detection of CNAs in neoplasia has the potential to characterize individual tumors by their “molecular fingerprint,” ultimately leading to the discovery of improved diagnostics and therapeutics. This study reports a series of experiments designed to address several critical hurdles impeding the flawless integration of aCGH into clinical medicine and the methodologic details.

Tumor cell heterogeneity can seriously reduce the efficacy of molecular genotyping techniques such as aCGH. For instance, isolation of individual tumor cells is crucial for an accurate genomic signature (aCGH profile) of the malignant Hodgkin Reed-Sternberg cell in Hodgkin's lymphoma because the Hodgkin Reed-Sternberg cells comprise a very small fraction of the tumor-populating cells.18 The isolation of individual cells may also be important in the analysis of rare malignant cells exfoliated into cerebrospinal fluid from central nervous system lymphomas or metastases,19 or biopsy specimens where the putative malignant cell is infrequent in the specimen due to infiltrating adjacent normal cells or tissue.20 Both microdissection techniques described in this report, LCM and manual microdissection, provide the opportunity to rapidly isolate and examine normal, adjacent (margin), and tumor tissue in parallel from archival tissues and may be useful in future interrogations of rare cancer stem cells and the stem cell niche.

Manual microdissection is technically straightforward and inexpensive and is ideal for situations where a clear demarcation between tumor and nontumor populations is evident, as observed in the lung tumor samples in this study. Our comparison of microdissected to undissected non-small cell lung cancer samples demonstrates that the removal of adjacent nontumor cells can significantly improve the detection of CNAs (Fig. 3). Furthermore, aCGH profiles are distinct after sampling from two different areas of the same tumor (FFPE and alternate FFPE; Fig. 5), supporting the view that it is essential to accurately isolate the tumor area or cells before aCGH testing.

LCM is a powerful methodology to gently and specifically isolate individual tumor cells. We found that as few as 100 LCM-isolated Hodgkin Reed-Sternberg tumor cells from frozen or FFPE samples were necessary to simulate unmanipulated (control) tumor aCGH profiles (Figs. 1 and 2). However, LCM requires expensive instrumentation and may be best reserved for situations where the tumor cells exist as a minor component of the tissue, e.g., needle biopsy specimen or Hodgkin's lymphoma.18

For both microdissection methods described, DNA amplification was necessary before labeling for aCGH. Because of the need for high-quality DNA, a WGA platform (GenomePlex WGA, Sigma-Aldrich) that randomly fragments DNA into 0.5-kb segments before amplification was selected based on the assumption that the impact of fragmentation associated with formalin fixation should be reduced compared with other WGA platforms, for instance, multiple displacement WGA, which requires a larger template and produces fragments around 10 kb.21 The amplification performed reproducibly well from both archival sources and produced aCGH profiles very similar to unamplified controls (Figs. 1 and 2). Furthermore, comparisons between untreated, diluted and amplified only, and frozen and FFPE and amplified showed no evidence of allele bias using the random fragmentation WGA (Fig. 2). Allele bias has been reported using multiple displacement amplification from FFPE archival DNA.22 Recently, Fiegler et al.23 demonstrated the utility of GenomePlex WGA in amplifying single cells for aCGH. Our WGA and aCGH results from single archival cells are very similar to the fresh cells used by Fiegler et al. In their report, however, although distinct regions of CNA were observed in the single-cell samples, resolution was decreased to at least 3 Mb because individual elements were combined in sets for detection of CNA, and the smallest change uncovered was 8.3 Mb. We also observed distinct regions of change in the FFPE single-cell samples (Fig. 1), but the correlation to the control was not as high as with greater cell numbers. Simulated single-cell replicates indicated that it would require 30 to 40 single-cell aCGH samples to very closely correlate with the bulk culture (data not shown), suggesting that our resolution with single-cell aCGH is on the order of 5 to 7 Mb as well (average BAC size, 0.178 Mb). It will likely prove technically challenging to increase aCGH resolution and detect statistically significant changes in individual cell samples to <3 Mb (Fiegler et al.; this study), but this is an area of active investigation.

BAC-based aCGH has been the gold standard for aCGH analysis of cancer samples for the past several years. It has been proven to have the highest signal-to-noise ratio and the lowest coefficient of variation in a recent study comparing BAC, Agilent, Affymetrix, and Illumina CGH technologies.24 Functional resolution for these platforms has recently been shown to be essentially equivalent to high-density or tiling BAC arrays for detecting single-copy alterations.25 aCGH studies of archival FFPE-derived DNA are limited to platforms not requiring complexity reduction such as BAC or Agilent aCGH.25 Agilent Technologies and BAC aCGH both use total genomic DNA, unlike Affymetrix and Illumina, which generate complexity reductions of the test sample before aCGH analysis.25 aCGH platforms that tolerate or are amenable to DNA isolated from archival sources will have great utility in the clinical environment. As part of this study, we have shown the ability of the Bioscore assay to assess quality of FFPE DNA before aCGH studies, thus enabling better utilization of archival source DNA. Furthermore, by comparing matched samples and measuring their signal-to-noise value (see statistical analysis), we have shown that BAC aCGH provides significantly higher signal-to-noise values compared with Agilent oligonucleotide arrays when based on chromosome X values. We also provide evidence demonstrating the decrease in signal-to-noise values in transitioning from frozen tissue source DNA to FFPE tissue source DNA to WGA DNA samples. This decrease in signal-to-noise value is accompanied by, on average, a larger number of CBS segments fit for FFPE-derived DNA than for frozen DNA. Ultimately, identifying divergent subpopulations that exist within a tumor through microdissection or focused sampling will provide a more comprehensive and accurate analysis that may only be possible through archival DNA sources.

The challenges of standardization and reproducibility of aCGH as a potential diagnostic tool are slowly being resolved. In this study, we have shown the potential for both BAC- and oligonucleotide CGH–based studies on archival samples. Eliminating the prevailing concerns of poor-quality DNA and developing the means to identify heterogeneity within a sample allows us to move forward with the interrogation of large clinical tumor banks. This will greatly facilitate identification and validation of molecular cytogenetic biomarkers that indicate the biological behavior (aggressiveness), invasive potential, and most applicable treatment strategy of genetically characterized tumor subgroups or patient-specific tumors.