Genomic and gene expression profiling of minute alterations of chromosome arm 1p in small-cell lung carcinoma cells

Genetic alterations occurring on human chromosome arm 1p are common in many types of cancer including lung, breast, neuroblastoma, pheochromocytoma, and colorectal. The identification of tumour suppressors and oncogenes on this arm has been limited by the low resolution of current technologies for fine mapping. In order to identify genetic alterations on 1p in small-cell lung carcinoma, we developed a new resource for fine mapping segmental DNA copy number alterations. We have constructed an array of 642 ordered and fingerprint-verified bacterial artificial chromosome clones spanning the 120 megabase (Mb) 1p arm from 1p11.2 to p36.33. The 1p arm of 15 small-cell lung cancer cell lines was analysed at sub-Mb resolution using this arm-specific array. Among the genetic alterations identified, two regions of recurrent amplification emerged. They were detected in at least 45% of the samples: a 580 kb region at 1p34.2–p34.3 and a 270 kb region at 1p11.2. We further defined the potential importance of these genomic amplifications by analysing the RNA expression of the genes in these regions with Affymetrix oligonucleotide arrays and semiquantitative reverse transcriptase–polymerase chain reaction. Our data revealed overexpression of the genes HEYL, HPCAL4, BMP8, IPT, and RLF, coinciding with genomic amplification.

Greater than 42 000 new cases of small-cell lung cancer (SCLC) are diagnosed annually in the United States, representing approximately 20% of all new lung cancer cases. Median survival time is 12 -36 months and the majority of patients eventually die from the disease (Minna et al, 2003). Advancing our understanding of the molecular characteristics of SCLC may lead to the ability to better diagnose and treat the disease. One approach to this is the detection of regions of genetic alteration in the tumour genome in order to identify the genes that are causal to the disease. Identification of novel genetic changes correlated to specific human malignancies has typically been a tedious and laborious process. Recently, high-throughput and high-resolution detection of such alterations has been made possible through the use of bacterial artificial chromosome (BAC) array comparative genomic hybridisation (CGH) (Pinkel et al, 1998). This technique has been applied to map alterations across the genome at 1 megabase (Mb) resolution (Snijders et al, 2001;Greshock et al, 2004), at higher resolution for specific chromosomal regions and chromosome arms in several tumour types (Buckley et al, 2002;Garnis et al, 2003Garnis et al, , 2004a, and, most recently, the whole human genome (Ishkanian et al, 2004).
Approximately 120 Mb of DNA span human chromosome arm 1p. Multiple regions on this chromosome arm have been implicated in SCLC and non-small-cell lung cancer (NSCLC) (Girard et al, 2000;Nomoto et al, 2000;Chizhikov et al, 2001;Ashman et al, 2002). Here, we describe the construction of a unique resource for analysing 1p. In all, 642 fingerprint-verified, physically mapped, overlapping BAC clones serve as target DNA for high-resolution BAC array CGH. Using this resource, we identified novel regions of DNA alteration in SCLC cells. We further defined the role of these alterations by employing Affymetrix oligonucleotide arrays and semiquantitative reverse transcriptase -polymerase chain reaction (RT -PCR) to analyse gene expression.

BAC array construction and CGH
To design a minimal tiling path, BAC clones from the RPCI-11 library (Osoegawa et al, 2001) were selected from the FPC (Fingerprinted Contigs) database (International Human Genome Mapping Consortium, 2001). Wherever possible, sequenced and overlapping clones were chosen. Clones were obtained from glycerol stocks of the RPCI-11 library. DNA isolation and restriction enzyme digestion was performed as described previously (Marra et al, 1997). Each clone's identity was verified by comparing its HindIII DNA fingerprint to that found in the FPC database. The BAC DNA amplification, array printing, hydridisation, and analysis were performed as described previously (Garnis et al, 2004a). A log 2 signal ratio of zero represents the most common copy number between the sample and normal reference DNA. The array was subjected to normal/normal hybridisation in order to eliminate clones that deviated from 0 by greater than þ 0.2 or less than À0.2. Clones with standard deviations among the triplicate spots greater than 0.075 were excluded from further analysis. A gain of copy number was defined as a log 2 ratio of greater than 0.2, and a loss as a log 2 ratio of less than À0.2. The average log 2 ratio of two or more adjacent clones was required to be above the threshold in order for a region of alteration to be defined.

Cell lines
The lung cancer cell lines used were established at the National Cancer Institute , with the exception of HCC33, which was generated at the Hamon Center for Therapeutic Oncology Research (Gazdar et al, 1998). NHBEC (normal human bronchial epithelial cells) and SAEC (small airway epithelial cells) were obtained from Clonetics (San Diego, CA, USA). HBEC2-KT, HBEC3-KT, HBEC4-KT, and HBEC5-KT are normal human bronchial epithelial cell lines immortalised with CDK4 and htert (Ramirez et al, 2004). Culturing conditions and DNA isolation were described previously (Girard et al, 2000). RNA isolation was carried out using the RNeasy Midi kit (Qiagen, Valencia, CA, USA), together with in-column DNase1 treatment.
Gene expression profiling using Affymetrix arrays RNA fluorescent-labelling reaction and hybridisation were performed using the Affymetrix GeneChips HG-U133A and HG-U133B according to the manufacturer's instructions (http:// www.affymetrix. com). The arrays consist of 22 283 (HG-U133A) and 22 645 (HG-U133B) probe sets, which together amount to 24 698 unique genes based on Unigene build 163. Total RNA (5 mg) was used for each reaction. Microarray analysis was performed using Affymetrix MicroArray Suite 5.0 and an in-house Visual Basic software MATRIX 1.24 (Girard et al, manuscript in preparation). Briefly, array data were median normalised and samples (or averages of samples) were compared against each other by calculating log ratios for each gene. Specifically, a lower signal threshold was set at 100 to reduce the amount of noise and the log ratio of one signal vs another was calculated by dividing the signals and computing the log base 2. These log ratios were calculated only when both signals were Present (i.e. the signalassociated detection P-values were less than 0.05) or when the higher signal was Present and the lower signal was Absent (P-value more than 0.05).

Statistical analysis of Affymetrix expression data
Only those Affymetrix probe sets that demonstrated a detection score of Present or Marginal (Po0.06) in at least 50% of the samples were analysed further. Statistically significant expression changes were identified using a Mann -Whitney U-test for those genes that passed the detection criteria. For determination of expression differences between the SCLC and normal cell lines, a two-tailed test was utilised and the direction of change noted. In determining a correlation between overexpression and genomic amplification, a one-tailed test was used. Genes were considered to exhibit aberrant expression if the P-value crossed a threshold of 0.05.

Statistical analysis of RT -PCR expression data
A ratio was obtained for each sample by dividing the intensity of the gel band for the gene of interest by the corresponding b-actin band. These ratios were then compared, using the Mann -Whitney U-test as described above, between samples exhibiting genomic amplification and samples exhibiting retention in the two regions of interest. If the P-value for the comparison was less than 0.05, and the data were consistent with the Affymetrix analysis, the gene was considered to have different expression between the two groups.

RESULTS AND DISCUSSION
The minimal tiling path for the 1p array consists of 642 BAC clones, of which 373 (58%) are sequenced clones, covering from 1p11.2 to 1p36.33. FPC clone order was compared to the golden path assembly on the July 2003 version of the UCSC Human Genome Browser (http://genome.ucsc.edu) (Kent et al, 2002) using the sequence accession number, BAC end sequence, or existing FISH maps. In total, 438 (68%) clones are confirmed with at least one of these methods and the remaining 204 BACs are ordered by their FPC location. The complete clone list has been made publicly available at http://www.bccrc.ca/cg/ArrayCGH_Group.html.
We chose to include only cell lines for the initial investigation with this array. Tumour samples from SCLC are infrequently available for research purposes and the DNA obtained from tumour samples is usually of lower yield and lower quality than that obtained from cell lines. In all, 15 cell lines generated from SCLC, described in Table 1  , were hybridised to the 1p array. Copy number alterations were detected in each of the 15 cell lines, ranging in size from 0.2 to 28 Mb. The locations and sizes of the alterations are summarised in Figure 1. Two cell lines (NCI-H1672, NCI-H2171) showed deletions only, eight (NCI-H82, NCI-H187, NCI-H289, NCI-H378, NCI-H889, NCI-H1184, NCI-H2195, NCI-H2227) showed amplifications only, and five (NCI-H526, NCI-H1607, NCI-H1963, NCI-H2141, HCC33) showed both amplifications and deletions. Three examples of array CGH profiles are shown in Figure 2, illustrating the detection of sub-Mb size copy number changes, as well as multiple distinct alterations along the 1p arm in NCI-H1672, NCI-H2141, and HCC33. Figure 3A shows the complex profile of cell line NCI-H526. We observe two distinct deletions at 1p36.12 -p36.13 and at 1p36.21 -p36.22. The area of retention between these deletions is approximately 2.5 Mb in size. The use of lower resolution techniques may have defined this region as one large deletion. Additionally, array CGH analysis was able to detect two sub-Mb alterations: a 0.5 Mb amplification at 1p35.2 and a 0.44 Mb amplification at 1p11.2 -p12. Alterations of these sizes would likely have been overlooked when using a CGH array with greater than 1 Mb resolution. In order to determine if the changes in array signal intensities reflect true copy number alterations, we selected three loci for FISH experiments. The loci and their associated clones' locations are noted in Figure 3A. As seen in metaphase nuclei of NCI-H526, 548I13 shows three copies, whereas 476E5 shows two copies ( Figure 3B). In a second experiment, 548I13 again shows three copies, whereas 385C11 shows six copies ( Figure 3C). The normalised ratio of BAC 548I13 is at zero, even though it has three copies, as NCI-H526 has greater than a diploid  number of chromosomes. These data demonstrate the ability of array CGH to distinguish loci that differ in copy number by 0.67 times (3 vs 2) and two times (3 vs 6).
Two regions of sub-Mb amplification were common among 45 -50% of the cell lines. Region 1 is a 580 kb amplification at 1p34.2 -p34.3, and Region 2 is a 270 kb amplification at 1p11.2, as displayed in Figure 1.
The Region 1 amplification occurs in seven of the 15 cell lines and contains BACs 314P18, 428O4, 204L3, and 115D7. Notably, in four of the seven cell lines (NCI-H378, NCI-H889, NCI-H1963, HCC33), the amplified region shows at least one BAC with a log 2 ratio greater than 2.5, corresponding to a greater than five-fold increase in signal from normal, and indicating high-level amplification. The region contains 12 genes, listed in Table 2. Conspicuously, MYCL1, a gene first described in SCLC (Nau et al, 1985) is present in this region. Of the 15 cell lines described here, all but HCC33 were previously analysed for MYC family amplifications using Southern blot analysis . NCI-H378 and NCI-H889 were described as having MYCL1 amplification, and in the current study, these amplification were also detected. We describe NCI-H526, NCI-H1184, NCI-H2141, and NCI-H1963 as having amplifications in the MYCL1 region; however, the Southern analysis did not find MYCL1 gene amplification in these cell lines. A possible explanation is that these cell lines do contain the amplification but they did not meet the four-fold signal increase cutoff necessary for the determination of an amplification in the Southern blot analysis. All other cell lines in which MYCL1 amplifications were not seen by Johnson et al were also not seen amplified in the present study.
Previous conventional CGH studies of SCLC have revealed amplification in the MYCL1 region. In SCLC cell lines, Levin et al (1994) detected amplification of chromosome bands 1p22 -32 in nine of 18 samples (Levin et al, 1994). (MYCL1 was initially mapped to 1p32 (Nau et al, 1985), then reassigned to 1p34.3 (Speleman et al, 1996)). Ried et al (1994) studied primary SCLC tumours and observed amplification of 1p32 in two of 13 cases (Ried et al, 1994). While these observations do implicate the MYCL1 locus, the regions identified are approximately 40 and 10 Mb in size, respectively, and contain hundreds of genes, making the identification of the specific genes affected by the amplification virtually impossible. Using high-resolution array CGH, we have defined the precise boundaries of the amplification in each cell line, with NCI-H2141 and HCC33 having amplified regions of less than 600 kb in size. We also determined that the minimal region of alteration at 1p34.2 -p34.3 is 580 kb in size and contains 11 genes in addition to MYCL1. This demonstrates the resolving power of array CGH to identify very small alterations and to delineate minimal regions of alteration. Previous studies that focused only on the presence and not the extent of MYCL1 amplification may have overlooked the potential involvement of neighbouring genes.
The Region 2 amplification is a 270 kb amplification occurring in eight cell lines and contains BACs 385C11, 498H23, and 114O18. The 114O18 is the last BAC mapped to the centromeric end of the 1p arm. Two genes, ADAM30 and Notch2, reside in this region.
ADAM30 is a member of the family of membrane proteins that contain a disintegrin and metalloprotease domain and its normal expression has only been detected in the testis (Cerretti et al, 1999). While ADAM30 itself has not been implicated in cancer, a number of other ADAM genes have, such as ADAM9 in breast cancer (O'Shea et al, 2003) and ADAM12 in bone cancer (Tian et al, 2002). Notch2 and the three other genes in its family, Notch1, Notch3, and Notch4, code for evolutionarily conserved Type 1 transmembrane receptor proteins (Artavanis-Tsakonas et al, 1999). While these proteins have been widely studied, investigations into their possible roles in cancer have yielded a number of studies with differing conclusions. Notch1 was first described as being involved in a balanced translocation with TCRb and a constitutively active Notch1 was oncogenic in T cells (Ellisen et al, 1991). Notch3 overexpression has been shown in NSCLC (Dang et al, 2000); however, it has also been demonstrated that expression of Notch1 and Notch2 induced cell cycle arrest in SCLC cell lines (Sriuranpong et al, 2001). We searched for alterations of Regions 1 and 2 in NSCLC. In the analysis of nine NSCLC cell lines, we detected amplification of Region 1 in one sample and Region 2 in four samples (data not shown). In comparison to SCLC, the amplifications in NSCLC included larger regions and, in the case of Region 1, were at a lower level. For example, NCI-H520, a lung squamous cell carcinoma, has a 7 Mb amplification containing Region 1, and a 3.2 Mb amplification containing Region 2. The lower frequency of Region 1 amplification is in agreement with a previous observation that MYC family gene amplification occurs at a frequency of less than 5% in NSCLC (Bruce Johnson, personal communication), as opposed to up to 36% in SCLC . The amplification of Region 2 was also detected in NSCLC tumours, where overexpression of Notch2 was observed (Garnis et al, 2005).
Hippocalcin like 4 222091_at AL136591 Yes P ¼ 0.01098 (up) Bone morphogenetic protein 8 207865_s_at NM_001720 Yes Gene expression analysis for each of these cell lines was obtained using Affymetrix GeneChips HG-U133A and HG-U133B. Expression levels of the genes contained in the Region 1 and Region 2 amplifications were compared between the SCLC cell lines and normal control cell lines SAEC, NHBEC, and four immortalised HBEC lines. Additionally, as there is no normal tissue that is known to be a suitable control for gene expression in SCLC, we also compared gene expression among cell lines with and without amplifications in the regions of interest. It has been previously demonstrated that the overexpression of some of the genes in a tumour can be attributed to their genomic amplification. Up to 44% of amplified genes (depending on their level of amplification) were found to be overexpressed in breast cancer cell lines (Hyman et al, 2002). We further compared the expression levels of these two sets of SCLC cell lines using semiquantitative RT -PCR analysis for the genes where a significant difference in expression was observed in the Affymetrix data. The results for all analyses of gene expression are detailed in Table 2 and a comparison of relative expression levels is displayed in Figure 4.
In Region 2, ADAM30 expression was not detected on the Affymetrix array. Notch2 was underexpressed in SCLC compared to normal cell lines. This observation is consistent with the observations of Sriuranpong et al (2001), where overexpression of Notch2 in SCLC induced cell cycle arrest. No difference in Notch2 expression was observed between SCLC cells with and without the genomic amplification of Region 2 in either Affymetrix or RTPCR analyses (see Figure 4B and Table 2). It appears that the amplification of this region may not have a functional role in increasing the expression of the genes contained in it.
A number of expression differences in both comparisons were observed for the genes in Region 1. In comparing SCLC and normal cells, four genes demonstrated significant differences: HEYL, HPCAL4, BMP8, and CAP1. CAP1 showed underexpression, while the other three were overexpressed. The comparison of SCLC cells with and without amplification of Region 1 revealed three genes that were overexpressed in the amplified cells: BMP8, IPT, and RLF. One gene, PPIE, was observed overexpressed in one of its three Affymetrix probes, but no difference was observed in its other two probes or the RT -PCR analysis. This may indicate a problem with that particular probe on the Affymetrix array. All other genes for which RT -PCR analysis was conducted agreed with the Affymetrix analysis (see Table 2). BMP8 was the only gene overexpressed in both comparisons. Owing to the possibility of the comparison of expression between SCLC and normal cells being skewed by extremely high expression in the amplified samples, we compared BMP8 expression between the normal cell lines and the nonamplified SCLC cells. BMP8 remained overexpressed in SCLC (data not shown).
HEYL is part of a subfamily of bHLH (basic helix -loop -helix) transcription factors. These proteins control cell fate decisions such as segmentation, neurogenesis, and myogenesis . The mouse homologue of HEYL has been shown to be a target of Notch1 signalling during development . As mentioned above, the Notch pathway has been found to have differing roles in cancer, depending on the tumour type.
HPCAL4 expression has only been detected in the brain. It is part of neuron-specific calcium-binding protein family; however, the specific function of HPCAL4 is not known (Kobayashi et al, 1998). HPCAL4 expression was not present in the normal lung cell lines (based on the detection P-value), but expression was seen in the SCLC cells. This may be a result of the neuroendocrine nature of SCLC.
CAP1 is associated with actin and cofilin and allows the rapid turnover of actin filaments. This is an important function in cell motility. In yeast, CAP is a component of the Ras pathway; however, this role has not been identified in humans (Moriyama and Yahara, 2002). Differential expression of CAP1 has not been previously detected in cancer; however, a potential role for the actin and cofilin complexes may be increasing cell motility during metastasis. A role such as this would imply overexpression of the genes involved, whereas we have observed underexpression of CAP1.
BMP8 is a bone morphogenetic protein, part of the TGF-b superfamily. These proteins play a role in aspects of mammalian development such as mesoderm determination, neural patterning, organogenesis, and skeletal patterning (DiLeone et al, 1997). There have been a number of studies of BMPs in cancer. BMPs were identified as potential tumour suppressor genes in myeloma (Hjertner et al, 2001) and prostate cancer (Brubaker et al, 2004). They were also shown to be overexpressed in oral squamous cell carcinoma (Jin et al, 2001) and BMP7 was overexpressed in breast cancer cell lines (Hyman et al, 2002).
RLF has been shown to be in fusion with MYCL1 in SCLC. Kim et al (1998) detected genomic amplification of RLF in four of 11 MYCL1-amplified SCLC cell lines they studied. Chimeric RLF-MYCL1 transcripts were detected in NCI-H889, NCI-H1836, and NCI-H1994 (Kim et al, 1998  general role in transcriptional regulation (Makela et al, 1995). The RLF-MYCL1 fusion protein has also been detected in SCLC primary tumours. Its role is thought to be the deregulation of expression of MYCL1 (Makela et al, 1992). Interestingly, while RLF was originally identified due to fusion and overexpression with MYCL1, it was not the cell lines with the highest amplification in which we observed the highest expression. While NCI-H378 and NCI-H889 show the highest amplification of the region, and were the only two cell lines with previously known MYCL1 amplification, NCI-H1184 and NCI-H1963 both express RLF at higher levels than NCI-H378 and NCI-H889. The observation of increased expression of RLF, particularly in cell lines that had not been previously studied, lends additional weight to its implication in SCLC. The overexpression of IPT was the most significantly associated with amplification (see Figure 4 and Table 2). IPT catalyses the transfer of an isopentenyl group from dimethylallyl pyrophosphate to a tRNA in the biosynthesis of the tRNA cytokinin isopentenyladenosine (Golovko et al, 2000). Isopentenyladenine is an end product of the mevalonate pathway. Mevalonate is the precursor of isoprenoid groups that are incorporated into other end products in addition to isopentyladenine such as sterols and ubiquinone (Goldstein and Brown, 1990). HMG-CoA reductase catalyses the conversion of HMG-CoA to mevalonate. Elevated activity of this protein has been observed in a number of tumour types. The statin family of drugs are HMG-CoA reductase inhibitors that reduce levels of mevalonate and its end products. Statins have been successfully used in the treatment of hypercholesterolaemia, and have been recently demonstrated to have antiproliferative and proapoptotic effects in tumours both in vitro and in vivo (Wong et al, 2002). As the mevalonate pathway has been shown to play an important role in the maintenance of the malignant phenotype, IPT, which contributes to the production of an end product, may have a role in promoting malignancy as well. Its specific function and contribution remain to be clarified.
We have presented a chromosome arm-specific tiling resolution BAC array consisting of 642 BAC clones spanning 120 Mb of chromosome arm 1p and profiled 15 SCLC cell lines using this array. We have reliably detected the previously known MYCL1 amplification, as well as defined a 580 kb amplification at 1p34.2 -p34.3, and a novel 270 Kb Amplification at 1p11.2. This demonstrates the ability of high-resolution array CGH to identify small alterations that may have escaped detection by other means, and to detect efficiently the chromosomal location, size, and relative level of a genetic alteration. Further, we have analysed the expression of the genes contained in the amplicons, in order to identify those with a potential role in SCLC. Notch2 and CAP1 were underexpressed, while HEYL, HPCAL4, and BMP8 were overexpressed in SCLC in comparison to normal cell lines. IPT, BMP8, and RLF were overexpressed in SCLC cells in which they were genomically amplified in comparison to SCLC cells without amplification. These data are in agreement with previous studies that broadly implicate the BMP family in cancer, and specifically implicate RLF in SCLC. Furthermore, we have observed increased expression of HEYL, HPCAL4, and IPT, for which additional investigation will be necessary to define their roles in cancer.
Genomic amplification is significant to cancer in that it can directly result in increased expression of amplified genes (Hyman et al, 2002). High-level amplifications have usually been seen associated with a particular oncogene such as EGFR, MYC, or ERBB2 (Schwab, 1999). Here, we have described an amplicon containing the MYCL1 oncogene, where additional genes are affected by the genomic amplification. This demonstrates the importance of integrating comprehensive genomic and gene expression analyses where all genes in a region can be studied. Further, we have demonstrated the benefit of a comparison among samples of the same tumour type as well as against normal tissues in order to reveal expression changes that may be functionally important.