Main

Carcinogenesis is a multistep process involving many genetic and epigenetic events. The former can take the forms of amplification of oncogenes and deletion or mutation of tumor suppressor genes. Mutation of genes involved in maintaining genomic integrity results in catastrophic events, leading to rapid accumulation of chromosomal aberrations.1 The occurrence of genomic instability is expected to affect global gene expression profiles, and provide a ground for selection of genetic traits that confer a growth advantage important for tumor progression. As a result, most human solid tumors are aneuploid, accumulating numerous gains and losses of chromosomal regions. In some cases, recurrent high levels of DNA copy number amplification can be observed in tumor samples. Among these amplicons many have well-characterized oncogenes, for example, Myc at 8q24, ErbB2 at 17q12 and cyclin D at 11q13.

Amplification at chromsome 19q12 has been observed in multiple tumor types2, 3, 4, 5 (also see www.helsinki.fi/cmg/cgh_data). Cyclin E1 (CCNE1), an E type cyclin, is located at 19q12, and has traditionally been considered to be the target of 19q12 amplicon. CCNE1 associates with CDK2 and CDK3 and directs the cell from the late G1 phase into the S phase. It also controls the initiation of DNA replication and centrosome duplication.6, 7 As unchecked cell growth and proliferation is the hallmark of human cancers, the CCNE1/CDK2 kinase system has been shown to be misregulated in multiple tumor types.2, 8 For example, CCNE1 is overexpressed in breast, ovarian, pancreatic, bladder, endometrial and colon cancers.3, 4, 5, 9, 10, 11 In many cases, the overexpression of CCNE1 is associated with amplification at the 19q12 locus.2, 3, 4, 5

Gastric cancer is the second most common cause of cancer death worldwide.12 Environmental and genetic factors are both important in gastric carcinogenesis. Gastritis, caused most commonly due to Helicobacter pylori (H. pylori) infection, is strongly associated with the development of gastric cancer.13 Comparative genomic hybridization (CGH) has been applied to study DNA copy number variation among gastric cancers.14, 15, 16, 17, 18 Like many other cancer types, CCNE1 is found to be amplified at 19q12 and overexpressed in a subset of gastric cancer samples.19, 20, 21 The biological and clinical significance of CCNE1 amplification and overexpression remains controversial: while some studies reported the correlation of CCNE1 expression with poor patient survival,20, 22 others found no such correlation.23

Recently, we reported the systematic characterization of gene expression in a large-scale analysis of gastric adenocarcinomas vs non-neoplastic gastric mucosa using cDNA microarrays.24, 25 Hierarchical clustering of global expression profiles of human gastric cancers revealed a tightly coregulated cluster of five genes, all located at chromosome 19q12, including CCNE1. In this paper, we report an in depth analysis of this 19q12 gene cluster and demonstrate that high expression of the 19q12 gene cluster is associated with 19q12 locus amplification. Using array-based CGH (array CGH) and real-time PCR we further refine the boundary of 19a12 amplicon to approximately 200 kb along chromosome 19q. We found that high expression of the 19q12 gene cluster statistically correlates with the cell proliferation gene signature. In addition, we identified a set of 577 genes whose expression levels positively correlate with the 19q12 gene cluster. The study therefore indicates that amplification at 19q12 is associated with cell proliferation in vivo.

Experimental methods

Patient Samples, RNA Preparation, cDNA Microarray and Hierarchical Clustering

Samples of tumor and normal gastric mucosa were collected from gastrectomy specimens from Department of Surgery, Queen Mary Hospital, The University of Hong Kong. Tissues were frozen in liquid nitrogen within half an hour after they were resected. Non-neoplastic mucosa from stomach was dissected free of muscle, and histologically confirmed to be tumor free by frozen section. Total RNA was extracted using Trizol (Invitrogen, Carlsbad, CA, USA). A detailed description of DNA microarray procedure has been previously reported.24, 25 Raw data are available at both the Stanford Microarray Database (http://genome-www5.stanford.edu/MicroArray/SMD/) and the ArrayExpress online data repository (http://www.ebi.ac.uk/arrayexpress, accession number: E-SMDB-2). The 6688 cDNA clones, representing approximately 5200 unique genes for hierarchical clustering, were generated as described.24, 25

Genomic DNA Extraction

Genomic DNA was extracted from gastric tumor samples using QiaAmp kit (Qiagen, Valencia, CA, USA). In all cases, DNA was isolated from adjacent blocks from the same tumors that RNA was extracted and used for microarray analysis.

Array CGH and Analysis

The arrays used in the study were prepared and hybridized as described previously.26 In brief, human 1.14 arrays were obtained from the UCSF Cancer Center Array Core (http://cc.ucsf.edu/microarray/). The arrays consisted of 2463 bacterial artificial chromosome (BAC) clones that covered the human genome at 1.5 Mb resolution. For hybridization, 1 μg of tumor DNA and 1 μg of gender-matched reference DNA (isolated from normal donor lymphocytes) was labeled by random priming using Cy3-dCTP and Cy5-dCTP, respectively, and hybridized to the arrays. Three single-color intensity images (DAPI, Cy3 and Cy5) were collected from each array using a charge-coupled device camera.

UCSF SPOT software27 was used to automatically segment the spots based on DAPI images, perform local background correction and to calculate various measurement parameters, including log2 ratios of the total integrated Cy3 and Cy5 intensities for each spot. A second custom program, SPROC (http://jainlab.ucsf.edu/Downloads.html) was used to associate clone identities and a mapping information file with each spot, so that the data could be plotted relative to the position of the BACs. Chromosomal aberrations were classified as a gain when the normalized log2 Cy3/Cy5 ratio was higher than 0.225; and as a loss when the ratio was lower than −0.225. Steep copy number changes with the graph showing a peak rather than a plateau, and a minimal normalized log2 Cy3/Cy5 ratio of 0.9 or higher, were classified as amplifications. Likewise, log2 Cy3/Cy5 ratio of −0.8 or lower were classified as homozygous deletions.

Quantitative Real-Time PCR to Determine DNA Copy Number Changes

Sybergreen-based real-time PCR was performed to assay the DNA copy number variation among nine loci at 19q12. House keeping gene GAPDH was used as the reference control. The detailed method of quantitative real-time PCR is the same as previously described.28 Assays were carried out using the software supplied with the ABI 7900 (Applied Biosystems, Foster city, CA, USA). Control samples were selected based on array CGH results: gastric cancers with 19q12 amplication demonstrated by array CGH (for example HKG91T and HKG92T) were chosen as positive control, whereas gastric cancers showed no 19q12 amplication by array CGH (HKG81T), as well as DNA from normal blood samples were used as negative control. All samples were assayed as triplicate and the values were averaged. DNA copy number was calculated as 2 Δ C t . Copy numbers of two blood samples from healthy donors were averaged and normalized to 2, which was subsequently used to normalize the copy numbers of gastric cancer samples. The primers were designed using Primer Expression program and the sequences are available in Supplementary Table 1. The efficiency of primers was tested to be between 90 and 100%.

Statistical Analysis

To identify gene expression signatures associated with the 19q12 gene cluster, expression values (normalized log2 (red/green) ratio) of five clones in the cluster were averaged (19q12 cluster expression vector). The Pearson's correlation coefficient was then calculated for every clone in the whole cluster (comprised of 6688 clones) against the 19q12 cluster expression vector. To smooth the correlation curve, the moving average of 21 clones was calculated and plotted against the main cluster (along the vertical axis) as described.29 The significant threshold for the moving average correlations was set by permuting the data 10 000 times. For each permutation, we permuted the 19q12 cluster expression vector, constructed the corresponding smoothed correlation curve and picked the maximum of the curve. We used the α-percentile of the 10 000 maximum values to determine the threshold of α significant level for the moving average correlations. We also applied the significance analysis of microarrays (SAM) method30 and used the 19q12 cluster expression vector as a continuous variable to identify genes that were significantly statistically correlated with the 19q12 cluster expression vector. Gene ontology categories were analyzed by GO-TermFinder.31

Results

19q12 Gene Cluster Revealed by Global Expression Profiling

We previously reported the global gene expression pattern of human gastric tumors, including 90 tumors, 14 lymph nodes and 22 nontumor gastric mucosa.24, 25 A hierarchical clustering algorithm was used to group 126 gastric tissue samples according to their gene expression patterns as delineated by 6688 cDNA clones (Figure 1a). This analysis revealed prominent clusters of coexpressed genes, which included: a set of clusters that appeared to reflect intrinsic characteristics of proliferating tumor cells (ie, a proliferation and beta-catenin cluster); normal gastric mucosal cells (ie, normal cluster); differentiation (ie, intestinal metaplasia cluster), stromal cells (ie, ECM cluster), and that of infiltrating lymphocytes (ie, leukocyte cluster). Interestingly, we also noticed several gene clusters that were comprised of genes located at close range on chromosomes (ie, Erbb2 cluster).

Figure 1
figure 1

Expression patterns of human gastric tissues. (a) Hierarchical clustering of 6688 cDNA clones (corresponding to 5200 unique genes) in 126 gastric tissue samples, including gastric tumors (n=90), metastasis lymph nodes (n=14) and nontumor gastric mucosa (n=22). The figure is presented in table format, where each row represents a gene and each column represents a sample. The relative abundance of each transcript centered across all the samples is depicted according to the color scale shown at the lower right corner. Gray indicates missing or excluded data. Specific gene clusters identified in the gastric tissue expression profiles are annotated to the right. See previous publication24, 25 for a detailed analysis of gene expression patterns in human gastric cancers. (b) Expanded view of the 19q12 gene cluster. (c) Correlation of expression levels between 19q12 gene cluster and global gene expression signature in human gastric cancer samples. Genes are organized along the X-axis the same order as in the clustering (a). The Pearson's correlation coefficient was calculated for every clone in the whole cluster (comprised of 6688 clones) against the 19q12 cluster expression vector. The correlation values (the blue line) are plotted as moving average of 21 genes (along the vertical axis). The red lines represent thresholds for the 2.5% one-sided tail probability under 10 000 permutations. Arrow indicates regions of statistically significant correlation. Part of the figure is the same as figures published previously25 with the permission to use from Molecular Biology of Cell.

One such cluster contained five genes (POP4, C19or12, RMP, CCNE1 and UQCRFS1), all located at chromosome 19q12 position 34.3–35.2 Mb (Figure 1b and Table 1). We refer to this cluster as the 19q12 gene cluster. To further illustrate the correlation between these five genes along chromosome 19, we retrieved genes which are located at chromosome 19 from the main cluster as shown in Figure 1a, and organized the genes along their chromosomal positions. We constructed a correlation map of chromosome 19 using 20 genes upstream of the five 19q12 cluster genes and 20 genes downstream (Figure 2). This displayed pairwise correlations between expression patterns of the genes and allows visual identification of groups of adjacent genes with similar gene expression patterns. Figure 2 depicts the group of five correlated 19q12 genes as a red block centered on the diagonal of the correlation matrix. The data further illustrated that these five genes located at 19q12 showed highly correlated gene expression patterns.

Table 1 Five genes in the CCNE1 cluster
Figure 2
figure 2

Correlation matrix of 45 genes at chromosome 19 (20 genes upstream of 19q12 gene cluster and 20 genes downstream of 19q12 cluster). All the pairwise correlations between gene expression profiles of 126 gastric samples of the chromosome 19 genes are displayed as a pseudocolor map in which color and intensity of each element depict direction and degree of correlation, respectively. Genes are ordered along both axes based on their genomic position along the chromosome 19.

The average expression levels of the 19q12 gene cluster in 90 gastric tumors and 22 nontumor gastric mucosa samples is shown in Figure 3. We found that the 19q12 gene cluster is statistically significantly highly expressed in tumor samples compared with nontumor gastric mucosa (P=3.7 × 10−5). Among the 90 gastric tumor samples, 10 showed expression of 19q12 gene cluster at very high levels (we used log2 (average array expression level)>0.8 as the cut off for high expression). The data indicated that there might be amplification at 19q12 in these 10 gastric tumor samples.

Figure 3
figure 3

Expression of 19q12 gene cluster in gastric cancer (n=90) and normal (n=22) samples revealed by cDNA microarrays. P-value was calculated using Student's t-test.

19q12 Amplicon by Array CGH

Multiple genes located at 19q12 showing similar overexpression in a subset of gastric cancer samples suggest that the 19q12 amplicon may contain a rather large region of multiple genes. To investigate whether high expression of the 19q12 gene cluster is due to amplification at 19q12, we applied array CGH in a subset of gastric cancer samples (all of them has been used in expression array studies). In all cases, gastric cancer samples without high expression of 19q12 gene cluster do not show 19q amplification. A representative image of DNA copy number on chromosome 19 of a gastric tumor without high expression of 19q12 gene cluster is shown in Figure 4a. On the other hand, seven of the seven gastric cancers with high expression of 19q12 gene cluster showed either amplification (n=6) or a one copy gain (n=1) at 19q12 region. A representative image of 19q12 amplificon assayed by array CGH is shown in Figure 4b. We compared the BAC clones at 19q12 in these gastric cancer samples and identified a set of five BAC clones which were most frequently amplified (Table 2). Particularly, BAC clone CTB25O22 showed amplification or gain in all seven samples. The five most frequently amplified BAC clones are mapped from 33.3 to 35.0 Mb, which is consistent with the position of the five clones in the 19q12 gene cluster (Table 1).

Figure 4
figure 4

Representative array CGH images of chromosome 19 for gastric cancer samples without (a) or with (b) 19q12 amplification.

Table 2 Five BAC clones located at 19q12 were most frequently amplified in gastric cancer samples

19q12 Amplification in Human Gastric Cancers Assay by Real-Time PCR

While array CGH analysis showed DNA copy number variations among tumor samples on a genome-wide basis, it only provided limited resolution when applied to the mapping of the boundary of amplicon. This is mainly due to the fact that there are on average 1.5 Mb distance between each BAC clone printed on the array. To further map the 19q12 amplicon, we identified genes located at 19q12 (Figure 5a). We performed real-time PCR to assay the genomic DNA copy number variation of nine genes at 19q12 in 10 gastric cancers with high expression of 19q12 gene cluster, as well as two control gastric cancer samples without high expression of 19q12 gene cluster. A composite figure showing all of the 10 gastric cancers with high 19q expression and the two with low expression is available as Supplementary Figure 1. Figure 5b show representative images from gastric cancer samples with or without high expression of 19q12 gene cluster. Real-time PCR mapped the most frequently amplified region (defined as genes with showed copy number >3 in more than 80% of the samples) to be from POP4 to CCNE1 (roughly 34.79–35.00 Mb) among chromosome 19q12, with locus at C19orf12 being amplified in all 10 samples tested.

Figure 5
figure 5

Mapping of 19q12 amplicon using real-time quantitative PCR. (a) Genes located at 19q12. #Genes in 19q12 cluster; and *genes assayed by real-time PCR. (b) DNA copy number assayed by real-time PCR in three gastric cancer samples. HKG35T: a gastric cancer sample with 19q12 amplication; HKG77T: a gastric cancer sample with one copy gain at 19q12, note the lack of DNA copy number change at CCNE1 locus; HKG63T: a gastric cancer sample with normal DNA copy number at 19q12. (c) Gene expression level parallels genomic DNA amplication at 19q12 in HKG92T gastric tumor sample.

We compared the DNA copy number assayed by real-time PCR and gene expression levels assayed by microarrays, and found in most cases the expression levels paralleled with the genomic DNA copy numbers (Supplementary Figure 2). Figure 5c illustrates such a correlation in gastric tumor sample HKG92T.

Interestingly we found that HKG77, which clearly showed one copy gain at 19q12, has normal copy number at the CCNE1 locus (Figure 5b). We also analyzed the DNA copy number of CCNE1 in HKG77T at a different position using a different primer set. We consistently found no evidence of DNA copy number gain or amplification of CCNE1 in HKG77T (data not shown). The data therefore suggest that DNA copy number gain at 19q12 is not necessarily associated with amplification at CCNE1 locus.

Gene Expression Patterns Associated with the 19q12 Gene Cluster

In an attempt to identify features in the global gene expression cluster that may be associated with expression profile of the 19q12 gene cluster, we calculated correlations for each gene expression vector with the average expression vector of 19q12 gene cluster among the 126 gastric samples. The resulting correlation curve, plotted as moving averages (window size=21 genes), are displayed in Figure 1c. We observed two positively correlated peaks, one representing the 19q12 gene cluster itself, and the other representing the cell proliferation cluster (arrows in Figure 1c). Similar results were obtained when we used a moving average window size of 11 or when we only used the data from 90 gastric cancer tissues. The cell proliferation gene cluster is comprised of genes whose functions are required for cell cycle progression and whose expression levels correlate with cellular proliferation rates. To determine whether the observed correlation is statistically significant, we permuted the data set and calculated the correlation. The red lines in Figure 1c represent the thresholds which are derived by the permutation test as described in the section ‘Statistical Analysis’ for the 2.5% one-sided tail probability. Clearly the two peaks observed are statistically significantly correlated with the 19q12 gene cluster.

To further identify gene expression signatures that may be regulated by the 19q12 gene cluster, we applied SAM to the data set using the 19q12 gene cluster expression value as a continuous variable vector. A delta value of 2.02 was selected, and the program identified 577 positively correlated genes with the median false significant number of 0.46 (<0.1%). We named this gene list as ‘19q12 gene cluster correlated genes.’ The top 20 genes are shown in Table 3 and the complete gene list is available as Supplementary Table 2. The top five genes are genes of the 19q12 gene cluster itself. Careful examination of the rest of the genes revealed that many of them have important functions during cell cycle progression and cell proliferation, for example, topoisomerase II alpha (TOP2A), CDC2, centromere protein A (CENPA), translin (TSN), tubulin beta 2 (TUBB2) and BUB1.

Table 3 Top 20 clones identified by SAM to be statistically significantly correlated with the expression of the 19q12 gene cluster

Our analysis of the Gene Ontology annotations for the 19q12 gene cluster correlated genes in gastric cancer indicated nonrandom enrichment in a variety of biological process categories, including cell cycle, nuclear division, cytokinesis and cell proliferation (Table 4). Furthermore, expression levels of many of these 19q12 gene cluster correlated genes are regulated during cell cycle. We downloaded the 1134 cell cycle-regulated genes previously published32 and found that in the main cluster in Figure 1a, 440 clones (6.58%) of the total 6688 clones are cell cycle regulated. On the other hand, 127 clones (22.0%) of the total 577 19q12 gene cluster-correlated genes are cell cycle regulated. This difference is high statistically significant using the hypergeometric test (P<0.001). The data therefore indicate that expression of the 19q12 gene cluster is associated with cell proliferation gene expression signature in vivo in gastric tumors.

Table 4 GO terms that are nonrandomly enriched (corrected P-value <0.05) with the products of the 577 genes whose expression levels are correlated with expression of the 19q12 gene cluster in gastric cancer samples

Discussion

We have shown in this study that by combining microarray expression data, based array CGH and quantitative real-time PCR, we can identify small regions of chromosomal amplification and the resulting genes that are upregulated in gastric cancers due to the amplification. Furthermore, we can associate on a genomic scale the genes that are correlated with the amplification and support a role of the amplification of 19q12 and cell proliferation in vivo.

Genomic analyses using microarrays has provided us global programs of gene expression patterns of human cancers.33, 34 Analysis of gene expression programs along chromosomes can help reveal unidirectional changes in expression of a large number of adjacently located genes, which is generally named as regional expression bias.35, 36, 37 As expected, a majority of the detectable regional gene expression biases in tumor samples also coincide with chromosomal aberrations.38, 39, 40 It is therefore feasible to infer cytogenetic abnormalities by examination of high-density gene expression data. However, these analysis methods focus on the whole chromosome, and it is easy to detect DNA copy number gains or losses along the chromosomes that occur frequently in tumor cells. Amplification or deletion events which occur at lower frequency (eg, <10 or 15%) and at a narrow region (<1 Mb) will be difficult to detect with these analysis methods. In this paper, we have shown that by simply using hierarchical clustering, we were able to identify several clusters of coregulated genes which may represent amplification events in tumor cells. Besides the 19q12 gene cluster which has been detailed in this paper, there is also a cluster at 17q12, which includes ErBB2, GRB7 and MLN64; a cluster at 20q13, which includes ZNF217, ATP5E and GNAS; a cluster at 7q22, which includes ATP5J2, CPSF4 and PDAP1; and a cluster at Xq28, which includes ARD1, IRAK1 and UBL4 (data not shown). Therefore, mapping the gene clusters along the chromosomes may help to identify narrow and less frequent gene amplification or deletion events in tumor samples.

In this paper, we focused our analysis of 19q21 gene cluster, it is important to further expand the study to other commonly amplified regions in gastric cancers. In addition, correlation of expression arrays and array CGH data will clearly assist to identify genes whose expression levels are altered because of the DNA copy number gain or loss, and possibly help to identify novel oncogenes and tumor suppressor genes in gastric cancers. The analysis between gene expression and DNA copy number variation is currently in progress and will be present separately (Leung et al, unpublished data).

While current studies of 19q12 amplicon have focused on CCNE1, we found that amplification at 19q12 is not always necessarily associated with DNA copy number gain at CCNE1 locus. The study therefore suggests that other genes within this amplicon may also have potential functions in tumor development. For example, RMP has been found to be expressed in small cell lung carcinoma cells and Reed–Sternberg cells of Hodgkin's disease.41 It binds to RNA polymerase II subunit 5 and functions as a transcriptional corepressor.42, 43 UQCRFS1 is also known as the complex III of the mitochondrial respiratory chain.44 This complex passes electrons from reduced ubiquinol to cytochrome c during the process of synthesis of ATP. The amplification or overexpression of UQCRFS1 may therefore play an important role for dividing and proliferating cancer cells. The amplification of UQCRFS1 has also been reported in breast and ovarian cancers.45, 46 Clearly, it is important to further evaluate the functions of these genes in gastric cancer development and determine the extent to which their gene expression is regulated by DNA amplification. In addition, these genes within 19q12 amplicon may also serve as treatment targets. For example, an antibody against POP4, C19orf12, UQCRFS1 or RMP may be useful to treat gastric cancer patients who have 19q12 amplication.

In summary, our study shows that expression array analysis combined with array CGH and real-time PCR provides a new and powerful tool to identify clusters of genes which may be regulated by genomic DNA aberrations. In addition, our study indicates that amplification at 19q12 is associated with cell proliferation in vivo.