Main

Gene expression profiling has the potential to advance our understanding and treatment of cancer and other diseases. Using this approach, expression profiles of normal and cancer cell populations can be compared, allowing the identification of specific genes or groups of genes that are dysregulated.1 Expression profiling studies of human prostate cancer have proven to be valuable and demonstrate the promise of uncovering important molecular mechanisms of disease.2, 3, 4, 5, 6 A major challenge that impedes the genetic analysis of whole prostate cancer tissue has been the technical inability to collect pure populations of specific cells directly from complex heterogeneous prostate tissue. Therefore, the ability to assess specifically pure populations of prostate cancer cells isolated from non-neoplastic cell types is a significant technical achievement.

The serial analysis of gene expression (SAGE) method is one of several new technologies capable of assessing global gene expression. One advantage of SAGE is that it allows qualitative and quantitative analysis of thousands of transcripts simultaneously.7, 8 In this approach, short sequence tags (10 bp) are isolated from mRNA at a defined position, ligated to form long concatemers, cloned and sequenced. The frequency of each tag in the cloned concatemer directly reflects transcript abundance within the messenger RNA (mRNA) population studied. Depending on the number of tags sequenced, changes in expression levels of rare transcripts can be detected. SAGE is a particularly advantageous technique for analyzing a cell population that is only a small subfraction of a particular tissue because it allows the unbiased detection of rare transcripts likely to be under-represented in pre-existing databases.1, 9

There are significant technical challenges associated with using SAGE to profile specific cell populations within heterogeneous human tissue.10, 11, 12 The limitation of relatively large amounts of input mRNA needed for typical SAGE applications must be overcome if the technique is to be adapted to routinely acquired human tissue samples. Laser capture microdissection (LCM) is a technique that allows the isolation of individual cell populations for genetic analysis, thereby circumventing the impact of tissue heterogeneity.13, 14, 15 We anticipated that combining an LCM technique with microSAGE would abrogate the limitations associated with a traditional whole tissue approach and produce cell-specific expression profiles from limited human tissue samples.

This study is the first to report the results of combining LCM and microSAGE technologies (LCM–microSAGE) to analyze global gene expression in routinely acquired prostate cancer tissue. Using this combined LCM–microSAGE technique enabled the generation of large-scale gene expression libraries of highly enriched cancer cells and normal cells from tissue sample.

Materials and methods

LCM of Human Prostate Tissue

Tissue samples from a single radical prostatectomy specimen (Gleason score 7(3+4), T2N0M0) were used for this study. Samples of normal and cancer were procured and immediately snap-frozen, and stored at −70°C. Hematoxylin–eosin-stained frozen sections of these specimens were examined by two pathologists (JHC-V, PT).

Using a PixCell LCM with an infrared diode laser (Arcturus Engineering, Santa Clara, CA, USA), 5 μm thick serial sections of retrieved fresh frozen tissues were cut and immediately stained using a HistoGene™ LCM Frozen Section Staining Kit (Arcturus, Mountain View, CA, USA) and LCM was performed immediately thereafter. Cancer and normal epithelial cells were microdissected from different fresh frozen tissue blocks. The mRNAs of cancer cells from 12 serial frozen sections and normal cells from 10 serial frozen sections were isolated using 97 068 and 104 300 laser pulses, respectively, from a laser beam that was 30 μm in diameter.

MicroSAGE Analysis

Complete extraction and purification of total RNA from captured cells was performed by following the protocol of the PicoPure RNA Isolation Kit (Arcturus, Catalogue No. #KIT0202) and checked for RNA integrity and amount by spectrophotometry (DUR7400 Spectrophotometer, Beckman). Cells were captured by approximately 10 000 pulses to produce total cellular RNA in a 10 μl volume. Then, 10 samples were collected for a total of 100 Ul (20–30 ng mRNA) for microSAGE analysis. Using the I-SAGE kit (Invitrogen, Carlsbad, CA, USA), microSAGE analysis with a modified ditag amplification procedures was performed as described previously.10, 11

The first series of ditag PCR amplifications were performed using 5 μl 10 × reaction buffer, 3 μl DMSO, 7.5 Ul 10 mM dNTPs, 2 μl each of ditag primer 1 and primer 2:

primer 1:

5′GGATTTGCTGGTGCAGTACA3′ (175 ng/μl)

primer 2:

5′CTGCTCGAATTCAAGCTTCT3′ (175 ng/μl)

29 μl dH2O, and 0.5 μl Taq polymerase. PCR cycling conditions consisted of heating the reaction mixture to 95°C for 2 min followed by 28 cycles of denaturing at 95°C for 30 s, annealing at 55°C for 1 min and synthesis at 70°C for 1 min (5 min for final cycle). A 104 bp PCR product was generated consisting of linkers (44+42 bp)+ditag (18 bp). A negative control consisted of identical conditions with ligase omitted from the reaction mix. The products from 50 reactions were pooled. The 104 bp DNA fragments were purified using SDS-PAGE and the optimal dilution for ditags (1:10, 1:20 or 1:50) was determined. A preparative PCR (192 reactions) was then performed as described above, except that the amount of primers was reduced to 87.5 ng and the number of cycles was reduced to 12.

Three fractions of concatemer (0.5–0.8, 0.8–1.0 and 1.0–2.0 kb) were ligated into a pZERO/SphI vector (Zero Background Cloning Kit, Invitrogen, Carlsbad, CA, USA). Negative controls consisted of a reaction without ligase and another without ligase or concatemer. Electroporation was used for bacterial transformation (50 μl TOP 10 One Shot Electrocompetent cells, 2 μl ligated vector). An amount of 20 μl of transformed bacteria was plated on low-salt LB/Zeocin/IPTG plates and clones were screened via PCR using M13 forward and reverse primers. PCR reactions containing concatemers comprised of at least 15 ditags were purified using a PCR purification column (PSI clone PCR 96 Kit, Princeton Separations, Adelphia, NJ, USA). Direct sequencing (ABI 3700, Applied Biosystems, Branchburg, NJ, USA) of PCR products was performed using a BigDye Primer Kit (Applied Biosystems) according to the manufacturer's instructions.

Sequence Analysis and Tag Extraction Strategies

The sequence and occurrence of each of the transcript tags was determined using the SAGE 2000 version 4.1 software (http://www.sagenet.org). SAGE 2000 produced a list of tags with their corresponding numerical values, providing a digital representation of global gene expression. The tags were then matched to the SAGE reliable map (ftp://ftp.ncbi.nih.gov/pub/sage/map/Hs/NlaIII). Monte Carlo simulations with adjustment for multiple comparisons were used to determine statistical significance and to calibrate the appropriate threshold values.16 A maximum value of 0.05 was chosen for P-chance. This yielded a false-positive rate that was no higher than 0.02 for the least significant P-chance value below the cutoff. The validity of this P-chance value was confirmed by using a second method that specifically calculates the probability of observing y copies of a particular tag in the cancer library given the observation of x copies of the same tag in the normal library.17 The P-chance was calculated by summing overall y values that were more extreme than the currently observed values.

Validation of LCM-MicroSAGE Data

To reveal the individual variation of gene expression, we assessed cancer and matched normal tissue samples using quantitative RT-PCR. Total RNA was isolated from 5 μm-thick fresh frozen tissue sections using a PicoPure™ RNA Isolation Kit (Arcturus). The concentration and integrity of RNA was assessed using a DUR7400 Spectrophotometer (Beckman). For this analysis, five well-characterized genes were selected including ATP synthase H+ transporting mitochondrial F0 complex subunit g (ATP5L),18 fatty acid synthase (FASN),19 kallikrein 3 or PSA (KLK3),20 calpain small subunit 1 (CAPNS1),21 prostate acid phosphatase (ACPP).22 Total RNA from nondissected whole tissue sections (300 ng/μl) or LCM samples (30 ng/μl) was used for the RT-PCR reaction. RT-PCR was performed using the ABI PRISM 7900HT Sequence Detection System using One-Step RT-PCR Master Mix Reagents (Applied Biosystems, Branchburg, NJ, USA) and Assays-on-Demand Gene Expression probes (Applied Biosystems). Serial dilutions of template isolated from the LNCaP prostate cancer cell line was used to generate standard curves. Statistical software produced the standard curve by measuring the crossing points of each standard and plotting them against the logarithmic value of concentrations. The concentrations of unknown samples were then calculated by setting their crossing points to the standard curve. The gene expression results were normalized to the expression of β-actin. The relative expression of each of the selected genes was calculated by dividing the value of expression for a gene by the value of β-actin.

Results

Generation of MicroSAGE Libraries using LCM

Laser capture microdissection was used to isolate selectively pure populations of prostate cancer cells and non-neoplastic epithelial cells from frozen sections of prostatectomy specimens (Figure 1). LCM-microSAGE libraries of cancer and non-neoplastic cells were generated using approximately 100 000 laser pulses from a PixCell IIe instrument (Arcturus) for each library. This yielded an estimated 3 × 105 cells using a 30 μm diameter laser beam and approximately 20–30 ng mRNA used to generate each microSAGE library. The mRNA served as starting template for subsequent ditag generation as described in detail in Materials and methods (Figure 2).

Figure 1
figure 1

Using a PixCell LCM with an infrared diode, 5 μm-thick serial sections of retrieved fresh frozen tissues were cut, and stained with a quick-acting hematoxylin–eosin (H&E) stain. Sections were air dried, and LCM was performed immediately afterward. Cancer cells (a, H&E, × 40) and normal cells (b, H&E, × 200) were microdissected from different fresh frozen tissue blocks. Most of cells were specific cancer cells except for a few myoepithelial cells (Cap). Inset: a high power view (H&E, × 400) of a cap showing dissected individual cancer glands on the film. Before, before dissection; after, after dissection; Cap, cells picked up on the film.

Figure 2
figure 2

Polyacrylamide gels (8% in a, b, c, 12% in d) and agarose gels (1.5% in e and f) showing the serial steps of LCM-microSAGE in a library from normal cells. (a) Shown are 28 cycles of PCR of various dilutions (1, 1:10; 2, 1:20; 3, 1:50) of 1 μl of the ligated ditag derived from LCM using normal cells to determine the optimal ditag concentration. The 104-bp band corresponding to the amplified ditags was sharply visible compared with the other faint background bands. (b) After establishing the optimal dilution, the ditag product was concentrated in 50 reactions. A box is used to delineate the collected ditag bands. The ditag bands were excised and DNA was extracted. (c) After large-scale rePCR (192 reactions) with 12 cycles, the PCR products were concentrated and run on a prepared gel from which the 100 bp ditag bands were excised. A box is used to delineate the ditag bands that were collected. (d) After digestion with NlaIII to cleave off the linkers, the small ditag of 22–26 bp (encircled by a black box) was excised and purified. (e) The isolated ditags were ligated to concatemers that were separated by size on a 1.5% agarose gel. The regions of the gel containing concatemers ranging from 0.5–0.8 kb (lowest bar), 0.8–1.0 kb and 1.0–2.0 kb (upper most bar) were excised. (f) After the purified concatemers were cloned in pZero vector, 20 μl of transformed bacteria were plated on low-salt LB/Zeocin/IPTG plates and clones were screened by PCR using M13 forward/reverse primers. PCR reactions that contain concatemers comprised of at least 15 tags (>6161 bp (226 bp vector+26 bp per ditagX15 ditags)) were purified. A bar indicated 600 bp. NC, negative control performed by H2O for 35 cycles in (a) and (b); PC, positive control RNA in I-SAGE kit; M, 10 bp ladder; ▪, ditag; □, linker.

Expression Profiles in Normal and Matched Cancer Cell Libraries

Two libraries were generated from normal and matched cancer cells. Concatemerized ditags were sequenced from 923 and 1612 selected colonies from normal and cancer cell libraries, respectively. A total of 10 463 and 10 111 unique tags were identified from each library, respectively (Table 1). Of the tags that matched with known genes in the SAGE map, 5272 matched with a single gene and 3798 matched with more than one gene (Table 1). Genes known to be highly expressed in both cancer and normal prostate cells including β-microseminoprotein (MSMB), sperm-associated antigen 7 (SPAG7), translationally controlled tumor protein (TPT1), lamin receptor 1 (LAMR1), and transforming growth factor alpha (TGFA) were expressed at high levels in the LCM-microSAGE libraries (Table 2). In all, 44% of the matching transcripts corresponded with mRNA sequence entries that have been characterized, whereas 46% matched either uncharacterized expressed sequence tags (EST) or complementary DNA (cDNA) entries (Table 1).

Table 1 Overall summary of two LCM-microSAGE libraries
Table 2 Genes highly expressed in both cancer and normal libraries

Most transcripts were expressed at similar levels in the cancer and non-neoplastic epithelial cell libraries (Figure 3). In cancer cells, 385 tags (4.2% of 9069 analyzed total unique tags) were increased more than fourfold and 389 tags (4.3%) were decreased more than fourfold. Of these transcripts, 20 were exhibited significantly different (P<0.05) expression between matched normal and cancer cells (Table 3). These differentially expressed transcripts consisted of five that were upregulated in prostate cancer cells and 15 that were downregulated.

Figure 3
figure 3

Semilogarithmic plot distribution of tags in prostate cancer and normal epithelial cell LCM-microSAGE libraries.

Table 3 Genes showing differential expression in prostate cancer vs normal cells

Validation of LCM-MicroSAGE

To confirm the LCM-microSAGE results, fresh tissue sections of cancer and normal tissue samples that were used for LCM-microSAGE analysis were also assessed by QRT-PCR for five selected genes (Figure 4). These genes were selected because they have been well studied in prostate cancer. The expression of ATP5L (tag ratio of normal vs cancer=0.4),18 FASN (0.33)19 and CAPNS1 (0.75)21 were demonstrated to be expressed at higher levels in the cancer cell LCM-microSAGE libraries. These genes were confirmed to be expressed at 2–3-fold higher levels in the cancer relative to normal cells with QRT-PCR. KLK3 (PSA) (1.0)20 and ACPP (1.0)22 were present at equivalent levels both in the normal and cancer cell LCM-microSAGE libraries. However, these genes were shown to be expressed at approximately 10-fold higher levels in cancer tissue with QRT-PCR. ATP5L and FASN genes were selected and their expression validated against LCM paired normal, cancer, and cancer stromal cells (Figure 4b). FASN was expressed to a greater degree in cancer cells than in normal and cancer stromal cells.

Figure 4
figure 4

Results of quantitative RT-PCR of five selected human genes. (a) ATP5L, FASN, CAPSE1, KLK3, ACPP gene expression in non-dissected whole normal and paired cancer tissues from the same patient. (b) ATP5L and FASN gene expressions were analyzed with total RNAs derived from LCM paired normal, cancer, and cancer-stromal cells from the same patient. Values are normalized against β-actin.

Discussion

Several methodologies have recently been developed to generate genomewide expression profiles that are characteristic of specific cell types or states of differentiation. In general these techniques involve a hybridization step, usually either cDNAs or oligonucleotides fixed to a solid matrix,23, 24, 25 or the SAGE.1, 7, 8, 12 As with any methodology, each of these techniques possesses advantages and corresponding disadvantages. The advantages of SAGE include the potential to quantify a large number of transcripts simultaneously in the absence of any a priori sequence information.1, 7 Additionally, previous studies have shown that there is not complete overlap in transcripts identified as differentially expressed using SAGE and array-based technologies.

However, the application of SAGE is technically more demanding and limited by the requirement for a relatively large amount of input mRNA.7, 26 These disadvantages have necessarily limited the application of SAGE expression profiling to routinely acquired human tissue samples. Further, these tissue samples are comprised of heterogeneous cell populations, including the population of interest, thereby conveying the potential to complicate the interpretation of the expression profiling data.

Therefore, it was of interest to address these challenges in an attempt to adapt the SAGE technique to small human tissue samples that are obtained for diagnostic evaluation. LCM is now a standard technique to isolate cells of interest rapidly from tissue sections with estimates of purity of >95%.13 Recently, LCM performed on routinely stained frozen tissue sections have been used for mRNA isolation amenable to RT-PCR and generation of cDNA array expression libraries.3, 27, 28, 29 The use of LCM-derived mRNA for SAGE profiling has not previously been described. Owing to the limited quantity of template obtained from LCM isolated cell populations, an amplification procedure was used, similar to that of others,10, 11 in order to generate sufficient amounts for the generation of SAGE libraries. This PCR step should be relatively free of bias because all ditags are of approximately equal length. However, the possibility that some ditag species are still preferentially amplified cannot be excluded. The advantage of using two rounds of PCR is that it enabled SAGE analysis to be performed from the limited amount template available from LCM-isolated cell populations. Furthermore, even though preferential amplification may occur, the SAGE software discards duplicate ditags generated by preferential amplification. The software, therefore, corrects this bias and provides more robust results. The high percentage of duplicate ditags excluded from subsequent analysis reduces the average number of analyzed tags obtained per clone compared to standard SAGE analyses.1, 16

The frequency of these potential artifacts can increase exponentially with the number of PCR cycles performed. In order to minimize this effect we performed multiple parallel PCR reactions of fewer cycles, rather than increase the number of PCR cycles using fewer replicates.11 Using this procedure, we could obtain an expression profile from <30 ng of LCM-derived mRNA. A high percentage (>20%) of duplicate ditags were obtained using the LCM-microSAGE method. This may be a consequence of the relatively low complexity within the mRNA population, due to either the LCM protocol (a low amount and quality of starting material). Alternatively, it might be anticipated that variable preservation of mRNA in banked tissue samples would reduce complexity and result in an increase in the number of duplicate ditags.

Since a SAGE library generates candidate genes, independent methods, such as RT-PCR, Northern blot analysis10, 11 in situ hybridization12, 30 or immunohistochemistry5 are needed for validation. However, validation was considered a lower priority goal for this study, since it was thought that relatively few statistically significant differences in gene expression would be determined based on the amount of total tags sequenced for the study. Even with total 20 574 tags sequenced, the majority of epithelial and tumor-associated genes may not be expressed at sufficient levels to permit a stringent statistical assessment of differential expression.

QRT-PCR was used to confirm the expression of several well-characterized genes of matched cancer and normal prostate tissue. ATP5L, FASN, CAPNS1 revealed a good correlation with transcripts (tags) in both LCM-microSAGE libraries. In contrast, KLK3 and ACPP exhibited lower transcript levels in both the cancer and normal cell libraries, but higher mRNA expression in whole cancer tissue. It is noteworthy that both the KLK3 (1464 bp) and ACPP (2239 bp) transcripts are relatively large. These transcripts may be preferentially degraded during the LCM and/or microSAGE procedures. In this regard, the β-microseminoprotein transcript (571 bp) was abundantly expressed in both LCM-microSAGE libraries. Ongoing studies will provide more detailed information regarding the variables that impact the sensitivity and reproducibility of the LCM-microSAGE technique.

In conclusion, we demonstrate the feasibility of combining LCM with microSAGE. This approach allows generation of cell-specific expression profiles and the assessment of the expression levels of known and unknown transcripts in specific cell types obtained from human tissue samples.