Introduction

Lung cancer is the leading cause of cancer death in the world; it accounts for 37% and 26% of the total deaths in men and women, respectively1. The overall 5-year survival rate for lung cancer patients is less than 15%. Non-small cell lung carcinoma (NSCLC) accounts for approximately 75% of all lung cancers and represents a heterogeneous group of cancers consisting mainly of squamous cell, adeno- and large cell carcinoma. Despite advances in cancer treatment in the past two decades, the prognosis of patients with lung cancer has been improved only minimally. Less than 50% of the patients that have successfully undergone potentially curative resections survive for 5 years after operation. The TNM staging system is an important but insufficient prognostic parameter; additional prognostic factors still affect the clinical outcomes, independent of stage.

Gene expression profiling by microarray and RT-PCR have been used to identify possible prognostic factors to predict patient outcome2, 3, 4. Real-time PCR permits quick, robust and quantitative measurement of gene expression, leading to the construction of gene expression-based survival models.

Comprehensive analysis of gene expression using RNA from fresh or frozen tumor specimens is becoming increasingly important to better understand cancer pathogenesis, disease progression and prognosis5, 6. However, frozen tumor specimens are not readily available. In contrast, formalin-fixed, paraffin-embedded tissues are widely present and usually have matching clinical data with which to carry out clinical studies.

Recently, several methods to extract RNA from paraffin-embedded tissues have been reported7, 8. We have developed an improved method of RNA extraction from paraffin-embedded lymphoid tissues and compared it with two other RNA extraction methods9. Our method yielded higher amounts of RNA with longer RNA fragments. Furthermore, real-time RT-PCR analysis showed that our method for RNA extraction resulted in significantly better reproducibility and concordance between the paired frozen and paraffin-embedded samples. However, the extent of RNA degradation and modification during the fixation process varied in different tumor tissues. We modified the RNA extraction method and demonstrated its efficiency and reproducibility in obtaining RNA from lung cancer tissues, confirming the purpose of our study. This improved RNA extraction method can likely be used to study gene expression profiling of other paraffin-embedded solid tumor samples.

Materials and methods

Tissue specimens

Formalin-fixed, paraffin-embedded and frozen lung cancer specimens as well as the corresponding normal lung cancer tissues were obtained from 8 patients with adenocarcinoma and squamous cell carcinoma. Fresh surgical specimens were fixed in 10% neutral-buffered formalin at room temperature for 4–6 h before being alcohol dehydrated and embedded in paraffin for 1 year (4 patients), 3 years (2 patients) or 5 years (2 patients). The study was approved by the Tianjin Medical University Ethical Review Board. All patients consented to the study.

RNA isolation from frozen specimens

Frozen tissues (50−100 mg) were ground into powder in liquid nitrogen, and then suspended in 1 mL TRIZOL Reagent (Invitrogen, USA). Total RNA was extracted using TRIZOL reagent according to the manufacturer's protocol. Briefly, the aqueous phase was used for RNA precipitation with an equal volume of isopropanol. The RNA pellet was washed once with 1 mL 75% ethanol, then air-dried and re-dissolved in an appropriate volume of RNase-free water. RNA was quantified using a spectrophotometer (Beckman, USA), and its quality was checked by agarose gel electrophoresis.

Extraction of total RNA from formalin-fixed, paraffin-embedded specimens

Total RNA was extracted from three 10 μm-thick formalin-fixed, paraffin-embedded sections (corresponding to about 30 mg of tissue). Sections were deparaffinized by two repeated incubations in 1.5 mL xylene at 37 °C for 20 min, followed by two repeated incubations in 1.5 mL 100% ethanol at 37 °C for at least 30 min. Ethanol was aspirated and the pellet was allowed to air-dry for 5 min at room temperature. Then the pellet was resuspended in 600 μL of RNA lysis buffer containing 10 mmol/L Tris/HCl (pH 8.0), 0.1 mmol/L EDTA (pH 8.0), 2% SDS (pH 7.3) supplemented with 50 μL of 60 mg/mL proteinase K (Promega, USA) and incubated at 60 °C for 16–20 h with occasional agitation, until the tissue was completely digested. Next, we purified RNA using two sequential extractions with an equal volume of 70% phenol (pH 4.3):30% chloroform at room temperature, unlike our previously reported method using only one extraction on lymphoid tissue samples9. Then, the RNA was precipitated with an equal volume of isopropanol in the presence of 1/10 volume of 3 mol/L sodium acetate (pH 5.2) and 0.5 μL of a 20 mg/mL solution of carrier glycogen (Invitrogen, USA) at -20 °C for at least 1 h. The RNA pellet was washed once in 75% ethanol, dried and redissolved in 20 μL of RNase-free water. All solutions were prepared using DEPC-treated water. RNA was quantified spectrophotometrically, and its quality was assessed by 1.5% agarose gel electrophoresis and staining with ethidium bromide.

To rule out the possibility of DNA contamination, the resolved RNA was incubated with 10 μg/mL RNase-free DNase at 37 °C for 30 min, and then precipitated as stated previously.

Reverse transcription

Two micrograms of total RNA from paraffin-embedded or frozen tissues was reverse transcribed using the M-MLV reverse transcriptase (Promega, USA) according to the manufacturer's protocol, with minor modifications. RNA template and random primers were incubated at 70 °C for 10 min to melt the secondary structure within the template, and cooled on ice for more than 2 min. Then the complete reaction mixture was incubated at 30 °C for 10 min, 42 °C for 60 min and 70 °C for 15 min.

Determination of the length of specific transcripts by PCR

To compare the maximal length of RNA transcripts extracted from paraffin-embedded and frozen tissues, we amplified 13 fragments of the β-actin gene, ranging in size from 99 to 705 bp. The primers for β-actin were used as previously described9. To prevent potential amplification of contaminating DNA, most primer pairs were designed to span different exons. PCR was performed in a total volume of 25 μL containing 1 μL of reverse-transcribed cDNA. After an initial incubation at 94 °C for 5 min, the reaction mixtures were subjected to 35 cycles of amplification using the following protocols: 94 °C for 45 s, 55 °C for 45 s and 72 °C for 45 s, followed by a final extension step at 72 °C for 7 min. PCR products were analyzed by 1.2% agarose gel electrophoresis and stained with GoldView nucleic acid dye.

Quantitative real-time RT-PCR

Real-time RT-PCR was performed using ABI PRISM 7500 Sequence Detection System instrument and software (Applied Biosystems, USA). The relative expression level of four house-keeping genes and five target genes was measured using SYBR Green I dye-based method. The sequences of the primers are presented in Table 1.

Table 1 Sequences of real-time RT-PCR primers.

PCR reactions were prepared in a final volume of 25 μL, with a final concentration of 1×Power SYBR Green PCR Master Mix (Applied Biosystems, USA) and cDNA derived from 25 ng of input RNA, as determined by spectrophotometric measurement using the OD260 value. Thermal cycling comprised of an initial UNG incubation at 50 °C for 2 min, AmpliTaq Gold DNA Polymerase activation at 95 °C for 10 min, 40 cycles of denaturation at 95 °C for 15 s and annealing and extension at 60 °C for 1 min. Each measurement was performed in triplicate and the threshold cycle (CT), the fractional cycle number at which the amount of amplified target reached a fixed threshold, was determined as previously reported9, 10.

To compare the expression of endogenous house-keeping genes between the paired paraffin-embedded and frozen specimens, the differences in the average CT values between these two types of specimens was calculated as follows: Mean CT =average Ct(paraffin–embedded tissue RNA) –average Ct(frozen tissue RNA). To compare the RNA expression of target genes among different specimens, normalization based on GUSB gene expression was performed, and the averages of the normalized CT values (ΔCT) were calculated as previously reported10, 11. Relative mRNA expression of a target gene within a specimen was calculated as 2-ΔCT. where ΔCT=CT(target gene)–CT(GUSB)7.

Statistical analysis

The degree of variance in the expression of distinct housekeeping genes was calculated using Excel computer software. Correlation of gene expression analysis was done using Pearson linear correlation.

Results

RNA extraction

For comparative purposes, we isolated total RNA from formalin-fixed, paraffin-embedded lung cancer tissues and frozen lung cancer tissues. To obtain a yield of amplifiable RNA, we performed RNA extraction twice with phenol-chloroform in lung tissues, instead of only once as was done in lymphoid tissues previously9. Starting from three 10-μm-thick formalin-fixed, paraffin-embedded tissue samples, we obtained an average of 44.9 μg (33.1−55.6 μg) of RNA, with OD260/280 ratios ranging from 2.0 to 2.1, indicating good quality purified RNA; this ratio is much better than in our previous method in lymphoid tissues, in which RNA OD260/280 ratios were between 1.6 and 1.8. As expected, RNA extracted from the frozen tissues showed distinct 28S and 18S ribosomal RNA (rRNA) bands and OD260/280 ratios ranging from 1.9 to 2.1, whereas most of the RNA extracted from formalin-fixed, paraffin-embedded specimens with our method was between the 28S and 18S bands (about 1000 to 2000 bases), and appeared smeary (Figure 1). To rule out the possibility of DNA contamination and to improve RNA quality, we added RNase-free DNase to the resolved RNA and incubated samples at 37 °C for 30 min, then precipitated RNA as stated in the Materials and methods. Compared to the DNase untreated method, we did not find a significant improvement in subsequent experiments, and we found there was less RNA recovered overall (data not shown).

Figure 1
figure 1

Agarose gel electrophoresis of total RNA extracted from frozen lung cancer tissues (A) and the corresponding paired formalin-fixed, paraffin-embedded lung cancer tissues (B). Lanes 1−4: one-year old samples; Lanes 5−6: 3-year old samples; Lanes 7−8: 5-year old samples.

RT-PCR amplification of different β-actin fragments

To assess the ability of the extracted RNAs to generate longer amplicons by RT-PCR amplification, 13 β-actin amplicons, ranging in size from 99 to 705 bp, were amplified from the RNA extracted from the paraffin-embedded and frozen specimens, as well as from the specimens before and after genomic DNA removal (Figure 2). Compared with frozen lung cancer tissues, the RNA extracted from the formalin-fixed, paraffin-embedded lung cancer tissues yielded β-actin amplicons with similar quality, although at a lower quantity. Using DNase-treated RNA as template did not improve the amplification efficiency and did not result in amplification of longer fragments (data not shown). Amplification of RT products in which reverse transcriptase was omitted did not yield PCR amplicons, thus ruling out the possibility of inadvertent amplification of contaminating DNA (data not shown).

Figure 2
figure 2

Effects of the method of total RNA extraction from formalin-fixed, paraffin-embedded lung cancer tissues on amplifiable RNA fragment length. Thirteen primer pairs were tested that amplify from 99 to 705 bp of β-actin. The expected product size is given for amplicons amplified from the frozen (A) and formalin-fixed, paraffin-embedded (B) lung cancer specimens.

Real-time RT-PCR gene expression analysis

To compare the consistency of quantitative RT-PCR between matched formalin-fixed, paraffin-embedded and frozen lung cancer tissues, and to identify the appropriate endogenous control genes for RNA input normalization in lung cancer tissues, we studied the expression of four commonly used housekeeping genes (GUSB, PGK-1, GAPDH, and 18S), which were selected from among those well-known in the literature as common, constitutively expressed genes across different conditions. The primers amplified amplicons with relatively small sizes (less than 150 bp). As expected, the RNA of the endogenous housekeeping genes extracted from paraffin-embedded specimens was generally lower in quantity (higher CT) than those extracted from matched frozen specimens (Figure 3). However, different housekeeping genes exhibited different degrees of variation in expression in different specimens. Of the 4 endogenous control genes, expression of GUSB was the least variable both in frozen and in paraffin-embedded specimens, while the expression of 18S was the most variable (Figure 3).

Figure 3
figure 3

RNA expression of housekeeping genes in paired frozen and formalin-fixed, paraffin-embedded lung cancer tissues. (A) Mean RNA expression (represented by mean CT) of 4 housekeeping genes (GUSB, PGK-1, GAPDH, and 18S) in 8 frozen and paired formalin-fixed, paraffin-embedded lung cancer tissues. Each bar represents gene expression in an individual specimen. Numbers above the bar represent the variance of expression of particular gene among the specimens. (B) RNA expression of PGK-1, GUSB, LCK, STAT1, MMD ERBB3, and DUSP6 in formalin-fixed, paraffin-embedded lung cancer tissues. The expression of each gene is normalized to its expression in the matched frozen specimens, as represented by CT difference between the paraffin and frozen specimens. Each symbol represents a distinct paired tissue specimen.

We next examined the expression of several non-housekeeping genes in the matched paraffin-embedded and frozen lung cancer tissues. We selected the following genes: LCK, MMD, STAT1, ERBB3, and DUSP6, which have been linked to the relapse-free and overall survival among patients with NSCLC6. The expression of these target genes was normalized to GUSB due to its minimal variation in expression. We were able to obtain similar levels of normalized expression profiles of these target genes in the formalin-fixed, paraffin-embedded specimens and frozen specimens, as shown in Figure 4A. ΔCT values of the paraffin-embedded tissues and their correlation with that of frozen tissues are shown in Table 2. Although the absolute quantities of specific RNA transcripts amplified from similar amounts of starting RNA were smaller in paraffin-embedded samples compared to the paired frozen specimens, once the data were normalized to endogenous housekeeping gene controls, the resulting relative amounts were very similar. The adjusted Pearson correlation (r) between the formalin-fixed, paraffin-embedded and frozen lung cancer specimens for all tested genes was r=0.885 (Figure 4B).

Figure 4
figure 4

Comparison of RT-PCR expression profiles of 5 genes (LCK, MMD, STAT1, ERBB3, and DUSP6) from paired frozen and formalin-fixed, paraffin-embedded lung cancer tissues. (A) Total RNA was extracted from frozen and paraffin-embedded lung cancer tissues, mRNA levels were determined by real-time SYBR green RT-PCR as described in Material and Methods. Each bar represents the normalized expression relative to GUSB and the mean of three measurements: Grey — formalin-fixed, paraffin-embedded tissues; Black — frozen tissues. (B) Spearman correlation for the 5 gene expression in 8 paired frozen and formalin-fixed, paraffin-embedded lung cancer tissues.

Table 2 Comparison of average ΔCt values between formalin-fixed, paraffin-embedded specimens and frozen specimens by target gene.

Measurement of the expression of these five genes using RNA extracted from two 3-year-old and two 5-year-old lung cancer specimens demonstrated a similar correlation between the paired formalin-fixed, paraffin-embedded and frozen specimens, as was observed for the 1-year old-specimens. The age of preservation of the formalin-fixed, paraffin-embedded lung cancer tissue did not have a marked effect on the expression of the housekeeping genes (such as PGK1 and GUSB) and the selected five genes (LCK, MMD, STAT1, ERBB3, and DUSP6), as shown in Figures 3B and 4A (data for other housekeeping genes is not shown). A comparison of 3- or 5-year–old specimens to 1-year-old specimens showed similar CT differences and ΔCT values between the matched formalin-fixed, paraffin-embedded and frozen specimens.

Discussion

DNA arrays or real-time RT-PCR are important tools in the diagnosis and treatment of human cancers12, 13. However, the requirement for fresh or snap-frozen tissues has limited their clinical application. By contrast, specimens collected and processed for pathological diagnosis are readily available, many with matching clinical data14. Recently, progress has been made to extract RNA from formalin-fixed, paraffin-embedded lymphoid tissues and breast cancer tissues15. However, these techniques are subjected to the tissue-specific, fixation-associated RNA degradation and modification. Optimization of RNA extraction for each tissue is therefore required. The aim of this study was to investigate whether the method of RNA extraction in formalin-fixed, paraffin-embedded lymphoid tissues we developed can be used on lung cancer specimens. We also wanted to test if the RNAs extracted with our improved method can be used in quantitative real-time RT-PCR. We also explored the most appropriate endogenous control genes to use for the normalization of RNA quality and quantity.

The traditional methods of RNA extraction from formalin-fixed, paraffin-embedded tissues often yield RNAs of insufficient quality10, which are extensively degraded to fragments that are, on average, 200 nucleotides in length16. Previous attempts to amplify fragments longer than 200 bp were usually unsuccessful17. To date, the most successful method for total RNA extraction from formalin-fixed, paraffin-embedded tissues utilizes digestion with proteinase K before the acid-phenol:chloroform extraction and carrier precipitation18. We modified this method by using a higher concentration of proteinase K and a longer digestion time, optimized to 16 h, to obtain higher quality RNA from lymphoid tissues. In this study, we applied the RNA extraction method used in lymphoid tissues to the lung cancer tissues. We were able to obtain high-yield and high-quality RNAs that could be amplified to yield long cDNA fragments (>600 bp). Furthermore, the delta threshold cycle (ΔCT) values of the formalin-fixed, paraffin-embedded tissues in our experiments had a high correlation to that of frozen lung cancer tissues (r=0.855, P<0.01) for all the test genes. Our results indicated that the formalin-fixed, paraffin-embedded lung cancer tissues may replace frozen tissues in gene expression analysis using real-time RT-PCR when our modified RNA extraction method is utilized.

There are no housekeeping genes whose expression is constant in all tissues during normal or malignant growth19, 20. To properly control for the variation in expression of RNAs, endogenous control genes need to be used for each cell type and tumor type in each experimental design21. Here, we chose four housekeeping genes with different abundances that have been widely cited in the literature, and exhibit relatively low variation in expression in different lymphoid tissues9, 21. Our experiments indicated that the GUSB gene exhibited the lowest variation of expression in formalin-fixed, paraffin-embedded and frozen lung cancer specimens, and should be used as a suitable endogenous gene to control for RNA quality and quantity.

In this study, we were able to obtain similar levels of normalized expression profiles of five target genes (LCK, MMD, STAT1, ERBB3, and DUSP6) in formalin-fixed, paraffin-embedded specimens and frozen specimens. Our observations suggest that real-time RT-PCR measurements of normalized gene expression in paraffin-embedded lung cancer specimens using our method may closely reflect the gene expression in paired frozen specimens and could obviate the need for frozen specimens. Further studies on the applicability of these methods for prediction, for instance, of lung cancer survival using models with five genes or more than five genes, are in progress6. Recently, we used this method to analyze BAG-1 expression in human lung cancer specimens that were formalin-fixed and paraffin-embedded between 1999 and 2003 from West China Hospital22.

In summary, we have modified the method of RNA extraction from formalin-fixed, paraffin-embedded lymphoid tissues and applied it to lung cancer tissues. Our method will enable researchers to use RT-PCR and real-time quantitative RT-PCR to study the pathogenesis, prognosis, and treatment of lung cancer using formalin-fixed, paraffin-embedded archival tissues.

Author contribution

Jun CHEN, Qing-hua ZHOU designed research; Fan ZHANG, Zhuo-min WANG and Hong-yu LIU performed research; Yun BAI, Sen WEI, and Ying LI contributed new analytical tools and reagents; Min WANG analyzed data; Jun CHEN and Zhuo-min WANG wrote the paper.