Main

To compare gene expression levels between two biological samples, normal cells vs tumor cells for example, it is critical to have a normalization control. Historically, this has been accomplished using genes such as glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and ACTB, which are presumed to have stable expression levels in all cell types. Additional approaches for normalization include using cell number, cell content (DNA, RNA, or protein amounts), or more recently a multigene readout on expression microarrays. In each case, the normalization procedure is designed to allow investigators to obtain an accurate measurement of the relative levels of a gene of interest across different samples.

Review of the literature shows that a common set of ‘housekeeping genes’ are used as endogenous controls for quantitative reverse transcriptase (qRT)-PCR. Overall, the majority of these mRNAs are effective normalization tools, although several groups report that variability in their expression levels can exist in different sample and cell types and needs to be considered when designing a study.1, 2, 3, 4, 5, 6, 7, 8 For example, Vandesompele et al9 reported that universally stable control genes do not exist, thus necessitating the development of an accurate normalization strategy for each experimental system. Aerts et al,10 in a study of 26 tumor cell lines including prostate cancer DU-145 and PC-3, showed GAPDH has the highest variability in gene expression of commonly used controls, whereas 18s rRNA, GUS, and ACTB had the best expression stability.10 ACTB also showed stable expression in skeletal muscle and diabetic glomeruli; however, this was not observed in colonic and adipose tissue, reinforcing the notion that many commonly thought of ‘housekeeping’ genes are not stably expressed in certain cell types and environments, thus reinforcing the need to validate endogenous controls in each tissue and experimental condition.2, 11, 12, 13

Beyond differences in housekeeping gene expression related to tissue type, investigators must also consider variability in gene expression related to disease state and procurement method. Tumor cells show alterations in many transcripts, particularly those related to increased growth and metabolism, a fact that must be considered when housekeeping genes are selected for studies of cancer or other proliferative processes.2, 14, 15, 16, 17, 18 Additionally, many previous studies examining housekeeping gene stability utilized cells grown in culture as opposed to ex vivo tissue specimens. Our group and others have found that cells in culture show some expression features, such as consistently high levels of ACTB, which appear to be related to in vitro growth conditions and are not necessarily observed in tissue samples. Thus, ACTB may be an excellent housekeeping gene for cells grown in vitro, but less accurate when studying tissues. Microdissected (MD) cells add another level of complexity to quantitative gene expression analysis as the mRNA is derived from a single cell type that may have a unique expression profile, including the housekeeping genes, relative to the bulk tissue sample from which they were derived.

The objective of the present study was to evaluate methods for normalizing gene expression measurements in MD tissue samples. Cell count, total RNA, individual mRNAs, and small sets of mRNAs were each assessed for their ability to serve as internal controls.

Materials and methods

Tissue Specimens

Five frozen specimens of newly diagnosed and untreated human prostate carcinoma (PCa) were obtained from patients who underwent prostatectomy at Catholic University in Santiago, Chile (cases 1–5) and were anonymized before transfer to the National Cancer Institute (NCI). Tumors were evaluated by two pathologists (JWG and RFC) and assigned Gleason scores of 6 (3,3; n=1), 7 (4,3; n=2), and 9 (4,5; n=2). Tissue specimens used in this study were selected using the NCI Pathogenetics Unit Prostate Tissue Database (data not shown). Tissue quality has previously been shown to be of excellent quality for isolation of RNA.19

Microdissection and RNA Isolation

Five cases of matched frozen nontumor prostate (benign hyperplasia) and moderate-grade prostate cancer (Gleason pattern 3,3), or high-grade prostate cancer (Gleason 4,5) or the Gleason 4 portion of a Gleason pattern 4,3 were MD in triplicate according to standard methods.20, 21 Staining of the frozen tissue samples before laser capture microdissection (LCM) was performed by dipping the tissue sections into solutions as follows: 70% ethanol (EtOH) for 20 s, H2O for 10 s, hematoxylin (Sigma-Aldrich, St Louis, MO, USA) for 10 s, H2O for 10 s, 70% EtOH for 10 s, eosin (Sigma-Aldrich) for 4 s, 95% EtOH for 10 s two times, 100% EtOH for 10 s two times, and xylenes for 20 s three times.

Okuducu et al22 reported that methyl green causes the least fluorescent interference for qRT-PCR. We have traditionally used hematoxylin and eosin (H&E) as our stain of choice because it allows for the optimum visualization of cells of interest. Therefore, before LCM, staining of additional triplicate sections of one case was performed according to the above H&E-staining method, with the deletion of eosin and either the use of hematoxylin alone or the substitution of methyl green (MTR Scientific, Ijamsville, MD, USA) for hematoxylin. This analysis was conducted to assess if there is a statistically detectable fluorescent interference of the different tissue stains.

Approximately 10 000 dissected cells were procured for each case, which equated to ∼3000 shots per tissue type (Figure 1). Samples were immediately placed into lysis buffer and stored at −80°C until RNA extraction. Total RNA isolation of each individual sample was conducted with PicoPure RNA extraction kit (Arcturus Engineering Inc., Mountain View, CA, USA) according to the manufacturer's protocol. Because DNA is often left in the sample using glass filter RNA extraction, the samples were subjected to DNAse treatment for 15 min.19 Following RNA isolation, 1 μl samples were aliquoted for RNA quantitation and qualitation analysis and were used immediately. All remaining RNA samples were stored at −80°C.

Figure 1
figure 1

Representative histology from matched prostate normal (a) and PCa (b) sample areas captured by laser microdissection. From these areas, 10 000 cells were MD and used as the normalized samples as described in Materials and Methods. H&E stained and original magnification, × 100.

RNA Quantitation and Qualitation

Quantitation of individual sample total RNA, in triplicate, was conducted using Bioanalyzer (Agilent Technologies Inc., Santa Clara, CA, USA) and NanoDrop (NanoDrop Technologies, Wilmington, DE, USA) equipment according to the manufacturer's protocol for <10 ng total RNA quantities per microliter of the sample. Quality of individual samples was assessed with the Bioanalyzer (Agilent Technologies Inc.) and a RNA integrity number was provided by 2100 Expert software (Agilent Technologies Inc.). Qualitation of case no. 5 total RNA was also visualized via electrophoresis on a 1.0% agarose gel.

Reverse Transcription

Reverse transcription (RT) was conducted on all individual RNA samples using TaqMan RT Reagents (Applied Biosystems Inc. (ABI), Foster City, CA, USA; catalogue no. N808-0234), with random hexamers as the RT primers, according to the manufacturer's protocol. Template was used at 4 μl volumes per reaction. All RT reactions were performed on the MJ Research PTC-200 thermocycler. The cDNA was used immediately for all PCR. Remaining cDNA was stored at −80°C.

Housekeeping Genes

Ten housekeeping genes that are constitutively expressed in various human tissues and used in numerous previous studies were chosen as endogenous controls for gene expression analysis (Table 1). Significant reduction of the chance that genes may be coregulated was achieved by selecting genes that belong to different functional classes.9 Except for PGK1 and GAPDH, which are both involved in glycolysis, it is assumed that the regulation of expression of these housekeeping genes is not directly related, and that they have independent functions in cellular maintenance.23

Table 1 Genes evaluated as endogenous controls for MD prostate N- and T-gene expression analysis

Quantitative Real-Time PCR (qPCR) Gene Expression Measurement

Ten commercially available optimized endogenous control primer/probe sets (Table 1) and 2 × TaqMan Universal Master Mix (ABI catalogue no. 4304437) were used in singleplex qPCR according to the manufacturer's protocol. The assays for ACTB, CYPA, HPRT, GAPDH, TfRC, B2M, GUS, PGK1, and RLP are cDNA specific; whereas the primers and probes for 18 s rRNA may detect genomic DNA. Negligible contamination (0.1%) of genomic DNA contamination was confirmed by the observation of a cycle threshold (CT) of 10 comparing RT-negative with RT-positive samples.24, 25 Stratagene M × 3000P™ real-time PCR machine was used for all qPCR and detections. cDNA template was used at 4 μl volumes per reaction. All qPCR assays were performed in triplicate after RT. Cycling conditions consisted of one cycle of 50°C for 2 min followed by 95°C for 10 min, and then 50 cycles of 95°C for 15 s followed by 60°C for 1 min. Controls consisting of total human prostate RNA (∼12 ng/μl; Ambion, Austin, TX, USA) were positive in all runs, and controls consisting of sterile molecular grade water were negative in all runs. Relative quantitation analysis of gene expression data was conducted according to the 2−ΔΔCT method.26, 27

Statistical Analysis

Analyses were carried out on the original CT scale. The ‘gold standard’ normalization factor was the average of all 10 endogenous control genes (Table 1). The variation for particular genes and the average values across all 10 genes was examined using a variance components model.28 This approach allowed decomposition of the sources of variation into components attributable to individual sample, microdissection replicate, and qPCR technical replicate. Specifically, we fit the model:

where i denotes the ith sample, j denotes the jth MD sample, and k denotes the qPCR technical replicate. In the analysis, there are five individual samples, with three microdissections for each, and then three qPCR technical replicates for each microdissection. The parameter β reflects the average value, while the random effects τi, δij, and ɛijk measure variation due to individual sample, microdissections within individual sample, and qPCR technical replication, respectively. Variances of τi, δij, and ɛijk, denoted as στ2, σδ2, and σɛ2, characterizing the different sources of variation for these three components, are estimated from the model. Ninety-five percent tolerance intervals demonstrating the natural variation in single measurements were constructed based on estimates obtained from the variance components model. These tolerance intervals can be interpreted as bounds on likely ranges for the average CT values. As an illustration, for a single MD sample with a single qPCR technical replicate across tumor samples, the 95% tolerance interval was computed as . The 95% tolerance interval for a single MD sample with a single qPCR technical replicate within a tumor sample was computed as .

The difference between using a single endogenous control gene or a pair of endogenous control genes as a normalization factor, relative to using all 10 endogenous control genes (‘gold standard’) was also assessed. For evaluating each of the 10 individual endogenous control genes or averages of any two control genes (ie, 45 possible combinations of two endogenous control genes), we fit the variance components models:

where the gene CT value was either the CT value for each of the 10 genes individually, or an average of the CT values for each of the 45 possible pairs of the 10 housekeeping genes. The average CT value is the average value over all 10 genes. The parameter β measures the average difference, while the random effects τi, δij, and ɛijk measure variation in the difference due to individual sample, microdissections within individual sample, and qPCR technical replication, respectively. Variances of τi, δij, and ɛijk, denoted as στ2, σδ2, and σɛ2, characterizing the different sources of variation in the difference for these three components, are estimated from the model. Estimates of the model parameters were used to compute 95% tolerance intervals for normalization factors based on only one or two genes vs the average of all 10 genes. These intervals were constructed for a single measurement (ie, one particular qPCR technical replicate) as well as for the more practical situation of the average of three qPCR technical replicates. These tolerance intervals can be interpreted as bounds on likely values of the difference in CT value of using only a single or two genes as compared with the average CT value across all 10 control genes. Genes were ranked according to the length of the interval. These intervals were constructed separately for normal and tumor tissue. Further, we also considered the common situation in which both normal and tumor tissue are collected from the same individual (ie, paired normal and tumor tissue).

We examined whether there was a significant effect of staining on CT values using a linear mixed model.29 A comparison across stains was made with an F-test.

Untransformed and transformed data means and s.d. are presented as means (s.d.). All P values correspond to two-sided tests with P≤0.05 considered as statistically significant.

RESULTS

Three strategies for gene expression normalization were evaluated using cells MD from radical prostatectomy specimens; cell count during LCM, total RNA measurement, and the use of endogenous housekeeping genes. However, before this analysis the basic parameters of the gene expression measurement system were assessed. First, it was determined if stains used for visualizing tissues interfere with subsequent qRT-PCR. Three stains were tested: (1) H&E, (2) hematoxylin alone, and (3) methyl green. An F-test revealed no statistically significant CT value difference for the three stains showing P values of 0.21, 0.08, and 0.18 for ACTB, HPRT, and GUS, respectively (Table 2). Therefore, H&E was used as the tissue stain for the remainder of the study. Second, the basic characteristics of qRT-PCR applied to MD samples were examined. Complementary DNA (cDNA) was produced from each of the 10 samples (one normal epithelium and one tumor dissection from five cases) and used to measure the levels of 10 housekeeping genes. A dilution series analysis of positive control RNA (human prostate total RNA) demonstrated that a minimum of 0.02 ng/μl total RNA was necessary for the assay. For each gene, CT values for the positive control RNA and all dissected samples were consistent with a within run s.d. of <0.3 CT, establishing that CT values of all 10 endogenous control genes can be reproducibly determined in dissected cells, and that PCR inhibitors were not present in them. Varying levels of expression were demonstrated for each of the 10 evaluated endogenous control housekeeping genes in the normal and tumor samples (Table 3).

Table 2 Mean difference, relative to H&E, for hematoxylin and methyl green staining for ACTB, HPRT, and GUS
Table 3 Average gene expression level (s.d.) of all 10 endogenous controls measured by CT values

LCM cell count was first evaluated as a normalization method by determining if consistent, reproducible amounts of RNA could be reliably procured from tissue samples. The following experimental strategy was employed. Three 5 μM thick serial histological sections were prepared from each of the 10 tissue blocks, 10 000 cells were dissected from identical areas of each of the three slides, 10 housekeeping genes were measured in all of the samples, and the intra-case variability in recovered RNA amounts was determined. In other words, we evaluated if the three serial histological sections within each case could produce identical amounts of RNA based on matched LCM cell count. Since the three recut sections contained cells that were immediately adjacent to each other within the tissue block, we made the assumption that the housekeeping gene levels within each case were constant. Thus, the only variables were the dissection procurement process and the RNA purification procedure. The data shown in Figure 2, and Tables 4 and 5, indicate that cell count during dissection was not a reliable means of normalizing expression data. Sizable variation was observed within the three intra-case dissections, from ±1.12 to 1.74 CT values in normal tissues and ±1.07 to 1.84 CT values in tumor tissues. These data imply that a comparison of 10 000 dissected cells from sample A against 10 000 dissected cells from sample B can produce an artifactual difference of up to 1.84 CT values using cell count as the normalization strategy. Since changes in gene expression of 1.5–2 CT values (∼3- to 4-fold) are considered biologically important,30 the use of dissection cell count as a means to compare samples is limited only to those studies that do not require precise measurements. However, we did find that cell count was useful in producing ‘ballpark’ RNA input levels within the range that can be successfully analyzed by qRT-PCR, and thus serves as a useful first step when analyzing dissected samples.

Figure 2
figure 2

Variation in CT values for normal and tumor tissue across LCM replicates. CT value variation for ACTB and GAPDH is representative of all 10 housekeeping genes. Each observation is the average of three qPCR technical replicates.

Table 4 Ninety-five percent tolerance intervals showing LCM variation within a tumor or normal sample
Table 5 Average qRT-PCR replicates housekeeping gene expression CT values for MD replicates per case and tissue type

The use of total RNA measurement as an internal comparator for qRT-PCR was next evaluated. LCM was performed on three serial sections from each of the 10 tissue blocks as described above, total RNA was recovered and measured using the NanoDrop method, all 10 housekeeping genes were analyzed by qRT-PCR in the 30 samples, and the intra-case reproducibility of total RNA measurement was calculated. Since, we had established that LCM cell count was not a reliable means to normalize samples, the total RNA data were compared to housekeeping gene levels to determine if the intra-case RNA values were reproducible. Again, we made the assumption that housekeeping gene levels were constant in the three serial recut slides prepared from each tissue block. The MD samples yielded approximately 16 μl of 8–10 ng/μl average of total RNA, which is similar to our previous MD sample collection for microarray studies (data not shown). The total RNA quantity per sample was at the low end of the detection limits for quantitation by NanoDrop (cases 1–4: normal samples mean=8.4 ng/μl; median=5.4 ng/μl; range=2.0–26.8 ng/μl. Tumor samples: mean=10.8 ng/μl; median=6.4 ng/μl; range=1.6–41.5 ng/μl). The data shown in Figure 3 indicate that total RNA quantity measurements are not reproducible across LCM samples and cannot serve as an effective normalization strategy for MD samples. The large variance generated by the RNA analysis technique alone precludes accurate measurement of small and moderate gene expression differences among samples. Similar to using cell count, however, we did find that total RNA is a useful parameter for calibrating ballpark RNA input levels within the range that can be successfully analyzed by qRT-PCR.

Figure 3
figure 3

Variation in total RNA quantitation measurements (NanoDrop) relative to average CT values across the 10 housekeeping genes. Each observation is the average of three qPCR technical replicates for each LCM replicate.

We then evaluated whether housekeeping genes could be used as an internal control for dissected samples. In this approach, the amount of target transcript per cell or per RNA content is not determined as with LCM cell count or RNA measurement; rather, the amount of target transcript relative to the overall transcriptome of a sample is calculated, much like an expression array experiment where input amounts of RNA are normalized between samples based on multiple housekeeping genes or overall mRNA readout. However, unlike expression arrays, it is impractical to use a large set of housekeeping genes for normalization in qRT-PCR samples, due to the time and resources involved; and more importantly, the limited amount of starting material. Thus, it was determined if one or two housekeeping genes could serve as a surrogate measurement for the transcriptome of dissected cells.

To perform this assessment, we started with the assumption that the average of the 10 housekeeping genes selected for the study was indeed reflective of the overall transcriptome of the dissected samples. The gene set was intentionally chosen to include diverse biological functions to increase this likelihood, even in tumors where many metabolic activities are increased. The strategy employed was to measure the 10 housekeeping genes in three separate dissected samples from each case, and statistically evaluate how accurately each individual gene, or small sets of genes, tracked with the average of the 10.

However, to complete this analysis the types of tissue samples that are typically used in qRT-PCR studies first needed to be considered. The two general scenarios are paired samples (diseased and normal samples from the same patient) or unpaired samples (diseased and normal samples from different patients). When the 10 housekeeping genes were initially analyzed, a strong positive correlation was observed between the paired normal and tumor samples from the five cases, indicating that the two study designs need to be separately considered (Figure 4).

Figure 4
figure 4

Correlation of paired tumor (T) and normal (N) samples (ie, T and N from the same patient). Each observation is a single qPCR technical replicate for each LCM replicate.

For unpaired data, 95% tolerance intervals corresponding to a ΔCT for a single tumor sample (one tumor sample, one LCM sample, and the average of three qPCR replicates) were constructed using the results from the variance components analyses on tumor samples, and the genes were ranked in terms of the size of the interval (Table 6). The data indicate that as few as two housekeeping genes can effectively serve as a surrogate of the overall transcriptome. For example, using the average of CYPA and TFRC, the interval is within ±0.54 CTs of the average CT value across all 10 housekeeping genes 95% of the time; whereas, using ACTB alone, the interval is within only ±2.33 CTs of this average 95% of the time.

Table 6 Ninety-five percent tolerance intervals (TI) on the difference between a single gene or the average of two genes and the average of all 10 housekeeping genes for unpaired samples

For paired data, we formed 95% tolerance interval around the ΔΔCTs (one LCM replicate and three qPCR replicates; Table 7). In contrast to unpaired data, only one gene is required for normalization, with PGK1 and RLP tracking the average of the 10 housekeeping genes best (±0.83 CTs and ±0.93 CTs of the average CT value across all 10 housekeeping genes 95% of the time, respectively).

Table 7 Ninety-five percent Tolerance Intervals (TI) on the difference between a single gene or the average of two genes and the average of all 10 housekeeping genes for paired samples

For both unpaired and paired data, the effects of using three LCM replicates vs one dissection, and the effects of using three qPCR replicates as opposed to just one qPCR for each sample on the variation in the average CT value across the 10 housekeeping genes were evaluated. The three potential sources of variation were decomposed into components attributable to; (1) sample to sample variability, (2) LCM replicate intra-case variability, and (3) qPCR replicate variability for each dissection. Using tumor tissue (unpaired case), the three components were estimated as 2.5, 0.56, and 0.01, respectively. These data indicate that the majority of variance is from true biological differences among the samples, and that the qPCR measurements within each dissection are highly stable. However, the intra-case variation among LCM samples is not trivial, suggesting there are advantages in performing multiple dissections from each case in a study. Results were similar for the paired situation (data not shown).

Analyzing the gene expression data using three separate normalization techniques revealed an interesting finding in one patient specimen in the study. Both the normal and tumor samples from case no. 5 appeared to contain approximately 100 times more RNA than the other specimens (Figure 5). This unexpected and anomalous result was supported by four separate experimental methods, and was observed consistently in multiple independent dissections. First, the initial NanoDrop readings from the 10 000 dissected normal and from the 10 000 dissected tumor cells showed significantly elevated RNA levels compared to the other four cases, in the range of 26 000 ng as opposed to 260 ng. Second, the BioAnalyzer chip that is routinely used to assess RNA quality in the samples showed an unusual pattern, an apparent large smear of RNA extending from the top to the bottom of the lane, with barely discernible 28S and 18S rRNA bands. This pattern is not consistent with degraded RNA where a smear of small size fragments at the bottom of the lane is typically observed, along with diminished or absent rRNA 28S and 18S bands. Third, the same samples were analyzed on a standard agarose gel and showed an identical pattern as the Agilent BioAnalyzer. Finally, the qRT-PCR failed for each dissection of case no. 5, even though the protocol was identical to that used for the other eight dissected samples that reliably amplified all 10 housekeeping gene transcripts. However, after diluting the normal and tumor samples from case no. 5 100-fold, each gene was consistently amplified, and expression data similar to the other four cases in the study was obtained.

Figure 5
figure 5

Total RNA concentrations per case sample (ie, average of three MD replicates) tissue type (normal (N) or tumor (T). Case no. 5 was dissected two times and total RNA from the second set of dissections was diluted 1:50 and 1:100. Average total RNA concentrations (ng/μl) per case tissue type: 1N=4.84, 1T=7.71; 2N=5.26, 2T=7.43; 3N=6.33, 3T=16.25; 4N=12.66, 4T=7.00; 5aN=1305.50, 5aT=1301.63; 5bN=1345.37, 5bT=1302.50; 5bN 1:50=31.27, 5bT 1:50=27.81; 5bN 1:100=13.71, 5bT 1:100=14.37. a=first set of three MD replicates. b=second set of three MD replicates. *=qRT-PCR did not work (no CT value) due to 100 times higher template concentration causing template inhibition of qRT-PCR.

DISCUSSION

The present study examined the use of three separate strategies for normalizing qRT-PCR measurements of MD cell samples. The data show that neither LCM cell count nor total RNA are precise enough to serve as internal standards for most studies, although as a practical matter both of these methods are useful in calibrating ‘ballpark’ mRNA amounts. However, endogenous housekeeping genes can serve as a surrogate marker of the cellular transcriptome and provide a precise internal control for expression measurements. Of interest is the finding that as few as one or two endogenous housekeeping genes can be used for MD material. With high probability, a combination of two housekeeping genes was within approximately 0.5 CT values of the average across all 10 genes. For paired normal and tissue studies, a single housekeeping gene was within one CT value. Since MD samples typically produce limited amounts of mRNA, the requirement for only one or two genes for normalization will facilitate measurement of a larger number of genes of interest in future studies.

There are several implications of these data for expression analysis of dissected samples. If cell count is utilized as an internal standard, then one can calculate the number of transcripts on a per cell basis and this measurement is independent of the overall transcriptome. Biologically, this information may be uniquely important if the intent of a study is to determine how the absolute amount of mRNA is related to the normal or pathological process being investigated. In other words, if cell count is employed as the normalization standard, an increase in a gene of interest in one sample relative to another indicates that the number of transcripts per cell in that sample is elevated.

Alternatively, comparing gene expression levels using the cellular transcriptome (either RNA measurement or endogenous housekeeping genes as a surrogate) as the normalization standard provides different information. Since the denominator (RNA amount, transcriptome levels) in the ratio of ‘gene of interest to internal control’ is not necessarily stable across dissected samples, the absolute change in the amount of mRNA per cell cannot be calculated. Rather, the relative level of the gene to the amount of RNA is determined and compared among samples. This analysis can produce a different result than when using cell count. For example, if dissected sample A contains 50% more mRNA of a gene of interest than sample B, but also contains 50% more mRNA overall (including 50% more of the endogenous housekeeping genes), then one would conclude that the gene of interest showed no change between the samples when in fact it was elevated by 50% in absolute amount. On the basis of the results of the present study, however, it appears these issues are of less concern when using MD samples as only endogenous housekeeping genes are an effective normalization tool, and cell count does not provide enough precision to be useful for most studies.

By employing tissue microdissection and multiple independent methods to normalize gene expression levels, we discovered a putatively unique RNA state in one case in the study. Both the normal and tumor cells from case no. 5 appeared to show approximately 100-fold more mRNA than the other eight dissected samples. This finding was verified by four experimental approaches and in multiple repeat dissections. Since all five specimens in the study were procured at the same institution in an identical manner, were received together at the NIH in the same shipment, and were processed side-by-side in our laboratory and received the same active DNase treatment, there is no obvious rationale for this difference. Histologically, the tumor from this unusual case showed an overall Gleason score of 5+4=9/10, and displayed a variety of aggressive patterns of growth, including solid sheets of cells, cords, focal neuroendocrine appearance, and areas of comedonecrosis, a variant that is classified as Gleason 5.31 In addition, highly aggressive basaloid carcinoma type areas were identified.

Case no. 5 highlights the unique information that is provided by each internal normalization method. Relying only on endogenous housekeeping genes, it is possible that this unusual finding would have been missed. Typically, we would have calibrated the input RNA using NanoDrop and then subsequent comparison of genes of interest to endogenous control genes would have appeared within the normal range. In other words, the change in apparent absolute mRNA content would not have been readily observed. However, since LCM cell count was used to calibrate general input amounts at the outset of the study, it was immediately clear that case no. 5 was unusual and appeared to contain significantly elevated global mRNA level on a per cell basis.

The cause of the anomalous mRNA results in case no. 5 is not yet known, nor is the relationship to the high-grade cancer observed in this specimen, if any. Analysis of the patient's history, family history, and clinical condition are underway and may reveal an important biological phenomenon. Alternatively, this finding may represent a previously unknown technical artifact associated with gene expression studies of MD patient samples, DNase-resistant mRNA for example, or other artifact related to tissue processing. If so, this problem will need to be uncovered and understood to eliminate this error in future expression studies of dissected samples. Our current interpretation of the data from case no. 5 is that the result is artifactual; however, to date attempts to uncover a technical, non-biological basis for this finding have not been successful.

In summary, three methods for normalizing gene expression data were evaluated in MD tissue samples; cell count during MD, total RNA, and endogenous housekeeping genes. The results indicate that housekeeping genes are the most useful internal standard and they are capable of enabling precise gene expression measurements of dissected cell samples.