Main

Over the past 5–10 years there have been a number of events that have caused concern related to quality assurance in laboratories performing diagnostic testing in oncology and the accuracy of these tests, which ultimately determine patient therapy.1 Our field has recognized the danger in the lack of standardization of processes between different laboratories, the variability inherent in the analytical phase and the lack of control over the preanalytical phase of tissue processing. Biospecimen science, the study of variables affecting biospecimen analysis, has identified a range of factors affecting the quality of harvested tissue.2, 3, 4, 5 These factors, grouped as preanalytical variables, represent a critical problem for biomarker measurement on tissue specimens that can lead to errors that affect patient care.

This issue is most prominent in breast cancer where it has been shown that preanalytical variables, especially delayed time to formalin fixation (cold ischemic time), have affected companion diagnostic testing.6, 7, 8, 9 Overall, it has been estimated that delay to formalin fixation and subsequent protein degradation within the tissue are responsible for a 10–20% false-negative rate for ER in breast cancer.1, 9 A similar effect has been shown for HER2 with loss of protein and RNA as a function of increasing time to fixation of the tissue. These issues have prompted the American Society of Clinical Oncologists (ASCO) and the College of American Pathologists (CAP) to publish guidelines for the evaluation of ER, PgR and Her2 in breast cancer, limiting cold ischemic time to 60 min.10, 11

The issue of preanalytic variables is not limited to the companion diagnostics of breast cancer. Data have been published about the loss of antigenicity of phosphorylated proteins12, 13, 14, 15 and increased levels of hypoxia-induced factors, as well as markers of post-translational modification as a consequence of delayed time to formalin fixation of harvested tissue.16 Also, RNA degradation and time-dependent modulation of gene expression have been observed as a cellular reaction to stress and suboptimal preanalytical processing conditions.15 Finally, the effects of preanalytical variables on harvested tissue do not only have an important role for companion diagnostic testing and development of new drugs but also in retrospective studies using archived tissue collections, where documentation of tissue handling and processing is often minimal.

Therefore, it was our goal to develop and construct an intrinsic control for formalin-fixed, paraffin-embedded (FFPE) tissue that could be used to assess suitability of a given specimen of unknown quality for immunological assessments. Here we describe proof of concept for the development of a tissue quality index (TQI).

MATERIALS AND METHODS

Study Cohorts

This prospectively designed study included three different cohorts of breast tissue, all of which have exact information on preanalytical specimen handling and processing with focus on cold ischemic time. All tissue was used after approval from the Rochester Institutional Review Board, from the Yale Human Investigation Committee protocol no. 8219 or no. 25173, or from the ethic institutional review board for ‘Biobanking and use of human tissue for experimental studies’ of the Pathology Services of the Azienda Ospedaliera Città della Salute e della Scienza di Torino. Written informed consent was obtained from all patients for their tissue to be used in research.

Time to Fixation Breast Cancer Series

Formalin-fixed, paraffin-embedded (FFPE) tissues of 93 breast cancer patients, who underwent surgery at the University of Rochester, School of Medicine (Rochester, NY, USA), were collected. Time from surgical resection to immersion of the specimen in formalin was recorded and ranged from 25 to 415 min. A tissue microarray (TMA) was constructed, consisting of these 93 breast cancer specimens, cell lines, and controls, all represented in twofold redundancy (two histospots per patient specimen).

Normal Breast Tissue Series

Normal breast tissues (NBTs) from 11 breast reduction mammoplasties were collected. Tissue from each patient was divided into different parts and exposed to predefined and controlled preanalytical variables, with time points until immersion of the tissue into formalin at 20 min and 2, 24 and 48 h after surgical removal of the tissue. A TMA was constructed representing these specimens with variable conditions in fourfold redundancy.

Italian Breast Cancer Series

Tissues from 100 breast cancer cases were collected and time until formalin fixation of these tissues was recorded, ranging from 1 to 72 h. This cohort differs from the time to fixation breast cancer (TFBC) series in a way that tissues were vacuum sealed and stored at 4 °C (UVSC) right after surgical removal until gross dissection and immersion of the tissue into formalin.17 TMAs were constructed consisting of the 100 breast cancer specimens represented in twofold redundancy.

Cell lines and cell culture

The cell lines T47D, BT474, SKBR3, MB231, MB468, CHO, A431, HT29, A59-195, and A82-68-B were purchased from ATCC (Manassas, VA, USA), cultured in our lab, and used to create control cell line cell blocks for TMA standardization. Culture conditions and construction of cell pellets for TMAs have been described previously.18

TMA construction

The TMAs for the three breast tissue series were constructed as described previously.18 Representative areas from the FFPE breast tissue were placed in a recipient block using 0.6 mm cores for the TFBCS and the NBT sets and using 1 mm cores for the Italian Breast Cancer (IBC) series.

Antibodies, immunofluorescent staining, and quantitative analysis using the AQUA method

These methods are all described in our previous work.9

Statistical analysis

TQI design: The goal of this procedure was to identify pairs of markers such that the difference between their AQUA scores is indicative of the time to fixation of the corresponding tissue. To simplify the design, we divided the times to fixation in two groups, according to whether the time was longer or shorter than 60 min, which was roughly the median time to fixation in our data set. We divided our data set into three groups, two of which were used for training and one was used as the test set. We used the training set to construct and screen prediction rules for the time to fixation. We then selected the six prediction rules with the largest training performance and computed their test performance. Using the training set, for each pair of markers j and k computed a screening score based on the differences of AQUA scores for the two markers across all patients as follows: Sjk=[(xij−xik)log2(ti/60)] where N is the number of samples (patients) in the training set, xij and xik are the AQUA scores of markers j and k for sample (patient) i, and ti is the time to fixation in minutes for sample (patient) i. The difference is weighted by the term log2(ti/60), which was used to favor pairs of markers whose difference had a different sign depending on whether the time to fixation was longer or shorter than 60 min. The formulation as log2(ti/60) was chosen to reduce contributions to the score from samples with time to fixation around 60 min, and thus reduced the effects of time measurement precision on defining whether a time to fixation was longer or shorter than 60 min. The logarithm was introduced to reflect the notion of exponential decays in markers half-lives, as well as the desired change of sign in the difference of AQUA scores in samples with time to fixation larger than the reference time of 60 min. The hyperbolic tangent represents a smooth threshold to limit spurious outliers. For each pair of markers, j and k, we derived a prediction rule Rjk. The rule Rjk predicts that the time to fixation is larger than 60 min when the AQUA score of marker j is larger than the AQUA score of marker k. For each prediction rule Rjk, we computed its performance using the area under the curve (AUC) of a receiver-operator characteristic curve on the training set. The AUC measured the ability of the difference between the AQUA scores of two chosen markers to classify correctly whether a TTF was longer or shorter than 60 min. Finally, we determined the test performance of the six rules with the largest training performance by computing their AUC on the test data. The chosen design addresses the inherent variability of biological samples by comparing two markers at a time. Our procedure favored pairs of markers such that the AQUA score of one increases with time to fixation, while the other decreases. For further validation of the TQI performance linear regressions were computed for log2-transformed time to fixation and the TQI components. 95% CIs were calculated using the bootstrapping approach (n=5000). Statistical analyses were performed using the R package (http://www.r-project.org/).

RESULTS

Quantification of Protein Expression According to Delayed Time to Formalin Fixation

To construct the TQI, the antibodies described in Table 1 were all tested on the TFBC series and their expression levels and possible changes as a function of preanalytical variables were measured. These results have been described previously.16 Briefly, housekeeping genes do not lose antigenicity with increasing time to formalin fixation, while proteins of hypoxia and some post-translational modifications significantly increase in expression. Other biomarkers, such as phosphotyrosine or phospho-Erk1/2 are more labile and show significant degradation with increasing time to fixation.

Table 1 Antibodies tested for the TQI

The TQI as an Intrinsic Control to Assess Delay to Formalin Fixation

The TQI was constructed using a subset of the TFBC series and its performance measured on a validation subset of the same array. Construction and performance of the TQI are described in Figure 1. The best candidate TQIs consisted of five pairwise combinations of proteins. The AQUA scores of all proteins were log2 transformed and subtracted from each other, with phosphotyrosine:phospho-HSP27 (pHSP27) performing best, the AUC value being 0.75, followed by cytokeratin:pHSP27 and ERK1/2:pHSP27 with training set AUC values between 0.6 and 0.7 (Figure 1a) and testing set somewhat lower. The predictive value of the TQI regarding delayed time to fixation was then assessed on the complete TFBC series (Figures 1b and c). A negative TQI value means that log2-normalized AQUA scores of pHSP27 are higher than the log2-normalized AQUA scores of phosphotyrosine or cytokeratin or ERK1/2. This suggests that the tissue may have suffered loss of antigenicity caused through delay in formalin fixation or other less well-defined preanalytic variables. The number of specimens with negative TQI values appears to be associated with prolonged cold ischemic time on the TFBC series.

Figure 1
figure 1

(a) The performance of six marker combinations on the testing and validation subgroup of the time to fixation breast cancer series as measured by receiver-operator characteristic (ROC) curves and area under the curve (AUC) values. The tissue quality index (TQI) was then calculated on the complete time to fixation breast cancer series. (b) TQI values of cytokeratin:pHSP27 and (c) ERK1/2:pHSP27 in relationship with increasing cold ischemic time.

Validation of the TQI on Two Independent Breast Tissue Validation Cohorts

We then validated the TQI and its performance on two independent cohorts of breast tissue, the NBT series and the IBC series, both of which consist of breast tissue with recorded cold ischemic time. Owing to the lack of lot-to-lot reproducibility of all tested phosphotyrosine antibody cocktails, and/or high levels of heterogeneity of the proteins targeted by this antibody, we were not able to reproduce our original results for phosphotyrosine and therefore omitted this epitope from the TQI. The other biomarkers, pHSP27, cytokeratin, and ERK1/2 showed robust reproducibility and were used in the validation of the TQI.

Before assessing the validation cohorts, we tested the TQI on a unique build (different TMA master block) of the TFBC series, which had initially served as the train/test cohort. We were able to show reproducibility of our original result, suggesting that this TQI is independent of biomarker heterogeneity seen for cytokeratin, ERK1/2, and pHSP27. The performance of the TQI was also validated and measured on this new build of the TFBC series. Linear regressions between increasing time to fixation and the differences of AQUA scores of the TQI markers were computed. Larger differences of the TQI makers correspond with shorter time to fixation measures for a time window of 30 to 120 min (Figure 2a).

Figure 2
figure 2

Validation and performance of the tissue quality index (TQI) on the time to fixation breast cancer (TFBC), the normal breast tissue (NBT), and Italian breast cancer (IBC) series. (a) Linear regressions between increasing time to fixation (log2 transformed) and the differences of AQUA scores of the TQI markers as performance measurement on an independent built of the TFBC. Higher values correspond with shorter time to fixation. The dotted line shows the 95% confidence interval (CI) of the regression line. (b) TQI pairs on each time-point of the NBT series. (c) Chi-squared analysis of the different TQI components. Although each marker combination by itself shows a significant correlation to increasing cold ischemic time, the combination of the separate TQI components facilitates the identification of a larger number of samples, which may have compromised tissue quality. (d) The TQI performance on the IBC series where special fixation conditions appears to result in significantly less epitope degradation.

This proof of principle was followed by validation on the NBT series and the IBC series. Negative TQI values are correlated with increasing delay to formalin fixation on the NBT series, indicating potential for detection of preanalytic epitopic degradation (Figure 2b). Chi-square analysis of each TQI and the combination of both shows a statistically significant association with time to fixation in this series (Figure 2c). The performance of the TQI on build 2 of the TFBC series and the NBT series was also assessed by calculation of the AUC of an ROC curve and showed values of 0.6–0.7 for these marker combinations recapitulating our original result (not shown). The IBC series consists of patient samples, which were vacuum sealed and stored at 4 °C (UVSC) to prevent degradation. Although this is not the current standard of care, we used this cohort as this method represents an alternative approach to diminishing the effects of cold ischemic time. The quantification of pHSP27, cytokeratin, and ERK1/2 and calculation of the TQI revealed mainly positive TQI scores (147 out of 169 readable samples have a positive TQI—the series consists of 100 patients, represented in twofold redundancy on the TMAs) suggesting better preservation of the tissue through vacuum sealing and storage at 4 °C (Figure 2d).

TQI Value and Quantitative ER Expression in Breast Tissue

Recent publications have shown that there is no loss of ER expression for tissue fixed within 1 h of cold ischemic time.16. However, loss of ER antigenicity has been reported for longer cold ischemic times.6, 9 Here we determine if the TQI can indicate loss of ER reactivity. Specifically, we hypothesize that cases with a negative TQI should have lower ER scores. ER was measured by QIF with the clone SP1 on all three breast tissue series. We showed that negative TQI values are significantly correlated with lower ER AQUA scores on the NBT series and the IBC series (Figures 3a and b), with P-values of 0.03. The correlation of TQI values and ER AQUA scores on build 2 of the TFBC series trends toward significance (P=0.067), but the majority of samples in this series were formalin fixed within 2 h. The expression levels of the proteins included in this TQI (pHSP27, cytokeratin, and ERK1/2) do not show any correlation with ER expression as tested by Pearson’s correlation coefficient of each protein separately (data not shown). Thus, the association of the TQI and ER AQUA scores appears to be a function of tissue quality and not a coincidental correlation of the TQI proteins and ER expression.

Figure 3
figure 3

Measurement of the tissue quality index (TQI) performance as a function of estrogen receptor (ER) expression levels quantified by AQUA. Negative TQI values are significantly associated with lower ER AQUA scores on the normal breast tissue (NBT) series and the Italian breast cancer (IBC) series (a and b), while the correlation between TQI values and ER expression does not reach significance on build 2 of the time to fixation breast cancer (TFBC) series.

DISCUSSION

A clinical mishap in Canada and a series of papers focusing on preanalytic variables have illustrated the need for a mechanism for tissue quality assessment.2, 5, 19 In response to these reports and new ER, PR, and HER2 guideline recommendations from the ASCO/CAP panel, efforts have been made to control preanalytical variables, especially cold ischemic time, to standardize companion diagnostic testing.10, 11 However, these standards cannot always be met and clinicians and researchers are often confronted with the challenge of accurate molecular characterization of tissue samples of unknown quality. Here we describe for the first time the construction of an internally calibrated tool consisting of three epitopes and their relative changes to assess the effects of cold ischemic time and the suitability of a given tissue for further immunological assessments.

Although substantial efforts in our field have improved standardization and documentation of tissue harvesting and companion diagnostic testing, controlling and minimizing preanalytical variables on tissue is still problematic and it is not hard to imagine a situation in which a TQI is needed. If for example, a breast cancer specimen of unknown quality tests negative for ER, one could envision assessment of the TQI to discount false-negative results caused through protein degradation. The TQI could also be envisioned to be applied to all specimens received for a clinical trial where tissue is collected from a wide range of sites around the world with less standardized laboratory settings. In this context, the TQI could prove that the tissue had the capacity to inform for the companion diagnostic test or it could suggest elimination of that data point in the subsequent biomarker analysis owing to poor tissue quality. Finally, the TQI could be used on retrospective studies based on archived tissue collections where information on postsurgical tissue processing was not available. In each case, testing the quality of the tissue could provide better accuracy, reproducibility, and applicability for research based on in situ biomarker evaluation.

Although the antibodies used for construction of the TQI were validated and the results of the TQI and its relationship to prolonged cold ischemic time and ER expression levels were reproducible on different breast tissue series, our work should be considered as pilot data and a proof of concept, rather than a definitive TQI test. This first study is subject to a number of limitations. Perhaps, the most significant is the low sensitivity and specificity for prediction. The performance of the two marker combinations—cytokeratin:pHSP27 and ERK1/2:pHSP27—as measured by AUC value, ranges from 0.6 to 0.7 showing the assay is accurate only between two-third and three-fourth of the times. Even though the TQI value is significantly correlated with increasing cold ischemic time, a performance of 0.6–0.7 AUC value suggests that several specimens in these breast tissue series are misclassified with respect to their potential loss of antigenicity.

A second major limitation of this TQI test is that it was constructed using only breast cancer tissues. Although it validated on two independent tissue sets, the IBC series were subject to different preanalytical variables (UVSC conditions),17 resulting in better preservation of biomarker expression as compared with the other two breast tissue series. In the future, we envision applicability of future TQIs in many other tissue types.

Finally, any TQI will always be limited by the variability in the epitope degradation rate between different epitopes. Here, we focused on ER, where we observed a significant association of lower ER expression levels with negative TQI values in two out of the three breast cohorts used for this study. However, the TQI might be better or worse if other epitopes were assessed. It is possible that to be highly accurate in assessment of tissue quality, a unique index may be required for certain classes of proteins or even for individual proteins. For example, phosphoepitopes on tyrosine appear to be highly labile and might require a different TQI than more stable structural proteins like tubulin or actin.

The variable rate of degradation of epitopes raises the question of the time window for the TQI. It has been previously shown that the bulk of the degradation of ER does not occur within the first few hours of delay to formalin fixation but rather at a later time window.6, 8, 9 This observation may explain why we did not observe a significant loss of ER AQUA scores for patients with a negative TQI on the TFBC series where most of the cases were fixed within 2 h. In comparison, in the longer NBT and IBC sets, where delay to fixation was stretched to 48 h or more, the TQI performed better. In the future one can envision different TQIs constructed for different time windows, depending on the specific projected application for the TQI.

In summary, for the first time, we report the construction of a TQI that serves as an intrinsic control of tissue quality. Although this work is preliminary and further optimization of this TQI is necessary to improve its performance and applicability to multiple tissue types, it represents a proof of concept for the potential for quantitative quality assessment of tissue specimens. We are hopeful that this approach to quantitative measurement of three or four biomarkers and their relative changes/relationships to each other can provide a tool to monitor effects of preanalytical variables on tissue specimens. In the future, we believe we will see more sensitive and specific tools for assessment of tissue quality.