Main

Assessment of hormone receptors to predict clinical outcome in breast cancer has been an accepted standard for nearly two decades. The role of estrogen receptor (ER) as a prognostic and predictive factor has been well established. The term prognostic factor is used to define any measurement available at the time of diagnosis or surgery that is associated with clinical outcome in the absence of systemic adjuvant therapy. On the other hand, the term predictive factor defines any measurement associated with response or lack of response to a particular therapy. The primary reason to assess ER in breast cancer is as a predictive factor for response to endocrine therapy. However, not all patients with ER-positive breast cancers derive benefit from this form of therapy. Therefore, additional markers for response to endocrine therapy have been sought. Since progesterone receptor (PR) expression is induced by ER, it has been studied as a surrogate marker for ER activity and has been used as an additional predictive factor for hormonal therapy in breast cancer. The results of overview analyses of randomized clinical trials in early breast cancer have shown that PR may add to the power of ER for predicting response to endocrine therapy.1 PR also predicts response to endocrine therapy in metastatic breast cancer.2 Like ER, PR has been measured in breast cancer patients as a ‘standard’ biomarker for a long time, initially by ligand-binding assay and for over a decade by immunohistochemistry. Panels of experts for both the College of American Pathologists and the American Society of Clinical Oncology have recommended that both ER and PR must be measured in all primary breast cancers to select patients for therapeutic and adjuvant hormonal treatment.3, 4 All of the studies reviewed by these panels to support these recommendations assessed PR by biochemical ligand-binding assay, which had been the gold standard to measure this biomarker in routine clinical practice.

However, most laboratories in the United States and throughout the world have switched to assessing ER and PR by immunohistochemistry on archival tissue since the mid 1990s, and no longer perform ligand-binding assay due to the potential advantages of immunohistochemistry (eg relatively low cost, time efficiency, and the ability to measure proteins directly on tumor cells).5 Unfortunately, most laboratories are using immunohistochemistry assays that do not meet the guidelines for technical and clinical validation, which are strongly recommended for routine clinical practice.6, 7, 8 Technical validation means that the assay used to measure a factor is sensitive, specific, reproducible, and interpreted in a relatively uniform manner between laboratories. Clinical validation means that the factor has been shown in studies, ideally randomized clinical trial, to identify subsets of patients with significantly different clinical outcome or response to treatment. A useful factor is one that is validated and used by physicians in daily practice to make treatment decisions. Panel of experts at College of American Pathologists, American Society of Clinical Oncology, and National Institute of Health in their last consensus statements had several reservations about assessing ER and PR by immunohistochemistry and noted a lack of standardized and validated assays.3, 4, 9

Although a few validated immunohistochemistry methodologies have been published for ER,10, 11, 12, 13, 14 none exist for PR. The few published studies that have assessed PR by immunohistochemistry consisted of patients of mixed clinical stages and variable treatment, making it impossible to separate the prognostic effects from the predictive value of PR. In addition, these studies used many different antibodies of varying sensitivity and specificity, a variety of other reagents, different scoring systems, and often arbitrary definitions of PR cut-offs to define positivity, making it extremely difficult to compare results.

The purpose of the present study was to develop and validate an immunohistochemical assay for PR in breast cancer, as we did previously for ER.13 We previously validated an immunohistochemical assay for PR on frozen sections using antibody KD 68.15 However, frozen section immunohistochemistry is not relevant today and the KD 68 antibody is no longer commercially available, prompting us to repeat the study on formalin-fixed, archival tissue sections with an available antibody. In this study, we developed an assay for PR by immunohistochemistry in archival breast cancer samples using antibody 1294, compared it to PR assessed by ligand-binding assay in the same samples, and determined its prognostic and predictive abilities in a clinical trial.

Materials and methods

Patient Characteristics

This study was conducted with approval of local Institutional Review Boards and accomplished in two phases following published guidelines for validation of prognostic and predictive factors.6, 7, 8 The first set of samples was designated as the ‘test set’. This set of samples served to define the clinical utility of immunohistochemical assay, and establish the cutoff to define PR-positive and PR-negative in the subset of breast cancer patients who received endocrine therapy only. The ‘test set’ consisted of 1235 cases of primary breast cancers in which PR measurements by ligand-binding assay were available, thus allowing comparison of the two methods to assess PR. The patient characteristics of this cohort, which is a subset of our large tumor bank used in our previous studies of ER13 and PR on frozen sections,15 have been described in detail in earlier studies. Their characteristics are summarized in Table 1.

Table 1 Characteristics of patients and breast cancers in test and validation sets

The second set, designated the ‘validation set’, was used to confirm the clinical utility of the immunohistochemical PR assay. It consisted of 423 premenopausal Vietnamese and Chinese women with stage II and III operable breast cancer treated with mastectomy (Table 1). These patients were a subset of 709 patients who participated in a randomized clinical trial of bilateral oophorectomy plus adjuvant tamoxifen for 5 years vs no adjuvant treatment, and whose clinical characteristics and location of participating centers were detailed in our previous report.14 None of these patients received any adjuvant cytotoxic chemotherapy. Archival paraffin blocks were available from 468 (66%) of 709 patients. Prior to performing the immunohistochemistry assay, these blocks were evaluated for overall quality, blinded to therapy or outcome. After this quality control, 423 cases were selected for this study. The remaining 45 cases were classified as ‘unsatisfactory’, which was usually due to inadequate number of tumor cells and/or very poor fixation. Patient and tumor characteristics of these patients were compared to the entire trial population and no statistically significant differences were found (data not shown), suggesting that this selection process did not introduce significant biases. A total of 213 patients were randomized to the treatment arm and 210 to the control arm, and their characteristics are summarized in Table 1.

Biochemical Assay for PR

PR levels were measured by the standard dextran-coated charcoal method as previously described16 and validated using level ≥5 fmol/mg protein as PR-positive.17

Immunohistochemical Assay for PR

As the initial step, we compared several PR antibodies; KD 68 (Abbott Diagnostics, IL, USA), 1A6 (DakoCytomation, CA, USA), Ab-8 (Neomarkers, CA, USA), 636 (DakoCytomation, CA, USA), and 1294 (DakoCytomation, CA, USA), to determine how well they performed on paraffin sections, using a battery of antigen retrieval conditions and optimizing sensitivity by adjusting the concentration of the primary antibody in order to detect a wide range of PR expression (from negative to highly positive). We then compared these antibodies in a set of 233 breast cancer samples in which PR status was already known by ligand-binding assay. Antibody 1294 showed the best range for detecting PR expression and had highest concordance (82%) with PR by ligand-binding assay, and was therefore selected for this study (data not shown). Recently, a comprehensive study on several PR antibodies by Press et al18 has also shown the superiority of 1294 antibody.

Immunohistochemical Staining in the Test and Validation Sets

For the test set, tissue sections were prepared from pulverized frozen tumor specimens left over from the ligand-binding assay as previously described,19 with minor modifications. Owing to the ultracold temperature used during pulverization, the tissue was fractured into histologically intact fragments ranging from approximately 0.1 to 1.0 mm in size. Individual samples consisted of 100 mg pellets of this particulate tissue, which was fixed for 8 h in 10% neutral buffered formalin and routinely processed to paraffin blocks. These uniformly prepared tissue samples have been used to validate many prognostic and predictive factors in breast cancer including ER13 and others.20, 21 For the validation set, samples were fixed in 10% neutral buffered formalin for 6–24 h in most cases, processed, and embedded in paraffin at different centers in Vietnam and China.

The immunohistochemistry assay was performed on 4 μm sections cut from the blocks and float-mounted on plus-coated glass slides (Fisher Scientific, TX, USA). The essential steps of immunohistochemistry assay included antigen retrieval in 0.9 M Tris-HCl buffer (pH 9.0) in a pressure cooker for 10 min; blocking endogenous peroxidase with 3% hydrogen peroxide; blocking nonspecific protein binding with an avidin–biotin blocking kit (Vector, CA, USA); incubating with primary mouse monoclonal antibody 1294 (DakoCytomation, CA, USA) at a dilution of 1:1600 for 1 h at room temperature; linking with biotinylated rabbit anti-mouse secondary antibody (DakoCytomation, CA, USA) at a dilution of 1:200 for 30 min; enzyme labeling with freshly prepared horseradish peroxidase-labeled streptavidin (DakoCytomation, CA, USA) at a dilution of 1:200 for 30 min; developing chromogen DAB+ solution (DakoCytomation, CA, USA) for 15 min; enhancing the chromogen with 0.2% osmium tetroxide; and lightly counterstaining the sections with 0.05% methyl green (Fisher Scientific, TX, USA). Human endocervix was used as a positive control because of its easy availability and relatively stable reactivity. The negative control consisted of nonimmune mouse IgG substituted for primary antibody. Controls were run with each batch of slides, at an average of 40 slides per batch. The method produced a distinct nuclear signal in PR-positive tumor cells (Figure 1). Complete protocols and reagent list can be found on our laboratory website (http://www.breastcenter.tmc.edu/cores/path).

Figure 1
figure 1

Progesterone receptor immunostaining on paraffin-embedded invasive breast cancer, using antibody 1294. Inset: Nuclear staining in endocervical gland and stroma used as positive control (× 400).

Interpretation of Immunostained Sections (Scoring System)

Immunostained slides were evaluated by light microscopy and the immunohistochemistry signal was scored using the so-called ‘Allred Score’, as in our previous study assessing ER by immunohistochemistry13 as well as several studies of other biomarkers.12, 20, 22, 23, 24, 25 Briefly, a proportion score was assigned representing the estimated proportion of positive staining tumor cells (0=none; 1<1/100; 2=1/100 to <1/10; 3=1/10 to <1/3; 4=1/3–2/3; 5=>2/3). Average estimated intensity of staining in positive cells was assigned an intensity score (0=none; 1=weak; 2=intermediate; 3=strong). Proportion score and intensity score were added to obtain a total score that ranged from 0–8. This system is easy to learn and highly reproducible.13 Slides were scored by one pathologist (SKM) who did not have knowledge of ligand-binding assay results or patient outcome.

Statistical Methods

Distributions of categorical variables were compared using standard χ2 tests. Time to event or censor date was calculated from the date of diagnosis for the test set and date of study entry for the validation set. The definition of disease-free differs slightly between test set and validation set. For the test set, disease-free is defined as first occurrence of recurrence or metastasis. Deaths without recurrence were not counted as events in the test set, as described previously.13 On the other hand, disease-free in validation set was defined as first occurrence of recurrence or death, if before recurrence.14 For overall survival, an event was defined as death from any cause. Patients with no event were censored at the last follow-up date.

An optimal cut point for PR positivity based on total immunohistochemistry score was obtained by using the minimum P-value approach.26 Since there are seven possible cut points from the immunohistochemistry total score (range: 0, 2–8), a Bonferroni procedure was employed which multiplies the P-values by seven in order to arrive at an adjusted P-value.27 The corresponding hazard ratios were also adjusted following the methods described before (27). Univariate and multivariate analyses for disease-free and overall survival used the log-rank test and the Cox proportional hazards model and estimates of hazard ratios and P-values were adjusted for multiple significance as described previously.13 There has been strong evidence of a nonproportional effect of ER as indicated by the loss of its predictive value over long periods of patient follow-up.28 For this study, tests of proportionality of PR on disease-free and overall survival over the follow-up period were performed using hypothesis tests of PR as a time-dependent variable in the Cox model, as well as examining plots of the beta coefficients of PR from the Cox model vs follow-up time.29 For this study population, evidence of nonproportionality of PR was not detected, and Cox models were therefore developed without adjustment for PR as a time-dependent variable. The validation set was analyzed in a similar manner, without incorporating P-value or hazard estimate adjustments. Although the validation set was not powered to detect an interaction effect, an interaction term between treatment group and PR positivity status was included in the Cox model in order to evaluate prognostic and/or predictive effects of PR in this randomized trial. All statistical tests were two-sided at the 5% level of significance, and were performed using the SAS Version 8.0.

Results

Distribution of Immunohistochemistry Scores and Concordance with Ligand-Binding Assay in the Test Set

In the test set, 525 tumors (43%) showed no staining for PR (ie total score=0). Only a few cases showed staining at extreme ends of positive staining (ie 1.3% had a total score=2 and 2.6% had a total score=8). The rest of the immunohistochemistry scores were approximately uniformly distributed, ranging from 9 to 12% (Figure 2). We then did a head-to-head comparison of the immunohistochemistry and ligand-binding assay for concordance of results in 1219 cases in which PR ligand-binding assay was available. There was 86% concordance between the two methods: 583 cases (48%) were positive and 470 cases (38%) were negative by both methods. Among the 14% discordant cases, 8% were positive by immunohistochemistry only and 6% by ligand-binding assay only (Table 2). There was thus good agreement between the two methods, yielding a kappa statistic of 0.72.

Figure 2
figure 2

Distribution of PR immunohistochemical score in the samples.

Table 2 Comparison of PR status between ligand-binding assay and immunohistochemistry in the test set

Determination of Cutoff Value to Define PR-Positive

The cutoff to segregate patients into clinically PR-positive vs clinically PR-negative was determined by univariate cutpoint analysis using both disease-free and overall survival as the end points in the test set. The analysis was performed in two groups of patients: those receiving no adjuvant therapy and those who received endocrine treatment only (Figure 3). Preliminary analyses of the association of PR immunohistochemistry using proportion score and intensity score on disease-free and overall survival in a Cox regression model indicated that neither proportion score nor intensity score alone were strongly associated with patient outcome. Therefore, cutpoint analysis was only performed for PR immunohistochemistry total score. The value of the total score that yielded the smallest log-rank P-value was chosen as the optimal cut point to define PR positivity. PR was a weak prognostic factor (overall survival only) for patients who received no adjuvant therapy. The best cutoffs and corresponding P-values for disease-free and overall survival were total score=4 (P=0.053, adjusted P=0.371) and total score=5 (P=0.003, adjusted P=0.021), respectively. Among patients who received adjuvant endocrine therapy, however, PR was a strong predictor of both disease-free and overall survival. The best cutoff for both end points was total score>2, with P-values 0.0003 (adjusted P= 0.0021) and 0.0002 (adjusted P=0.0014), respectively (Figure 3). Therefore, in all subsequent analyses, we classified tumors as PR-positive if the total score by immunohistochemistry was greater than 2 and PR-negative if the total score was 0 or 2. Of interest, this was the same cutoff value established in our previous study of ER by immunohistochemistry.13

Figure 3
figure 3

Determination of cutoff value to define PR-positive. These are Kaplan–Meier curves for disease-free survival (n=348) corresponding to each possible total score by immunohistochemistry. Note the separation between patients with total score of 0 and 2 vs ≥3. Tumors with a total score of >2 were defined as PR-positive.

Association of PR by Immunohistochemistry with Other Clinical and Pathological Variables

In the test set, PR positivity was significantly associated with node-negative disease (P=0.008), special histological tumor types (P=0.0042), smaller tumor size, postmenopausal status (age >50 years), diploid tumors, low S-phase, ER by ligand-binding assay and immunohistochemistry, PR by ligand-binding assay as well as endocrine and chemotherapy (all P-values <0.0001). Association between PR as assessed by immunohistochemistry and other clinical and pathological variables in the validation set are displayed in Table 3.

Table 3 Association between PR and other clinical and pathological variables in the validation set

Comparison of Prognostic and Predictive Abilities of PR by Immunohistochemistry and Ligand-Binding Assay in the Test Set

Multivariate Cox regression models including nodal status, tumor size, and age were used to compare the abilities of PR by immunohistochemistry and ligand-binding assay to predict clinical outcome. The results are summarized in Table 4, showing that PR measured either by immunohistochemistry or ligand-binding assay was not a significant prognostic factor in untreated patients. However, PR was a strong predictive factor as seen in the subset of patients receiving endocrine therapy only (n=362) and those receiving both endocrine and cytotoxic chemotherapy (n=107). PR positivity by immunohistochemistry was an independent predictor of improved disease-free (HR=0.546, P=0.0034) and overall survival (HR=0.595, P=0.0040) in endocrine-treated patients. Since the cutpoint for PR positivity by immunohistochemistry was determined from the same patient subset, adjusted HRs and P-values are also presented in Table 3. Similar analyses of PR assessed by ligand-binding assay in the same patients showed marginally significant ability to predict disease-free (HR=0.673, P=0.0534) and a significant ability to predict overall survival (HR=0.641, P=0.0124). In both settings, the predictive ability of PR by immunohistochemistry was stronger than PR by ligand-binding assay, both in the group of patients who received endocrine therapy only as well as those who got both endocrine and chemotherapy (Table 4). In a multivariate analysis including both PR by immunohistochemistry and PR by ligand-binding assay, along with tumor size, nodal status, and age, PR by immunohistochemistry remained significantly associated with disease-free (HR=0.539, P=0.0029) and overall survival (HR=0.616, P=0.0078), while PR by ligand-binding assay did not remain significant in the model.

Table 4 Multivariate cox regression models to compare prognostic and predictive ability of PR results by ligand-binding assay and immunohistochemistry in the test set

Prognostic and Predictive Ability of PR in the Validation Set

The study design in the validation set provided us the opportunity to confirm the reproducibility, usefulness, and applicability of this assay to an independent set of patients. Using Kaplan–Meier curves, in the control group, patients with PR-positive tumors showed 9 and 16% improvements in disease-free and overall survival at 5 years, with P-values of 0.058 and 0.004, respectively. In the treatment arm, patients with PR-positive tumors showed 9 and 13% improvement in disease-free and overall survival at 5 years due to endocrine therapy with P-values of 0.015 and 0.005, respectively (Figure 4).

Figure 4
figure 4

Clinical outcome and PR in the validation set: Kaplan–Meier survival curves for disease-free and overall survival. In the control arm (n=210), there was 9 and 16% survival advantage at 5 years for disease-free and overall survival, respectively, as compared to 9 and 13% benefit in the treatment arm (n=213).

In Cox multivariate models of the entire patient population, containing nodal status, treatment, tumor size, histological grade, and Her-2/neu status, PR by immunohistochemistry remained an independent predictor of clinical outcome with P-values of 0.038 (hazard ratio 0.67) and 0.007 (hazard ratio 0.55) for disease-free and overall survival, respectively (Table 5). In the model that also included ER as a variable, PR did not remain significant, perhaps due to either imbalance in ER distribution in the two arms or ER and PR being highly correlated and thus not contributing independently (data not shown). A test of interaction between PR status and treatment was included in the overall model to determine if PR provides additional prognostic information with hormonal treatment. The hazard ratio for interaction term translated to an interaction hazard ratio equal to 1.13.30 For 80% power and 5% significance level, the available sample size in the validation set could only detect an interaction hazard ratio of at least 2.8, as opposed to the observed 1.13 in our multivariate model,30 suggesting that this analysis was underpowered to show an independent predictive effect of PR.

Table 5 Multivariate cox regression analysis in the validation set

Discussion

ER and PR status as measured by biochemical ligand-binding assay are the only prognostic and predictive biomarkers recommended for routine clinical use in breast cancer by the American Society of Clinical Oncology and the College of American Pathologists.3, 4 Both are relatively weak prognostic factors, but strong predictive factors for response to adjuvant and therapeutic hormonal therapy. The primary reason for measuring these biomarkers today is the latter. These recommendations, including the overview analysis of randomized clinical trials in early breast cancer,1 are based on standardized ligand-binding assay methodology. However, ligand-binding assay has been replaced by immunohistochemistry on archival tissue over the past several years. Several recent studies have shown that assessing ER by immunohistochemistry is as good as or better than the ligand-binding assay,10, 11, 12, 13 but similar validation studies for PR are lacking.

There are several reasons that comprehensive studies of PR by immunohistochemistry lagged behind that for ER. First, many clinicians depend on ER status alone to select patients for hormonal therapy. PR alone was found to be a weaker prognostic and predictive factor as compared to ER in studies using ligand-binding assay.1 In addition, until relatively recently, there were only a limited number of good antibodies available for PR that worked on archival tissue. Despite these limitations, there have been several studies of assessing PR by immunohistochemistry in various settings in breast cancer. For example, three studies15, 31, 32 assessed patients receiving adjuvant hormonal therapy alone, and all three showed a significant relationship between PR positivity and improved outcome. However, two were based on frozen section immunohistochemistry15, 31 and the other32 used the KD 68 antibody, rendering them irrelevant today because laboratories exclusively use fixed archival tissue and because the KD 68 antibody is no longer commercially available. A total of 12 studies looked at patients receiving combined hormonal and cytotoxic chemotherapy.33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44 Only seven33, 34, 35, 39, 40, 42, 44 showed a significant relationship between PR status and clinical benefit. Of these positive studies, four34, 35, 39, 40 used frozen section immunohistochemistry, and the other three, which were done on archival tissue, used either KD 6842 or noncommercial antibodies.33, 44 Similarly, four studies,12, 31, 35, 45 evaluated patients with advanced/metastatic disease, and three12, 31, 35 showed a significant clinical benefit of hormonal therapy in patients with PR-positive tumors. All three studies employed the KD 68 antibody. Looking at these studies as a group, over 60% were based on frozen section immunohistochemistry, and about the same number on the KD 68 antibody. Collectively, these studies were based on five different antibodies, and used six different methods of scoring with arbitrary cutoffs to define PR positivity, ranging from 1 to 50%! Clearly, this body of work falls far short of the guidelines for validation of prognostic and predictive factors.6, 7, 8

Our goal was to develop an accurate and reliable assay to assess PR by immunohistochemistry in archival tissue based on commercially available reagents at reasonable cost, and to validate the clinical usefulness of this assay compared to the gold standard ligand-binding assay. In our test set, we have demonstrated that PR by immunohistochemistry provides significantly better clinical information as compared to ligand binding assay. In addition, an important finding of this study is the low optimum cutoff to define PR-positive (total score >2, corresponding to >1% weakly positive tumor cells). Of interest, in our previous validation of ER by immunohistochemistry in breast cancer, total score >2 was also determined to be the optimum cutpoint.13 In the present study, PR total score=3 (the lowest positive score, corresponding to 1–10% weakly positive tumor cells) comprised 15% of the endocrine-treated patients (Figure 3). Most laboratories arbitrarily use 10% as a cutoff to define PR-positive. Thus, our results suggest that using such a high arbitrary cutoff can lead to misclassification of as many as 15% of breast cancer patients as PR-negative. Finally, we have demonstrated the clinical usefulness of this assay in an independent set of patients in a randomized clinical trial. This validation set did not have enough statistical power for analyses of treatment interaction, or for evaluation of PR in relation to ER (ie value of PR in ER-negative tumors). This latter aspect of PR and its utility in ER-negative cancers has been analyzed in detail in our two other large databases by Bardou et al,46 but PR in those samples was assessed by ligand-binding assay.

In summary, we have developed and validated an immunohistochemical assay for PR in archival tissue that is sufficiently validated to justify routine clinical use in the management of breast cancer patients. However, as compared to ER, PR adds only a limited amount of additional predictive information for response to hormonal therapy. We demonstrated that patients with even very low levels of PR (1–10% weakly positive cells) have a better clinical outcome. This study suggests that PR by immunohistochemistry can provide useful clinical information that is better than ligand-binding assay. Laboratories performing assessment of PR in routine clinical practice should either comprehensively validate their assay or adopt a validated assay, in order to report reliable and clinically meaningful results for the clinicians.