Main

Human epidermal growth factor receptor 2 (HER2) is one of four known members of the epidermal growth factor receptor family. About 15% of invasive human breast cancers show an HER2 amplification and overexpression.1 These tumors have a more aggressive behavior, including a reduced overall survival (OS) and a higher rate of recurrence.2 In 1998, the therapeutic antibody trastuzumab was approved by the Food and Drug Administration for HER2-positive, metastatic breast cancer.3 In the following years, additional anti-HER2 agents, including lapatinib, pertuzumab, and trastuzumab emtansine (T-DM1) were developed.4 As the anti-HER2 therapy is only useful in HER2-overexpressing breast cancer, the accurate HER2 determination is essential for therapeutic management of breast cancer patients. The current guidelines of the American Society of Clinical Oncology/College of American Pathologists Clinical Practice Guideline recommend a validated immunohistochemistry assay, if applicable followed by an in situ hybridization (ISH).5

Because of the relevance for HER2 for therapeutic strategies, the correct determination of this marker in the pathology laboratory has been in the focus of clinical discussions for many years. Typically, in this context, it is stated that ‘up to 20% of current HER2 testing may be inaccurate.’6 This statement is made in the ASCO-CAP guidelines, which were published in 2007, and it is based amongst others on a study by Paik et al,7 which was performed on 104 tumors. Considering the significant advances in standardization of pathology assays over years, this result might not be relevant for the current clinical reality, and an update of concordance data is needed for a realistic view of pathology standards.

In addition to these technical issues, there is still the clinical hypothesis that anti-HER2 therapies might also be effective in tumors with low HER2 expression. This hypothesis is based on data from Paik et al, who showed increased response rates to anti-HER2 therapy in tumors with negative central HER2 staining that have had received anti-HER2 treatment based on positive local testing.8 On the basis of this observation, in 2000 the National Surgical Adjuvant Breast and Bowel Project (NSABP) had started the NSABP trial B-31, with the aim to evaluate the correlation between HER2 copy number and benefit in a combination of four cycles of doxorubicin and cyclophosphamide followed by four cycles of paclitaxel plus trastuzumab (ACTH) vs standard chemotherapy in the adjuvant setting.9 In a recent study, Loi et al10 describe that patients with HER2-positive, estrogen receptor (ER)-positive (determined by immunohistochemistry) breast cancer with low FISH ratio (≥2 to <5) or with higher ESR1 levels seem to benefit less from adjuvant trastuzumab after chemotherapy.

The neoadjuvant clinical studies of the German Breast Group have evaluated different anti-HER2 therapies over several years. From 2005 to 2010, local HER2 testing was performed, and patients in the GeparQuattro and GeparQuinto study have received anti-HER2 therapy based on local HER2 results. From 2011 on, central HER2 testing was established, but local results were recorded in the clinical database as well. On the basis of this data, we are now able to evaluate the development of HER2 testing on a nation-wide level in a clinical study database of 8289 tumors, of which 1581 were reported to be HER2-positive based on local and/or central pathology. A total of 1071 of these tumors have received anti-HER2 therapy based on the local testing, which allows the evaluation of response to anti-HER2 therapy in tumors that have been found to be HER2-negative in retrospective central testing.

As a primary aim of this study, we evaluated the discordance for HER2 between central and local pathology in five sequential large neoadjuvant clinical trials in a total of 1581 tumor samples over a time of 12 years. The local HER2 status of the 1581 tumor samples had been determined in several pathology institutions throughout Germany as a part of the routine diagnostic workup before the clinical decision to include the patient in a neoadjuvant trial. Therefore, the local pathologists were not aware of the fact that the patient would be included in clinical trials, which allows us to perform an unbiased evaluation of diagnostic standards in clinical routine. We evaluated subgroups of cases that show a higher discordance rate, in particular focusing on differences between hormone receptor (HR)-positive and -negative tumors.

As a secondary aim, we evaluated the clinical end points of pathological complete response (pCR) as well as disease-free survival (DFS) in those patients that had received an anti-HER2 therapy based on local testing.

Materials and methods

Patients

All Gepar trials were neoadjuvant prospective, multicenter, randomized studies in patients with primary invasive breast cancer. An overview of the trials is given in Figures 1 and 2.

Figure 1
figure 1

Consort statement with overview on Gepar trials and patients included in this analysis.

Figure 2
figure 2

Discordance between the centrally and locally assessed HER2 status of all cases (a), as well as depending on the hormone receptor status (b), respectively, in GeparTrio (G3), GeparQuattro (G4), GeparQuinto (G5), GeparSixto (G6), and GeparSepto (G7).

A total of 2357 women with primary breast cancer (cT2-cT4, cN0-3, and cM0) were enrolled between July 2001 and December 2005 in the phase III GeparTrio study (NCT00544765)11, 12 and the GeparTrio pilot13 trials. At this time, trastuzumab was not part of adjuvant or neoadjuvant therapy, and patients were not treated with any neoadjuvant anti-HER2 therapy.

Between August 2005 and November 2006, a total of 1509 patients with primary breast cancer (cT1c-cT4, cN0-3, and cM0) were enrolled in the phase III GeparQuattro study (NCT00288002). Therefrom, 450 patients had locally HER2-positive and 1058 patients had centrally confirmed HER2-negative tumors.14, 15

In the phase III GeparQuinto trial (NCT00567554; cT1-4, cN0-1, and cM0), 621 patients were enrolled in the HER2-positive arm between November 2007 and July 2010.16 In addition, another 1948 patients were enrolled between November 2007 and June 2010 in the HER2-negative arm, whereof 1925 started with therapy.17, 18, 19

In all, 728 patients were screened between August 2011 and December 2012 for suitability of the phase II GeparSixto trial (NCT01426880; cT1-4, cN0-3, and cM0). From these, 588 women initiated with chemotherapy thereof 286 patients had a locally HER2-positive, but only 273 patients (46.4%) had a centrally confirmed HER2-positive tumor.20

Between July 2012 and December 2013, a total of 1229 patients were randomly assigned in the phase III GeparSepto trial (NCT01583426). Therefrom, 429 patients had a locally HER2-positive, but only 402 patients (32.7%) exhibited a centrally confirmed HER2-positive tumor. The main results of the trials have been published previously.11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21

In the GeparTrio, GeparQuatro, and GeparQuinto cohorts, the central HER2 assessment was performed retrospectively, and all patients in the clinical database with a locally HER2-positive tumor and an available central HER2 status were included in this analysis. In GeparSixto and GeparSepto the central HER2 analysis was performed prospectively before randomization. GeparSixto includes only triple-negative breast cancer and HER2-positive tumors. Thus, the inclusion depends on the HER2 status and would introduce a bias in our analyses; therefore, GeparSixto was not included in this comparison.

In addition, GeparSepto includes only patients with elevated risk based on factors cT, cN, sentinel node status, HR status, Ki67, and HER2 status. Thus, also for GeparSepto the inclusion criterion depends on the HER2 status. But in opposite to GeparSixto a potentially discordant HER2 status affects the inclusion of only few patients, because most patients are included anyway by one or more of the other factors. Therefore, the bias introduced by the GeparSixto patients is small if any (Figure 1).

The baseline characteristics of all included patients are shown in Table 1. This study is reported according to the REMARK criteria.22

Table 1 Baseline patient characteristics of all included patients

Histopathological Examination

A positive HER2 status was defined using immunohistochemistry as HER2 3+ (DAKO score) or HER2 2+ with HER2 gene amplification (ISH). HER2 gene amplification was defined as ISH ratio >2.2 for all included studies except for GeparSepto. In GeparSepto, the cutoff was changed to >2.0 based on the updated ASCO-CAP guidelines. In GeparTrio, the central assessment of HER2 expression by immunohistochemistry was performed on a tissue microarray followed by a silver-enhanced ISH (SISH), if necessary.23 In GeparQuattro, the central assessment of HER2 expression by immunohistochemistry was performed on large sections as well as on tissue microarray for selected cases.24 In GeparQuinto, the immunohistochemistry analysis as well as the SISH for HER2 was analyzed on large sections, too.25

The GeparSixto and the GeparSepto trials were the first studies where the locally HR and the HER2 status had to be confirmed by the central pathology (Institute of Pathology, Charité—Universitätsmedizin Berlin) before randomization on a whole slide.

Immunohistochemical Staining

Centrally immunohistochemical staining of HER2 was performed using different antibodies that were established at the Institute of Pathology, Charité Hospital at the time at which the study was conducted (GeparTrio: HercepTest antibody, Dako, Glostrup, Denmark; GeparQuattro: rabbit polyclonal anti-HER2, antibody, clone A0485, DakoCytomation, Hamburg, Germany; GeparQuinto: monoclonal anti-HER2 antibody, clone 4B5, Ventana Medical Systems, Tucson, AZ, USA; and GeparSixto and GeparSepto: monoclonal anti-HER2 antibody, clone 4B5, Ventana Medical Systems). For HER2 SISH, the Inform-SISH system, Ventana (GeparTrio and GeparQuattro) or the Ultra View SISH Detection kit, Ventana Medical Systems (GeparQuinto, GeparSixto, and GeparSepto) were used. The stainings were performed using the Discovery XT autostainer (Ventana). Data on local HER2 analysis were extracted from the pathology reports.

PCR

A total of 285 formalin-fixed paraffin-embedded samples from the GeparTrio study and 243 formalin-fixed paraffin-embedded samples from the GeparQuattro study have been evaluated for mRNA expression of HER2 and ESR1, the results have already been published for both cohorts.23, 24, 26 In this study, we have combined the analyses to show the different ranges of HER2 mRNA in ESR1-positive and ESR1-negative tumors. RNA was extracted using the VERSANT Tissue Preparation System (Siemens Healthcare Diagnostics, Tarrytown, NY, USA) and qRT-PCR was performed as previously described.26, 27, 28, 29

Statistical Evaluation

Statistical evaluation was performed using SPSS Statistic version 22 (IBM Corporation, Armonk, NY, USA) as well with R version 3.2.2 (R Foundation for Statistical Computing, Vienna, Austria). The graphics concerning the RNA data were generated with GraphPad Prism 5.04 (GraphPad Software, La Jolla, CA, USA). Discordance rate was calculated as the number of the cases that were central HER2-negative and local HER2-positive, which were divided by all centrally analyzed HER2 samples and multiplied by 100. Univariable models (cross-tabulations and two-sided χ2-tests) were fit to evaluate pCR rates in the two groups of discordant and concordant cases.

The Kaplan–Meier survival function, the log-rank test, and Cox regression analyses were used for the analyses of DFS and OS. The prediction of pCR was analyzed in the combined subgroup of all discordant cases of GeparQuattro and GeparQuinto. In GeparTrio, no anti-HER2 therapy was available. For GeparSepto, the complete survival data were not available at the time of analyses. All tests were two-sided. The significance level was set at P=0.05.

Results

Discordance of the HER2 Status in the Whole Cohort

A total of 1581 patients with HER2-positive cancers were included in the analysis. Over the 12-year time period from GeparTrio to GeparSepto, the discordance rate between local and central HER2 testing decreased from 52.4% in GeparTrio to 8.4% in GeparSepto (GeparQuattro: 25.4%; GeparQuinto: 22.7%; and GeparSixto: 7.0%; Figure 2a).

Discordance of the HER2 Status Depending on the HR Status

In the study cohort excluding GeparSixto, 828 patients (63.9%) had HR-positive tumors and 467 patients (36.1%) had HR-negative tumors. Discordance rates were higher in HR-positive compared to HR-negative HER2-negative tumors (26.6% vs 16.3%, P<0.0001): GeparTrio 58.8% vs 37.9% (P=0.0113); GeparQuattro: 30.8% vs 18.9% (P=0.0341); GeparQuinto: 29.2% vs 13.9% (P=0.0003); and GeparSepto: 9.2% vs 6.1% (P=0.4302; Figure 2b).

Prediction of pCR

A total of 677 patients had received an anti-HER2 treatment based on the local positive testing in the GeparQuattro or GeparQuinto trial. Of these 677 patients, 161 (23.8%) were found to be HER2-negative on retrospective central testing. This allowed the evaluation of response to combined chemo-/anti-HER2 therapy in retrospectively HER2-negative tumors. Patients with centrally negative HER2 status achieved significantly less pCR than patients with concordant HER2 status (13.7% vs 32.2%; P<0.001). These results are particularly relevant in HR-positive tumors. In the HR-positive subgroup (n=268), only 7.9% of patients with discordant, but 26.9% of patients with concordant HER2 status achieved a pCR (P<0.001). In the HR-negative subgroup (n=295) a similar, but nonsignificant trend was observed: 27.7% of patients with discordant HER2 status achieved a pCR, compared to 37.9% of patients with concordant HER2 status (P=0.191).

At the whole, 465 patients received trastuzumab as anti-HER2-therapy. A total of 14 patients with discordant HER2 status (13.0%) obtained a pCR. However, 122 patients (34.2%) with concordant HER2 status achieved a pCR (P<0.001). The remaining 212 patients received lapatinib, but only in one arm of the GeparQuinto trial.

Survival Analyses

Median follow-up time of the 677 patients from GeparQuinto and GeparQuattro included in the survival analyses was 60.6 months. In the complete cohort, there were no significant differences in the DFS (log-rank P=0.396) and OS (log-rank 0.980) between the patients with concordant and discordant HER2 status.

However, differences were observed if the analysis was stratified based on HR status. Patients with HR-negative tumors and a discordant HER2 status had a significantly reduced OS (66.7 vs 74.9 months; P=0.019) compared to patients with HR-negative tumors and concordant HER2 status (Figure 3a).

Figure 3
figure 3

Overall survival depending on the hormone receptor status ((a) hormone receptor negative and (b) hormone receptor positive) comparing the HER2 concordant and HER2 discordant group.

In HR-positive tumors, a trend in the opposite direction was observed. The patients with HR-positive breast cancer revealed no significant differences in the OS analyses (P=0.125; Figure 3b). According to HR status, the analyses for DFS revealed no significant differences (HR-negative: P=0.181; HR-positive: log-rank P=0.108).

The other analyses revealed no novel significant results.

Evaluation of HER2 Status by Quantitative RT-PCR

To evaluate ER and HER2 expression quantitatively, data on mRNA expression of both receptors in tumor samples from the GeparTrio and GeparQuattro study were analyzed in combination. In total, data from 528 patients were available, 285 from the GeparTrio trial and 243 from GeparQuattro. In ER-negative tumors, HER2 showed a bimodal mRNA expression with ER-negative/ HER2-negative cases and ER-negative/ HER2-positive cases forming two distinct clusters (Figure 4, left). Within the ER-positive cases HER2 mRNA expression was continuous and no clustering regarding the HER2-positive or -negative cases could be seen (Figure 4, right).

Figure 4
figure 4

Estrogen receptor alpha and HER2 mRNA expression levels in 528 tumors from GeparTrio and GeparQuattro.

Discussion

Our study demonstrates in a comprehensive analysis of five neoadjuvant breast cancer studies a decrease of the discordance rate comparing the central and local HER2 status over a time of 12 years, reflecting a significant improvement of the diagnostic standards in clinical histopathology.

To our knowledge, this is the first study that describes the development of the discordance rate over a long time period. The causes for discordant HER2 results are multifaceted and can be divided into pre-analytic, analytic, and post-analytic points.30 Perez et al specify the time to fixation as well as the duration of fixation as one of the most relevant factors regarding the pre-analytic procedures. Regarding the analytic factors, the technical validation of assays, the antigen retrieval, and the superiority of the automated staining methods seem to have a relevant influence. Finally, the interpretation of the results, image analysis, reporting, and the continuous promise of quality are considered as the relevant post-analytic factors.30

Several studies have evaluated the HER2 discordance in different countries (eg, refs 31, 32, 33). Consequential, several approaches to improve this situation were initialized. Since the year 2000, interlaboratory tests were performed in Germany to preserve the quality in breast cancer diagnosis. Since 2007, the responsibility of the participation in these tests is necessary for the certification process of breast cancer centers. For this reason, the number of participants increased during the last years.

Earlier studies2 have suggested that up to 30% of all breast carcinomas might be HER2-positive, suggesting that the earlier assays had a higher rate of false-positive results, which is in line with the data shown for the GeparTrio study. It should be noted that GeparTrio was conducted at a time when HER2 testing was not part of the regular workup of breast cancer, because anti-HER2 therapies were yet not included in adjuvant or neoadjuvant therapies. Therefore, the large differences clearly reflect the start of the learning curve of the pathologists with a new test. In addition, GeparTrio was one of the first studies in which HER2 testing was performed on core biopsies, which are now known to have a more intense staining due to the different fixation time and the smaller specimen size. With more experience in HER2 testing, the rate of HER2-positive tumors is now estimated to be 15–16%. In Germany, the web-based HER2 monitor34 has been a valuable tool to compare the rate of HER2-positive tumors in different institutions.

Most interestingly, the discordance rate in our study regarding the HR-positive patients is higher than in the HR-negative subgroup. This result could be explained by the RNA results, which show a bimodal distribution of HER2 mRNA in HR-negative tumors and a continuous distribution in HR-positive tumors. Within the continuous distribution, it is much more difficult to determine a cutpoint. Pathologist should be aware of the fact that HER2 determination might be more difficult in ‘triple-positive’ tumors and that the rate of false-positives might be higher in these tumor types.

Because of the neoadjuvant setting and the long follow-up time, we were able to evaluate the clinical impact of discordant HER2 evaluation using chemotherapy response and survival as end points. The differences that were observed in the GeparQuattro and GeparQuinto studies were linked to therapy response, which was significantly lower in patients with centrally HER2-negative tumors that had received anti-HER2 treatment based on the local positivity. This suggests that anti-HER2 agents are not active in HER2-negative tumor population in GeparQuattro and GeparQuinto. This finding is in contrast to the observation by Paik et al8 who reported increased response rates in tumors that were centrally HER2-negative, but locally HER2-positive in the NSABP B-31. It should be noted that GeparQuattro and GeparQuinto were not randomized and powered to address this question, and it will be very interesting to await the results of the NSABP B-47 study that is testing anti-HER2 treatment in tumors with weak HER2 expression.

In the survival analysis, differences between HER2 concordant and discordant tumors were observed. These differences become obvious only when the analysis is stratified for HR status, because they have the opposite direction in HR-positive and HR-negative tumors. In the HR-negative subgroup, patients with discordant HER2 result have a significantly reduced OS in our study. This could be explained by the fact that these are triple-negative tumors based on the central pathology result, which are known to have a reduced prognosis compared to adequately treated HER2-positive tumors. Interestingly, in the HR-positive group, there is a nonsignificant trend in the opposite direction, with an improved prognosis of the patients with discordant HER2 testing. This can also be explained by the fact that these are luminal tumors based on central pathology, which have a generally better prognosis than HER2-positive tumors.

As a conclusion, our results show that the concordance of HER2 testing has now reached more than 90%, which is a considerable improvement over time compared to the rate that was stated in the first ASCO-CAP guideline.6 This improvement indicates that the different quality initiatives in clinical histopathology are able to improve diagnostic performance in clinical practice.