PREDICT underestimates survival of patients with HER2-positive early-stage breast cancer

The prognostic performance of PREDICT in patients with HER2-positive early breast cancer (EBC) treated in the modern era with effective chemotherapy and anti-HER2 targeted therapies is unclear. Therefore, we investigated its prognostic performance using data extracted from ALTTO, a phase III trial evaluating adjuvant lapatinib ± trastuzumab vs. trastuzumab alone in patients with HER2-positive EBC. Our analysis included 2794 patients. After a median follow-up of 6.0 years (IQR, 5.8–6.7), 182 deaths were observed. Overall, PREDICT underestimated 5-year OS by 6.7% (95% CI, 5.8–7.6): observed 5-year OS was 94.7% vs. predicted 88.0%. The underestimation was consistent across all subgroups, including those according to the type of anti HER2-therapy. The highest absolute differences were observed for patients with hormone receptor negative-disease, nodal involvement, and large tumor size (13.0%, 15.8%, and 15.3%, respectively). AUC under the ROC curve was 73.7% (95% CI 69.7–77.8) in the overall population, ranging between 61.7% and 77.7% across the analyzed subgroups. In conclusion, our analysis showed that PREDICT highly underestimated OS in HER2-positive EBC. Hence, it should be used with caution to give prognostic estimation to HER2-positive EBC patients treated in the modern era with effective chemotherapy and anti-HER2 targeted therapies.


INTRODUCTION
The addition of trastuzumab to adjuvant chemotherapy has dramatically improved the outcomes of patients with HER2positive early breast cancer, reducing the risk of mortality by more than 30% 1 . Despite the undoubted benefit of adjuvant therapy, several clinical questions remain open. Approximately 25% of patients still experience recurrence up to 10 years from diagnosis, and further research efforts are needed to better refine patient selection for adopting escalation or de-escalation treatment strategies 2,3 .
PREDICT (www.predict.nhs.uk) is a publicly available, online tool that helps to predict the individual prognosis of patients with early breast cancer and to show the impact of adjuvant treatments administered after breast cancer surgery. It uses traditional clinical-pathological factors, and it is aimed to support clinical decision making in the adjuvant setting. The original version of PREDICT (v.1.0) was derived from cancer registry information on 5,694 women treated in East Anglia from 1999-2003, and was subsequently validated in several datasets of patients with breast cancer 4,5 . In 2011, the model was updated to include HER2 status. Estimates for the prognostic effect of HER2 status were based on an analysis of 10,179 cases collected by the Breast Cancer Association Consortium (BCAC), none of which had been diagnosed after 2004, to ensure that patients did not receive trastuzumab 6 . A subsequent validation was done in 2012 in a British Columbia Canadian cohort 7 . This study demonstrated that the inclusion of the HER2 status allowed the model to perform better than the previous PREDICT version and Adjuvant! Online in estimating overall and breast-cancer-specific survival 7 .
Although the use of PREDICT is recommended to aid decision making in the adjuvant setting 8 , its prognostic role in HER2positive early breast cancer patients treated with modern chemotherapy and anti-HER2 therapies remains unclear. We aimed to investigate the prognostic performance of PREDICT in patients with HER2-positive early breast cancer who received trastuzumab-based therapy started concurrently with chemotherapy within the ALTTO trial. The ALTTO trial is the largest adjuvant study ever conducted in the field of HER2-positive early breast cancer and, including at least 5-year follow-up data from all patients 9 , represented a unique opportunity to investigate the reliability and prognostic performance of PREDICT in women with HER2-positive disease.

RESULTS
Out of 8381 patients included in the ALTTO trial, 2836 were treated with chemotherapy and concurrent trastuzumab-based therapy and were potentially eligible for the present analysis. In 42 patients, the PREDICT algorithm was not evaluable (due to age of the patient <25 years old [n = 7], missing tumor size [n = 13], or missing lymph nodes status [n = 22]). Therefore, 2794 patients were included in the present analysis ( Fig. 1).
This finding was consistent across all subgroups, with a difference ranging from 2.7% (in the hormone receptor-positive subgroup) to 15.8% (in patients with ≥4 positive lymph nodes) ( Table 2). The underestimation of survival by PREDICT was consistent and similar in all analyzed subgroups, including among patients treated with lapatinib and trastuzumab (Predictedobserved 5-year OS: −6.98), trastuzumab alone (Predictedobserved 5-year OS: −6.28), or trastuzumab followed by lapatinib (Predicted-observed 5-year OS: −6.82).
This finding of suboptimal discriminatory accuracy was consistent across all subgroups, ranging from 61.7% (in patients with ≥4 positive lymph nodes) to 77.7% (in patients receiving trastuzumab alone as anti-HER2 therapy) ( Table 3). The lowest discriminatory accuracy was observed for patients with high number of nodes (≥4 and 1-3 positive lymph nodes) (Supplementary Figs 1 and 2), and for patients receiving a nonanthracycline-based chemotherapy (61.7%, 64.8%, and 65.2%, respectively). The highest discriminatory accuracy was observed for patients with negative lymph nodes ( Supplementary Fig. 3) and for patients receiving trastuzumab alone as anti-HER2 therapy (77.3% and 77.7%, respectively).

DISCUSSION
To the best of our knowledge, PREDICT is the only publicly available, free, online tool developed to predict individual prognosis in the specific population of patients with HER2positive early breast cancer based on traditional and easily retrieved clinical-pathological factors including HER2. In our ALTTO analysis, PREDICT highly underestimated patients' OS; this finding was consistent across all patient subgroups, with highest absolute differences for patients with hormone receptor-negative disease, nodal involvement, and large tumor size. In terms of discrimination, the accuracy of PREDICT was overall low, with the lowest discriminatory accuracy observed in patients with nodal involvement (≥4 and 1-3 positive lymph nodes), and in patients receiving non-anthracycline-based chemotherapy.
The low performance of this tool raises several questions about the reliability of PREDICT to give prognostic estimation in HER2positive early breast cancer patients. To potentially explain the reasons for the underestimation of patients' OS, we can speculate whether the population used to validate this prognostic tool accurately mirrors the real-world population of patients with HER2-positive disease treated in the modern era with effective chemotherapy and anti-HER2 targeted therapies. The prognostic effect of HER2 status was evaluated and incorporated in the PREDICT tool for the first time in October 2011, based on data from the Breast Cancer Association Consortium (BCAC) 6 consisting in 10,179 cases not exposed to anti-HER2 treatment (Supplementary Table 1). The subsequently developed model (called PREDICT Plus) was then validated in the original British Columbia dataset, a cohort including 203 HER2-positive breast cancer patients 7 . In this latter cohort, PREDICT demonstrated an improved ability to estimate breast cancer-specific and overall survival in HER2positive patients, compared to other prognostication tools such as PREDICT and Adjuvant! Online 7 . In the HER2-positive cohort of the British Columbia dataset, the observed 10-year OS was 44.3%, and none of the included patients had received trastuzumab 7 . A further step forward, was the inclusion in PREDICT of the estimates of benefit from adjuvant trastuzumab, with its proportional reduction of 31% in the mortality rate up to five years. These estimates were based on the results of four clinical trials: FinHER 10 , HERA 11 , B31/N9831 12,13 , and BCIRG006 14 (Supplementary Table 2).
Patients with HER2-positive early breast cancer are experiencing a consistent shift towards better survival across the years, mainly due to the increasingly effective local and systemic therapies available in this setting. This change might not be reflected by a prognostic tool developed and validated 10 years ago. In particular, newer drugs like pertuzumab and T-DM1 have become available for many patients developing disease progression after  treatment in the ALTTO trial. These two drugs improve OS in metastatic patients and may contribute to the "better-thanpredicted" OS 15,16 . Moreover, the current standard of care for early breast cancer is even superior to the treatment received by many patients in the ALTTO study, including neoadjuvant therapy with pertuzumab, adjustment of adjuvant therapy based on pathological response to neoadjuvant therapy (i.e., T-DM1 for patients who do not achieve pathological complete response) and considering extended adjuvant anti-HER2 therapy with neratinib and endocrine therapy for patients with hormone receptor-positive disease. As such, the discordance between OS estimated by PREDICT and the current real-world OS is expected to be even higher. Therefore, our results suggest that the current version of PREDICT should be used with caution for prognostication in HER2positive early breast cancer patients treated in the modern era with effective chemotherapy and anti-HER2 targeted therapies.
It should be also considered that at least part of the discordance observed between the observed and predicted 5-yr OS by PREDICT could be due to the differences existing between a highly selected population enrolled in a clinical trial and the real-world patient population, which might have slightly different prognosis 17,18 . Clinical trials have a strong internal validity, but their external validity could be weaker, particularly in the case of narrow inclusion criteria. For this reason, findings from clinical trials might overestimate outcomes as compared to real-world practice. Due to differences in the distribution of age, comorbidity status, and overall health, differences between predicted and observed OS in a clinical trial sample as compared to real-world data are expected. Consistently with our findings, an independent validation of PREDICT on data from real-world patients led by Gray and colleagues showed a general pattern of   15.3% and 14.5%, respectively), although not focusing specifically on HER2-positive disease 19 . Additionally, prognostication estimates of PREDICT are provided as OS rates. Although OS is an important endpoint, being free from any ambiguity in its definition, it could be influenced by several variables (competing risks) not strictly related to breast cancer and not considered in PREDICT, such as comorbidities and performance status 20 . Non-cancer deaths may not entirely reflect tumor biology, aggressiveness, and responsiveness to therapy 20 . On the other hand, the more aggressive the disease, the higher the relevance of OS. Indeed, HER2-positive breast cancer tend to develop more early recurrences compared to hormone receptor positive/HER2-negative disease, thus having an undoubtedly more relevant impact on OS 21 .
In our analysis, the highest absolute differences between observed and predicted OS were observed for patients with hormone-receptor negative disease, larger tumor size, and high number of nodes (≥4 positive lymph nodes), namely those patients traditionally considered at higher risk of relapse. Further investigations are urgently needed to better predict prognosis of these patients. Of note, despite the traditional stigma of poor prognosis for patients with high-risk HER2-positive breast cancer, recent clinical trials have shown good outcomes also for this highrisk subset of patients 22 .
The prediction of prognosis in patients with early breast cancer is an issue of paramount importance, not only in hormone receptor-positive/HER2-negative disease, where prognostication may settle whether adjuvant chemotherapy should be administered or not, but also in HER2-positive disease. Indeed, although in HER2-positive breast cancer almost all patients deserve chemotherapy as per standard of care, a reliable prognostic estimation has several implications, from the planning of premenopausal patients' reproductive life (e.g. affecting the choice of having or not a pregnancy later on 23 ), to a therapeutic perspective (adoption of escalation or de-escalation treatment strategies, including type of chemotherapy to be administered together with anti-HER2 treatment and use of extended adjuvant endocrine therapy in hormone receptor-positive disease 24 ).
Several molecular assays are now available for hormone receptor-positive/HER2-negative breast cancer 25 , and, recently, some molecular assays have been also developed for HER2positive disease 26 .
It is likely that these assays will refine prognostication beyond what can be provided by clinical prognostic models like PREDICT 27,28 , and their increasing use, as a consequence, will reduce reliance on tools like PREDICT. Nevertheless, one strength of PREDICT is the fact that it is "free" and easy to use in everyday clinical practice, and its integration with molecular assay could provide a more complete prognostic evaluation of each single patient. Recently, Prat et al. developed a new prognostic score, HER2DX, based on the combination of clinical-pathological and molecular characteristics of the tumor (nodal and tumor stage, the number of stromal tumor-infiltrating lymphocytes, PAM50 subtypes, and expression of 13 genes relating to proliferation and underlying subtype-related biology) 26,29 . This was the first attempt to build a combined prognostic score based on clinicopathological and genomic variables in early-stage HER2positive breast cancer, using tumor samples from the phase 3 Short-HER trial 30 . However, the HER2DX prognostic model is still immature to be used as biomarker, and future clinical validations are warranted in order to establish its use in different scenarios, especially in the neoadjuvant setting.
Our study has some limitations that should be acknowledged. First, this is an unplanned exploratory analysis. Second, some information (including prognostic factors like the proliferation index Ki67 and type of method for breast cancer detection) were not available in the ALTTO database and could not be included in the model. Third, PREDICT did not allow for estimates of dualtargeted anti-HER2 therapy efficacy, and, in particular, does not provide estimates for lapatinib use. However, our subgroup analysis confirmed that PREDICT still underperforms for patients treated with trastuzumab alone. Additionally, PREDICT tool does not consider the presence of comorbidities and/or the patient performance status, thus further limiting the possibility to compare predicted vs. observed outcomes using a clinical trial sample. Finally, only the point estimates by PREDICT, without its range, were included in the present analysis.
On the other hand, our study has several strengths. Our results derive from a large cohort (n = 2794) of patients enrolled in the largest, randomized adjuvant trial ever conducted in the field of HER2-positive breast cancer. We included only patients receiving adjuvant trastuzumab-based therapy started concurrently with modern chemotherapy. Trial sample size allowed the exploration of relevant patient subgroups. All data used for the analyses were prospectively collected during the trial conduction, as detailed in the study protocol.
In conclusion, in patients with HER2-positive early breast cancer enrolled in the ALTTO trial and treated with modern chemotherapy and trastuzumab-based therapies, the PREDICT score highly underestimated OS. The suboptimal performance of this prognostic tool was observed irrespective of type of anti-HER2 treatment, type of chemotherapy regimen, age of the patients at the time of diagnosis, central hormone receptor status, pathological nodal status, and pathological tumor size. Our results suggest that the current version of PREDICT should be used with caution to give prognostic estimation in HER2-positive early breast cancer patients treated in the modern era with effective chemotherapy and anti-HER2 targeted therapies. The further improvement of therapeutic strategies expected in the next future will likely increase the survival of patients with HER2-positive early breast cancer, thus requiring the current version of PREDICT to be updated to provide reliable prognostic estimation in these patients.

Study design and patients
Details of the ALTTO trial study design were previously published 31 33 .
Eligible patients were randomized to one of four anti-HER2 treatment arms: trastuzumab alone, lapatinib alone, sequential treatment with trastuzumab for 12 weeks followed by a 6-week washout period before other 34 weeks of lapatinib, and dual anti-HER2 blockade with trastuzumab plus lapatinib. The CONSORT diagram of the ALTTO study is reported in the ALTTO primary analysis paper.
Anti-HER2 treatment could be administered as per physician's choice following chemotherapy completion (design 1), or concomitantly, either with a taxane after anthracycline-based chemotherapy (design 2) or with 6 cycles of docetaxel and carboplatin in an anthracycline-free regimen (design 2B). In all treatment arms, adjuvant anti-HER2 therapy was administered for 1 year.
In 2011, after the first interim analysis, the lapatinib arm was closed and patients were offered adjuvant commercial trastuzumab 31 .
In the present analysis, in order to reflect current clinical practice in this setting, only patients who received concurrent chemotherapy (design 2 and design 2B) and who received trastuzumab-based anti-HER2 therapy (i.e. trastuzumab alone arm, trastuzumab followed by lapatinib arm and trastuzumab plus lapatinib arm) were included. All patients originally assigned to the lapatinib alone arm, and those who received anti-HER2 therapies at the completion of all chemotherapy (sequential treatment, design 1) were excluded.

Ethics section
All patients signed a written informed consent prior to enrollment in ALTTO. The project proposal of the present exploratory analysis was submitted and approved by the ALTTO Steering Committee.

Study objectives
The primary objective of the current analysis was to investigate the prognostic performance of PREDICT in breast cancer patients with earlystage HER2-positive disease treated with modern chemotherapy and concurrent trastuzumab-based anti-HER2 therapy.
Secondary objectives were to investigate the prognostic performance of PREDICT according to the type of anti-HER2 treatment received (trastuzumab alone, trastuzumab followed by lapatinib and, trastuzumab plus lapatinib), type of chemotherapy regimen received (anthracyclinebased chemotherapy regimens vs. non-anthracycline-based chemotherapy regimens), age of patients at the time of diagnosis (age ≤ 40 years vs. age 41-64 vs. age ≥65 years), central hormone receptor status (hormone receptor -positive vs. negative), pathological nodal status (node-negative vs. node-positive disease [

Data extraction
PREDICT estimates for each patient were calculated by one investigator blinded to patient outcomes. Patient and tumor characteristics, as well as administered adjuvant anticancer treatments, were entered in the PREDICT v.2.2 program to calculate the predicted 5-year OS for each patient. Detection modality and Ki67 status were considered "unknown" for all patients (as these variables were not collected as part of the ALTTO trial).
The most updated ALTTO database was used for this analysis 9 , which corresponds to at least 5-year follow-up for every single patient.

Statistical analysis
The present analysis should be considered as exploratory, since it was not preplanned in the study protocol and the power of the statistical analyses performed was not pre-specified.
The prognostic performance of PREDICT was evaluated by assessing the following endpoints: i) calibration, defined as the agreement between the predicted and observed survival rates, and ii) discriminatory accuracy, defined as the ability of distinguishing individuals who will survive 5 years compared to those who will not (i.e. the ability to discern patients with good outcomes from those with poor outcomes at the individual patient level).
The observation time for each patient was defined as the time between the date of diagnosis and an event. OS event was defined as death from any cause.
The median predicted 5-year OS was calculated from individual predicted outcomes by PREDICT v. 2.2.
For assessing calibration, the median predicted 5-year survival probabilities (by PREDICT) were compared with the observed 5-year survival rates (as obtained by Kaplan-Meier curves). We had to use the median 5-year prediction instead of the mean 5-year prediction, due to the skewness in the distribution, i.e. mean 5-year prediction was 83.6% while median 5-year prediction was 88.0%, and thus the mean predicted 5-year survival probability underestimated the center of the distribution. Therefore, we used the median as a robust estimator of the center of the distribution. Using the standard error as obtained by the Kaplan-Meier curve, we calculated 95% CI for the difference in predicted vs. observed 5-year survival. Calibration plots for PREDICT were constructed by visualizing mean predicted vs. observed survival outcomes by deciles of predicted outcomes.
For assessing discriminatory accuracy, the area under the receiveroperator characteristic curve (AUC under the ROC) and corresponding 95% CI for 5-year predicted OS were calculated. The AUC translates into the probability that the predicted outcome of a randomly selected patient who indeed had that outcome is higher than that of a patient who did not; the higher the AUC, the better the tool is at identifying patients with a better survival.
Subgroup analyses were performed to investigate the prognostic performance of PREDICT according to the type of anti-HER2 treatment and chemotherapy received, age at the time of diagnosis, central hormone receptor status, pathological nodal status, and pathological tumor size.
Statistical analysis was performed by L.A. using SAS 9.4 statistical software (SAS Institute, Cary, NC) and R.

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.

DATA AVAILABILITY
Data can be made available upon reasonable request. Data and results from the Data Centre at Institut Jules Bordet in Brussels (Belgium) can be made available upon approval of a research proposal.