Introduction

Breast cancer (BC) is the most common type of malignancy in women and the second most common cancer overall. According to the GLOBOCAN database, the age-standardized incidence rate in Italy is 92.8 cases per 100,000 women per year1.

Classification of BC is primarily based on the expression of key signaling molecules including receptors for the female sex hormones (HR) estrogen and progesterone (ER, PR), as well as the human epidermal growth factor 2 (HER2).

For the large population of patients with HR+ (ER+ and/or PR+), HER2− tumor status (~70% of nonmetastatic primary BCs)2, adjuvant hormonal therapy (HT) is recommended and is used as the standard treatment for most patients. When adjuvant chemotherapy (CT) is given in addition to HT, the side effects and risk burden require adequate measures to best identify the patients who will most likely derive a clinical benefit.

Traditionally, a set of clinical and pathological features have been used to evaluate the prognosis of the patient and guide decisions on adjuvant therapies. Besides the HR status, prognostic parameters include patient age, lymph node involvement, tumor size, histological type, and grade, as well as the Ki67 proliferation marker3. However, these conventional clinical and histopathological markers have insufficient specificity and sensitivity to precisely predict which patients are likely to experience a significant benefit of CT on cancer recurrence that outweighs the substantial side effects.

Conventional parameters apparently have limited association with tumor biology, leaving a broad margin of predictive uncertainty4. Accordingly, only a minority of HR+, HER2− patients appears to benefit from CT: in a large meta-analysis performed by the Early Breast Cancer Trialists’ Collaborative Group, the reduction of the 10-year recurrence rate by adjuvant CT was <10%5. Recently, the prospective ECOG-ACRIN Trial Assigning Individualized Options for Treatment (TAILORx) established in over 10,000 patients that the vast majority of HR+, HER2−, node-negative (N0) primary BC patients (about to 80%) do not derive benefit from CT in terms of recurrence risk6.

The 21-gene Oncotype DX Breast Recurrence Score multigene assay was developed to aid physicians in making personalized CT treatment decisions in HR+, HER2− early-stage BC patients. Clinical validation and utility of the assay have been demonstrated in multiple studies with >96,000 N0 and node-positive (N+) BC patients worldwide6,7,8,9,10,11,12.

Two studies have validated the prediction of a CT benefit by the Oncotype DX® test for N0 patients with level 1B and 1A evidence. NSABP B-2013 was a prospective analysis of archived, preserved samples and it demonstrated that patients with Recurrence Score (RS) results of 26–100 derive a substantial benefit from CT, whereas patients with an RS 0–10 had excellent clinical outcomes at 9 years with endocrine therapy alone14. The TAILORx study prospectively assessed in a large, randomized population the merits of CT in patients with RS 11–25 and demonstrated that, overall, they did not derive a significant benefit from CT. Taken together, these results established that the Oncotype DX assay can guide CT treatment decisions such that N0 patients with RS 0–25 can safely forego CT, whereas patients with RS 26–100 do derive substantial benefit from CT used in addition to hormonal treatment. In postmenopausal women with N+, ER + BC, the prospective, retrospective analysis of SWOG-8814 trial established that N1 patients with RS 0–17 could be safely spared CT, whereas patients with RS 31–100 achieved a strong clinical benefit with CT15.

For N0 and N+ disease, aggregate data from prospective registries with 5–10 years of observation confirm that the RS result consistently identifies patients with good clinical outcomes when treated with HT alone10,16,17.

The Oncotype DX assay has been incorporated into clinical and pathological guidelines of major international medical societies including the European Society of Medical Oncology, St. Gallen Consensus Conference, American Society of Clinical Oncology, and American Joint Committee on Cancer. The National Comprehensive Cancer Network Guidelines state that the Oncotype DX assay is the only test with proven validity to predict CT benefit18.

In decision impact studies in Europe conducted before publication of the TAILORx study and, hence, before the predictive cutoffs were established with precision, outcomes have shown the utility of the RS result in clinical practice. In a meta-analysis of more than 500 N0, HR+, HER2− primary BC patients from four studies, the overall rate of recommendation change was 32% post- vs. pre-testing, whereas the CT recommendation rate decreased from 55% to 34%19.

Here we describe the results of the multicenter, prospective, observational study, PONDx, which was performed in Italy from February 2016 to December 2017, and which investigated the real-life use of the Oncotype DX Breast Recurrence Score test by physicians treating early BC patients in routine care in clinical BC reference centers. The study primarily evaluated the impact of the Oncotype DX assay on physicians’ treatment decisions. A further objective was the characterization of the patient population in which the test is used in real-life settings at clinical BC reference centers in Italy.

Results

Data from 1738 BC patients who underwent Oncotype DX testing were available from 27 reference centers located in 6 regions of Italy (Lombardia, Lazio, Emilia Romagna, Campania, Abruzzo, and Marche). In the present analysis, 14 patients were excluded due to incomplete data, leaving 1724 in the analysis cohort.

The tumors diagnosed in the participating patients were mostly invasive ductal HR-positive carcinomas, with histological grade 2 and 3, and tumor size ranging from 1 to 5 cm, with Ki67 expression mostly in the range from 10% to >30%. The majority of patients were >50 years old; 36% were premenopausal and 55% postmenopausal (Table 1).

TABLE 1 Patient and tumor characteristics in the overall population (N = 1738).

The distribution of Recurrence Score results is shown in Table 2 (left columns). Judged by the conventional cut points, 57% of the overall population were in the RS 0–17, 34% in the RS 18–30 and 9% in the RS 31–100 group. Using the TAILORx-based cut points, 83% of the population was in the RS 0–25 group and 17% in the RS 26–100 group.

TABLE 2 Oncotype DX Breast Recurrence Score® results categorized according to RS groups used prior to TAILORx (left) and RS groups based on TAILORx cut points (shaded columns, right).

It is interesting to note the discordance of some of the key classical pathological parameters and RS results. Using the TAILORx-based RS cut points (Table 2, right columns), the discordance between some classical pathological parameters and RS result remains pronounced, with a significant proportion of grade 2 (89%) and grade 3 (64%) patients having RS 0–25, indicating no CT benefit. Similarly, patients with the lowest (<10%) or highest (>30%) level of expression of the proliferation marker Ki67 were found to have RS 26–100 and RS 0–25, respectively. The latter may in part be related to the known lack of reproducibility of Ki67 assays20,21,22, yet could be expected considering Ki67 has not been demonstrated to correlate with CT response23. No discernible correlations were found between RS result and tumor size, age, or menopausal status.

In the population analyzed for treatment recommendation (n = 1683), the physicians recommended CT + HT in 824 patients (49%) prior to the availability of the RS result, whereas 859 (51%) were assigned to HT alone. Patients with recommendations for other therapies than CT or CT + HT were excluded from this analysis.

After the RS results became available, the physicians changed their decision in 512 patients (30%). Consequently, the number of patients with a CT + HT recommendation dropped from 824 to 524, corresponding to a net reduction of CT recommendations of 36% (Table 3 and Fig. 1).

TABLE 3 Recommendations by the treating physician regarding adjuvant anti-tumor therapy before (PRE-RS) and after availability (POST-RS) of the Recurrence Score result. For these analyses, patients with recommendations other than CT + HT or HT were excluded.
Fig. 1: Changes in treatment recommendations before and after availability of the Recurrence Score result.
figure 1

Rates of hormone therapy alone (HT) or chemo-endocrine therapy (CT+HT) recommendations before testing (Pre RS) and changes in recommendations based on the test resutls (Post RS).

Looking at selected groups defined by tumor characteristics, the following picture emerges among the patients with clinicopathological high-risk tumors (Table 3): in patients with grade 3 malignancies (n = 475), physicians changed their treatment recommendation in 37% of cases, leading to a 27% net reduction of CT + HT. Patients with Ki67 expression >20% (n = 962) had their recommendation amended in 36%, with a 37% net reduction in CT + HT recommendations (from 608 to 386 patients).

For patients with N0 nodal status (n = 1160) CT + HT recommendations declined by 27% (from 512 to 374 patients). In contrast, the group with N1 disease (n = 414) saw the frequency of CT + HT recommendations reduced by a net percentage of 55% after the RS results became available (from 258 to 110 patients). Similarly, pronounced effects were observed for the subset of patients with Nmic (n = 109) and with lobular BC (n = 206). In these specific groups, the net reduction of CT + HT was 39% and 53%, respectively.

Regarding patients aged >50 vs. ≤50 years, both age groups showed a strong net reduction of CT + HT recommendations (by 31% and 42%, respectively). In the group with initial recommendations of HT-only (N = 859), a minor fraction of patients had their recommended treatment changed to CT + HT (n = 102; 12%).

A simulation of expected treatment recommendations after availability of the RS result was performed based on TAILORx RS cut points6 and the estimated interpretation of the RS results according to the findings of the TAILORx study (see “Methods” and Table 4). According to these premises, 75% (n = 1263) of patients would receive HT-only and 25% (n = 420) adjuvant CT + HT regimens in this setting, corresponding to a relative reduction of 49% for the overall population (Fig. 2) (47% for N0) (Table 5). Regarding the age groups >50 vs. ≤50 years, there was a net reduction of CT + HT recommendations (by 41% and 50%, respectively). The proportion of patients ≤50 years with N0 disease and RS 16–20 and RS 21–25 represent, respectively, 9% and 5% of the overall N0 population.

TABLE 4 Algorithm used to estimate distribution of post-RS treatment recommendations in N0 patients based on TAILORx results.
Fig. 2: Patients with post-RS recommendations for chemo-endocriine therapy (CT + HT): actual number based on previous RS cut points and expected percentage assuming decision-making according to TAILORx results (N = 1683).
figure 2

Pre RS: treatment recommendations before avalability of the Recurrence Score result. Post RS: treatment recommendations accounting for the Recurrence Score result.

TABLE 5 Expected recommendations assuming decision-making according to TAILORx results.

Discussion

In this observational study, we documented the use of the Oncotype DX assay and its impact on physicians’ therapeutic recommendations in a sizeable patient population from 27 BC reference centers in 6 regions of Italy. Our data add to the growing body of evidence from RS result decision impact studies with N0 and N+ tumors performed in Europe19,24,25,26, North America27,28, or Australia29.

Assessment of the recurrence risk by the Oncotype DX Breast Recurrence Score assay influenced physician’s choice of adjuvant regimens, resulting in an overall reduction of 36% in CT recommendations vs. decisions based on prognostic-only clinicopathological risk parameters. Recalculation of the expected post-RS recommendations using estimates based on TAILORx cut points and results showed a higher reduction (49%) of the proportion of patients recommended CT in the total population. This observation was consistent in both age groups of ≤50 and >50 years.

The significant overall reduction in CT recommendations guided by RS results in this study correlates with that of other studies published worldwide, reporting up to 47% reduction with similar patient populations and pre-TAILORx cut points. This indicates a consistent decision impact of the RS results19,24,25,26,27,28,29. The relative reduction in CT recommendations is relevantly influenced by clinical practice and baseline CT usage. This has been reported to be highly variable across countries, centers, and within single centers4.

The findings of our study support previous evidence4 demonstrating physicians’ uncertainty to recommend adjuvant treatment relying on prognostic-only factors that have not been correlated with the prediction of CT benefit. Tumor grade and size, in particular, are not stringently related to tumor biology. Although they estimate patient prognosis, i.e., risk of recurrence, they do not predict response to CT as a specific treatment option23.

In addition, the high variability and lack of standardization on Ki67 or histological tumor-grade assessments leave the treating physicians with a margin of uncertainty22. The Oncotype DX test results can provide confidence for CT treatment decisions with a strong body of evidence on the prediction of CT benefit1. Of note, no patient subgroup in this study was identified, which could forego Oncotype DX RS testing without losing potentially useful predictive information on CT benefit or lack thereof.

Analysis of the widely used Ki67 proliferation marker revealed that 66% of the patients with high Ki67 expression (>30%) had an RS result 0–25 and would not be expected to derive benefit from CT, indicating that Ki67 should probably not be used as a dominant indicator for treatment decisions. Similar considerations apply for patients with grade 3 tumors, 64% of whom had an RS result in the 0–25 range. This is consistent with observations in the TAILORx study where 73% of patients with a high clinical risk based on tumor-grade and size assessment had an RS result 0–25 and hence might have been overtreated if the RS result had not been used for the treatment decision6.

Conversely, a minor fraction of patients had their initial HT-only recommendations changed to actually receive additional CT, suggesting that they were rescued from potential undertreatment based on conventional criteria.

The overall change in CT recommendations by Oncotype DX RS results in this study, based on pre-TAILORx cut points, is in the range of that in other published studies with similar patient populations worldwide, indicating a consistent interpretation of the RS results19,24,25,26,27,28,29. The strength of the present study lies in its size: this is the largest decision impact study reported to date. Furthermore, this real-life study confirms the results of a randomized controlled trial. Its main limitation is that the study was performed before the availability of TAILORx data, demonstrating that even a higher proportion of patients could be spared CT, thanks to the RS results. Consequently, we might underestimate the net CT sparing effect of the Oncotype DX test. Although the results of the TAILORx study were practice-changing for N0 patients, the ongoing RxPONDER trial will provide additional information on the clinical usefulness of the Oncotype DX assay in women with HR-positive BC and positive axillary nodes. In this study, we observe a very significant reduction of CT recommendations (55%) for patients with limited nodal involvement (N1). This is supported by the consistent evidence from the SWOG-881415 and Plan B8 studies, as well as prospective registries9,11 supporting CT sparing for patients with the lowest RS results.

Chemotherapeutic regimens used in the adjuvant setting in women with early BC are associated with a significant risk of acute and long-term adverse effects—the latter including fatigue, cardiotoxicity, cognitive impairment30, peripheral neuropathy, and cases of secondary malignancies including leukemia. Quality of life and working ability may be reduced at least temporarily by these treatments as well. Therefore, tools refining the population that derives appropriate benefit to justify the adverse effects of adjuvant CT serve an important medical need in patients with HR+, HER2− primary BC. The Oncotype DX assay is such an instrument and may contribute significantly to a reduction in the use of CT for patients who are unlikely to derive benefit. Conversely, the assay allows identification of a group of 15–20% of HR+, HER2− early BC patients who derive a substantial benefit from CT. In these patients, CT in addition to HT consistently results in lower distant recurrence rates than HT alone5,13,14.

In our study, we reported, overall, 83% of patients had RS 0–25 guiding towards a significant de-escalation of CT. These proportions are consistent with the TAILORx study and with patient registries such as the Surveillance, Epidemiology and End Results registry in the United States31 and the Clalit registry in Israel10, reporting 84% and 80% patients with RS 0–25, respectively. This majority of patients (about 80%) with RS 0–25 consistently showed excellent clinical outcome with HT alone and hence can safely be spared CT.

Results from the TAILORx study are considered practice-changing, because for the first time it was shown in a large prospective randomized trial that a sizeable group of patients could be identified with a unique multigene assay to derive minimal or no benefit from CT. For patients initially recommended to receive chemo-HT in our study, use of the Oncotype DX test with TAILORx cut points led to a reduction by nearly 50% in CT recommendations.

In addition to the clinical benefit to the patients who are spared adverse effects, reduction of CT use has relevant implications for the healthcare system through reduction of direct expenses (cytotoxic drugs and their application) and indirect costs (managing side effects). Benefits to healthcare-associated and societal costs importantly include diminished duration of absence from work, which has been shown to be significantly prolonged by a median of 7 months for patients who receive CT. In fact, the use of CT was one of the factors with the highest risk ratio of delayed time to work after primary BC32, responsible for more than a quarter of the total costs of CT33. Other genomic assays are available for early BC patients; however, it is noteworthy that although Oncotype DX brings value guiding CT decisions based on direct evidence of prediction of CT benefit, other genomic assays are prognostic-only. Decision impact studies with MammaPrint® prognostic assay34,35, EndoPredict®36, or Prosigna®37,38 consistently reported a limited impact of net CT use related to a balance between significant reduction from CT-HT to CT and a significant increase from HT alone to CT-HT.

The observations on the use and impact of the Oncotype DX Breast Recurrence Score test on the participating reference centers of PONDx in Italy support the notion that the test provides clinically useful predictive information, complementing standard clinical and pathological risk parameters for patients with HR+, HER2− N0/N1 primary BC. The physicians used the results to modify their original treatment recommendations, which resulted in a reduction of patients recommended for CT by more than a third.

Estimating the effects based on the RS categories and outcomes established in the recently completed large prospective TAILORx trial confirmed the primary results of PONDx, indicating the potential for an even more pronounced reduction of CT recommendations and thereby potentially sparing a significant proportion of patients from acute and long-term toxicities of these treatments.

Methods

Patients eligibility

Eligible patients fulfilled the validated criteria for use of the Oncotype DX assay: patients aged ≥18 years with a recent diagnosis of early, single-invasive ER+ HER2− BC and available information on lymph node involvement categorized as N0, Nmic (micrometastatic node involvement), or N1 (one to three positive nodes).

Baseline patient documentation included age and sex, menopausal status, conventional clinical and pathological tumor characteristics including histologic type (lobular/ductal), tumor size and grade, nodal status, receptor status (ER, PR, HER2), Ki67 expression, and RS results as soon as available.

Ethics

Patients provided written informed consent before participation in the study. The protocol was approved by the Ethics Committee of all participating institutions: Università La Sapienza di Roma, Policlinico Universitario Agostino Gemelli, IRCCS Regina Elena National Cancer Institute, Fondazione IRCCS Istituto Nazionale dei Tumori, Policlinico Sant’Orsola Malpighi, Ospedale San Raffaele, Ospedale Gaetano Bernabeo, Policlinico Umberto I, Azienda Ospedaliera Universitaria Federico II, ASST Spedali Civili, IRCCS Regina Elena National Cancer Institute, Ospedale Valduce, ASST di Cremona, ASST Papa Giovanni XXIII, Istituto Europeo di Oncologia, Istituto Clinico Humanitas, IRCCS Multimedica Sesto San Giovanni, Azienda Ospedaliera San Camillo Forlanini, Azienda Ospedaliera San Giovanni - Addolorata, Ospedale SS. Trinità, Istituto Clinico Humanitas, Ospedale San Giovani Calibita Fatebenefratelli, Azienda Ospedaliero Universitaria Ospedali Riuniti di Ancona, Ospedale di Belcolle, Ospedale di Circolo e Fondazione Macchi, Ospedale Nuovo Regina Margherita, Policlinico Universitario Campus Biomedico, Istituto Nazionale Tumori Fondazione G. Pascale, and ASST BG Ovest Ospedale Treviglio.

Treatment decisions

Prior to Oncotype DX testing, CT was recommended to patients with worse prognosis based on clinical, pathological, and biological features as per the local clinical practice. Individual treatment modalities (HT, CT-HT) recommended by the treating physician were documented before (pre-RS) and after (post-RS) availability of the test results. Patients followed recommendations that emerged post-RS result.

Descriptive analyses were performed for the overall population and subpopulations of patients with clinical high-risk tumors defined by grade 3 disease and/or >20% Ki67 positivity, patients with N0 vs. N1 nodal status, and those with cancers of lobular histology. For these populations, changes in treatment recommendations regarding HT and/or CT were determined by comparing the percentage of patients receiving a recommendation of HT or HT + CT before vs. after the test results became available to the treating physician. The primary analysis used the RS cut points commonly used before TAILORx was published, to define three RS groups: 0–17, 18–30, and 31–100.

An additional exploratory analysis regarding the influence of the RS result on treatment recommendations was performed using the RS cut points for N0 patients defined by the TAILORx trial and their expected interpretation in clinical practice. In the TAILORx study, exploratory analyses suggested that all N0 patients above the age of 50 years with RS 0–25 have no CT benefit. For younger patients (≤50 years), an RS of 0–15 indicated no CT benefit, whereas some CT benefit was derived for RS 16–20 (1.6%) and RS 21–25 (6.5%). Patients with N1 disease were assigned to RS groups according to previous cut points as described above. Thus, the algorithm described in Table 4 was recommended by an expert panel based on the finding from exploratory analyses of TAILORx, suggesting a potentially clinically meaningful benefit from CT for a small number of patients, and was used for patients with nodal status N0 in the analysis of presumed treatment recommendations based on TAILORx findings.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.