Incorporation of biologic factors for the staging of de novo stage IV breast cancer

This study aimed to investigate the prognostic value of biological factors, including histological grade, estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor-2 (HER2) status in de novo stage IV breast cancer. Based on eligibility, patient data deposited between 2010 and 2014 were collected from the surveillance, epidemiology, and end results database. The receiver operating characteristics curve, Kaplan–Meier analysis, and Cox proportional hazard analysis were used for analysis. We included 8725 patients with a median 3-year breast cancer-specific survival (BCSS) of 52.6%. Higher histologic grade, HER2-negative, ER-negative, and PR-negative disease were significantly associated with lower BCSS in the multivariate prognostic analysis. A risk score staging system separated patients into four risk groups. The risk score was assigned according to a point system: 1 point for grade 3, 1 point if hormone receptor-negative, and 1 point if HER2-negative. The 3-year BCSS was 76.3%, 64.5%, 48.5%, and 23.7% in patients with 0, 1, 2, and 3 points, respectively, with a median BCSS of 72, 52, 35, and 16 months, respectively (P < 0.001). The multivariate prognostic analysis showed that the risk score staging system was an independent prognostic factor associated with BCSS. Patients with a higher risk score had a lower BCSS. Sensitivity analyses replicated similar findings after stratification according to tumor stage, nodal stage, the sites of distant metastasis, and the number of distant metastasis. In conclusion, our risk score staging system shows promise for the prognostic stratification of de novo stage IV breast cancer.


INTRODUCTION
De novo stage IV breast cancer is a rare disease that is considered to be incurable and accounting for~5% of newly diagnosed breast cancer cases 1 . Earlier, the majority of patients with this type of cancer did not survive for more than 5-years after diagnosis, with a 5-year overall survival (OS) of~20% 2 . However, with advances in chemotherapy, target therapy, and endocrine therapy, the 5-year OS has now increased to 40% in the modern era of multidisciplinary management 3,4 . The 5-year OS could reach 50% in hormone receptor (HoR)-positive (+) tumors, but the 3year breast cancer-specific survival (BCSS) and OS for de novo stage IV triple-negative breast cancer are still lower than 20% 5,6 . Further, the median OS for human epidermal growth factor receptor-2 positive (HER2+) tumors in this population has also been reported to reach 60 months after trastuzumab-based therapy 7 , and the prognosis of de novo stage IV disease was found to be better than those with recurrent tumors [8][9][10] .
Gene expression studies have suggested that the histological grade is more closely related to the molecular composition of breast cancer than the primary tumor size and lymph node status 11,12 . Tumor grade is an important biologic factor that has been incorporated into the most recent breast cancer staging system of the 8th American Joint Committee on Cancer (AJCC) 13 . The 8th AJCC breast cancer staging system has significantly changed from the 7th AJCC anatomical staging system. Biologic factors in breast cancer, including histological grade, HER2, estrogen receptor (ER), and progesterone receptor (PR) status, have been now included in the traditional anatomic primary tumor (T), regional lymph nodes (N), and distant metastasis (M) staging system to create new stages 13 . Several studies have verified that the new staging system is more accurate in predicting prognosis than the 7th AJCC staging system [14][15][16][17] . However, the new staging system only includes patients with nonmetastatic disease; those in de novo stage IV disease were excluded 13 . In previous studies, including ours, have shown that the HoR+/HER2+ subtype was associated with significantly better BCSS than the HoR+/HER2− and HoR−/HER2+ subtypes in de novo stage IV disease, while those with HoR-/HER2-disease had the worst survival 2,5 . Therefore, tumor biologic factors are significant predictors for both responses to therapy and prognosis in non-metastatic as well as metastatic disease.
The de novo stage IV subgroup is an important enrolled population in clinical trials. Further, there is a significant difference in the survival of this population. Therefore, it is critical to investigate whether the biologic factors based on the 8th AJCC stages could also apply to de novo stage IV disease. In light of this, we explored the prognostic value of biological factors for this disease using a population-based cohort from the surveillance, epidemiology, and end results (SEER) program.
Survival and prognosis Within a median follow-up of 29 months (range, 0-83 months), there were 5326 deaths observed, out of which 4653 were related to breast cancer. The 3-year BCSS was 52.6%, and the median BCSS was 39 months.
Multivariate analysis showed that higher histologic grade, HER2negative, single HoR-positive (ER-positive or PR-positive), and double HoR-negative (ER-negative and PR-negative) status were significantly associated with lower BCSS (Table 2). Moreover, age, race/ethnicity, histology, surgery, chemotherapy, bone metastasis, lung metastasis, liver metastasis, and brain metastasis were also identified as independent prognostic factors correlated with BCSS. However, the BCSS was comparable among patients with stage T1 and T2 disease, and BCSS was also comparable among patients with stage N0, N1, N2, and N3 disease.
The area under the curve (AUC) under the ROC curve of SEER risk score staging system (AUC = 0.628, 95%CI 0.618-0.638) was significantly higher than that of MDACC risk score staging system (AUC = 0.611, 95%CI 0.601-0.622) (P < 0.0001) (Fig. 3). The results indicated that the SEER risk score staging system had a better predictive performance for BCSS compared to the MDACC risk score staging system. ER estrogen receptor, G1 well-differentiated, G2 moderately differentiated, G3 poorly/undifferentiated, HER2 human epidermal growth factor receptor-2, N nodal, PR progesterone receptor, SD standard deviation, T tumor. a Indicates four metastatic sites, including bone, brain, liver, and lung. Prognostic value of the risk score staging system We used multivariate prognostic analysis to assess the prognostic effect of the SEER risk score staging system based on BCSS (Table 4). After adjustment for age, race/ethnicity, histology, T stage, N stage, surgery, chemotherapy, radiotherapy, and the sites of distant metastasis, the SEER risk score staging system was found to be an independent prognostic factor associated with BCSS. Patients with a higher risk score had a lower BCSS. When risk score 0 was used as a  reference, patients with risk score 1 was associated with significantly lower BCSS than those with risk score 0 (hazard ratio [HR] = 1.473, 95% confidence interval [CI] = 1.195-1.816, P < 0.001), patients with risk score 2 had a significantly lower BCSS than those with risk score 0 (HR = 2.437, 95% CI = 1.979-3.001, P < 0.001), patients with risk score 3 had a significantly lower BCSS compared to those with risk score 0 (HR = 5.092, 95% CI = 4.121-6.291, P < 0.001). Patients with risk score 3 was associated with significantly lower BCSS compared to those with risk score 1 (HR = 3.456, 95% CI = 3.182-3.754, P < 0.001) and risk score 2 (HR = 2.647, 95% CI = 2.459-2.848, P < 0.001). Sensitivity analyses replicated similar findings after stratification according to the T stage ( Fig. 4a-d), N stage ( Fig. 5a-d), the sites of distant metastasis (Fig. 6a-e), and the number of distant metastasis (Fig. 7a-c) ( Table 5).

DISCUSSION
A primary limitation of the AJCC 8th stages is that it is limited to patients with non-metastatic breast cancer. It is critical to investigate whether the biologic factors based on the 8th AJCC stages could also be applied to breast cancer with de novo stage IV disease. In the present study, we used a population-based cohort from the SEER program to investigate the prognostic effect of biologic factors in de novo stage IV breast cancer. The current study indicated that the risk score staging system developed by the histological grade, HER2 status, ER status, and PR status might provide a better risk stratification for this population.
The findings in our study may have potential clinical implications in the current era of personalized therapy for de novo stage IV breast cancer. First, it provides a concise summary of the de novo stage IV breast cancer, which allows for efficient communication among clinicians and researchers. In addition, it also provides a framework for relaying prognostic stratification based on the sum of the tumor and biologic factors. According to this prognostic framework, the risk score staging system can be applied to determine the optimal treatment approach for individual patients. Moreover, it can more thoroughly and accurately assess the impact Fig. 3 Receiver operating characteristics analyses for prediction of breast cancer-specific survival with the two risk score staging systems. The SEER risk score staging system had a better predictive performance for breast cancer-specific survival compared to the MD Anderson Cancer Center risk score staging system. of the novel or changing treatment approach for this population. Finally, the risk score staging system can frequently be used to define subgroups for inclusion in clinical trials. The present analysis reveals the heterogeneity in the prognosis of the de novo stage IV breast cancer, and therefore, overcomes a significant limitation of the latest AJCC staging system, which does not include de novo stage IV disease. Although several studies have incorporated the biological factors into the substages of this population, only the histologic grade, ER, and HER2 were included in the scoring system for stratification, and the PR status was excluded 18,19 . Another limitation of the previous studies was that the survival curves between risk score 0 and risk score 1 overlapped 18,19 . In this study, a large cohort was used, and the BCSS curves could be clearly distinguished. Additionally, the risk score staging system developed in our study using the data from the SEER program (including grade, HER2, ER, and PR status) had a better predictive performance for BCSS than the MDACC risk score staging system (including grade, HER2, and ER status) 18 . Therefore, in order to better predict the prognosis and guide treatment decisions, these substages based on the risk score staging system should be introduced in the advanced setting similar to patients with non-metastatic disease. Additionally, the risk stratification based on the risk score staging system will undoubtedly serve as critical roles in patient care and research for this population.
Triple-negative breast cancer had the worst outcomes in de novo stage IV disease 2,5 . In our study, we found that the median BCSS was less than 20 months in triple-negative breast cancer patients regardless of the histologic grades. However, it should be noted that in HER2negative tumors, single HoR-positive tumors (ER+/PR− or ER−/PR+ subtypes) had lower BCSS than those of the double HoR-positive  20 . Several studies also confirmed that single HoR-positive tumors showed worse prognosis than double HoR-positive tumors in the HER2negative group [21][22][23][24] . No significant effect of single HoR-positive tumors in the prognostic assessment of HER2-positive tumors may be related to the results of trastuzumab treatment. In the 8th AJCC staging system, prognostic stage groups were determined in the breast cancer patients that mostly underwent appropriate multidisciplinary treatment, including chemotherapy, anti-HER2 therapy, and endocrine therapy 13 . In our study, all patients with HoR+/HER2+, HoR−/HER2+, and HoR −/HER2− subtypes were received chemotherapy, and approximately half of the HoR+/HER2− patients received chemotherapy. However, we did not have data regarding anti-HER2 therapy and endocrine therapy in this study. In our study, the effect of biological factors on the survival trends in de novo stage IV breast cancer was similar to the results from non-metastatic breast cancer [14][15][16][17] . Therefore, we could assume that the majority of patients in our SEER-based study also received appropriate multidisciplinary treatment according to the status of biologic factors.
According to the 8th AJCC pathological staging system, T2N0M0, G2/HER2−/ER+/PR+patients are classified as stage IA, and G2/HER2 −/ER+PR−, G2/HER2−/ER−/PR+, G2/HER2−/ER−/PR− patients are classified as stage IIB 13 . Furthermore, consistent with our findings, the survival of HER2-negative and single HoR-positive tumors was comparable to that of double HoR-negative tumors according to the new AJCC pathological staging system. The aggressive behavior of single HoR-positive tumors indicated that the single HoR-positive tumors had distinct clinical and biological features. Therefore, in this study, we integrated single HoR-positive and double HoR-negative tumors into an aggressive subgroup. A recent study showed that the Fig. 5 Comparison of breast cancer-specific survival by risk score for N0-3 patients using the SEER risk score staging system. a N0; b N1; c N2; d N3. HER2−/ER+/PR− subtype exhibited more ZNF703 and RPS6KB1 amplification events than HER2−/ER+/PR+ tumors 25 , which could promote cell proliferation, increase the stem cell population, chemotherapy resistance, tamoxifen resistance, and radiotherapy resistance [25][26][27][28][29][30] . Therefore, further exploration of treatment strategies for single HoR-positive tumors are needed in the future to improve patient survival.
The 8th AJCC staging system incorporates the T stage, N stage, histologic grade, ER, PR, and HER2 status in the determination of the novel stages 13 , but we did not include the T stage and N stage in this study due to the controversial effect of T and N stage on BCSS in patients with de novo stage IV breast cancer. Additionally, Li et al. reported that there was no difference in survival between nodenegative and node-positive patients 10 . Moreover, the current AJCC staging is mainly divided into clinical staging (all patients for clinical classification and staging) and pathological staging (for patients in whom surgery is the initial treatment), but the role of surgery in de novo stage IV disease remains controversial 3,4,31-33 . Therefore, the significance of integrating T and N stages into the risk score staging system needs to be further explored in the future for this population.
An important caveat should be noted that the patients enrolled in the determination of AJCC 8th stages were treated with multimodal therapy according to the status of biologic factors. However, standard testing of biologic markers for evidence-based treatment might not be accessible to the majority of patients around the globe, especially those in low-and middle-income countries 34 . Thus, the applicability of the risk score staging system to global patients may be compromised.
The role of local management in patients with de novo stage IV breast cancer remains controversial. In our study, we found that local surgery was associated with better BCSS for this population. However, conflict results were reported in the American Society of Clinical Oncology 2020 data. A retrospective study using the data from the National Cancer Database showed that primary tumor resection was associated with better overall survival in breast cancer patients with de novo stage IV disease 35 . Another randomized trial from E2108 indicated that additional locoregional treatment to optimal systemic therapy did not improve progression-free survival or overall survival compared to those in optimal systemic therapy alone arm 36 . According to our findings, it is worth carrying out further study to investigate the role of local management in de novo stage IV breast cancer after stratification by the risk score staging system. Several limitations of the present analysis should be emphasized. First, the SEER database lacks sufficient details of the chemotherapy regimen, endocrine therapy, and anti-HER2 therapy. Second, comorbidity and performance status are also not recorded in the SEER database. Third, our study used BCSS in order to neutralize any confounding effects resulting from non-breast cancer-related death. In addition, the SEER program lacks a central pathology review for the biologic factors considered in the risk score, which could potentially lead to misclassification of the risk score staging system. Finally, the median follow-up period was short (29 months) in our study, which may have concealed some minor long-term effects among different stage categories.
In summary, the risk score staging system proposed in this study could be useful for more detailed stratification of de novo stage IV breast cancer and reflect the outcome of individualized treatment. Further studies involving larger sample sizes and more extended observation periods should be conducted to confirm the prognostic effect and validity of this staging system.

METHODS Patients
Data for female breast cancer diagnosed between 2010 and 2014 were extracted from the population-based SEER database 37 . Patients diagnosed with de novo stage IV breast cancer were included. Patients with de novo Fig. 6 Comparison of breast cancer-specific survival by risk score in different metastatic sites using the SEER risk score staging system. a bone; b brain; c liver; d lung; e other. stage IV breast cancer were defined as distant metastases known at the time of diagnosis or found during the initial staging workup prior to the first course of treatment. We excluded patients in which there was no pathological diagnosis, T0 stage, no data on T stage, N stage, tumor grade, HER2, ER, and PR status were also excluded. The patients with unknown metastatic sites, including bone, brain, liver, and lung, were also excluded. Moreover, patients without chemotherapy in HoR+/HER2+, HoR−/HER2+, and HoR−/HER2− subtypes were also excluded from this study. Our study was exempt from approval by the Institutional Review Board of the First Affiliated Hospital of Xiamen University because the SEER program provides de-identified information of patients.

Variables
The following variables of interest were extracted: age at diagnosis, race/ ethnicity, histology, T stage, N stage, histological grade, ER status, PR status, HER2 status, radiotherapy, surgical procedures, and chemotherapy. In addition, the patterns of distant metastasis, including bone, brain, liver, lung, and other sites of metastasis, were included. TNM stage was determined based on the AJCC 7th staging system.

Statistical analysis
The primary outcome in the present study was BCSS, which was considered as the time from the initial diagnosis to death from breast cancer. The median BCSS and BCSS rate was estimated using the Kaplan-Meier method, and the effect of various subgroups on BCSS were compared by the log-rank test. ROC curve was used to evaluate the AUC, in order to compare the effect of different risk score staging systems in predicting BCSS. The independent prognostic factors associated with BCSS were determined with the multivariate Cox proportional hazard model. Sensitivity analyses focused on the T stage, N stage, the sites of distant metastasis, and the number of distant metastasis were performed. All data were analyzed by IBM SPSS version 22.0 (IBM Corp., Armonk, NY) and MedCalc 13.0 software (MedCalc Software BVBA, Ostend, Belgium). A P value < 0.5 was considered to indicate the statistical significance, and all tests were two-sided.

Reporting summary
Further information on experimental design is available in the Nature Research Reporting Summary linked to this paper. Fig. 7 Comparison of breast cancer-specific survival by risk score in different number of metastatic sites using the SEER risk score staging system. a one site; b two sites; c three-four sites. Indicates four metastatic sites, including bone, brain, liver, and lung.
Z.-Y. He et al.