Clinical outcomes and a nomogram for de novo metastatic breast cancer with lung metastasis: a population-based study

To better understand the clinical characteristics of newly diagnosed lung metastatic breast cancer (LMBC) and quantify its prognosis, we retrieved data on patients with LMBC from the Surveillance, Epidemiology, and End Results database. Eligible patients were randomly assigned to training and validation cohorts (ratio 7:3) to establish a nomogram using the Cox proportional hazards regression model. In total, 4310 patients with LMBC were enrolled, including 52.4% (2259/4310) HR+/HER2−, 17.6% (757/4310) HR+/HER2+, 10.8% (467/4310) HR−/HER2+, and 19.2% (827/4310) HR−/HER2− subtype patients. Inclinations of lung and brain involvement in HR−/HER2+ and HR−/HER2− subgroups, liver involvement in the HER2 overexpressing subgroup, and bone involvement in the HR-positive subgroup were detected in the LMBC population. Regarding prognosis, HR+/HER2+ subtype patients presented the most favorable profile (mOS 35.0 months, 95% CI 30.1–39.9), while HR−/HER2− patients exhibited the worst (mOS 11.0 months, 95% CI, 10.0–11.9). A nomogram was developed in the training cohort and validated internally (C-index 0.70) and externally (C-index 0.71), suggestive of decent performance. This study assessed the clinical outcomes associated with molecular subtypes, metastatic patterns, and surgical intervention and provided a robust nomogram for the estimation of survival probabilities, which are promising for the management of LMBC in clinical practice.

were obtained from the Surveillance, Epidemiology, and End Results (SEER) database. Patients who were newly diagnosed with LMBC and had no missing clinicopathological and survival data were assessed for eligibility. Patients were excluded if (1) tumor grade; molecular subtypes; and the status of estrogen receptor (ER), progesterone receptor (PgR), and human epidermal growth factor receptor 2 (HER2), in addition to that of visceral metastases, were unknown and (2) tumor size and node involvement were not evaluated. Data analyses were performed in December 2020. Information on the selected cohort was successively extracted for the analysis of the following: age at diagnosis, sex, race, laterality, histologic type, grade, molecular subtypes, immunochemical status (ER, PR, and HER2), tumor size, node involvement, visceral metastases, performance of surgery, radiotherapy, and chemotherapy. This study was conducted in accordance with the Strengthening the Reporting of Observational Studies in Epidemiology guidelines 10 and the Transparent Reporting of a Multivariate Prediction Model for Individual Prognosis or Diagnosis statement 11 . Outcome. LMBC was defined as de novo metastatic breast cancer presenting with lung metastasis with positive histological confirmation. The differences in clinicopathological features and prognosis were compared among the molecular subtypes, which were classified into four categories-hormone receptor (HR)-positive/HER2-negative (HR+/HER2−), HR-positive/HER2-positive (HR+/HER2 +), HR-negative/HER2-positive (HR−/HER2+, HER2), and HR-negative/HER2-negative (HR−/HER2−, TN). Overall survival (OS) was defined as the interval between the initial diagnosis of breast cancer and death caused by any reason. According to SEER terminology, visceral metastases involve the liver and brain. The American Joint Committee on Cancer 7th edition guidelines were adopted to define the tumor-node-metastasis stage of breast cancer.
Statistical analysis. Comparative analysis of baseline characteristics was performed using Pearson's chisquare test and Fisher's exact probability test for qualitative data and the t-test or Wilcoxon rank test for quantitative data with a normal and abnormal distribution, respectively. Survival outcomes were compared using the Kaplan-Meier method with log-rank tests. Patients were randomly assigned to the training and validation cohorts in a 7:3 ratio to establish and externally validate the model. Prognostic factors were identified with consecutive performance of univariate and multivariate Cox proportional hazards regression analyses, which were adopted to develop a nomogram for estimating the 2-and 5-year survival probabilities. The discriminative and calibrating capabilities of this nomogram were evaluated both internally and externally using the concordance index (C-index) and calibration curves with bias-corrected validation under 1000 bootstrap resamples. A C-index of 0.5 indicated agreement by chance, and a C-index of 1 indicated perfect discrimination. All statistical analyses were two sided, with P < 0.05 considered statistically significant, and were performed using IBM SPSS Statistics (version 26.0; IBM Corp., Armonk, NY), and R software (version 3.6.4, www.r-proje ct. org/).

Results
Among the 7746 initially identified patients with LMBC, 4310 were finally eligible ( Supplementary Fig. 1). The population demographics and baseline clinicopathological characteristics are presented in Supplementary  Table 1.
Clinical outcomes associated with metastatic patterns. The metastatic patterns of patients with LMBC were analyzed; the involved cases and their survival were analyzed for outcome evaluation. Overall, lung-only metastatic disease had the highest incidence rate (1555/4310, 36.1%), followed by lung and bone metastatic disease (1332/4310, 30.9%), with no statistical significance in the median OS between the groups (P = 0.053; Supplementary Tables 2, 3). With respect to the number of metastatic sites, the overall prognosis constantly worsened with an increase in the number of involved organs (Supplementary Fig. 2A). For patients with malignancy involving three sites, an inferior tendency was detected in patients with bone, lung, and brain metastases (P < 0.0001; Supplementary Fig. 2B). However, no statistical significance was noted in the prognosis of patients with malignancy involving three sites ( Supplementary Fig. 2C). In addition, patients with LMBC and brain metastasis exhibited the worst survival, and the additional involvement of the bone tended to exert little Table 1. Population demographics and baseline characteristics of included patients associated with molecular subtypes.  www.nature.com/scientificreports/ effect on the prognosis of patients with lung-only (P = 0.053); lung and liver (P = 0.621); and lung, liver, and brain metastasis (P = 0.648; Supplementary Table 3, Supplementary Fig. 2D).
Clinical outcomes associated with treatment. The prognostic benefits of surgical performance were assessed in patients with de novo LMBC. Regarding molecular subtypes, a constantly improved OS was revealed across HR+/HER2+, HR+/HER2−, HR−/HER2+, and HR−/HER2− subtype disease ( Supplementary Fig. 3A-D), which was consistent with the prognostic outcomes of patients with lung-only and paired-organ metastases with bone, liver, and brain involvement ( Supplementary Fig. 4A-D). For the entire LMBC population, the overall OS was significantly improved by surgical intervention (P < 0.0001), and the comparative prognosis stratified by clinical characteristics is presented in Supplementary Table 4.
In addition, treatment patterns were subjected to comparative analysis in terms of survival benefits. A comparable effectiveness was detected between surgery plus chemotherapy (40.9 months, 95% CI 43.9-38.0) and surgery plus radiotherapy (42.0 months, 95% CI 48. 8-35.2). In addition, no additional benefit was retrieved from surgery plus chemotherapy plus radiotherapy. The surgery-based combination regimen was advantageous compared to the other treatment options, including surgery alone, chemotherapy alone, or chemotherapy plus radiotherapy. Development and validation of the nomogram. Eligible patients were randomly allocated to the training and validation cohorts, which included 3017 and 1293 individuals, respectively. In the training cohort, the prognostic factors were successively identified, including age at diagnosis (P < 0.0001), race (P < 0.0001), histologic type (P = 0.001), tumor grade (P < 0.0001), molecular subtype (P < 0.0001), AJCC T stage (P = 0.006), bone metastasis (P < 0.0001), liver metastasis (P < 0.0001), brain metastasis (P < 0.0001), performance of surgery (P < 0.0001), and chemotherapy (P < 0.0001), which were collectively adopted to develop the prognostic model ( Table 2). The nomogram showed that a tumor grade, molecular subtype, and age at diagnosis had a higher effect. The points of each variable were summed up by locating the respective points on the scale and then a straight line was drawn down to the total point scales to estimate the 2-year and 5-year survival rates.
The nomogram constructed for the estimation of 2-and 5-year survival in patients with LMBC was constructed is shown in Fig. 2. The overall C-index was 0.70 (95% CI 0.69-0.83) in the training cohort and 0.71 (95% CI 0.68-0.72) in the validation cohort, and the time-dependent C-index curves of the two cohorts signified that the values associated with survival were consistently > 0.50, indicative of favorable discriminative power (Fig. 3A). Calibration plots of the two cohorts demonstrated a decent agreement between the actual and predicted 2-and 5-year survival probabilities, which suggested a satisfactory calibration capability (Fig. 3B,C). In summary, the newly established nomogram showed good performance for survival estimation in patients with LMBC.

Discussion
To our knowledge, this is the first study to comprehensively discuss the clinical features and prognostic outcomes associated with molecular subtypes, metastatic patterns, and surgical intervention and to develop a robust prediction model for the estimation of individual prognosis of de novo metastatic breast cancer with lung involvement.
To illustrate the distinctive presentations associated with molecular subtypes, we first performed comparative analyses among the LMBC population with HR+/HER2+, HR+/HER2−, HR−/HER2+ and HR−/HER2− subtype disease. The percentage of TN and HER2 subtype disease was relatively higher in patients with LMBC than in the entire breast cancer population (approximately 10% vs. 4%) 2 , suggesting an inclination of lung metastasis related to molecular subtype in patients with LMBC. An ascending tendency of lung involvement in TN and HER2 subtype breast cancer was noted in previous studies, with a recorded incidence of 20.8-35.0% and 22.9-45.0%,   [12][13][14] . In addition, we demonstrated that bone involvement tended to occur in luminal-like disease, while liver metastasis tended to occur in HER2 overexpression disease, which is consistent with the findings of previous studies that focused on de novo metastatic breast cancer 12,15,16 . The current evidence suggests that this kind of presentation can be independent of disease characteristics 17 , and our study demonstrated that the organ-specific metastasis remained stable in patients with initial lung metastasis. This type of subtype-associated predisposition could potentially constitute the intrinsic profiles of breast malignancies and provide clinical implications for organic selectivity in the management of cancer metastasis. We also assessed the heterogeneous prognosis among the different molecular subtypes of LMBC, and our results suggested that the survival was in great favor of the HR+/HER2+ subtype, and patients with TN exhibited a relatively worse prognosis than the other subtypes. It is well acknowledged that TN breast cancer presents the most unfavorable disease features, with a median OS of 10-13 months in de novo metastatic breast cancer 18,19 , which was in line with the survival outcomes reported in the present study. In contrast, patients with HR+/ HER2+LMBC had relatively favorable prognostic profiles, which could be the result of multiple treatment options for this type of subtype, including anti-HER2-targeted therapy and endocrine therapy. However, we could not further discuss the therapeutic influences on prognosis due to insufficient information on treatment in SEER database.
This is the first study to show that the distinctive survival outcomes are associated with metastatic profiles. We classified the metastatic patterns and further investigated the effects of the involved sites on the prognosis of LMBC. The prognosis gradually worsened as the total number of involved sites increased, and for patients with LMBC with paired metastatic sites, a successively inferior tendency was detected in lung involvement combined with bone, liver, and brain involvement. However, no statistical significance was revealed in patients with LMBC and three concurrent metastases. To further clarify the prognosis of patients with LMBC with diverse metastatic patterns, we performed a comparative analysis in the entire population. The corresponding results showed that patients with LMBC and brain metastasis had the worst survival, and the additional involvement of the bone did not decrease the overall prognosis. Although the metastatic patterns and prognostic correlations have been discussed in previous studies 12,20,21 , they tended to focus on the entire group of patients with de novo metastatic breast cancer instead of patients with LMBC. Therefore, the findings might not apply to patients with newly diagnosed lung involvement. In the current study, we conducted analyses in this specific cohort and reported novel findings of prognostic profiles associated with involved patterns, which can provide promising evidence for clinical management of patients with LMBC in clinical practice. www.nature.com/scientificreports/ Given the controversial role of surgical intervention in de novo metastatic breast cancer [22][23][24][25] , we comprehensively discussed the potential effects of surgical performance on the prognosis of LMBC. Surgical performance could prolong the OS of patients with LMBC independent of the molecular subtypes. For patients with LMBC with lung-only and paired metastases, this kind of survival benefit remained consistent. Collectively, resection of primary disease can improve the overall prognosis of patients with. LMBC and this benefit tended to vary with metastatic patterns, which was consistent with previous findings 26 . There is a promising rationale for this practice, and increasing evidence has emerged for surgical performance in de novo stage IV breast cancer 27 . However, we could not further elaborate on the correlations between surgical performance and involved patterns in specific breast cancer subtypes due to the limited sample size, in addition to the specific techniques regarding surgery including surgical procedures, the optimal time point for surgery, and predictive biomarkers of the advantageous population for the receipt of surgical intervention due to limited data in the database. In addition, the overall prognosis could be interpreted by a show of factors associated with cancer treatment and disease characteristics in the setting of therapeutic phrases, these findings should be used with enough caution for physicians. However, considering the limited evidence for the prognostic value of surgical intervention for patients with LMBC, the current study could provide emerging evidence, and further studies should be conducted to investigate the associations between primary disease resection and surgical performance in the specified cohorts from the LMBC population.
To further quantify the estimation for individual prognosis, we developed a prediction model for the 2-and 5-year survival probabilities of patients with. de novo LMBC, which was further validated internally and externally in the selected cohorts. The results of model validation suggested that this novel nomogram provided a robust prediction of survival in the LMBC population. Considering that this reliable nomogram was the first fulfillment of prognostic estimation for LMBC, the present study provides strong evidence for practitioners to introduce individual-based therapeutics for survival benefits in clinical practice.
There are limitations to our findings. First, metastatic sites were not fully recorded in this database, which comprised the metastatic sites after sequential therapies and the soft tissue and distant lymph nodes at the initial diagnosis, and could exert inevitable effects on the proportion of results regarding metastatic patterns. However, the organs commonly involved in breast cancer include the lung, bone, liver, and brain 28 , which were included in our analyses, and the study results can be applied to all patients with LMBC. In addition, treatment information was not sufficiently available. This includes, for instance, endocrine therapy as a first-line intervention for ER+/HER2− breast cancer, targeted therapy for HER2+ breast cancer, chemotherapeutic protocols, radiation performance, and surgical removal of metastatic lesions, which could result in misestimation of the associations between current treatment options and survival benefits as well as ignorance of the influence of some new treatments, such as immunotherapy, PARP inhibitors, and PI3K-AKT inhibitors on survival benefits. This should be further improved in future population-based studies. Moreover, information on progression-free survival was not included in the SEER database, leading to a lack of a major survival profile. Finally, several disease characteristics vital to clinical outcomes are absent in this database, such as the Ki-67 index and lymphovascular invasion; therefore, we could consider all disease characteristics to further calibrate this prediction model.
In conclusion, this study revealed great heterogeneity in the clinical outcomes of LMBC associated with molecular subtypes, metastatic patterns, and surgical performance. Prognostic factors were identified, and we established a robust nomogram for the estimation of individual 2-and 5-year survival in patients with LMBC. Prospective studies with more cohorts for extensive validation are warranted in the future.

Data availability
The SEER database was available from: www. seer. cancer. gov.