Exploration of a modified stage for pN0 colon cancer patients

Exploring a modified stage (mStage) for pN0 colon cancer patients. 39,637 pN0 colon cancer patients were collected from the SEER database (2010–2015) (development cohort) and 455 pN0 colon cancer patients from the Second Affiliated Hospital of Harbin Medical University (2011–2015) (validation cohort). The optimal lymph nodes examined (LNE) stratification for cancer-specific survival (CSS) was obtained by X-tile software in the development cohort. LNE is combined with conventional T stage to form the mStage. The novel N stage was built based on the LNE (N0a: LNE ≥ 26, N0b: LNE = 11–25 and N0c: LNE ≤ 10). The mStage include mStageA (T1N0a, T1N0b, T1N0c and T2N0a), mStageB (T2N0b, T2N0c and T3N0a), mStageC (T3N0b), mStageD (T3N0c, T4aN0a and T4bN0a), mStageE (T4aN0b and T4bN0b) and mStageF (T4aN0c and T4bN0c). Cox regression model showed that mStage was an independent prognostic factor. AUC showed that the predictive accuracy of mStage was better than the conventional T stage for 5-year CSS in the development (0.700 vs. 0.678, P < 0.001) and validation cohort (0.649 vs. 0.603, P = 0.018). The C-index also showed that mStage had a superior model-fitting. Besides, calibration curves for 3-year and 5-year CSS revealed good consistencies between observed and predicted survival rates. For pN0 colon cancer patients, mStage might be superior to conventional T stage in predicting the prognosis.

Globally, colon cancer (CC) is one of the most common cancers worldwide and the major causes of cancer-related mortality 1,2 . For resectable CC, surgery combined with systematic lymph node dissection is considered as the primary treatment 3 . Although many prognostic markers have been identified to date, tumor stage is the most widely used prognostic factor 4 . The American Joint Commission on Cancer (AJCC) tumor-node-metastasis (TNM) classification, which is based on the depth of tumor invasion of the intestinal wall and the number of positive lymph nodes, is the most important factor in determining prognosis and subsequent therapeutic methods.
In recent years, the number of lymph nodes examined (LNE) for pN0 CC patients has attracted substantial attention due to its unique prognostic value 5 . Studies have shown that the greater the number of LNE, the better the disease-free survival (DFS) and overall survival (OS), especially in pN0 patients [6][7][8] . LNE is an independent risk factor for survival in patients with CC. Moreover, the LNE is an important indicator to ensure accurate staging of lymph nodes because it helps to assess the extent of lymph node involvement 9,10 . The National Comprehensive Cancer Network (NCCN) guidelines recommend that at least 12 lymph nodes need to be dissected intraoperatively for CC patients to effectively assess postoperative pathological staging 11 . In recent clinical practice, about 30-50% of CC patients still have inadequate lymph node dissection 12,13 .
However, the prognostic stratification for CC patients with negative node metastasis diseases has been only determined by T stage, regardless of the nodal information. In other words, the conventional staging system might be inappropriate for pN0 patients and the number of LNE could be taken into consideration to better stratify patients with different prognosis. Therefore, this study used data from the SEER database to determine the optimal stratification of LNE for pN0 CC patients and subsequently, construct a modified stage (mStage) for this special population based on conventional T stage and novel N stage (nN stage). In addition, our departmental data was used to further validated the capability of the mStage. www.nature.com/scientificreports/ and adjuvant therapy; (2) patients without complete follow-up data; (3) the basic information of the patient is incomplete.
In addition, 445 CC cases from the Second Affiliated Hospital of Harbin Medical University between January 2011 and December 2015 were also enrolled in this research as a validation cohort. The last follow-up was in October 2021. Inclusion and exclusion criteria for validation cohort were the same as those for development cohort (SEER). Statistical analysis. All the statistical analyses were calculated in statistical software package SPSS 22.0 (IBM Corp, Armonk, NY, USA) and R software (version 3.6.1 https:// www.r-proje ct.org/). The clinical characteristics of patients were summarized by number and percentage. In order to obtain the new N stage, the most appropriate cut-off value of LNE for CSS were obtained by X-tile software (version 3.6.1 https:// medic ine. yale. edu/ lab/ rimm/ resea rch/ softw are/). Cox proportional hazard regression was applied to investigate the relationship between mStage and CSS. Concordance index (C-index) and receiver operating characteristic (ROC) curve were used to determine the efficiency of mStage. Kaplan-Meier curves were generated and analyzed using logrank tests. The difference was considered statistically significant for a two-sided P < 0.05.

Result
Patient characteristics. According to the screening criteria, 39,637 patients from the SEER database (development cohort) and 455 patients from the Chinese population (validation cohort) were identified in this study. In the development cohort, female (51.0%), older than 65 years (65.0%), accounted for a higher proportion of patients, while male (60.5%), less than 65 years (54.5%), accounted for a higher proportion of patients in the validation cohort. In all patients, most proportions were found in right colon (64.2% and 50.5%), adenocarcinoma (92.5% and 77.4%), grade I/II (87.8% and 89.75%). The mean number of LNE in the development and validation cohorts was 18.98 ± 9.52 and 16.94 ± 7.77, respectively. The detailed data was summarized in Table 1.
Superiority of the modified TNM staging system. Cox proportional hazard regression model showed that mStage was still an independent prognostic factor of CSS after eliminating confounding factors (Table 4). In addition, mStage was also found to be an independent prognostic factor for OS and CSS excluding those died from other causes (Suppl . Table S1), 2). Figures 4a,b and 5a,b show survival curves stratified by conventional   www.nature.com/scientificreports/ TNM stage and mStage and prognostic stratification using the mStage is much clearer than with conventional TNM stage in the development and validation cohorts.
In addition, AUCs of the mStage and TNM stage at 3-year were drawn based on the new staging also indicating the better discrimination ability of the mStage in the development and validation cohort (Fig. 6c,d).
What' s more, the calibration curves for 3-year and 5-year CSS also showed a satisfactory predictive accuracy in the development and validation cohorts (Fig. 7a-d).

Discussion
Nowadays, CC is associated with a higher incidence of gastrointestinal cancers and poses a major public health challenge due to its high mortality rate 1 . The AJCC TNM staging system is the most widely applied system in clinical practice to evaluate the survival status, treatment and prognosis of patients. Among them, N stage was divided mainly according to whether there was lymph node metastasis or the number of positive lymph nodes: N0 (no metastatic LNE), N1 (N1a: 1 metastatic LNE; N1b: 2-3 metastatic LNE; N1c: cancer nodule formation) and N2 (N2a: 4-6 metastatic LNE; N2a ≥ 7 metastatic LNE). It can be seen that there is no further stratification in N0 stage. Hence, pN0 stage patients were only stratified according to the T stage, remains a controversial issue.     www.nature.com/scientificreports/ At present, the number of LNE has been shown to be an independent prognostic factor in multiple cancer types, especially in CC. Higher LNE has been associated with improved survival of pN0 CC patients but the mechanism of the relationship between the two is unclear 6,9,14 . Several hypotheses have been proposed. One possible reason is that the greater the number of LNE is associated with a greater chance of a positive node being examined and a more accurate tumor stage 15,16 . Assessing the number of LNE helps with reducing the likelihood of misclassifying stage III disease as stage I or II and improve prognosis, particularly for pN0 CC patients [17][18][19] . In addition, an increase in the number of LNE may be an indicator of better treatment, including complete tumor resection and adequate pathological evaluation. Another explanation is that the increase in the number of negative lymph nodes indicates a stronger immune response. Once the immune system detects the presence of tumor cells, local lymph nodes will increase, and more lymph nodes will be easier to be examined in postoperative pathology. Studies have found that LNE are correlated with local neutrophil and lymphocyte infiltration by analyzing the tumor microenvironment 5 . All the above studies proved the relationship between LNE and prognosis through data analysis, but did not specify the optimal stratification of LNE in pN0 CC patients. In this study, the optimal stratification of LNE for CSS was achieved by the X-tile software (N0a: LNE ≥ 26, N0b: LNE = 11-25 and N0c: LNE ≤ 10) and the Kaplan-Meier survival analysis results showed that there were significant differences in prognosis among the three LNE groups (P < 0.001) that proves that our results are meaningful.  www.nature.com/scientificreports/ The AJCC 8th TNM classification system recommends a minimum of 12 lymph nodes to effectively assess patient survival benefits. The number of LNE can be used effectively as a marker of surgical and pathological adequacy. But LNE are often influenced by tumor location, tumor size and patient age, and especially by the skill of the surgeon and the diligence of the pathologist 12,[20][21][22] . When the number of LNE is insufficient, the conventional TNM system is used for staging, and patients may be misjudged, especially for those determined as N0 stage cases. The inclusion of the number of LNE in the modified staging system could better stratify patients compared with conventional method to some extent.
In addition, there is a great deal of debate about the number of LNE at least 12. Ning et al. found that the optimal cut-off value of LNE should be 18 in pN0 CC patients 23 .Therefore, the cut-off value of the number of LNE is still controversial. We urgently need a new and convincing staging system for clinical use.
In this study, the optimal stratification of LNE was achieved by the X-tile software (nN stage: (N0a: LNE ≥ 26, N0b: LNE = 11-25 and N0c: LNE ≤ 10) and there were significant statistical differences between the three groups. Subsequently, a modified TNM stage was constructed based on conventional T stage and nN stage. To make the new system more rational in distinguishing patients with different outcomes, all patients were unified into six modified stages (mStage) according to the HRs and survival curves. The KM CSS curves show that the mStage can better classify patients with similar prognosis than the conventional stage. In addition, the AUC and C-index of mStage were significantly higher than those of conventional TNM staging system in both development and validation cohorts, indicating that the mStage has potential advantages over conventional stage in predicting survival.
There are several innovations in our research. First of all, the selection of LNE cut-off value took into account the patients with insufficient LNE, making the nN stage system more universal. Then, we further analyzed the prognostic interaction between nN stage and conventional T stage and constructed a modified staging system for pN0 CC patients, which showed superior predictive power compared with conventional TNM staging system. Finally, we did validation cohort to make our results more convincing.
This study has several limitations. Firstly, we proposed stratification of LNE for the first time, while there was no consensus on stratification results, which may limit the application and promotion of the mStage system. Secondly, this study is a retrospective analysis, which needs to be further verified by some prospective clinical studies. Thirdly, the sample size of the validation cohort seems to be insufficient, requiring a larger sample analysis to verify the accuracy of the modified staging system in the future.
In conclusion, the mStage system could predict the prognosis of pN0 CC patients and showed superior predictive power compared with conventional TNM staging system. Ethical approval. This study received ethical approval from the Second Affiliated Hospital of Harbin Medical University. The study used de-identified data and adhered to World Medical Association's Declaration of Helsinki for Ethical Human Research. SEER is a publicly available database with anonymized data; no ethical review was required.
Informed consent. Informed consent has been obtained from 455 colorectal cancer patients and their families.

Data availability
The study data of development cohort are available from the SEER database (user ID: 14,262-Nov2019, https:// seer. cancer. gov/). The study data of validation cohort used and/or analyzed during the current study are available from the Second Affiliated Hospital of Harbin Medical University, China.