Prognostic impact of examined lymph-node count for patients with esophageal cancer: development and validation prediction model

Esophageal cancer (EC) is a malignant tumor with high mortality. We aimed to find the optimal examined lymph node (ELN) count threshold and develop a model to predict survival of patients after radical esophagectomy. Two cohorts were analyzed: the training cohort which included 734 EC patients from the Chinese registry and the external testing cohort which included 3208 EC patients from the Surveillance, Epidemiology, and End Results (SEER) registry. Cox proportional hazards regression analysis was used to determine the prognostic value of ELNs. The cut-off point of the ELNs count was determined using R-statistical software. The prediction model was developed using random survival forest (RSF) algorithm. Higher ELNs count was significantly associated with better survival in both cohorts (training cohort: HR = 0.98, CI = 0.97–0.99, P < 0.01; testing cohort: HR = 0.98, CI = 0.98–0.99, P < 0.01) and the cut-off point was 18 (training cohort: P < 0.01; testing cohort: P < 0.01). We developed the RSF model with high prediction accuracy (AUC: training cohort: 87.5; testing cohort: 79.3) and low Brier Score (training cohort: 0.122; testing cohort: 0.152). The ELNs count beyond 18 is associated with better overall survival. The RSF model has preferable clinical capability in terms of individual prognosis assessment in patients after radical esophagectomy.


Surgery and pathology. Surgical procedures included primary tumor resection and LN dissection. The
McKeown or Sweet esophagectomy with radical lymphadenectomy was selected based on preoperative conditions and patient status. Surgery was performed by experienced doctors who can carry out complete lymph node dissection based on comprehensive preoperative examination using a unified model. All resected specimens were carefully examined by two senior pathologists following a uniform process. The number of LN was counted under a low-power field microscope. All processes were strictly and carefully executed to ensure the accuracy of lymph node count. The total number of lymph nodes was calculated as the total number of LNs resected in the cervical, thoracic, and abdominal regions. A pathological N stage was defined according to the eighth edition of the AJCC TNM classification system 17,18 . Follow-up. Patients were scheduled for follow-up every 3 months in the first 2 years after esophagectomy and every 6 months in the following years. The endpoint was death (disease-related or nonspecific) or the loss to follow-up. Disease-free survival (DFS) was defined as the time from surgery to first disease manifestation or death from any cause. Overall survival (OS) was evaluated as the time from surgery until death from any cause or last follow-up. The data of patients alive at the end of the study were censored for the purpose of analysis.
Model development. The prediction model was developed in two stages: variable selection and model construction. The methods implemented at each stage and the prediction models are described below. We considered a variable selection method: Lasso regression analysis. A backward stepdown selection process based on the lowest Akaike information criterion (AIC) value was used in the Lasso-Cox regression model to make all variables in the model significant 19 . Next, the relationship between the selected variables and the outcome of interest was investigated using the Lasso-Cox regression model and RSF model. Random survival forest is an ensemble method, which uses the bootstrap sampling method to randomly select samples to form multiple binary survival trees, and then form a random survival forest plot 20 . The tree nodes are split according to the maximum survival difference between child nodes. For each bootstrap sample, approximately 37% of the samples in the training cohort were not extracted on average, and these samples were called out-of-bag (OOB) samples. The OOB error rate of the OOB sample was calculated and the lower the error rate, the better the model performance. For RSF model, the parameters ntree and node size were determined according to the lowest error rate using rsample package (the error rate = 31.7%, ntree = 500, node size = 10). Other parameters were set according to the default values.
Assessment of model performance. The performance of the models was assessed based on the timedependent area under the receiver operating characteristic curve (t-AUC). Model discrimination performance was determined using the Harrell concordance index (C-index). The C-index ranged from 0.5 (no better than www.nature.com/scientificreports/ chance) to 1.0 (perfect discrimination) 21 . The overall performance of the prediction model was quantified as the Brier score, reflecting the average squared deviation between the predicted probabilities for a set of events and their outcomes (0: perfect prediction and 1: completely false prediction). The prediction error curve can be used to graphically determine the prediction error of the Brier score over time 22 . The models were subjected to external testing with the SEER cohort.
Statistical analyses. Continuous variables were presented as medians with interquartile ranges (IQR), while categorical variables were presented as percentages. Survival curves were plotted using the Kaplan-Meier method, and the log-rank test was used to assess differences in survival between the groups. The cut-off value of ELNs count in the training cohort was identified using R-statistical software and the survival package and was validated by analyzing the testing cohort. Probability (p) values < 0.05 and the statistical tests were based on a two-sided significance level. Lasso regression analysis was performed using the glmnet package in R-statistical software. RSF and stepwise selection models were implemented using the Random Forest SRC and MASS packages, respectively. All statistical analyses were performed using R version 4.2.0 (https:// www.r-proje ct. org/).

Ethical approval.
The study was conducted in accordance with the principles of the Declaration of Helsinki, the study protocol was approved by the ethics committee of affiliated cancer hospital of zhengzhou university (NO. 2022-KY-0049-001) and individual consents for this retrospective analysis were waived.

Results
Patient characteristics and distribution of ELNs number. We enrolled 743 and 3,208 patients with EC in the training and testing cohorts, respectively. The demographic characteristics and pathological findings for each cohort are presented in www.nature.com/scientificreports/ Independent prognostic factors in the training cohort. After univariate analysis via Cox regression analysis, data on the variables of sex, tumor site, histological type, grade, pathologic T category, pathologic N category, and the ELNs count were entered into multivariable logistic regression analyses. However, histological type and grade were not found to be significant. Multivariate analyses demonstrated that hazard ratios were significantly higher for the factors of male sex, tumor site, advanced depth of invasion, increased number of metastasized lymph nodes, and decreased number of examined lymph nodes ( Table 2). Table 2 reveals that an increasing ELN count was an independent factor favoring cancer survival (training cohort: HR = 0.98, CI = 0.97-0.99, P < 0.01; testing cohort: HR = 0.98, CI = 0.98-0.99, P < 0.01). Table 3 shows that the number of ELNs was an independent prognostic factor of DFS in the training cohort (HR = 0.99, CI = 0.97-1, P = 0.02). We determined that the optimal resected ELNs count was 18 in the training cohort using R-statistical software and the survival package. As shown in Fig. 1A, B, patients with resected ELN count > 18 had a better prognosis in both cohorts, whereas no significant difference was observed in the survival curves of DFS between the two groups in the training cohort (Fig. 1C).

Impact of examined lymph-node number on survival and optimal count.
Subgroup analyses. In T1, T2 ,and T3 + T4 cases, we noted that patients having beyond 18 inspected nodes had greater overall survival rates ( Supplementary Fig. 2 A-F) in both cohorts. Since few patients had T4 tumors, we merged T3 and T4 into one T3 + T4 group. The same finding was only observed in N0-1 stages (Supplementary Fig. 3A, C vs. B, D) in both cohorts. In the histologic type subgroup analysis, the ELN count (ELNs > 18 vs. ELNs ≤ 18) was an independent prognostic factor of OS in squamous cell carcinoma stages, but not in adeno- www.nature.com/scientificreports/ www.nature.com/scientificreports/ carcinoma in the training cohort ( Supplementary Fig. 4A, B), whereas in the validation cohort, the result was observed in both squamous cell carcinoma and adenocarcinoma ( Supplementary Fig. 4C, D). Owing to the limited data collected from the database, we only performed subgroup analysis on preoperative and postoperative treatment on the data of the training cohort. The survival benefit of lymph node dissection greater than 18 was only found in patients who did not receive preoperative or postoperative adjuvant therapy ( Supplementary  Fig. 5B, D vs. A, C). Supplementary Fig. 6. The process of screening variables using Lasso regression analysis (with no zero coefficients) is shown in Supplementary Fig. 7A, B. Five statistically significant variables were retained in the Lasso-Cox and RSF models: sex, tumor site, pathological T category, pathological N category, and ELN count. In addition, we developed models containing the TNM stage to compare the RSF model with the AJCC stage.

Model development. The design of the model is illustrated in
Model performance. Figure 2A, B illustrates the discrimination of the model assessed using the C-index.
The C-index of the RSF model was highest among the four models in both the training and testing cohorts. Figure 3A . In addition, we plotted time against AUC curves for each model (Fig. 4A, B). We found that the AUC changed over time. Figure 5A   www.nature.com/scientificreports/

Discussion
In this study, we demonstrated that the number of retrieved lymph nodes removed was significantly associated with a favorable prognosis in patients with EC from both cohorts. This conclusion corroborated with the aforementioned findings 23 . Furthermore, we found the same correlation between ELNs and DFS. An optimal number of 18 resected lymph nodes demonstrated an improved OS but not DFS for patients who underwent esophagectomy. We tested the value of 18 using the SEER database and found that there were substantial differences between the cut-off values and the survival of patients with EC after esophagectomy. According to the theory: the long-term benefit is more important than the early endpoint 24 . We proposed 18 as the optimal number of lymph nodes in view of resection. We further examined the relationship between ELNs and survival in different types of tumors. A higher number of ELNs had a positive effect on the OS in the T1, T2, and T3 + T4 stage tumors. Patients with a greater number of ELNs had improved OS in N0-1, but not in N2-3. We found that with deeper tumor invasion or more positive lymph nodes, higher ELN counts was not an independent factor favoring OS. In other words, the improvement in ELNs for survival is limited to its number. In terms of histologic type, patients with large number of ELNs were associated with better survival in both adenocarcinoma and squamous cell carcinoma of the testing cohort; however, the relationship was found only in squamous cell carcinoma of the training cohort and not in adenocarcinoma. In addition to this, we found that when people undergo preoperative or postoperative adjuvant therapy, the relationship was dispersed. It may be owing to the effect of adjuvant therapy on patients' survival. Further studies are required to explore the complex relationship between adjuvant therapy and resected lymph nodes and survival. As previously mentioned, we know that the ELN count has been shown to be a superior indicator of survival of EC patients.  www.nature.com/scientificreports/ In order to better predict postoperative survival of patients with esophageal cancer, we attempted to identify more indicators. However, prognostic factors in patients with EC are known to be complicated. To date, a rough assessment is usually made based on the influencing factors confirmed by previous studies such as TNM stage and tumor grade, but not through individual analysis and judgment 25 . In view of this, we identified factors that affect the survival and prognosis of patients with EC and developed a prediction model.
Through Lasso regression analysis, we identified sex, tumor site, T and N category, and number of retrieved nodes as independent prognostic factors. These findings were consistent with those of previous studies on survival risk factors for esophageal cancer 7,8 . Then, a Lasso-Cox model was constructed for predicting survival. The hazard ratios were significantly higher for the factors of male sex, tumor site, advanced depth of invasion, increased number of metastasized lymph nodes, and decreased number of examined lymph nodes.
The Cox proportional hazards regression algorithm is commonly used to design models as we did; however, the conditions for its applicability are subject to several restrictions, such as the inaccuracy of models caused by the deviation of independent variable selection methods 26 . Compared with traditional survival analysis methods, the random survival forest model is not constrained by the proportional risk assumption, log-linear assumption, and other conditions. The machine learning-based risk prediction model yielded more favorable discrimination and significantly better accuracy than did the traditional model in this study. The RSF model outperformed the Lasso-Cox model with a higher C-index. Besides, in order to evaluate whether the RSF model could improve the prognostic prediction compared to the TNM stage, we developed models containing the TNM stage, and the results indicate that the RSF model showed better discrimination and accuracy. Moreover, we calculated the AUC for time-specific ROC curves at continuous time points, and the dynamic AUC line was plotted to depict temporal changes in accuracy. It also had the lowest Brier Score and prediction error curve. The results showed that the RSF model had a higher accuracy than did the other prediction models.
The variables of the RSF model were evaluated by variable importance (VIMP). The VIMP showed that the N stage was the largest important factor for prognosis ( Supplementary Fig. 8). The N stage, the number of metastasis nodes identified, depends significantly on the number of nodes retrieved. As shown in Table 2, ELN counts were an important prognostic factor for patients with esophageal cancer, and more than 18 nodes reduced the risk scores significantly. The possible reasons for this finding are that retrieving more lymph nodes makes it more likely that the potentially metastasized lymph nodes will be resected. Moreover, the number of retrieved nodes may reflect the adequacy of surgical, pathological, and institutional care, all of which tend to affect treatment outcomes 27,28 .
Our study exhibited several strengths, including a large sample size, independent validation in an external cohort of patients, and the use of a machine-learning-based statistical tool for prediction model. In this study, we collected real data from our center as a training set and data from a foreign database as an external validation set. In addition, our data volume is very large. These factors added to the credibility of our finding of the RSF model showing considerable accuracy and efficacy compared with the COX model. Our results could be used in promising clinical applications prospectively, such as patient counseling, convenient prognosis assessment, and individualized follow-up strategy formulation, promoting the combination of prognostic tools and clinical management for operable EC patients.
However, we also encountered certain limitations. First, this single-center, retrospective cohort exhibited selection bias, undermining the generalizability of the best model recommended in this study. Second, several potential factors (tumor size, inflammatory biomarkers, and genetic data) were not included in the survival analysis since data collected was inadequate. Third, the algorithm and predictive process of the random survival forest model could not be expressed by a conventional formula as a nonparametric model, thereby affecting the generalizability and applicability of research conclusions to a certain extent. Fourth, two possible biases could have resulted in a miscount of LN number. These include underestimation as a result of the difficulty in separating each LN in the dissected tissues and overestimation because of fragmentation of nodal tissues during the removal of LNs, which might limit the application of a cut point. Further combined multicenter analyses should be considered, as well as prospective clinical verification of the precise value and a more acceptable cutoff number of lymph nodes.
In the present study, we found that a higher number of ELNs was associated with better prognosis, with an optimal ELN count of 18. In addition, we found the RSF model had the highest prediction accuracy among the four prediction models we developed. Thus, the RSF model is recommended for predicting the prognosis in patients with esophageal cancer after surgery.