A nomogram improves AJCC stages for colorectal cancers by introducing CEA, modified lymph node ratio and negative lymph node count

Lymph node stages (pN stages) are primary contributors to survival heterogeneity of the 7th AJCC staging system for colorectal cancer (CRC), indicating spaces for modifications. To implement the modifications, we selected eligible CRC patients from the Surveillance Epidemiology and End Results (SEER) database as participants in a training (n = 6675) and a test cohort (n = 6760), and verified tumor deposits to be metastatic lymph nodes to derive modified lymph node count (mLNC), lymph node ratio (mLNR), and positive lymph node count (mPLNC). After multivariate Cox regression analyses with forward stepwise elimination of the mLNC and mPLNC for the training cohort, a nomogram was constructed to predict overall survival (OS) via incorporating preoperative carcinoembryonic antigen, pT stages, negative lymph node count, mLNR and metastasis. Internal validations of the nomogram showed concordance indexes (c-index) of 0.750 (95% CI, 0.736–0.764) and 0.749 before and after corrections for overfitting. Serial performance evaluations indicated that the nomogram outperformed the AJCC stages (c-index = 0.725) with increased accuracy, net benefits, risk assessment ability, but comparable complexity and clinical validity. All the results were reproducible in the test cohort. In summary, the proposed nomogram may serve as an alternative to the AJCC stages. However, validations with longer follow-up periods are required.

node-related parameters such as negative lymph node count 8,9 (NLNC) and lymph node ratio 6,9,10 (LNR) have been demonstrated to be associated with the survival of patients with CRC, while neither the NLNC or LNR is incorporated into the AJCC staging system. Fourth, the pN stages categorize the positive lymph node count (PLNC) and are unable to incorporate continuous variables, which leads to additional loss of information and predictive accuracy. Lastly, the introduction of biomarkers such as the preoperative expression of carcinoembryonic antigen (CEA) may offer extra precision in the prediction of CRC disease status, which might help to ease increased concerns regarding the anatomical basis of the pN stages 11 .
In the present study, we anticipated that the expression of CEA, the presence of TDs and node-related parameters including the LNC, NLNC, PLNC and LNR could explain and address to a certain degree the survival heterogeneity caused by the pN stages. Modifications of the pN stages by the addition of CEA expression, TDs and node-related factors to a multivariate nomogram might improve its predictive accuracy for CRC. To test this hypothesis, we retrospectively reviewed relevant clinical-pathological variables and the vital status of CRC patients from the Surveillance Epidemiology and End Results (SEER) database. The first aim of this study was to verify the basis for modifying the pN stages by the above-mentioned parameters and to show whether TDs might be incorporated as metastatic lymph nodes. The second aim was to determine and validate the optimal multivariate model that was used to establish the predictive nomogram after the modifications. This study may help us understand the survival heterogeneity complicated by the pN stages and may offer patients with CRC an improved prognostic tool without increased complexity.

Methods
Patients and eligibility criteria. The SEER program (http://seer.cancer.gov) is maintained by the National Cancer Institute and is a national database of cancer statistics in the United States 12 . The data on cancer research are freely available to the public upon submission of a signed data-use agreement (http://seer.cancer.gov/data/ sample-dua.html) to the SEER administration 12 . The experimental protocols used in our study were exempt from review by the ethics committee of the Shanghai East hospital since the data were anonymously extracted and analyzed. Informed consents from participants were also waived due to the complete anonymity of the patients. The study was conducted according to the TRIPOD statement 13 and adhered to the Declaration of Helsinki for medical research involving human subjects 14 .
In the present study, any CRC patients from the SEER database who were diagnosed in 2010 and 2011 were considered for inclusion in a training cohort and a test cohort, respectively. However, patients were excluded if they met the following criteria: (1) were not diagnosed with adenocarcinoma, (2) unproven diagnosis by surgical pathology, (3) history of malignancy, (4) multiple primary tumors, (5) preoperative/intraoperative radiation therapy, (6) unknown or borderline CEA status, (7) pTis lesions or inconsistent/insufficient information to specify the tumor-node-metastasis (TNM) stages, (8) unknown number of TDs or unknown LNC or PLNC, (9) follow-ups with incomplete dates or follow-ups of less than one month and (10) inactive follow-ups or unknown outcomes.
Variables and endpoint. The variables that were evaluated were as follows: sex, age, race, tumor location, grade, perineural invasion, CEA expression, TDs, LNC, PLNC, NLNC, LNR, the 7th AJCC/TNM stages, postoperative radiation, survival (in months) and vital status. Among them, the NLNC and the LNR were derived from both the LNC and the PLNC. The endpoint we used was overall survival (OS), which was determined by the vital status.
Statistical analyses. Discontinuous variables were presented as frequencies while continuous variables were presented as medians and ranges due to skewed distributions. Cumulative survival rates among patients with different pN stages with and without stratifications were plotted using the Kaplan-Meier (K-M) curve method and were compared by log-rank test. To modify the pN stages, each TD was quantified as a metastatic lymph node and the node-parameters were recalculated accordingly to yield the modified LNC (mLNC), PLNC (mPLNC), LNR (mLNR) and AJCC (mAJCC) stages. Based on the training cohort, the mLNC, mPLNC, mLNR, NLNC, CEA expression, pT stages and M stages were then incorporated into a multivariate Cox regression analysis with a forward stepwise elimination of relatively unimportant variables. Advantages of the final multivariate model were attested by comparisons with the AJCC and mAJCC stages using goodness of fit (log-likelihood), Akaike information criterion (AIC) and concordance index (c-index). Next, the nomogram was constructed based on the final model of the training cohort. The performance of the nomogram was internally evaluated by c-index, 200-resample bootstrap validation, calibration and the area under the time-dependent receiver operating characteristic (ROC) curve (AUC) at different time points. External validation was achieved by applying the nomogram to the test cohort using similar statistics. Decision curve analysis 15 (DCA) was also performed to compare the threshold probabilities and the net benefits associated with the nomogram and the AJCC stages. Lastly, to demonstrate the ability of the nomogram to make risk assessments, each patient in the training cohort was given a total score based on the nomogram. Risk classifications at the overall stage level were illustrated with K-M curves after the patients were divided into different prognostic groups according to percentile scores. Risk stratifications for individual AJCC stages as well as for patients who received postoperative radiation were performed using similar methods. All the analyses were processed by the SPSS 18.0 (SPSS Inc., Chicago, IL, USA) and R 3.2.3 programs. By convention, only a two-sided P value < 0.05 was considered statistically significant.  Evaluation of the pN stages. The results of the K-M curve analyses for the training cohort ( Fig. 2) showed that the TDs, LNC, NLNC, LNR and expression of CEA were significantly associated with OS (all P log-rank < 0.001). All of these could be used to stratify the pN stages (all P log-rank for trend < 0.001), while pairwise comparisons revealed some discrepancies among these parameters. For instance, no apparent survival difference was identified between pN1c stage patients and pN1a stage patients (P log-rank = 0.318) or between pN1c stage patients and pN1b stage patients (P log-rank = 0.343) ( Fig. 2A). This was also the case for patients who were TD (− ) LN (+ ) (namely, TD-negative and node-positive cases) and patients who were TD (+ ) LN (− ) (Fig. 2D, P log-rank = 0.164). The results insinuated that metastasis in TDs and lymph nodes might have a comparable impact on OS. Furthermore, the survival of node-positive patients was significantly different depending on the TD status ( Fig. 2D, P log-rank < 0.001), which indicated that the effect of TD could not be ignored when lymph node metastases were present. Moreover, in patients with lymph node metastases (LNR > 0, n = 2796), the OS of pN2 stage patients with a decreased LNR (≤ median) was comparable to that of pN1 stage patients with an LNR either above or below the median (0.15) ( Fig. 2J, P log-rank = 0.132 and 0.453). In addition, the expression of CEA exerted a reverse effect on the pN stages (Fig. 2L) as the survival of CEA (− ) pN1 patients was better than that of CEA (+ ) pN0 patients (P log-rank < 0.001); moreover, a similar relationship was found between CEA (− ) pN2 patients and CEA (+ ) pN1 patients (P log-rank = 0.021). The results for the LNR and CEA expression implied that advanced pN stages were not necessarily associated with a shortened OS. Modifications of the heterogeneous pN stages might bring improved precision to survival estimations.  Modifications of the N factor. Considering that TDs had a prognostic effect similar to that of positive lymph nodes, these were combined as the mPLNC, the method of which is described above. The results of multivariate Cox analyses for the training cohort are shown in Predictive nomogram. The nomogram was constructed based on the final multivariate model for the training cohort (Fig. 3A).

Characteristics
Internal and external validations. The c-indexes of the nomogram in the training and test cohorts were 0.750 (95% CI, 0.736-0.764) and 0.770 (95% CI, 0.754-0.786), respectively. Similarly, the bias-corrected c-indexes for the training and test cohorts were 0.749 and 0.769, respectively, which indicates no significant changes. Calibration plots displayed a good agreement between the observed and the nomogram-predicted OS at different time points in both the training (Fig. 3B to E) and test cohorts (see Supplementary Fig. S1). The time-dependent   3F to I) showed that the nomogram consistently outperformed the AJCC stages, as the nomogram was associated with improved net benefits (higher lines of prediction by the nomogram). However, the nomogram gave comparable threshold probabilities between which a predictive model was clinically valid. The results of the DCA remained stable in the test cohorts (see Supplementary Fig. S2).

Risk classifications and stratifications.
After the patients were scored and ranked according to percentiles, risk classifications and stratifications were implemented to illustrate the ability of the nomogram to make risk assessments in the training cohort. In general, the nine AJCC stages were unable to accurately predict the OS of patients with CRC, particularly for those with stage II and stage III disease (Fig. 4A, IIIA vs. I, P log-rank = 0.766; IIIA vs. IIA, P log-rank = 0.080; IIIB vs. IIB, P log-rank = 0.776). Conversely, the nomogram was able to classify patients with stage I-IV disease into nine significant prognostic groups (Fig. 4B, all P log-rank < 0.016 for pairwise comparisons). Based on the percentile scores of the particular stages, the nomogram could also stratify patients with stage I (Fig. 4C, all P log-rank < 0.002 for pairwise comparisons), stage II-III (Fig. 4D, all P log-rank < 0.047 for pairwise comparisons) and stage IV (Fig. 4E, all P log-rank < 0.001 for pairwise comparisons) disease into a number of significant risk subgroups. Additionally, responses to postoperative radiation therapy (first course therapy) in patients who receive postoperative radiation (n = 325) might also be predicted by the nomogram (Fig. 4F, P log-rank < 0.001).

Discussion
In the present study, we evaluated the survival heterogeneity that results from use of the pN stages and proposed a new prognostic nomogram that was able to avoid the limitations associated with the AJCC staging system. The nomogram achieved stable improvements in predictive accuracy, net benefits and reproducibility through the incorporation of the expression of CEA, pT stages, NLNC, mLNR and metastasis without a significant increase in degrees of freedom (df = 9). As supported by a number of previous studies 5, [16][17][18][19] , there are some reasons that TDs should be considered metastatic lymph nodes irrespective of lymph node status. Most importantly, our study showed that TDs and metastatic lymph nodes had a comparable impact on the survival of patients with CRC and that TDs also imposed risks on node-positive CRC. Consistent with our study, an investigation 16 of patients with node-positive CRC reported an increased recurrence rate (49.2% vs. 14.4%, P < 0.001) and decreased OS (P < 0.001) after surgery in those with TDs compared with those without TDs. Other important, supportive reasons include the finding that metastases in the TDs and lymph nodes shared similar recurrence patterns 17 and that pathologists experience substantial difficulty in the complete differentiation of these two entities 18,19 . Actually, our study showed that the mAJCC stages, which were simplified by the combination of TDs and metastatic lymph nodes, achieved a higher log-likelihood and a lower AIC in comparison with conventional AJCC staging. Some studies have also reported that this combination enhanced the diagnostic objectivity 19 and predictive accuracy 19-21 of the pN stages.
Due to the aforementioned limitations, the higher pN stages did not seem to necessarily be associated with shortened survival. The results of the K-M curve analyses revealed that use of the pN stages led to both underestimates (i.e., pN2, LNR ≤ median and pN1, CEA (− )) and overestimates in OS (i.e., pN0, CEA (+ ) and pN1, CEA (+ )) of patients with CRC. We observed that 76.5% (224/293) of the stage IIIA patients and 46.0% (684/1490) of the stage IIIB patients in the training cohort constituted 22.2% and 67.7%, respectively, of the pN1 CEA (− ) patients (n = 1010) who were identified to be at risk for underestimation by the pN stages. We also observed that 17.6% (275/1563) of the stage I patients and 34.1% (609/1787) of the stage IIA patients accounted for 24.3% and 53.8%, respectively, of the pN0 CEA (+ ) patients (n = 1132) whose survival was likely to be overestimated. This  Table 3. Multivariate Cox regression analysis in the training cohort. HR, hazard ratio; 95% CI, 95% confident interval; ref, referent; CEA, carcinoembryonic antigen; NLNC, negative lymph node count; mLNR, modified lymph node ratio; mLNC, modified lymph node count; mPLNC, modified positive lymph node count.
explains precisely why some of the stage II patients exhibited a worse survival than stage III patients. In addition, the results indicate that pN1 CEA (− ) and pN0 CEA (+ ) patients may be treated as high-risk stage II patients for whom adjuvant therapies are appropriate, but this requires further validation. The nomogram successfully avoided the above-mentioned limitations of the pN stages by the inclusion of other node parameters. It was not accidental that the NLNC and mLNR, rather than the mLNC and mPLNC, were prioritized by the multivariate Cox analyses. The mLNR contained additional information about the NLNC, which improved the predictive accuracy of the mPLNC. A recent systematic review confirmed that the prognostic value of the LNR was superior to that of the PLNC 22 . Moreover, the nomogram allowed the mLNR to be continuously represented. This further avoided the problem of the threshold variability in the LNR, which made studies incomparable and hindered the application of the LNR 22 . In contrast to the mLNR, the NLNC and LNC were applicable to patients with either early or advanced CRC. Consistent with many other studies 3,8,9,23,24 , our analyses revealed a positive association among the NLNC, LNC and OS. The mechanisms of this association are increasingly linked to confounders that simultaneously correlate with the LNC and survival of patients with CRC 6,7 . An emerging role of the adaptive immune response to tumors is also highlighted to characterize the LNC as a patient-specific marker rather than as a quality indicator 25 . Despite the association, the NLNC showed an advantage over the LNC as a more significant predictor in our study. One reason may be that the favorable effect of the NLNC on OS is more relevant and stable than that of the LNC because the LNC in node-positive patients In the plots of decision curve analysis, the "assume none" lines represented the assumption that no event occurred; while the "assume all" lines represented the assumption that events occurred in all the patients. CEA, carcinoembryonic antigen; pT, pT stages; NLNC, negative lymph node count; mLNR, modified lymph node ratio; OS, overall survival; AJCC, the American Joint Committee on Cancer.
considers the NLNC and PLNC, while the effect of the LNC may be neutralized since the two components exert opposite effects on prognosis. This is in accord with a recent population-based study, in which the 12-node benchmark proved to be an independent predictor of CRC in patients with stage I-III disease (n = 13,941, HR = 0.67) but not in patients with stage III-IV disease (n = 6810, P = 0.136) 24 . Another possible reason is that the influence of the LNC on patient survival is more easily diminished by improvements in the quality of external pathology with increasing awareness of the 12-node minimum requirement 26 . In contrast, the NLNC may be more intrinsically related to enhanced regional lymphocytic reactions that result in an increased NLNC and prolonged survival 27 . Therefore, the NLNC is a better predictor of survival than the LNC.
Together with previous findings, our study provided one of the first nomograms that incorporates CRC patients with and without metastasis using population-based data. This nomogram is also the first CRC prognostic nomogram that contains a modification of the algorithm for the presence of TDs and the pN stages through the incorporation of both the NLNC and the mLNR. Compared with the published nomograms 28 and the AJCC stages for CRC, our nomogram exhibited improved accuracy without a significant increase in model complexity. Nonetheless, our study does have some limitations that deserve attention. Since the analyses were performed retrospectively, selection biases might be underestimated. The duration of the follow-up periods in both cohorts is relatively short because the SEER program did not collect data on TDs until the year 2010. Although we have demonstrated that the performance of the nomogram is reliable and reproducible, this nomogram may still require validation by independent studies with a longer follow-up period. It should also be noted that the SEER research database lacks chemotherapy information albeit the data are irrelevant to the development of the nomogram. Additionally, the inclusion of new biomarkers such as cell-free DNAs 29 and circulating tumor cells 30 may improve the performance of the nomogram. Lastly, tumor location (i.e., right-sided vs. left-sided location) is associated with site-specific genetic alterations 31 that may biologically determine tumor recurrence and outcome 32 . Thus it may be a simple, reproducible and robust predictor of future modifications of nomograms 33 and AJCC stages. In summary, our study demonstrates substantial survival heterogeneity among the pN stages, which decreases the performance of the AJCC staging system. The quantification of TDs as metastatic lymph nodes is an effective and practical modification that improves predictive accuracy. Based on that modification, the nomogram that incorporates CEA expression, pT stages, the NLNC, the mLNR and metastasis has been internally and externally validated as a useful tool for risk assessments. This nomogram also outperformed the conventional AJCC staging system in both the training and test cohorts with increased predictive accuracy and net benefits but with comparable complexity and clinical validity. Thus, this nomogram holds promise for future application in clinical practice. However, this nomogram still requires independent validations with longer durations of follow-up.