Development and validation of a novel staging system integrating the number and location of lymph nodes for gastric adenocarcinoma

Background Evidence suggests that the anatomic extent of metastatic lymph nodes (MLNs) affects prognosis, as proposed by alternative staging systems. The aim of this study was to establish a new staging system based on the number of perigastric (PMLN) and extra-perigastric (EMLN) MLNs. Methods Data from a Chinese cohort of 1090 patients who had undergone curative gastrectomy with D2 or D2 plus lymphadenectomy for gastric cancer were retrospectively analysed. A Japanese validation cohort (n = 826) was included. Based on the Cox proportional hazards model, the regression coefficients of PMLN and EMLN were used to calculate modified MLN (MMLN). Prognostic performance of the staging systems was evaluated. Results PMLN and EMLN were independent prognostic factors in multivariate analysis (coefficients: 0.044, 0.115; all P < 0.001). MMLN was calculated as follows: MMLN = PMLN + 2.6 × EMLN. The MMLN staging system showed superior prognostic performance (C-index: 0.751 in the Chinese cohort; 0.748 in the Japanese cohort) compared with the five published LN staging systems when MMLN numbers were grouped as follows: MMLN0 (0), MMLN1 (1–4), MMLN2 (5–8), MMLN3 (9–20), and MMLN4 (>20). Discussion The MMLN staging system is suitable for assessing overall survival among patients undergoing curative gastrectomy with D2 or D2 plus lymphadenectomy.


BACKGROUND
Comprehensive and appropriate therapeutics, including endoscopy, surgery, radiotherapy, chemotherapy, and immunotherapy, have improved outcomes of gastric cancer patients. [1][2][3][4][5] Surgery remains vital in the treatment of resectable, non-metastatic gastric cancer. 6 Tumour invasion depth and lymph node (LN) statusused in almost all gastric cancer staging systems-are essential independent prognostic factors for overall survival (OS), following a microscopically margin-negative (R0) resection. [7][8][9][10] LN classification in the eighth edition of the tumour-nodemetastasis (TNM) staging system of the American Joint Committee on Cancer/International Union Against Cancer (AJCC/UICC) and the 15th edition of the Japanese Gastric Cancer Association (JGCA) staging system, the world's most commonly used staging systems, is based on the number of metastatic LNs (MLNs). 10,11 However, before the fifth edition of TNM and 14th edition of the JGCA staging systems, LN classification was based on the anatomical location of MLNs. 12,13 The numeric LN staging system with multiple updates to the cut-off value showed high accuracy in survival prediction; however, some suggested limitations included lack of information on the anatomical extent of MLNs and the total number of LNs (TLNs) retrieved during surgery. [14][15][16][17] Son et al. 14 proposed inclusion of the anatomic extent of MLNs in a staging system for more accurately predicting gastric cancer prognosis. Choi et al. 18 developed an alternative LN staging system based on anatomical location, which reclassified the LN stations into lesser-curvature (LC), greater-curvature (GC), and extra-perigastric (EP) groups. Chen et al. 19 developed yet another LN staging system based on both the number and anatomic location of metastatic LNs, considered a more efficient prognostic indicator than the JGCA and TNM staging systems. Other authors have proposed LN ratio (LNR), the ratio of metastatic LNs relative to the total number of retrieved LNs, and log odds of metastatic LNs (LODDS), defined as the log of the ratio between the probability of being a positive LN and the probability of being a negative LN, which might be better LN staging systems, as they take into account the number of LNs retrieved during surgery. [20][21][22][23] The National Comprehensive Cancer Network (NCCN) guidelines recommend D2 gastrectomy, which includes systematic lymphadenectomy of N1 and N2 group LNs, as the standard treatment for advanced gastric cancer. 24 In many high-volume centres in the Eastern Asian countries, D2 gastrectomy is performed routinely, and has been shown to improve survival. [25][26][27] Considering the impact of MLNs' anatomic location on survival, given D2 surgery outcomes, we hypothesised that the number of LNs in the perigastric (LN stations No. 1, 2, 3, 4, 5, 6) and EP areas (LN stations No. 7,8,9,10,11,12,etc.) might differently influence the prognosis. In other words, the effect of each extraperigastric MLN (EMLN) on prognosis might be different from that of each perigastric MLN (PMLN). Therefore, this study aimed to define the different prognostic effects of PMLN and EMLN and develop and validate a new staging system based on the number of PMLN and EMLN. Comparisons were made with the eighth edition AJCC LN, Choi's, 18 Chen's, 19 LNR, and LODDS staging systems to confirm the prognostic value of the new staging system.

Patient selection
Patients who had undergone curative-intent resection for gastric cancer at the Department of Gastrointestinal Tumor Center at Peking University Cancer Hospital (PUCH) between January 1, 2007 and December 31, 2015 were selected. Inclusion criteria were a histopathological diagnosis of gastric adenocarcinoma with no combined malignant neoplasm or distant metastasis and treatment with R0 gastrectomy and D2 or D2 plus lymphadenectomy. Patients with remnant gastric cancer, preoperative chemotherapy, incomplete clinicopathological or follow-up information, and less than 1-month postoperative survival were excluded. According to the JGCA guidelines, 28 D2 lymphadenectomy cannot be performed during proximal gastrectomy; hence, patients treated with proximal gastrectomy were excluded. A total of 1090 eligible patients were included (Fig. 1).
The validation cohort included 826 patients treated between 2000 and 2007 at the Cancer Institute Ariake Hospital (CIAH), Tokyo, Japan, that met the aforementioned inclusion and exclusion criteria (Fig. 1). The data for the CIAH cohort contained no unique personal identifiers and were extracted from a publicly accessible database. 29 Clinicopathological data The clinicopathological dataset included patient demographics (age, sex), pathological variables (location, size, histological type, differentiation, invasion depth, LN status, Lauren type, vascular invasion), follow-up duration, and survival status at last follow-up (April 2019).
LN status included the anatomic location of each MLN and the number of TLNs. All patients in PUCH cohort were treated by the same team of experienced surgeons. With reference to the Japanese gastric cancer treatment guidelines, all the surgeons have reached a consensus on the surgical procedures and extent of lymphadenectomy to ensure the stability of the surgical outcomes. After radical gastrectomy with D2 or D2-pluslymphadenectomy per the Japanese gastric cancer treatment guidelines, the surgeons dissected the LNs from the gastrectomy specimen. All LNs were placed in containers marked with numbers corresponding to the numerical system for LN identification described by the Japanese Research Society for Gastric Cancer (JRSGC). 30 Two or more trained pathologists used palpation to examine the specimens, obtain as many LNs as possible, regardless of the size of LNs, and reported the number of TLNs and MLNs in each station. Then we counted the number of all MLNs, TLNs, and divided the MLNs into PMLNs and EMLNs.
To facilitate comparisons with other LN staging systems, we recorded the following data. PMLNs were divided into LC and GC groups, as proposed by Choi

Statistical analysis
The clinicopathological characteristics of the study cohorts were presented as mean (standard deviation) for continuous variables and counts and proportions for categorical variables. Univariate and multivariate Cox proportional hazards regression models were used to analyse the relationship between the clinicopathological variables and OS. Variables statistically significant (P < 0.05) in the univariate analysis were included in multivariate analysis (stepwise backward elimination). In the results of multivariate analysis, the regression coefficient can reflect the influence of the variable on the prognosis, and the regression coefficients of PMLNs and EMLNs were extracted from the multivariate Cox model of the PUCH cohort. The ratio of the regression coefficient of PMLNs and EMLNs can reflect their different prognostic effects, and we used this ratio to adjust EMLNs. Then the number of modified metastatic LNs (MMLNs) was calculated using the following formula: where β PMLN and β EMLN were the regression coefficients or PMLNs and EMLNs in the multivariate Cox model. MMLN was considered to be a continuous variable. To group the MMLN, first, the number of MMLNs was rounded to an integer; then four cut-off points were set to divide the MMLNs into five groups consistent with those of the eighth edition TNM LN categories. Further, we used the enumeration method for different combinations of cut-off points in Cox regression models to calculate Harrell's concordance index (C-index), often used to evaluate the discriminative ability of a model. A high C-index represents a better discrimination ability of the model. We selected the cut-off points of maximum C-index to construct our MMLN staging system.
The Kaplan−Meier method with the log-rank test was used to explore differences in survival between the strata established by the eighth edition AJCC LN, Choi's, Chen's, LNR, LODDS, and our MMLN staging system. Comparisons of the predictive value of each LN staging system were performed using the C-index, the likelihood-ratio test, and Akaike's Information Criterion (AIC). A model with a low AIC, a high likelihood-ratio χ 2 score, and a high C-index had a better predictive value. All analyses were performed using the R software (version 3.6.0; https://www.r-project.org/). Statistical significance was set at a two-sided P value < 0.05.

Clinicopathological characteristics
The clinicopathological characteristics of the PUCH (n = 1090) and the CIAH (n = 826) cohort, with median follow-up duration of 49.2 (range 1-136) and 30.0 (1-95) months, respectively, are presented in Table 1. The sex ratio was similar between the two cohorts (P = 0.751), but other variables showed significant differences. Patients  Prognostic factors in multivariate analysis and development of the MMLN staging system Univariate and multivariate analyses were performed on the PUCH cohort (Table 2). In the univariate analysis, both PMLNs and EMLNs significantly affected OS (all P < 0.05). In the multivariate analysis, the extent of gastrectomy, pT stage, TLNs, PMLNs, and EMLNs were independent prognostic factors. We extracted the regression coefficients of PMLNs (β PMLN = 0.044) and EMLNs (β EMLN = 0.115) in Table 2 Supplementary Table S1. The MMLN staging system was further examined, and univariate analysis and multivariate analysis confirmed that the MMLN staging system was an independent prognostic factor in the validation cohort (P < 0.001, Supplementary Table S2).
Survival analysis based on the MMLN classification within the other five LN classifications The Kaplan−Meier survival curves for the PUCH cohort stratified into the eighth edition AJCC LN system-based subgroups are plotted in Fig. 2a-c. As pN0 is equivalent to MMLN0 and only two patients of MMLN2 were included in the pN1 stage, the survival curves of pN0 and pN1 stages are not shown. Within each pN stage, significantly different survival rates are shown for each of the MMLN categories (log-rank test: pN2, P = 0.008; pN3a, P = 0.003; pN3b, P = 0.003). Results of the survival analysis for the MMLN classification-based strata within the AJCC LN, LNR, LODDS, Choi's, and Chen's staging system are shown in Table 3. The MMLN staging system was able to distinguish groups associated with different OS within most of the subgroups distinguished by each of these previously proposed staging systems. In contrast, the previously proposed LN-based staging systems did not distinguish differences in OS within the groups based on MMLN staging system. The corresponding survival curves are shown in Supplementary Fig. S1.
Comparisons of the prognostic performance of all LN-based stage systems Results from Cox regression modelling, C-index values, AIC values, and likelihood-ratio χ 2 scores were compared among the PUCH and CIAH cohort. In the PUCH cohort, the MMLN staging system showed the best prognostic performance (C-statistic: 0.751; AIC: 1026.1.2; likelihood-ratio χ 2 score: 288.8, Table 4). Both the LNR and LODDS classification, based on MLNs and TLNs, were not better than the eighth edition AJCC LN staging system. For Chen's and Choi's LN staging systems based on status and anatomic location of MLNs, Chen's was not better than any other LN staging system; the Choi's was second only to the MMLN staging system. The MMLN staging system had the best discriminative ability (Cstatistic = 0.748) and homogeneity (likelihood-ratio χ 2 score = 68.5) in the CIAH validation cohort, but the LNR staging system performed somewhat better in terms of the AIC value (LNR vs. MMLN: AIC, 785.5 vs. 785.7). Survival curves for all six LN staging systems in the CIAH cohort are plotted in Fig. 2d−i. Although the MMLN staging system showed a difference in OS (P < 0.001), the MMLN2, MMLN3, and MMLN4 classifications failed to prognostically discriminate patients (log-rank test: MMLN2 vs. MMLN3, P = 0.361; MMLN3 vs. MMLN4, P = 0.320). Similarly, the last two or three subgroups within each of the other five staging systems also showed similar survival curves (all P > 0.05).

DISCUSSION
The prognostic values of PMLNs and EMLNs in patients treated with curative resection with at least a D2 lymphadenectomy for gastric cancer were studied. In multivariate analysis, both PMLNs and EMLNs showed prognostic values independent of the extent of gastrectomy, TLNs, and pT stage. We defined one EMLN as equivalent to~2.6 PMLNs in terms of the degree of influence on prognosis. A modified numeric-based LN staging system, namely, the MMLN staging system was established by combining the prognostic weights of PMLNs and EMLNs. The new system provided good discriminative ability and homogeneity for data collected from two high-volume hospitals in the Eastern Asian countries.
LN status in gastric cancer is one of the most robust predictive variables of OS after gastrectomy and research has focused on defining an optimal LN-based staging system over the last decade. 23,[32][33][34][35][36] Even with the widely used AJCC TNM staging system, its recent editions have mainly optimised the LN staging cut-off values and number of subgroups. [37][38][39][40] To our knowledge, the present study is the first to evaluate and demonstrate the   differences in prognostic power between PMLNs and EMLNs in terms of their respective numbers. The MMLN staging system is established based on accurate information on the number of MLNs in each LN group, unlike the AJCC LN staging system, which uses the total number of MLNs only. We used multiple indices, including Harrell's C-index, likelihoodratio χ 2 score, and AIC to evaluate and compare the prognostic value of each LN staging system. Although all LN staging systems showed good predictive accuracy, the prognostic performance of the MMLN staging system was the best. The superiority of the MMLN staging system was demonstrated in survival analysis across the subgroups of the other five LN classifications (Fig. 2a−c and Table 3). The MMLN classification showed good discriminative ability within most subgroups of the other five LN staging systems and showed sufficient homogeneity within each MMLN-derived group across the other five LN staging systems. Furthermore, the results in the validation cohort strengthen our research. Both China and Japan are countries with a high incidence of gastric cancer in the world, but they have completely different epidemiological characteristics. 41 Our study cohort also showed significant differences in clinicopathological characteristics, but the MMLN staging system in the CIAH cohort still showed good predictive power, indicating that our staging system has good universality. Therefore, in high-volume centres that routinely perform highquality D2 lymphadenectomy, MMLN classification could be recommended for inclusion in the prognostic evaluation system. A D2 lymphadenectomy and assessment of the number of MLNs at each group are prerequisites for applying the MMLN staging system. Although the MMLN staging system requires knowledge of cut-off values and calculation formula, we believe that MMLN staging system is still simple and easy to use in clinical work.
Although the anatomic location of MLNs is no longer included in the TNM and JGCA staging systems, we considered it essential to improve the predictive accuracy of any LN staging system.  Chen's system is an LN staging system that combines the number and anatomic location of MLNs, but uses a cut-off value of overall MLNs to establish subgroups for the N1, N2, and N3 groups. 19 However, in the present study, this system showed poor prognostic performance, suggesting using such a cut-off value is not accurate enough.
Recently proposed LN staging systems, such as LODDS and LNR, emphasise the importance of TLNs. 15,22,36,46 However, in our study, when the MMLN staging system was included in multivariate analysis, TLN was excluded, suggesting that it is not an independent prognostic factor. In addition, high-quality D2 lymphadenectomy means adequate LNs dissection. In contrast, TLNs cannot be used to evaluate the extent of LNs dissection. Therefore, both LODDS and LNR staging systems did not perform better than did our MMLN staging system. This study has several limitations. First, the extent of D2 lymphadenectomy was changed from a more extensive to the less extensive form between 2007 and 2015. To minimise the withinstudy heterogeneity likely to result from this change, we also included patients with D2 plus lymphadenectomy, according to the latest edition of Japanese gastric cancer treatment guidelines. Second, the MMLN staging is based on high-quality D2 lymphadenectomy along with the status of PMLNs and EMLNs. As such, it might not be suitable for patients recommended D1 lymphadenectomy or with fewer than 16 LNs retrieved. Further research is necessary to expand the applicability of the MMLN staging system to patients treated with D1-pluslymphadenectomy. Third, although the MMLN staging system performed well in both PUCH and CIAH cohorts, it failed to discriminate some subgroups of patients within the CIAH cohort. This finding can be explained by a high proportion of patients with earlier-stage gastric cancer in the CIAH cohort. Although the distinct clinicopathological characteristics of the Japanese cohort strengthened the universality of MMLN staging system, the limited late-stage cases (AJCC pN3a + pN3b less than 10%) in the Japanese cohort also weakened the advantages of MMLN staging system in these stages. Using a larger cohort might help validate the prognostic performance of the MMLN staging system. Finally, this study included only Asian cohorts; future studies are required to validate the applicability to Western cohorts.

CONCLUSION
To summarise, this study demonstrates the differences in prognostic power between PMLNs and EMLNs in terms of their respective numbers and establishes the MMLN staging system on the basis of the number of PMLNs and EMLNs. Comparisons with the eighth edition AJCC LN, LNR, LODDS, Choi's, and Chen's staging systems suggest better prognostic performance of the MMLN staging system. We recommend the MMLN staging system in patients treated with curative gastrectomy with at least D2 lymphadenectomy for prognostication.

AUTHOR CONTRIBUTIONS
All the authors contributed to writing of the report. Z.L. and J.J. designed the study. X.W., F.S. and X.G. enrolled patients and collected the data. J.J., X.W., Z.L., X.G. and F.S. performed quality control of data and algorithms. Z.L., X.G., Y.Z. and X.Y. analysed all the data. X.W. and X.G. interpreted the results. All the authors read and approved the final manuscript.

ADDITIONAL INFORMATION
Ethics approval and consent to participate This study was performed according to the Helsinki Declaration of 1964 and later versions, and was approved by the institutional review board (IRB) of Peking University Cancer Hospital (IRB number: 2019KT05). Patients of PUCH cohort provided written informed consent for the use of patient specimens and clinical data for research purposes. The data for the CIAH cohort contained no unique personal identifiers and were extracted from a publicly accessible database. Hence, patient informed consent was not required.
Data availability Data of the PUCH cohort are not shared, owing to the privacy or ethical restrictions. Data of the CIAH cohort are openly available through the Internet.
Competing interests The authors declare no competing interests.
Funding information This study is supported by the Peking University Clinical Scientists Program (BMU2019LCKXJ011), supported by "the Fundamental Research Funds for the Central Universities" to J.J. AIC Akaike's Information Criterion, MMLN modified metastatic lymph node, AJCC American Joint Committee on Cancer, LNR lymph node ratio, LODDS log odds of metastatic lymph nodes.
Development and validation of a novel staging system integrating the. . . Z Li et al.