Introduction

Endometrial cancer (EC), the fifth most frequent cancer in women, accounts for an estimated 382,000 new cancer cases worldwide and 90,000 deaths annually1. In 2018, 17,089 patients were newly diagnosed with EC and 2597 patients with EC died in Japan2. In the United States, the mortality rate due to EC, which decreased initially, is on the rise since 2010, and nearly 12,550 deaths have been estimated in 20223. EC is the most commonly observed gynecological malignancy4 and is classified as type 1, grade 1, or grade 2 endometrioid carcinoma, which is regarded as low-grade EC (LGEC) and accounts for approximately 80% of cases; type 2, which has a high histologic grade and aggressive clinical behavior, accounts for the remaining 20%5,6. High-grade endometrial carcinomas (HGECs) are a heterogeneous group of tumors that include grade 3 endometrioid, serous, and clear cell carcinomas. Although the overall prognosis of patients with EC is generally good, with an 80% overall survival (OS) at 5 years, 15–20% of patients with a low-risk profile still experience recurrence7. Furthermore, the outcomes for patients with EC with systemic recurrence are poor, with a median survival hardly exceeding 12 months8.

In EC, the assessment of lymph node metastasis (LNM) is an important factor in determining treatment strategies and predicting prognosis. The LNM status is closely related to poor prognosis and is an important factor in EC staging and determining the need for adjuvant therapy9,10,11. Two randomized trials have shown no therapeutic benefit of systemic lymphadenectomy in patients with low risk for recurrence12,13. It has also been reported that preoperative stratification by imaging and histological assessments permits a reduction in lymphadenectomy to approximately 50%14. However, lymphadenectomy should be performed in patients with non-endometrioid histology or deeply infiltrating high-grade disease, both of which are known to have more aggressive behavior, even though lymphadenectomy has a 10–20% risk of lower-extremity lymphedema and a 10–25% risk of lymphocele development15,16. Although lymphadenectomy should be performed based on a balance of risks and benefits, an international consensus has not yet been reached on the eligibility criteria for it17.

The preoperative or postoperative assessment of LNM often uses histology, serum cancer antigen 125 (CA125) level, myometrial invasion (MI), and/or lymphadenopathy by imaging as a part of the preoperative workup18,19. Although there have been several reports of preoperative LNM risk assessment in patients with EC, the current preoperative risk assessment is moderate (sensitivity 67–92%)18,20,21,22,23,24,25,26. Todo et al. reported that preoperative clinicopathological factors such as volume index as assessed by magnetic resonance imaging (MRI), CA125, histological subtype, and grade according to biopsy are associated with LNM of EC22; further, they constructed a scoring system for prediction of LNM. The diagnostic predictive performance of this scoring system was 92% for sensitivity and 53% for specificity23. Similarly, Kang et al. reported that high CA125 and three MRI parameters (deep MI, enlarged lymph nodes [LNs], and extrauterine extension) were significantly associated with LNM among patients with a low-risk for recurrence. They defined a predicted probability of less than 4%, and developed criteria (Korean Gynecologic Oncology Group [KGOG] criteria) for identifying patients at low risk for LNM24; the criteria were validated in a prospective multicenter observational study. The diagnostic predictive performance of the KGOG criteria was 85% for sensitivity, 56% for specificity, and 70% for the receiver operating characteristic (ROC) curve area19. Only a few studies have verified that the prediction of LNM is associated with postoperative prognosis. To develop a highly accurate and precise prediction model using preoperative factors, it is necessary to perform many validation studies and develop new methods.

To provide appropriate treatment to patients with suspected LNM, we assessed whether a prediction model constructed by machine learning algorithms using preoperative clinicopathological factors can be a useful tool for selecting patients who need lymphadenectomy. We also investigated how preoperative pathological factors affect LNM predictions on clinical outcomes.

Materials and methods

Two cohort studies consisting of NCCH and SUH

We conducted two retrospective cohort studies. Creasman et al. reported that a minimum of 10 LNs were removed during the surgery for EC. Thus, we adopted 10 LNs as our minimum cutoff27. Details on case selection are provided in the Supplementary Methods. All patients who had 10 or more LNs removed in addition to the mainstay surgery (hysterectomy, bilateral oophorectomy) between 2007 and 2018 at the National Cancer Center Central Hospital (NCCH) were included in this study to extract a strictly node-negative group that did not include clinically undetectable LNM. A total of 125 EC patients were enrolled in the NCCH cohort. Patients receiving neoadjuvant chemotherapy were excluded. This study was approved by the Institutional Review Board of the National Cancer Center Research Institute (2017–331).

Next, 129 EC patients who had undergone initial surgery (hysterectomy, bilateral salpingo-oophorectomy, and resection of 10 or more LNs) after being diagnosed between 2006 and 2017 at the Showa University Hospital (SUH) were enrolled in the SUH cohort. After obtaining approval from the institutional ethical and research review boards of SUH (approval number: 2544), this study was conducted following the ethical guidelines of the Declaration of Helsinki.

Table 1 summarizes the patient characteristics and preoperative clinicopathological factors for the 254 cases comprising the NCCH and SUH cohorts. Pathological characteristics of the patients' resected samples are summarized in Supplementary Table S1 for the NCCH cohort and Supplementary Table S2 for the SUH cohort. The general guidelines for the treatment of EC at each institution are described in the Supplementary Method.

Table 1 Clinicopathological characterisitics of 254 endometrial cancer patients consisting of the National Cancer Center Hospital and the Showa University Hospital cohorts.

In the NCCH cohort, the general requirements for informed consent for the use of their samples in the research were obtained at their first visit to the NCCH. Information obtained in our study using samples collected after obtaining general informed consent from participants has been summarized on the NCCH website. Patients were free to revoke their presumed consent at any time point. We only used samples from patients who did not revoke their consent. Similarly, we informed patients treated before 2000 that the information summary of our study is published on the official NCCH website. Patients who refused to provide consent for the use of their residual samples were excluded from this study. The clinical data used in this study were collected from the patients' medical records. Written informed consent was obtained from all patients in the SUH cohort.

Preoperative clinicopathological factors

Preoperative endometrial biopsy specimens from the NCCH and SUH cohorts were evaluated by at least two pathologists at each institution. In this study, grades 1 and 2 endometrioid endometrial carcinomas were defined as LGEC, and other histological or unknown grade endometrial carcinomas were defined as HGEC. In both cohorts, MRI was routinely used for the preoperative work-up of the patients with EC. Each patient's MRI data were examined by two radiologists at each institution. MI was defined as less than 50% invasion and 50% or more invasion on the axial and sagittal images, respectively. LNs with their short axes longer than 1 cm were considered to be enlarged. Tumor diameter (TD) was defined as the maximum diameter on the sagittal T2-weighted images. TD measurements were used to obtain the ROC curves for LNM. The ROC curve was used to calculate the cut-off value (TD: 47 mm). Serum CA125 levels were determined by chemiluminescent immunoassay using the preferred assay method of each institution. To determine the relationship between the measured serum CA125 levels and pathologic factors, a population should be divided into premenopausal and postmenopausal groups because serum CA125 levels are affected by ovarian hormones and aging28,29,30,31. In the current study, the patients were divided into two groups according to their menopausal status. The ROC of LNM was obtained based on the CA125 value and the cutoff value was determined (52.3 U/mL [non-menopausal] and 48 U/mL [menopausal]).

Data splitting

The NCCH cohort (n = 125) was randomly divided into the NCCH training set comprising 75 patients and the NCCH test set with 50 patients; no significant differences in clinicopathological factors between the two sets were found (Supplementary Table S3). This resulted in the allocation of 33 patients with LNM and 42 patients without LNM to the NCCH training set, and 18 patients with LNM and 32 patients without LNM to the NCCH test set.

Construction and validation of the prediction models for LNM

In this study, the models were constructed using three methods of logistic regression (LR) classifiers as the baseline, in addition to supervised machine learning classifiers of support vector machines (SVM) and random forests (RF). SVM is a method for determining and classifying discriminative thresholds from a data distribution, while RF is used to classify data by collecting a multitude of decision trees. All the classifiers were implemented using the R package randomForest, kernlab, and glm2 (method “ksvm” for SVM and “randomForest” for RF). Machine learning classifiers were trained using repeated five fold cross-validation of the training dataset. Each prediction model was constructed using the NCCH training data and its predictive performance was validated using the NCCH test data and the SUH cohort.

Statistical analysis

Statistical analysis was performed using R software ver. 4.1.0 (R Foundation, Vienna, Austria) and JMP version 15.0.0 software (SAS Institute Inc., Cary, NC, USA). Variables that achieved statistical significance in the univariate analysis were subsequently included in the multivariate analysis. The level of statistical significance was set at P < 0.05. In logistic regression, we adopted the rule of 10 events per variable for the number of variables included in the multivariate analysis. Therefore, for the multivariate analysis, variables with the highest odds ratios in the univariate analysis were selected. Cumulative survival was estimated using the Kaplan–Meier method, and the difference in survival between the two groups was analyzed using the log-rank test. The effects of variables on OS or relapse-free survival (RFS) were determined via univariate and multivariate analyses using the Cox proportional hazard model with R and JMP software.

Ethical approval

The study protocol was approved by the Institutional Review Board of the National Cancer Center Research Institute and of Showa University (Approval Numbers 2017–331 and 2544, respectively), and the study was conducted following the ethical guidelines of the Helsinki Declaration. Written informed consent was obtained from all the patients using an opt-out form. Patients who refused to provide consent were excluded from the study.

Informed consent

In the NCCH cohort, the general requirements for informed consent for the use of their samples in the research were obtained at their first visit to the NCCH. Information obtained in our study using samples collected after obtaining general informed consent from participants has been summarized on the NCCH website. Patients were free to revoke their presumed consent at any time point. We only used samples from patients who did not revoke their consent. Similarly, we informed patients treated before 2000 that the information summary of our study is published on the official NCCH website. Patients who refused to provide consent for the use of their residual samples were excluded from this study. The clinical data used in this study were collected from the patients’ medical records. Written informed consent was obtained from all patients in the SUH cohort.

Results

Association of preoperative clinicopathological factors with the risk for LNM

The clinical characteristics and pathological data of 254 patients are summarized in Table 1. Deep MI, enlarged LNs, large TD (as determined by MRI), and high serum CA125 levels were significantly higher in patients with than without LNM (P < 0.01). In both cohorts, there was no difference in the distribution of biopsy histological subtypes and grades between patients with and without LNM (Table 2). Multivariate analysis revealed that deep MI, enlarged LNs, and high serum CA125 levels were associated with the risk of LNM in the NCCH cohort. Even in the SUH cohort, univariate analysis showed that deep MI, enlarged LNs, large TD, and high serum CA125 levels were higher in patients with than without LNM (P < 0.05), and there was no difference in the frequency of biopsy histological types between patients with and without LNM (Table 2B). Multivariate analysis revealed that high serum CA125 levels were associated with the risk of LNM. In the combined analysis of 254 patients from the NCCH and SUH cohorts, deep MI, large TD, enlarged LNs, and high serum CA125 levels were independently associated with LNM in the multivariate analysis (Table 2C).

Table 2 Preoperative clinicopathological factors and risk of lymph node metastasis in patients with endometrial cancer.

Construction of predictive models for LNM detection using preoperative clinical factors

We investigated whether a predictive model for LNM could be constructed using the results of routine preoperative examinations, including MI, TD, LNs enlargement, biopsy histology, and serum CA125 levels. Predictive models were constructed for the NCCH training set (n = 75) using three methods: (A) LR, supervised machine learning with (B) SVM, and (C) RF. Models were validated on the test sets of the NCCH (n = 50) and SUH (n = 129) cohorts. The area under the ROC curve (AUC) was calculated to evaluate the predictive power of each model. Nearly all methods showed a high predictive performance above AUC 0.80, which was similar to the results of validation by other cohorts, including the SUH and NCCH test sets (Fig. 1).

Figure 1
figure 1

Performance of the preoperative predictive model for lymph node metastasis in endometrial cancer (training set: National Cancer Center Hospital training set (n = 75), test set: NCCH test set (n = 50), and Showa University Hospital set (n = 129). (A) Receiver operating characteristic (ROC) curves using logistic regression. (B) ROC curves using support vector machine. (C) ROC curves using random forest. The solid and dashed lines show the NCCH and the SUH test sets, respectively. AUC The area under the ROC curve, NCCH National Cancer Center Hospital, SUH Showa University Hospital.

Summary of the previous reported predictive performance

To evaluate the predictive performance of LNM in the present study, we compared it with the previously reported LNM prediction algorithm using preoperative clinical factors. In this study, both the RF and LR using clinical factors showed that our model had slightly lower sensitivity and higher specificity than previously reported models, although the positive likelihood ratio was higher than previously reported (Table 3).

Table 3 Comparison of the preoperative prediction models for lymph node metastasis in patients with endometrial cancer.

Association of the preoperative clinical factors with the risk for LNM between LGEC and HGEC determined by biopsy specimens

In patients with LGEC in the NCCH cohort, multivariate analysis revealed that high serum CA125 levels and enlarged LNs were significantly associated with a risk for LNM (odds ratio [OR] = 7.72, P < 0.01, and OR = 9.26, P < 0.01, respectively; Table 4A). Multivariate analysis of the SUH cohort showed that high serum CA125 levels were significantly associated with LNM (OR = 13.2, P < 0.01; Table 4B). In the combined analysis consisting of 254 patients from the SUH and NCCH cohorts, we identified high serum CA125 levels and enlarged LNs in the LGEC group (OR = 10.1, P < 0.01, and OR = 6.20, P < 0.01, respectively; Table 4C). Conversely, multivariate analysis revealed that deep MI and high serum CA125 levels in HGEC of the NCCH cohort were significantly associated with the risk of LNM (OR = 9.52, P < 0.01, and OR = 17.5, P = 0.016, respectively). However, none of the factors were statistically associated with the risk of LNM in the SUH cohort (Table 4A and B). In the combined analysis of 254 patients from the SUH and NCCH cohorts, deep MI on MRI and high serum CA125 levels were associated with the risk of LNM in the HGEC group (OR = 7.86, P < 0.01, and OR = 7.32, P = 0.019, respectively; Table 4C.

Table 4 Association between lymph node metastasis and preoperative clinical risk factors according to the biopsy histological grade.

Since the strength of factors contributing to LNM differs between LGEC and HGEC, we decided to separate LGEC and HGEC patients and create LNM prediction models for each. In the LGEC group, the three machine learning methods showed a relatively high predictive ability, with an AUC of approximately 0.75 for the (A) LR: AUC 0.75, (B) SVM: AUC 0.79, (C) RF: AUC 0.76; Fig. 2) models. The HGEC group showed an even higher predictive performance: (A) LR: AUC 0.84, (B) SVM: AUC 0.77, and (C) RF: AUC 0.86.

Figure 2
figure 2

Predictive performance of lymph node metastasis by biopsy histological types (low-grade endometrial cancer / high-grade endometrial cancer (training set: National Cancer Center Hospital (NCCH) training set (n = 75), test set: NCCH test set (n = 50) and Showa University Hospital (SUH) set (n = 129). (A) Receiver operating characteristic (ROC) curves using logistic regression. (B) ROC curves using support vector machine. (C) ROC curves using random forest. Solid and dashed lines show the HGEC and the LGEC test sets, respectively. AUC The area under the ROC curve, HGEC High-grade endometrial cancer, LGEC Low-grade endometrial cancer.

Correlation between the predictive classification of LNM and the clinical outcomes

We examined the association between the LNM predicted by the LR method in this study and the clinical outcomes of 125 patients in the NCCH cohort and 129 patients in the SUH cohort predicted in this study. Patients with positive LNM prediction had better RFS and OS than patients with negative LNM prediction (Supplementary Fig. S1). After adjusting for the presence of adjuvant therapy (chemotherapy or radiation therapy), RFS and OS in the positive LNM prediction group were significantly worse than those in the negative LNM prediction group (Supplementary Table S4). In the group without pathological LNM, the RFS of the groups with positive LNM prediction was worse than that of the groups with negative LNM prediction (Fig. 3). On the other hand, in the group with pathological LNM, there was no significant difference in RFS between the positive and negative LNM prediction groups (Supplementary Fig. S2).

Figure 3
figure 3

Kaplan–Meier survival curves according to the node-predicted status in patients without pathological lymph node metastasis. Top row, NCCH cohort 125 patients; bottom row, SUH cohort 129 patients. (A) RFS of positive lymph node metastasis prediction (red line) and negative lymph node metastasis prediction (blue line). (B) OS of positive lymph node metastasis prediction (red line) and negative lymph node metastasis prediction (blue line). LNM lymph node metastasis, NCCH National Cancer Center Hospital, OS Overall survival, RFS Relapse-free survival, SUH Showa University Hospital.

Discussion

In this study, the LNM prediction model using preoperative clinicopathological predictors showed a high predictive power, similar to that reported previously19,22,23,24,25,26. We also showed that the risk factors associated with LNM differ between patients with LGEC and HGEC and that the strength of the association was also different. The prediction performance of the HGEC group was higher than that of the LGEC group. In the group without pathological LNM, patients with positive LNM prediction using our model had worse clinical outcomes than patients with negative LNM prediction, even if 10 or more LNs were removed, and pathology was negative for LNM. These results suggested that this LNM prediction model can identify patients at high risk of recurrence regardless of pathological LNM status, and these patients may require postoperative therapy even if the absence of pathologic LNM.

The predictive model for LNM constructed using clinical factors showed a high AUC of over 0.8, similar to that reported previously19,22,23,24,25,26. However, compared to previous reports, the present study resulted in a predictive model with lower sensitivity and higher specificity. The positive likelihood ratio, calculated by sensitivity/ (1—specificity), was higher than or similar to that reported previously. Previous reports have predicted LNM primarily in populations that might have had low LNM risk, which may account for the difference in predictive ability between previous models and the current one. Using the present prediction model, the prediction performance of LNM was better in the HGEC group than in the LGEC group by endometrial biopsy. The group with positive LNM prediction had a poorer prognosis than the group with negative LNM prediction. Due to the high specificity of this model, it could accurately predict poor prognoses of patients who may require lymphadenectomy.

We also showed that LNM prediction using clinical factors had a higher diagnostic performance in the HGEC group than in the LGEC group, and deep MI on MRI correlated with LNM in the HGEC group, and enlarged LNs on MRI correlated with LNM in the LGEC group. In a meta-analysis of the diagnostic precision of clinical biomarkers for the preoperative prediction of LNM in EC, both enlarged LNs detected by MRI and high serum CA125 levels were reported to be more diagnostic of LNM in the HGEC group than in the LGEC group; this is consistent with previous reports20.

Clinical factors that are considered to be risk factors for LNM in EC have been reported as poor prognostic factors, and the prediction model for LNM could also possibly predict a population with a poor prognosis. In this study, we revealed, for the first time, that the positive LNM prediction group, including deep MI, large TD, enlarged LN, and high serum CA125 levels, had a worse prognosis, even in patients without postoperative pathological LNM. Many guidelines, including the National Comprehensive Cancer Network, the European Society for Medical Oncology and the Japanese Society of Gynecologic Oncology guidelines, indicate that postoperative pathological stages, histology, and lymph vascular space invasion are parameters for risk assessment in patients with EC10,32,33. However, it would be clinically useful to predict prognosis with factors that can be evaluated preoperatively. In the future, the risk of LNM could be calculated based on preoperative pathology information, which could have clinical applications.

Despite its findings, our study had several limitations. This was a two-center, retrospective study with a limited number of patients. The general treatment guidelines for EC patients differed substantially between the two hospitals. This study design might not have included a low-risk metastatic group that did not have their LNs removed. We need to further validate our prediction model with additional independent sample sets because there could be a significant association between LNM risk and HGECs due to differences in histological type distributions by race and surgical methods or treatments administered to high- and low-risk groups for LNM. The previously reported LNM predictive models compared in this study are mostly based on Asian populations and have similar predictive performances. The model used in this study may be useful in predicting poor prognosis patients, particularly in Asian EC patients.

Conclusion

We demonstrated that routinely assessed preoperative factors can predict LNM with poor prognosis with a high probability independent of the machine learning algorithms used to construct them. The predictive performance of LNM in the HGEC group was as high as AUC 0.84 (as against AUC 0.75 in the LGEC group). Since the clinical factors associated with LNM differ from deep MI and high serum CA125 in the HGEC group to enlarged LNs and high serum CA125 in the LEGC group. The predictive model constructed in this study can also identify patients with a poor prognosis that have aggressive characteristics based on preoperative pathological factors alone, which may provide appropriate treatment selection and surveillance.