Machine learning predicts lymph node metastasis of poorly differentiated-type intramucosal gastric cancer

Zhou, Cheng-Mao; Wang, Ying; Ye, Hao-Tian; Yan, Shuping; Ji, Muhuo; Liu, Panmiao; Yang, Jian-Jun

doi:10.1038/s41598-020-80582-w

Download PDF

Article
Open access
Published: 14 January 2021

Machine learning predicts lymph node metastasis of poorly differentiated-type intramucosal gastric cancer

Cheng-Mao Zhou¹,
Ying Wang¹,
Hao-Tian Ye¹,
Shuping Yan²,
Muhuo Ji¹,
Panmiao Liu¹ &
…
Jian-Jun Yang¹

Scientific Reports volume 11, Article number: 1300 (2021) Cite this article

2805 Accesses
18 Citations
1 Altmetric
Metrics details

Subjects

Abstract

To construct a machine learning algorithm model of lymph node metastasis (LNM) in patients with poorly differentiated-type intramucosal gastric cancer. 1169 patients with postoperative gastric cancer were divided into a training group and a test group at a ratio of 7:3. The model for lymph node metastasis was established with python machine learning. The Gbdt algorithm in the machine learning results finds that number of resected nodes, lymphovascular invasion and tumor size are the primary 3 factors that account for the weight of LNM. Effect of the LNM model of PDC gastric cancer patients in the training group: Among the 7 algorithm models, the highest accuracy rate was that of GBDT (0.955); The AUC values for the 7 algorithms were, from high to low, XGB (0.881), RF (0.802), GBDT (0.798), LR (0.778), XGB + LR (0.739), RF + LR (0.691) and GBDT + LR (0.626). Results of the LNM model of PDC gastric cancer patients in test group : Among the 7 algorithmic models, XGB had the highest accuracy rate (0.952); Among the 7 algorithms, the AUC values, from high to low, were GBDT (0.788), RF (0.765), XGB (0.762), LR (0.750), RF + LR (0.678), GBDT + LR (0.650) and XGB + LR (0.619). Single machine learning algorithm can predict LNM in poorly differentiated-type intramucosal gastric cancer, but fusion algorithm can not improve the effect of machine learning in predicting LNM.

A nomogram to predict risk of lymph node metastasis in early gastric cancer

Article Open access 24 November 2021

Application of machine learning algorithm in predicting distant metastasis of T1 gastric cancer

Article Open access 07 April 2023

A nomogram for predicting lymph node metastasis in early gastric signet ring cell carcinoma

Article Open access 12 September 2023

Introduction

Gastric cancer is the world's fourth most common neoplastic disease, and the second most fatal tumor-related disease¹. With the development of endoscopic techniques, improved diagnostics and the global popularization of gastric cancer screening, the early gastric cancer (EGC) detection rate increases every year, especially in Japan and Korea^2,3. EGC can be treated with endoscopic resection, D1 or D2 radical surgical resection, as well as other medical auxiliary treatments according to tumor stage⁴. The indications and effects of the various treatments vary. EGC only considers the depth of focal infiltration; it does not consider lymph node metastasis, an important factor in choosing an EGC treatment regimen. Therefore, it is necessary to accurately stage EGC patients prior to surgery to select a reasonable treatment option. Studies have shown that EGC with lymph node metastasis (LNM), the number of lymph node metastases, and lymph node metastasis in different regions, have important effects on EGC treatment and prognosis⁵. Therefore, for over 80% of patients with EGC, radical surgery on D1 or D2 increases unnecessary lymph node dissection. It also increases the trauma caused by surgery, and affects patient recovery. In recent years, the development of endoscopic mucosal dissection and endoscopic mucosal resection has brought new developments to EGC treatment. There is now less trauma and quick postoperative recovery. Thus, patients can avoid the heavy trauma and long recovery time caused by laparotomy or endoscopic surgery. However, it is important to accurately judge lymph node metastasis before surgery⁶.

In recent years, many studies have reported on machine learning in medicine. For example, using large preoperative data to develop and validate machine learning algorithms can predict hospital stay and patient-specific hospital costs after primary total hip arthroplasty⁷; Additionally, machine learning can predict hospital acquired pneumonia in patients with schizophrenia⁸; Machine learning techniques can also predict 5-year survival in patients with chondrosarcoma⁹.

However, few studies have investigated the prediction of LNM in early poorly differentiated early gastric cancer^10,11,12. This study assesses clinicopathological factors for predicting LNM in intramucosal PDC. It also develops and validates a risk model for predicting LNM using machine learning to provide a basis for the treatment of poorly differentiated-type intramucosal gastric cancer.

Methods

Study population

There were no human involved in this study. And this is only a secondary data analysis study using public databases. Data are available from the BioStudies (public) database (https://www.ebi.ac.uk/biostudies/studies?query=S-EPMC4881979), accession numbers: S-EPMC4881979. We prospectively analyzed data from patients diagnosed with PDC who had undergone radical gastrorectal resection and lymph node dissection. Patients included in the study were confirmed as having pure poor differentiated-type T1 (tumor invasion confined to mucosa or submucosa) gastric cancers. The tumors were classified histologically according to the World Health Organization’s Classification of Tumors¹³.

Analysis of clinical results

The following clinicopathological factors were included in the study, including, presence of lymphangitic involvement (LVI), gender, tumor depth, age, presence of ulcer, tumor size, location of tumor, general appearance, number of resected nodules and presence of LNM. Tumors were staged according to the Seventh Edition of the American Joint Committee on Cancer Staging (7th Edition)¹⁴.

Machine learning

Logistic regression (LR) is a broad classification machine algorithm that can predict the probability of future results, whereas "regression" is actually a classification. Accurate, logistic regression is a dichotomous classification algorithm.

Random forest (RF) is a supervised learning algorithm. It is trained with the "bagging" method. The bagging method combines multiple models, and can be more effective than a single model. Thus, it can increase the overall effect.

XGB generates multiple regression trees based on features, and each regression tree learns the corresponding residuals, and the sum of the residuals is the predicted value of the sample.

GBDT is an integrated learning method that uses gradients as input to later trees to learn multiple trees. The combination of multiple trees can then generate a comprehensive learner with strong generalizability.

Statistical analysis

Statistical analysis was conducted in R, version 3.4.3(https://cran.r-project.org/bin/windows/base/old/3.4.3/), and machine learning modeling was performed with python, version 3.6.5 (https://www.python.org/downloads/release/python-365/). Pearson’s correlation analysis was calculated, and the machine learning algorithm was performed with the following algorithms: XGB, RF, GBDT, LR, XGB + LR, RF + LR and GBDT + LR. 70% of the data was divided into training groups for development, and 30% were verified by the test groups. When missing values were dichotomous, the number of digits was used, and multiple imputation was used for continuous variables. MSE refers to mean squared error.

Ethics approval and consent to participate

This was a secondary data analysis study using data from the BioStudies public database.

Results

A total of 1169 patients were enrolled, with lymph node metastases occurring in 61 (5.2%) of them. The age of the lymph node metastasis and non-metastasis groups did not statistically vary between the training and test groups (P = 0.281 and P = 0.115, respectively) (see Table 1).

Table 1 Patient basic characteristic information.

Full size table

Correlation analysis showed that lymph node invasion, tumor invasion depth, and tumor size were positively correlated with LNM (Fig. 1). In addition, the Gbdt algorithm in the machine learning results finds that number of resected nodes, lymphovascular invasion and tumor size are the primary 3 factors that account for the weight of LNM (see Fig. 2).

Effect of the LNM model of PDC gastric cancer patients in the training group: Among the 7 algorithm models, the highest accuracy rate was that of GBDT (0.955); The AUC values for the 7 algorithms were, from high to low, XGB (0.881), RF (0.802), GBDT (0.798), LR (0.778), XGB + LR (0.739), RF + LR (0.691) and GBDT + LR (0.626). Among the 7 algorithms, GBDT’s MSE was the lowest (0.045) and LR was the highest (0.054) (see Table 2 and Fig. 3).

Table 2 Forecast results for training and test group.

Full size table

Results of the LNM model of PDC gastric cancer patients in test group: Among the 7 algorithmic models, XGB had the highest accuracy rate (0.952); Among the 7 algorithms, the AUC values, from high to low, were GBDT (0.788), RF (0.765), XGB (0.762), LR (0.750), RF + LR (0.678), GBDT + LR (0.650) and XGB + LR (0.619). XGB had the lowest MSE (0.048) (see Table 2 and Fig. 4).

Discussion

At present, research has focused on minimally invasive surgery that can maintain postoperative patient survival rates. The goal is to minimize surgical injury with safe and effective operating procedures, so that patients can enjoy higher quality of life^15,16. The incidence of lymph node metastasis has been reported to be between 2.2 and 4.2% for intramucosal (T1a) primary gastric adenocarcinoma, and between 9.4 and 16.1% for early (T1) primary gastric adenocarcinoma^10,12. Our findings suggest that 5.2% of patients with poorly differentiated-type intramucosal gastric cancer develop lymph node metastases. This is consistent with previous findings. Furthermore, the results of this study indicate that the Gbdt machine learning algorithm yields the first 3 factors that account for the weight of lymph node metastasis: number of resected nodes, lymphovascular invasion and tumor size. At the same time, single machine learning algorithm can predict LNM in poorly differentiated-type intramucosal gastric cancer, but fusion algorithm can not improve the effect of machine learning in predicting LNM.

Many clinical pathological factors related to LNM in early gastric cancer have been studied^17,18. A large sample study in the United States showed that tumor stage, pathological type, and tumor size are independent predictors of LNMin early gastric cancer¹⁹. Chen et al. have concluded that tumor diameter ≥ 3 cm, whether it is pathological or low-differentiation type, whether it is mixed adenocarcinoma or signet ring cell carcinoma, tumor infiltration into the submucosa, and vascular invasion are independent risk factors for LNM²⁰. Our results corroborate this view.

The Japanese gastric cancer assistance group noted that the LNM rate was low for tumors > 2 cm in diameter, patients with no ulcers, tumors ≤ 3 cm in diameter, and differentiated intramucosal cancers with ulcers. This could serve as an absolute indication for ESD²¹. Pokala et al. concluded that early intramucosal gastric cancer with tumor diameter < 4 cm has a low risk of LNM, and can be locally resected²². This is consistent with the results of our study. Our results corroborate this view.

Submucosal cancers have a higher rate of LNM than intramucosal cancers. Furthermore, they may be rich in capillaries in the submucosa of the gastric wall, which are usucaptible to cancer cell invasion^23,24. Studies have shown a high rate of LNM in undifferentiated early gastric cancer²⁵. As the tumor grows, the invasion deepens and the LNM rate increases. The LNM rate has been shown to be associated with lymphangitic tumor thrombus²⁶. Female patients with early gastric cancer are more likely to develop lymph node metastases than males. This is presumably related to endogenous estrogen levels²⁷. Another study has shown that low differentiation, infiltration into the submucosa, large tumors, and venous or lymphatic invasion are independent risk factors for LNM²⁸. These findings are also corroborated by our findings.

At present, the main problem of machine learning method in medical practice is the lack of application scenarios and related clinical data. At present, a large number of published machine learning articles only use simple machine learning algorithms. In this study, we also use the machine learning fusion algorithm. However, the results of the test set fusion machine learning algorithm are not ideal. This also proves that when the machine learning algorithm is applied in medical clinic, it should pay attention to the application scenarios and the collection of relevant data.

This study has several limitations. Firstly, it only used routine hematoxylin and eosin staining. Therefore, accurate diagnosis of lymph node micrometastases was difficult. For example, lymph node micrometastasis may be a key causative factor in recurrent gastric cancer treatment. Furthermore, this study included only data on tumor characteristics; no data on patient-related tumor genes were collected. This may have contributed to the lack of optimal predictive results. Because different regions, different races and different treatment schemes may cause different incidence of lymphatic metastasis, and the rate of lymph node metastasis in intramucosal gastric adenocarcinoma is low in this study and previous studies.However, these will not affect the prediction results of machine learning in this study. However, more multi-center and forward-looking research is needed in the future.

Conclusion

Single machine learning algorithm can predict LNM in poorly differentiated-type intramucosal gastric cancer, but fusion algorithm can not improve the effect of machine learning in predicting LNM. This may provide guidance for personalized treatment of such patients.

Data availability

Data are available from the BioStudies (public) database (https://www.ebi.ac.uk/biostudies/studies?query=S-EPMC4881979), accession numbers: S-EPMC4881979.

References

Siegel, R. L., Miller, K. D. & Jemal, A. Cancer statistics, 2018. CA Cancer J. Clin. 68(1), 7–30 (2018).
Article Google Scholar
Pasechnikov, V. et al. Gastric cancer: prevention, screening and early diagnosis. World J. Gastroenterol. 20, 13842–13862 (2014).
Article Google Scholar
Yu, H. Y. et al. Magnifying narrow-band imaging endoscopy is superior in diagnosis of early gastric cancer. World J. Gastroenterol. 21, 9156–9162 (2015).
Article Google Scholar
Espinel, J. et al. Treatment modalities for early gastric cancer. World J. Gastrointest. Endosc. 7, 1062–1069 (2015).
Article Google Scholar
Zhao, B. W. et al. Lymph node metastasis, a unique independent prognostic factor in early gastric cancer. PLoS ONE 10, e0129531 (2015).
Article Google Scholar
Guo, T. J. et al. Feasible endoscopic therapy for early gastric cancer. World J. Gastroenterol. 21, 13325–13331 (2015).
Article CAS Google Scholar
Ramkumar, P. et al. Development and validation of a machine learning algorithm after primary total hip arthroplasty: applications to length of stay and payment models. J. Arthroplasty 34, 632–637 (2019).
Article Google Scholar
Kuo, K. et al. Predicting hospital-acquired pneumonia among schizophrenic patients: a machine learning approach. BMC Med. Inf. Decis. Mak. 19, 42 (2019).
Article Google Scholar
Thio, Q. et al. Can machine-learning techniques be used for 5-year survival prediction of patients with chondrosarcoma?. Clin. Orthop. Relat. Res. 476, 2040–2048 (2018).
Article Google Scholar
Lee, J. H. et al. Predictive factors for lymph node metastasis in patients with poorly differentiated early gastric cancer. Br. J. Surg. 99, 1688–1692 (2012).
Article CAS Google Scholar
Kim, H. et al. Early gastric cancer of signet ring cell carcinoma is more amenable to endoscopic treatment than is early gastric cancer of poorly differentiated tubular adenocarcinoma in select tumor conditions. Surg. Endosc. 25, 3087–3093 (2011).
Article Google Scholar
Kunisaki, C. et al. Risk factors for lymph node metastasis in histologically poorly differentiated type early gastric cancer. Endoscopy 41, 498–503 (2009).
Article CAS Google Scholar
World Health Organization Classification of Tumors (Lyon: IARC Press, 2000)
Kleihues, P. & Sobin, L. H. World Health Organization Classification of Tumors. Cancer 88, 2887 (2000).
Article CAS Google Scholar
Lee, J. et al. Clinical practice guidelines for gastric cancer in Korea: an evidence-based approach. J. Gastric Cancer 14, 87–104 (2014).
Article Google Scholar
Tanabe, S. et al. Gastric cancer treated by endoscopic submucosal dissection or endoscopic mucosal resection in Japan from 2004 through 2006: JGCA nationwide registry conducted in 2013. Gastric Cancer 20, 834–842 (2017).
Article Google Scholar
Pyo, J. et al. Early gastric cancer with a mixed-type Lauren classification is more aggressive and exhibits greater lymph node metastasis. J. Gastroenterol. 52, 594–601 (2017).
Article CAS Google Scholar
Hatta, W. et al. A scoring system to stratify curability after endoscopic submucosal dissection for early gastric cancer: “eCura system”. Am. J. Gastroenterol. 112, 874–881 (2017).
Article Google Scholar
Pokala, S. et al. Lymph node metastasis in early gastric adenocarcinoma in the United States of America. Endoscopy 50, 479–486 (2018).
Article Google Scholar
Chen, L. et al. Risk factors of lymph node metastasis in 1620 early gastric carcinoma radical resections in Jiangsu Province in China: a multicenter clinicopathological study. J. Dig. Dis. 18, 556–565 (2017).
Article Google Scholar
Hasuike, N. et al. A non-randomized confirmatory trial of an expanded indication for endoscopic submucosal dissection for intestinal-type gastric cancer (cT1a): the Japan Clinical Oncology Group study (JCOG0607). Gastric Cancer 21, 114–123 (2019).
Article Google Scholar
Pokala, S. et al. Lymph node metastasis in early gastric adenocarcinoma in the United States of America. Endoscopy. 50, 479–486 (2018).
Article Google Scholar
Catalano, F. et al. The modern treatment of early gastric cancer: our experience in an Italian cohor. Surg. Endosc. 23, 1581–1586 (2009).
Article Google Scholar
Ye, B. et al. Predictive factors for lymph node metastasis and endoscopic treatment strategies for undifferentiated early gastric cancer. J. Gastroenterol. Hepatol. 23, 46–50 (2008).
Article Google Scholar
Hirasawa, T. et al. Incidence of lymph node metastasis and the feasibility of endoscopic resection for undifferentiated-type early gastric cancer. Gastric Cancer 12(3), 148–152 (2009).
Article Google Scholar
Kim, D. et al. Factors related to lymph node metastasis and surgical strategy used to treat early gastric carcinoma. World J. Gastroenterol. 10, 737–740 (2004).
Article Google Scholar
Abe, N. et al. Risk factors predictive of lymph node metastasis in depressed early gastric cancer. Am. J. Surg. 183, 168–172 (2002).
Article Google Scholar
Woo, J. et al. Application of minimally invasive treatment for early gastric cancer. J. Surg. Oncol. 85(4), 181–185 (2004).
Article Google Scholar
Pyo, J. et al. A risk prediction model based on lymph-node metastasis in poorly differentiated–type intramucosal gastric cancer. PLoS ONE 11(5), e0156207 (2016).
Article Google Scholar

Download references

Acknowledgements

We are grateful to the BioStudies (public) database for including and providing Professor Lee's original data²⁹.

Author information

Authors and Affiliations

Department of Anesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
Cheng-Mao Zhou, Ying Wang, Hao-Tian Ye, Muhuo Ji, Panmiao Liu & Jian-Jun Yang
Department of Pathology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
Shuping Yan

Authors

Cheng-Mao Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Ying Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hao-Tian Ye
View author publications
You can also search for this author in PubMed Google Scholar
Shuping Yan
View author publications
You can also search for this author in PubMed Google Scholar
Muhuo Ji
View author publications
You can also search for this author in PubMed Google Scholar
Panmiao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jian-Jun Yang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Y.W.,H-T. Y., P.M.L.,C-M Z. and J-J Y. wrote the main manuscript text. S.P.Y. and M.H.J. prepared Figs. 1, 2, 3 and 4. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Cheng-Mao Zhou or Jian-Jun Yang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zhou, CM., Wang, Y., Ye, HT. et al. Machine learning predicts lymph node metastasis of poorly differentiated-type intramucosal gastric cancer. Sci Rep 11, 1300 (2021). https://doi.org/10.1038/s41598-020-80582-w

Download citation

Received: 17 February 2020
Accepted: 23 December 2020
Published: 14 January 2021
DOI: https://doi.org/10.1038/s41598-020-80582-w

This article is cited by

Non-endoscopic Applications of Machine Learning in Gastric Cancer: A Systematic Review
- Marianne Linley L. Sy-Janairo
- Jose Isagani B. Janairo
Journal of Gastrointestinal Cancer (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.