A deep-learning algorithm enables the real-time video-based recognition of polyps during colonoscopy, with sensitivities and specificities surpassing 90%.
Colorectal cancer (CRC) is a major cause of cancer-related death. To decrease the incidence of CRC, it is recommended that neoplastic lesions (such as adenomas) be resected during colonoscopy1,2. A large cohort study conducted in the United States with 88,902 participants showed a roughly 70% reduction of death from CRC after colonoscopy screening3. However, the quality of the colonoscopy is important; adenoma detection rates (ADRs) during colonoscopy are inversely associated with the incidence of interval CRC (defined as cancer diagnosed between screening and post-screening surveillance examinations) and with CRC-related mortality4,5. However, many adenomas are missed during routine colonoscopy6, and ADR is significantly influenced by inter-endoscopist variability4, which is a major barrier to the standardization of high-quality colonoscopy. Computer-aided detection (CADe) systems powered by artificial intelligence could address this problem7. By indicating the presence and location of polyps in real time during colonoscopies, CADe can function as a second observer, potentially drawing the endoscopist’s attention to polyps that are displayed on the monitor but may be overlooked by the endoscopist.
A major strength of CADe systems is that they are not susceptible to inherent inter- and intra-observer variability, and can therefore improve ADRs regardless of the endoscopist’s skill. Several studies have shown that CADe systems provide roughly 70–90% sensitivity and 60–90% specificity for the detection of polyps8,9,10,11. However, because these studies included fewer than 100 patients, the generalizability of CADe systems remains uncertain. Now, Xiaogang Liu and colleagues report in Nature Biomedical Engineering the development and validation of a CADe model that employs a deep-learning algorithm (specifically, a convolutional neural network in which the neural layers only connect to the next layer) for the real-time detection of polyps12 (Fig. 1). The researchers used 5,545 endoscopic images (3,634 with polyps and 1,911 without polyps) from 1,290 patients to train the CADe machine-learning model, and validated it by using prospectively collected endoscopic images and videos.
Before the validation study was initiated, a panel of nine, experienced endoscopists assessed the test datasets from the Sichuan Provincial Peoples’ Hospital (where the images for training the algorithm had been obtained), and annotated the presence and location of any polyps. Liu and co-authors used four validation datasets: 27,113 static images from 1,138 patients acquired in the Sichuan hospital, for which the algorithm achieved a per-image sensitivity of 94% and a per-image specificity of 96%; a public database of 612 polyp-containing images from 29 colonoscopies, for which the algorithm reached a per-image sensitivity of 88%; video recordings of 138 polyps from 111 patients acquired in the same Sichuan hospital, for which the algorithm achieved a per-image sensitivity of 92% and a per-polyp sensitivity of 100%, with 89% accuracy in tracking polyp appearance; and video recordings acquired in the same hospital and consisting of 1,072,483 image frames from 54 colonoscopies, none of which contained any polyps, and for which the algorithm achieved a 95% specificity on a per-frame basis. The total average processing time per image frame (that is, the latency of the system in video-image interpretation) was 77 ms on a commercial flagship graphics processor (the NVIDIA Titan X Pascal). This means that the CADe system allows for real-time recognition of polyps during colonoscopy with acceptable speed and, in view of the validation results, also with acceptable accuracy.
The algorithm’s validation effort carried out by Liu and co-authors surpasses what has been shown so far. Theoretically, a CADe model that is developed using only static images as learning material tends to work well for static images but is less suited for video recordings, which usually contain vastly more lower-quality image frames than static images. Also, the over-90% values in sensitivity and specificity for video-based analysis suggest that the authors’ algorithm is robust, even for low-quality image frames. This might be due to the algorithm’s focus on local features, owing to the neural network’s small receptive field (the spatial extent of the connectivity of a given neuron). Furthermore, the authors’ algorithm helps to detect polyps that only partially appear on the screen (Fig. 1). An additional advantage of the authors’ system is the low rate of false positives — only 5% of the image frames in the colonoscopy videos that did not contain any polyp were tagged. This compares to previously reported specificities of 60–75% for other CADe colonoscopy systems9,10.
However, Liu and co-authors’ CADe system sometimes missed small and far-off polyps because it can only extract a small portion of pixels and cannot fully capture the characteristics of these types of polyp. Also, the system was not evaluated under special bowel conditions, such as those resulting from inflammatory bowel disease, intestinal bleeding or inadequate bowel preparation. These limitations would need to be addressed before this technology can be implemented clinically. The system should now be tested in a prospective trial, where it is used in real time during actual colonoscopy rather than with pre-recorded colonoscopy videos (so far, no prospective studies or randomized controlled trials for automated polyp detection have been published in the peer-reviewed literature). This is important because prospective clinical evaluation provides data that cannot be appropriately analysed from a retrospective study, such as accuracy in consideration of missing data (which should be treated on the basis of best-case and worse-case scenarios), and the benefits of using CADe for experts and non-experts. Also, retrospective studies of CADe systems do not necessarily consider any additional time required during the use of the system, the endoscopist’s potential increase in stress due to the need to check two video streams (one unmodified and another showing the potential polyps tagged) and to listen to alarms (also for false positives), and the system’s performance with low-quality image frames, which usually involve more than 30% of clinical colonoscopy work13. Ultimately, CADe systems for colonoscopies should be evaluated in a randomized controlled trial with ADRs or interval CRC rates as a primary endpoint.
Zauber, A. G. et al. N. Engl. J. Med. 366, 687–696 (2012).
Winawer, S. J. et al. N. Engl. J. Med. 329, 1977–1981 (1993).
Nishihara, R. et al. N. Engl. J. Med. 369, 1095–1105 (2013).
Kaminski, M. F. et al. N. Engl. J. Med. 362, 1795–1803 (2010).
Corley, D. A. et al. N. Engl. J. Med. 370, 1298–1306 (2014).
van Rijn, J. C. et al. Am. J. Gastroenterol. 101, 343–350 (2006).
Mori, Y., Kudo, S. E., Berzin, T. M., Misawa, M. & Takeda, K. Endoscopy 49, 813–819 (2017).
Tajbakhsh, N., Gurudu, S. R. & Liang, J. IEEE Trans. Med. Imaging 35, 630–644 (2016).
Misawa, M. et al. Gastroenterology 154, 2027–2029 (2018).
Fernandez-Esparrach, G. et al. Endoscopy 48, 837–842 (2016).
Urban, G. et al. Gastroenterology. https://doi.org/10.1053/j.gastro.2018.06.037 (2018).
Wang, P. et al. Nat. Biomed. Eng. https://doi.org/10.1038/s41551-018-0301-3 (2018).
Mori, Y. et al. Ann. Intern. Med. https://doi.org/10.7326/M18-0249 (2018).
Y.M. and S.K. received speaking honoraria from Olympus Corporation.
About this article
Cite this article
Mori, Y., Kudo, Se. Detecting colorectal polyps via machine learning. Nat Biomed Eng 2, 713–714 (2018). https://doi.org/10.1038/s41551-018-0308-9
An Extensive Study on Cross-Dataset Bias and Evaluation Metrics Interpretation for Machine Learning Applied to Gastrointestinal Tract Abnormality Classification
ACM Transactions on Computing for Healthcare (2020)
The Lancet Gastroenterology & Hepatology (2020)
Journal of Clinical Medicine (2020)
Gastrointestinal Endoscopy (2019)
Applied Sciences (2019)