Abstract
The clinical application of breast ultrasound for the assessment of cancer risk and of deep learning for the classification of breast-ultrasound images has been hindered by inter-grader variability and high false positive rates and by deep-learning models that do not follow Breast Imaging Reporting and Data System (BI-RADS) standards, lack explainability features and have not been tested prospectively. Here, we show that an explainable deep-learning system trained on 10,815 multimodal breast-ultrasound images of 721 biopsy-confirmed lesions from 634 patients across two hospitals and prospectively tested on 912 additional images of 152 lesions from 141 patients predicts BI-RADS scores for breast cancer as accurately as experienced radiologists, with areas under the receiver operating curve of 0.922 (95% confidence interval (CI) = 0.868–0.959) for bimodal images and 0.955 (95% CI = 0.909–0.982) for multimodal images. Multimodal multiview breast-ultrasound images augmented with heatmaps for malignancy risk predicted via deep learning may facilitate the adoption of ultrasound imaging in screening mammography workflows.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$99.00 per year
only $8.25 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The authors declare that the main data supporting the results of this study are available within the paper and its Supplementary Information. The raw US datasets from The First Affiliated Hospital of Anhui Medical University and Xuancheng People’s Hospital of China cannot be made available for public release because of patient privacy. However, some data can be made available for academic purposes from the corresponding author on reasonable request, subject to permission from the institutional review boards of the hospitals.
Code availability
The deep-learning models were developed using standard libraries and scripts available in PyTorch. The pre-trained weights for ResNet were obtained from the torchvision library. Custom codes and the annotation tool for the deployment of the system are available for research purposes from the corresponding author upon reasonable request.
References
Jemal, A. et al. Global cancer statistics. CA Cancer J. Clin. 61, 69–90 (2011).
Bray, F., Jemal, A., Grey, N., Ferlay, J. & Forman, D. Global cancer transitions according to the Human Development Index (2008–2030): a population-based study. Lancet Oncol. 13, 790–801 (2012).
Forouzanfar, M. H. et al. Breast and cervical cancer in 187 countries between 1980 and 2010: a systematic analysis. Lancet 378, 1461–1484 (2011).
Berg, W. A. et al. Detection of breast cancer with addition of annual screening ultrasound or a single screening MRI to mammography in women with elevated breast cancer risk. J. Am. Med. Assoc. 307, 1394–1404 (2012).
Ohuchi, N. et al. Sensitivity and specificity of mammography and adjunctive ultrasonography to screen for breast cancer in the Japan Strategic Anti-cancer Randomized Trial (J-START): a randomised controlled trial. Lancet 387, 341–348 (2016).
Berg, W. A. et al. Ultrasound as the primary screening test for breast cancer: analysis from ACRIN 6666. J. Natl Cancer Inst. 108, djv367 (2015).
Lee, H.-J. et al. Observer variability of Breast Imaging Reporting and Data System (BI-RADS) for breast ultrasound. Eur. J. Radiol. 65, 293–298 (2008).
Abdullah, N., Mesurolle, B., El-Khoury, M. & Kao, E. Breast Imaging Reporting and Data System lexicon for US: interobserver agreement for assessment of breast masses. Radiology 252, 665–672 (2009).
Doi, K. Computer-aided diagnosis in medical imaging: historical review, current status and future potential. Comput. Med. Imaging Graph. 31, 198–211 (2007).
Yassin, N. I., Omran, S., El Houby, E. M. & Allam, H. Machine learning techniques for breast cancer computer aided diagnosis using different image modalities: a systematic review. Comput. Methods Programs Biomed. 156, 25–45 (2018).
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Russakovsky, O. et al. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015).
Shen, D., Wu, G. & Suk, H.-I. Deep learning in medical image analysis. Annu. Rev. Biomed. Eng. 19, 221–248 (2017).
Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. J. Am. Med. Assoc. 316, 2402–2410 (2016).
Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017).
De Fauw, J. et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat. Med. 24, 1342–1350 (2018).
Ardila, D. et al. End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nat. Med. 25, 954–961 (2019).
Cheng, J.-Z. et al. Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci. Rep. 6, 24454 (2016).
Han, S. et al. A deep learning framework for supporting the classification of breast lesions in ultrasound images. Phys. Med. Biol. 62, 7714–7728 (2017).
Tanaka, H., Chiu, S.-W., Watanabe, T., Kaoku, S. & Yamaguchi, T. Computer-aided diagnosis system for breast ultrasound images using deep learning. Phys. Med. Biol. 64, 235013 (2019).
Moon, W. K. et al. Computer‐aided diagnosis of breast ultrasound images using ensemble learning from convolutional neural networks. Comput. Methods Prog. Biomed. 190, 105361 (2020).
Shin, S. Y., Lee, S., Yun, I. D., Kim, S. M. & Lee, K. M. Joint weakly and semi-supervised deep learning for localization and classification of masses in breast ultrasound images. IEEE Trans. Med. Imaging 38, 762–774 (2018).
Qi, X. et al. Automated diagnosis of breast ultrasonography images using deep neural networks. Med. Image Anal. 52, 185–198 (2019).
Zhang, Q. et al. Deep learning based classification of breast tumors with shear-wave elastography. Ultrasonics 72, 150–157 (2016).
Qian, X. et al. A combined ultrasonic B-mode and color Doppler system for the classification of breast masses using neural network. Eur. Radiol. 30, 3023–3033 (2020).
Zhou, Y. et al. A radiomics approach with CNN for shear-wave elastography breast tumor classification. IEEE Trans. Biomed. Eng. 65, 1935–1942 (2018).
Shan, J., Alam, S. K., Garra, B., Zhang, Y. & Ahmed, T. Computer-aided diagnosis for breast ultrasound using computerized BI-RADS features and machine learning methods. Ultrasound Med. Biol. 42, 980–988 (2016).
Zhang, E., Seiler, S., Chen, M., Lu, W. & Gu, X. BIRADS features-oriented semi-supervised deep learning for breast ultrasound computer-aided diagnosis. Phys. Med. Biol. 65, 125005 (2020).
Mendelson, E. B. et al. ACR BI-RADS® Ultrasound In ACR BI-RADS® Atlas, Breast Imaging Reporting and Data System (American College of Radiology, 2013).
Nagendran, M. et al. Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies. Br. Med. J. 368, m689 (2020).
Hu, J., Shen, L. & Sun, G. Squeeze-and-excitation networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) https://doi.org/10.1109/CVPR.2018.00745 (IEEE, 2018).
Selvaraju, R. R. et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. In IEEE International Conference on Computer Vision (ICCV) https://doi.org/10.1109/ICCV.2017.74 (IEEE, 2017).
Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations (2015); preprint at https://arxiv.org/abs/1409.1556 (2014).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) https://doi.org/10.1109/CVPR.2016.90 (IEEE, 2016).
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. Rethinking theInception architecture for computer vision. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) https://doi.org/10.1109/CVPR.2016.308 (IEEE, 2016).
Chen, J. H. & Asch, S. M.Machine learning and prediction in medicine—beyond the peak of inflated expectations. N. Engl. J. Med. 376, 2507–2509 (2017).
Park, S. H. & Han, K. Methodologic guide for evaluating clinical performance and effect of artificial intelligence technology for medical diagnosis and prediction. Radiology 286, 800–809 (2018).
Becker, A. S. et al. Classification of breast cancer in ultrasound imaging using a generic deep learning analysis software: a pilot study. Br. J. Radiol. 91, 20170576 (2018).
Ciritsis, A. et al. Automatic classification of ultrasound breast lesions using a deep convolutional neural network mimicking human decision-making. Eur. Radiol. 29, 5458–5468 (2019).
Wang, Y. et al. Breast cancer classification in automated breast ultrasound using multiview convolutional neural network with transfer learning. Ultrasound Med. Biol. 46, 1119–1132 (2020).
Berg, W. A. et al. Shear-wave elastography improves the specificity of breast US: the BE1 multinational study of 939 masses. Radiology 262, 435–449 (2012).
Lee, S. H. et al. Evaluation of screening US-detected breast masses by combined use of elastography and color Doppler US with B-mode US in women with dense breasts: a multicenter prospective study. Radiology 285, 660–669 (2017).
Cho, N. et al. Distinguishing benign from malignant masses at breast US: combined US elastography and color Doppler US—influence on radiologist accuracy. Radiology 262, 80–90 (2012).
Destrempes, F. et al. Added value of quantitative ultrasound and machine learning in BI-RADS 4–5 assessment of solid breast lesions. Ultrasound Med. Biol. 46, 436–444 (2020).
Castelvecchi, D.Can we open the black box of AI? Nature 538, 20–23 (2016).
Clinical and Patient Decision Support Software (Draft Guidance) (Food and Drug Administration, 2018).
Lee, H. et al. An explainable deep-learning algorithm for the detection of acute intracranial haemorrhage from small datasets. Nat. Biomed. Eng. 3, 173–182 (2019).
Shin, H.-C. et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35, 1285–1298 (2016).
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A. & Torralba, A. Learning deep features for discriminative localization. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) https://doi.org/10.1109/CVPR.2016.319 (IEEE, 2016).
Barr, R. G. et al. WFUMB guidelines and recommendations for clinical use of ultrasound elastography: part 2: breast. Ultrasound Med. Biol. 41, 1148–1160 (2015).
DeLong, E. R., DeLong, D. M. & Clarke-Pearson, D. L.Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44, 837–845 (1988).
Acknowledgements
We are grateful to H. Zheng, Y. Liu, X. Shuai, G. Zhang, J.-S. Zhang, W. Yao and J.-X. Zhang for participation in the reader studies. We acknowledge help from Y. Lu and R. Wodnicki in manuscript revision and editing.
Author information
Authors and Affiliations
Contributions
X.Q. conceived of, designed and supervised the project. J.P. and H. Zheng provided clinical expertise and guidance on the study design. X.Q., L.Y. and X.G. developed the deep-learning framework and software tools necessary for the experiments. X.Q., J.P., H. Zheng, X.X., Hao Zhang and C.H. created the datasets, interpreted the data and defined the clinical labels. X.X., C.H., Hanqi Zhang, W.Z. and Q.S. collected the raw US images and patients’ pathology results in the clinic. X.Q., L.Y., X.G., Hao Zhang and L.L. executed the research and performed the statistical analysis. K.K.S. advised on the US imaging techniques. X.Q., J.P. and H. Zheng wrote the manuscript. All authors contributed to reviewing and editing the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Biomedical Engineering thanks Guy Cloutier and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–8 and Tables 1–5.
Rights and permissions
About this article
Cite this article
Qian, X., Pei, J., Zheng, H. et al. Prospective assessment of breast cancer risk from multimodal multiview ultrasound images via clinically applicable deep learning. Nat Biomed Eng 5, 522–532 (2021). https://doi.org/10.1038/s41551-021-00711-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41551-021-00711-2
This article is cited by
-
A validation of an entropy-based artificial intelligence for ultrasound data in breast tumors
BMC Medical Informatics and Decision Making (2024)
-
Applying deep learning to recognize the properties of vitreous opacity in ophthalmic ultrasound images
Eye (2024)
-
Cardiologist-level interpretable knowledge-fused deep neural network for automatic arrhythmia diagnosis
Communications Medicine (2024)
-
Diagnostic performance of deep learning in ultrasound diagnosis of breast cancer: a systematic review
npj Precision Oncology (2024)
-
Artificial intelligence in liver imaging: methods and applications
Hepatology International (2024)