Predicting systemic diseases in fundus images: systematic review of setting, reporting, bias, and models’ clinical availability in deep learning studies

Li, Yitong; Zhang, Ruiheng; Dong, Li; Shi, Xuhan; Zhou, Wenda; Wu, Haotian; Li, Heyan; Yu, Chuyao; Wei, Wenbin

doi:10.1038/s41433-023-02914-0

Review Article
Published: 18 January 2024

Predicting systemic diseases in fundus images: systematic review of setting, reporting, bias, and models’ clinical availability in deep learning studies

Eye (2024)Cite this article

177 Accesses
1 Citations
10 Altmetric
Metrics details

Subjects

Abstract

Background

Analyzing fundus images with deep learning techniques is promising for screening systematic diseases. However, the quality of the rapidly increasing number of studies was variable and lacked systematic evaluation.

Objective

To systematically review all the articles that aimed to predict systemic parameters and conditions using fundus image and deep learning, assessing their performance, and providing suggestions that would enable translation into clinical practice.

Methods

Two major electronic databases (MEDLINE and EMBASE) were searched until August 22, 2023, with keywords ‘deep learning’ and ‘fundus’. Studies using deep learning and fundus images to predict systematic parameters were included, and assessed in four aspects: study characteristics, transparent reporting, risk of bias, and clinical availability. Transparent reporting was assessed by the TRIPOD statement, while the risk of bias was assessed by PROBAST.

Results

4969 articles were identified through systematic research. Thirty-one articles were included in the review. A variety of vascular and non-vascular diseases can be predicted by fundus images, including diabetes and related diseases (19%), sex (22%) and age (19%). Most of the studies focused on developed countries. The models’ reporting was insufficient in determining sample size and missing data treatment according to the TRIPOD. Full access to datasets and code was also under-reported. 1/31(3.2%) study was classified as having a low risk of bias overall, whereas 30/31(96.8%) were classified as having a high risk of bias according to the PROBAST. 5/31(16.1%) of studies used prospective external validation cohorts. Only two (6.4%) described the study’s calibration. The number of publications by year increased significantly from 2018 to 2023. However, only two models (6.5%) were applied to the device, and no model has been applied in clinical.

Conclusion

Deep learning fundus images have shown great potential in predicting systematic conditions in clinical situations. Further work needs to be done to improve the methodology and clinical application.

摘要

背景: 深度学习技术分析眼底图像有助于筛查全身性疾病。然而, 数量迅速增加的研究, 其质量存在差异, 并且缺乏系统性评估。目的: 对所有旨在使用眼底图像和深度学习预测系统参数和条件的文献进行系统回顾, 评估其性能, 并提供促进临床实践转化的建议。方法: 截至2023年8月22日, 使用关键词“深度学习”和“眼底”检索了两个主要的电子数据库 (MEDLINE和EMBASE) 。包括使用深度学习和眼底图像预测系统参数的研究, 并从四个方面进行评估: 研究特征、透明报告、偏倚风险和临床可用性。透明报告由TRIPOD声明评估, 而偏倚风险由PROBAST评估。结果: 系统性搜索共4969篇文章, 其中31篇被纳入综述。眼底图像可预测各种血管性和非血管性疾病, 包括糖尿病及相关疾病 (19%) 、性别 (22%) 和年龄 (19%) 。大多数研究集中在发达国家。根据TRIPOD, 在确定样本大小和缺失数据处理方面的报告不足。对于数据库和代码完整访问的报道也存在不足。根据PROBAST, 1/31 (3.2%) 被分类为总体偏倚风险低, 而30/31 (96.8%) 被分类为总体偏倚风险高。而5/31 (16.1%) 研究使用了前瞻性外部验证队列。只有两项 (6.4%) 描述了研究的校准情况。从2018年到2023年, 每年的发表文献数量显著增加。然而, 只有两个模型 (6.5%) 应用于设备, 没有模型应用于临床。结论: 深度学习眼底图像在预测临床情况方面展现出巨大潜力。需要进一步努力改进方法学和临床应用。

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow chart of study records.**

**Fig. 2: Risk of bias assessed using Prediction model Risk Of Bias Assessment Tool (PROBAST) reporting standards for all included studies.**

**Fig. 3: Temporal distribution illustrating the number and models’ clinical availability of studies included in this review.**

**Fig. 4: Distribution illustrating the number and clinical availability of studies in 7 different prediction outcomes.**

Multimodal deep learning of fundus abnormalities and traditional risk factors for cardiovascular risk prediction

Article Open access 02 February 2023

The performance of a deep learning system in assisting junior ophthalmologists in diagnosing 13 major fundus diseases: a prospective multi-center clinical trial

Article Open access 11 January 2024

Deep-learning-based AI for evaluating estimated nonperfusion areas requiring further examination in ultra-widefield fundus images

Article Open access 17 December 2022

Data availability

Raw data are available on request from the corresponding author.

References

Liew G, Gopinath B, White AJ, Burlutsky G, Yin Wong T, Mitchell P. Retinal vasculature fractal and stroke mortality. Stroke. 2021;52:1276–82.
Article CAS PubMed Google Scholar
Patton N, Aslam TM, MacGillivray T, Deary IJ, Dhillon B, Eikelboom RH, et al. Retinal image analysis: concepts, applications and potential. Prog Retin Eye Res. 2006;25:99–127.
Article PubMed Google Scholar
Forster RB, Garcia ES, Sluiman AJ, Grecian SM, McLachlan S, MacGillivray TJ, et al. Retinal venular tortuosity and fractal dimension predict incident retinopathy in adults with type 2 diabetes: the Edinburgh Type 2 Diabetes Study. Diabetologia. 2021;64:1103–12.
Article CAS PubMed PubMed Central Google Scholar
Wong TY, Knudtson MD, Klein R, Klein BE, Meuer SM, Hubbard LD. Computer-assisted measurement of retinal vessel diameters in the Beaver Dam Eye Study: methodology, correlation between eyes, and effect of refractive errors. Ophthalmology. 2004;111:1183–90.
Article PubMed Google Scholar
Thom S, Stettler C, Stanton A, Witt N, Tapp R, Chaturvedi N, et al. Differential effects of antihypertensive treatment on the retinal microcirculation: an anglo-scandinavian cardiac outcomes trial substudy. Hypertension. 2009;54:405–8.
Article CAS PubMed Google Scholar
Czakó C, Kovács T, Ungvari Z, Csiszar A, Yabluchanskiy A, Conley S, et al. Retinal biomarkers for Alzheimer’s disease and vascular cognitive impairment and dementia (VCID): implication for early diagnosis and prognosis. Geroscience. 2020;42:1499–525.
Article PubMed PubMed Central Google Scholar
Gamble L, Mash AJ, Burdan T, Ruiz RS, Spivey BE. Ophthalmology (eye physician and surgeon) manpower studies for the United States. Part IV: Ophthalmology manpower distribution 1983. Ophthalmology. 1983;90:47a–64a.
Article CAS PubMed Google Scholar
Yuan M, Chen W, Wang T, Song Y, Zhu Y, Chen C, et al. Exploring the growth patterns of medical demand for eye care: a longitudinal hospital-level study over 10 years in China. Ann Transl Med. 2020;8:1374.
Article PubMed PubMed Central Google Scholar
Celi LA, Mark RG, Stone DJ, Montgomery RA. “Big data” in the intensive care unit. Closing data loop. Am J Respir Crit Care Med. 2013;187:1157–60.
Article PubMed Google Scholar
Futoma J, Simons M, Panch T, Doshi-Velez F, Celi LA. The myth of generalisability in clinical research and machine learning in health care. Lancet Digit Health. 2020;2:e489–e492.
Article PubMed PubMed Central Google Scholar
Mookiah MRK, Hogg S, MacGillivray TJ, Prathiba V, Pradeepa R, Mohan V, et al. A review of machine learning methods for retinal blood vessel segmentation and artery/vein classification. Med Image Anal. 2021;68:101905.
Article PubMed Google Scholar
van Leeuwen KG, Schalekamp S, Rutten M, van Ginneken B, de Rooij M. Artificial intelligence in radiology: 100 commercially available products and their scientific evidence. Eur Radio. 2021;31:3797–804.
Article Google Scholar
Auffermann WF, Gozansky EK, Tridandapani S. Artificial intelligence in cardiothoracic radiology. AJR Am J Roentgenol 2019;212:997–1001.
Jones OT, Matin RN, van der Schaar M, Prathivadi Bhayankaram K, Ranmuthu CKI, Islam MS, et al. Artificial intelligence and machine learning algorithms for early detection of skin cancer in community and primary care settings: a systematic review. Lancet Digit Health. 2022;4:e466–e476.
Article CAS PubMed Google Scholar
Phillips M, Marsden H, Jaffe W, Matin RN, Wali GN, Greenhalgh J, et al. Assessment of accuracy of an artificial intelligence algorithm to detect melanoma in images of skin lesions. JAMA Netw Open. 2019;2:e1913436.
Article PubMed PubMed Central Google Scholar
Nabi J. Artificial intelligence can augment global pathology initiatives. Lancet. 2018;392:2351–2.
Article PubMed Google Scholar
Bera K, Schalper KA, Rimm DL, Velcheti V, Madabhushi A. Artificial intelligence in digital pathology - new tools for diagnosis and precision oncology. Nat Rev Clin Oncol. 2019;16:703–15.
Article PubMed PubMed Central Google Scholar
Ting DSJ, Foo VH, Yang LWY, Sia JT, Ang M, Lin H, et al. Artificial intelligence for anterior segment diseases: Emerging applications in ophthalmology. Br J Ophthalmol. 2021;105:158–68.
Article PubMed Google Scholar
Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ. 2015;350:g7594.
Article PubMed Google Scholar
Moons KG, Altman DG, Reitsma JB, Ioannidis JP, Macaskill P, Steyerberg EW, et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015;162:W1–73.
Article PubMed Google Scholar
Nagendran M, Chen Y, Lovejoy CA, Gordon AC, Komorowski M, Harvey H, et al. Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies. BMJ. 2020;368:m689.
Article PubMed PubMed Central Google Scholar
Corti C, Cobanaj M, Marian F, Dee EC, Lloyd MR, Marcu S, et al. Artificial intelligence for prediction of treatment outcomes in breast cancer: systematic review of design, reporting standards, and bias. Cancer Treat Rev. 2022;108:102410.
Article CAS PubMed Google Scholar
Wolff RF, Moons KGM, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med. 2019;170:51–58.
Article PubMed Google Scholar
Moons KGM, Wolff RF, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: a tool to assess risk of bias and applicability of prediction model studies: explanation and elaboration. Ann Intern Med. 2019;170:W1–w33.
Article PubMed Google Scholar
Gallifant J, Zhang J, Del Pilar Arias Lopez M, Zhu T, Camporota L, Celi LA, et al. Artificial intelligence for mechanical ventilation: systematic review of design, reporting standards, and bias. Br J Anaesth. 2022;128:343–51.
Article CAS PubMed Google Scholar
Lee AY, Yanagihara RT, Lee CS, Blazes M, Jung HC, Chee YE, et al. Multicenter, head-to-head, real-world validation study of seven automated artificial intelligence diabetic retinopathy screening systems. Diabetes Care. 2021;44:1168–75.
Article PubMed PubMed Central Google Scholar
Kaushal A, Altman R, Langlotz C. Geographic distribution of US cohorts used to train deep learning algorithms. JAMA. 2020;324:1212–3.
Article PubMed PubMed Central Google Scholar
Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw. 2015;61:85–117.
Article PubMed Google Scholar
Blaha MJ. The critical importance of risk score calibration: time for transformative approach to risk score validation? J Am Coll Cardiol. 2016;67:2131–4.
Article PubMed Google Scholar
Laukkanen JA, Kunutsor SK. Is ‘re-calibration’ of standard cardiovascular disease (CVD) risk algorithms the panacea to improved CVD risk prediction and prevention? Eur Heart J. 2019;40:632–4.
Article PubMed Google Scholar

Download references

Funding

This work was supported by National Natural Science Foundation of China (82220108017, 82141128), the Capital Health Research and Development of Special (2020–1–2052), and the Science & Technology Project of Beijing Municipal Science & Technology Commission (Z201100005520045, Z181100001818003).

Author information

These authors contributed equally: Yitong Li, Ruiheng Zhang, Li Dong.

Authors and Affiliations

Beijing Tongren Eye Center, Beijing Key Laboratory of Intraocular Tumor Diagnosis and Treatment, Beijing Ophthalmology & Visual Sciences Key Lab, Medical Artificial Intelligence Research and Verification Key Laboratory of the Ministry of Industry and Information Technology, Beijing Tongren Hospital, Capital Medical University, Beijing, China
Yitong Li, Ruiheng Zhang, Li Dong, Xuhan Shi, Wenda Zhou, Haotian Wu, Heyan Li, Chuyao Yu & Wenbin Wei

Authors

Yitong Li
View author publications
You can also search for this author in PubMed Google Scholar
Ruiheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Li Dong
View author publications
You can also search for this author in PubMed Google Scholar
Xuhan Shi
View author publications
You can also search for this author in PubMed Google Scholar
Wenda Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Haotian Wu
View author publications
You can also search for this author in PubMed Google Scholar
Heyan Li
View author publications
You can also search for this author in PubMed Google Scholar
Chuyao Yu
View author publications
You can also search for this author in PubMed Google Scholar
Wenbin Wei
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

YTL and DL conceived the study and designed the study protocol. YTL, RHZ and WBW executed the search and extracted data. YTL performed the initial analysis of data, with all authors contributing to interpretation of data. All authors contributed to critical revision of the manuscript for important intellectual content and approved the final version. WBW is the study guarantor. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

Corresponding author

Correspondence to Wenbin Wei.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Material

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, Y., Zhang, R., Dong, L. et al. Predicting systemic diseases in fundus images: systematic review of setting, reporting, bias, and models’ clinical availability in deep learning studies. Eye (2024). https://doi.org/10.1038/s41433-023-02914-0

Download citation

Received: 28 February 2023
Revised: 10 November 2023
Accepted: 20 December 2023
Published: 18 January 2024
DOI: https://doi.org/10.1038/s41433-023-02914-0

Predicting systemic diseases in fundus images: systematic review of setting, reporting, bias, and models’ clinical availability in deep learning studies

Subjects

Abstract

Background

Objective

Methods

Results

Conclusion

摘要

Access options

Similar content being viewed by others

Multimodal deep learning of fundus abnormalities and traditional risk factors for cardiovascular risk prediction

The performance of a deep learning system in assisting junior ophthalmologists in diagnosing 13 major fundus diseases: a prospective multi-center clinical trial

Deep-learning-based AI for evaluating estimated nonperfusion areas requiring further examination in ultra-widefield fundus images

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Material

Rights and permissions

About this article

Cite this article

Search

Quick links

Subjects

Abstract

Background

Objective

Methods

Results

Conclusion

摘要

Access options

Similar content being viewed by others

Multimodal deep learning of fundus abnormalities and traditional risk factors for cardiovascular risk prediction

The performance of a deep learning system in assisting junior ophthalmologists in diagnosing 13 major fundus diseases: a prospective multi-center clinical trial

Deep-learning-based AI for evaluating estimated nonperfusion areas requiring further examination in ultra-widefield fundus images

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Material

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links