Dense anatomical annotation of slit-lamp images improves the performance of deep learning for the diagnosis of ophthalmic disorders


The development of artificial intelligence algorithms typically demands abundant high-quality data. In medicine, the datasets that are required to train the algorithms are often collected for a single task, such as image-level classification. Here, we report a workflow for the segmentation of anatomical structures and the annotation of pathological features in slit-lamp images, and the use of the workflow to improve the performance of a deep-learning algorithm for diagnosing ophthalmic disorders. We used the workflow to generate 1,772 general classification labels, 13,404 segmented anatomical structures and 8,329 pathological features from 1,772 slit-lamp images. The algorithm that was trained with the image-level classification labels and the anatomical and pathological labels showed better diagnostic performance than the algorithm that was trained with only the image-level classification labels, performed similar to three ophthalmologists across four clinically relevant retrospective scenarios and correctly diagnosed most of the consensus outcomes of 615 clinical reports in prospective datasets for the same four scenarios. The dense anatomical annotation of medical images may improve their use for automated classification and detection tasks.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Flowchart of the overall study.
Fig. 2: Examples of the labelled slit-lamp images in scenario 2.
Fig. 3: Schematic of the classification labels of pathological features in scenario 3.
Fig. 4: Flow diagram of the system.
Fig. 5: Comparison between the performances of the DSV and the diagnosis of ophthalmologists, in a retrospective external validation (scenarios 1–4).
Fig. 6: Prospective test of the DSV (scenarios 1–4).

Data availability

The main data supporting the results in this study are available within the paper and its Supplementary Information. The slit-lamp images and patient data used in this study cannot be shared publicly, yet with certain restrictions they may be available from the corresponding authors for research purposes.

Code availability

The source code for the DSV is available at GitHub ( The web server for the DSV is available at


  1. 1.

    Carneiro, G. et al. (eds) Deep Learning and Data Labeling for Medical Applications: First International Workshop, LABELS 2016, and Second International Workshop, DLMIA 2016, Held in Conjunction with MICCAI 2016, Athens, Greece, October 21, 2016, Proceedings Vol. 10008 (Springer, 2016).

  2. 2.

    Wiener, J. M. & Tilly, J. Population ageing in the United States of America: implications for public programmes. Int. J. Epidemiol.31, 776–781 (2002).

    Article  Google Scholar 

  3. 3.

    Jauhar, S. When rules for better care exact their own cost. The New York Times (5 Jan 1999).

  4. 4.

    Zhang, M. & Zhou, Z. A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng.26, 1819–1837 (2014).

    Article  Google Scholar 

  5. 5.

    Cho, J., Lee, K., Shin, E., Choy, G. & Do, S. How much data is needed to train a medical image deep learning system to achieve necessary high accuracy? Preprint at arXiv (2015).

  6. 6.

    Nosowsky, R. & Giordano, T. J. The health insurance portability and accountability act of 1996 (HIPAA) privacy rule: implications for clinical research. Annu. Rev. Med.57, 575–590 (2006).

    CAS  Article  Google Scholar 

  7. 7.

    Hand, D. J., Smyth, P. & Mannila, H. Principles of Data Mining (Adaptive Computation and Machine Learning) (MIT Press, 2001).

  8. 8.

    Wang, L. et al. Comparative analysis of image classification methods for automatic diagnosis of ophthalmic images. Sci. Rep.7, 41545 (2017).

    CAS  Article  Google Scholar 

  9. 9.

    Long, E. et al. An artificial intelligence platform for the multihospital collaborative management of congenital cataracts. Nat. Biomed. Eng.1, 0024 (2017).

    Article  Google Scholar 

  10. 10.

    Dart, J., Stapleton, F. & Minassian, D. Contact lenses and other risk factors in microbial keratitis. Lancet338, 650–653 (1991).

    CAS  Article  Google Scholar 

  11. 11.

    Guillaumin, M., Küttel, D. & Ferrari, V. Imagenet auto-annotation with segmentation propagation. Int. J. Comput. Vis.110, 328–348 (2014).

    Article  Google Scholar 

  12. 12.

    Krishna, R. et al. Visual genome: connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vis.123, 32–73 (2017).

    Article  Google Scholar 

  13. 13.

    Simpson, A. L. et al. A large annotated medical image dataset for the development and evaluation of segmentation algorithms. Preprint at arXiv (2019).

  14. 14.

    Nowak, S. & Rüger, S. How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation. In Proc. International Conference on Multimedia Information Retrieval (Eds. Wang, J. Z. & Nozha, B.) 557–566 (ACM, 2010).

  15. 15.

    Menze, B. H. et al. The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans. Med. Imaging34, 1993–2024 (2014).

    Article  Google Scholar 

  16. 16.

    Maier-Hein, L. et al. Why rankings of biomedical image analysis competitions should be interpreted with care. Nat. Commun.9, 5217 (2018).

    CAS  Article  Google Scholar 

  17. 17.

    Miller, G. A. WordNet: a lexical database for English. Commun. ACM38, 39–41 (1995).

    Article  Google Scholar 

  18. 18.

    Gerstenblith, A. T. & Rabinowitz, M. P. The Wills Eye Manual: Office and Emergency Room Diagnosis and Treatment of Eye Disease (Lippincott Williams & Wilkins, 2012).

  19. 19.

    Kanski, J. J. Signs in Ophthalmology: Causes and Differential Diagnosis E-Book (Elsevier Health Sciences, 2010).

  20. 20.

    Crick, R. P. & Khaw, P. T. A Textbook of Clinical Ophthalmology: a Practical Guide to Disorders of the Eyes and Their Management (World Scientific, 2003).

  21. 21.

    Paul, R.-E. & Whitcher, J. Vaughan and Asbury’s General Ophthalmology 17th edn (McGraw Hill, 2007).

  22. 22.

    Trebeschi, S. et al. Deep learning for fully-automated localization and segmentation of rectal cancer on multiparametric MR. Sci. Rep.7, 5301 (2017).

    Article  Google Scholar 

  23. 23.

    Collier, S. A. et al. Estimated burden of keratitis—United States, 2010. MMWR Morb. Mortal. Wkly Rep.63, 1027–1030 (2014).

    PubMed  PubMed Central  Google Scholar 

  24. 24.

    Wong, T. Y. & Bressler, N. M. Artificial intelligence with deep learning technology looks into diabetic retinopathy screening. JAMA316, 2366–2367 (2016).

    Article  Google Scholar 

  25. 25.

    Jannin, P. et al. Validation of medical image processing in image-guided therapy. IEEE Trans. Med. Imaging21, 1445–1449 (2002).

    Article  Google Scholar 

  26. 26.

    Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA316, 2402–2410 (2016).

    Article  Google Scholar 

  27. 27.

    Verghese, A., Shah, N. H. & Harrington, R. A. What this computer needs is a physician: humanism and artificial intelligence. JAMA319, 19–20 (2017).

    Article  Google Scholar 

  28. 28.

    Kovler, I. et al. Haptic computer-assisted patient-specific preoperative planning for orthopedic fractures surgery. Int. J. Comput. Assist. Radiol. Surg.10, 1535–1546 (2015).

    CAS  Article  Google Scholar 

  29. 29.

    Menze, B. H. et al. The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans. Med. Imaging34, 1993–2024 (2015).

    Article  Google Scholar 

  30. 30.

    Pascolini, D. & Mariotti, S. P. Global estimates of visual impairment: 2010. Br. J. Ophthalmol.96, 614–618 (2012).

    Article  Google Scholar 

  31. 31.

    Rosenblatt, R. A., Andrilla, C. H., Curtin, T. & Hart, L. G. Shortages of medical personnel at community health centres: implications for planned expansion. JAMA295, 1042–1049 (2006).

    CAS  Article  Google Scholar 

  32. 32.

    Bodack, M. I., Chung, I. & Krumholtz, I. An analysis of vision screening data from New York City public schools. Optometry81, 476–484 (2010).

    Article  Google Scholar 

  33. 33.

    LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature521, 436–444 (2015).

    CAS  Article  Google Scholar 

  34. 34.

    Claassen, J. A. H. R. The gold standard: not a golden standard. Brit. Med. J.330, 1121–1121 (2005).

    Article  Google Scholar 

  35. 35.

    Ren, S., He, K., Girshick, R. & Sun, J. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell.39, 1137–1149 (2017).

    Article  Google Scholar 

Download references


We thank the following clinicians and clinical technicians for participating in data annotation: K. Chen, X. Zhao, D. Wu, T. Yu, X. Li, Y. An, Q. Wu, R. Li, X. Huang, Y. Li, C. Huang, J. Feng, J. Ye, X. Zhang, B. Lin, H. Ma, Y. Chen, X. Cui, H. Bai, T. Feng, X. Liu, J. Lu, Y. Zhou, H. Zhong, Q. Wang, Z. Wang and C. Huang. This study was funded by the National Key R&D Program of China (no. 2018YFC0116500), the Guangdong Science and Technology Innovation Leading Talents (no. 2017TX04R031), the Science and Technology Planning Projects of Guangdong Province (no. 2019B030316012), the Science Foundation of China for Distinguished Young Scholars (no. 81822010), the National Natural Science Foundation of China (nos. 81770967, 81873675 and 81800810), the Science and Technology Planning Projects of Guangdong Province (no. 2018B010109008) and the Fundamental Research Funds of the Innovation and Development Project for Outstanding Graduate Students in Sun Yat-sen University (no. 19ykyjs36). These sponsors and funding organizations had no role in the design or performance of this study.

Author information




H.L., W.L., Y.Y. and X.L. contributed to the conceptualization; W.L., Y.Y., E.L., Y.Z., C.C. and K.Z. contributed to the methodology; W.L., Z.L., X.W. and J.L. contributed to the data curation; Y.Y., K.Z., L.H. and L.Z. contributed to the formal analysis; H.L., W.L. and Y.Y. contributed to the preparation of the original draft; and E.L., K.Z., Z.L., X.W., Y.Z., Y.L., X.L., C.C. and D.Y. contributed to the review and editing. All of the authors discussed the results, commented on the manuscript and approved the final manuscript for publication.

Corresponding authors

Correspondence to Yizhi Liu or Xiyang Liu or Haotian Lin.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary methods, figures and tables.

Reporting Summary

Supplementary Dataset 1

Output of the DSV, answers of the ophthalmologists and reference standards for the retrospective external-validation dataset.

Supplementary Dataset 2

Output of the DSV and reference standards for the prospective dataset.

Supplementary Video 1

Demonstration of the Visionome website.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Li, W., Yang, Y., Zhang, K. et al. Dense anatomical annotation of slit-lamp images improves the performance of deep learning for the diagnosis of ophthalmic disorders. Nat Biomed Eng 4, 767–777 (2020).

Download citation


Quick links

Sign up for the Nature Briefing newsletter for a daily update on COVID-19 science.
Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing