The global public health burden of diabetes has been growing at a stunning speed due to various risk factors, such as obesity, lack of physical activities, aging, urbanisation and scale of population. The total number of patients with diabetes is estimated to reach 300 million by 2025 and 366 million by 2030 [1, 2]. India, China and the United States will remain the top three nations with the highest numbers of people with diabetes worldwide. A proportion of the affected population is also at risk of diabetic retinopathy (DR). Recent studies revealed that up to over 18% of patients in China [3] and over 21% in India [4] are affected by DR. Effective screening and early intervention are essential for the cost-effective prevention of irreversible blindness from DR [5]. Early diagnosis of DR requires systematic and extensive screening due to the asymptomatic nature of this major cause of preventable blindness. The deficiency in qualified professionals for providing effective screening and treatment of vision threatening DR poses further challenges in both India [6] and China [7]. In the United States, where specialised healthcare has been relatively more accessible, eye care was reported to be available to only <60% of these patients [8].

Promising results from a few research groups worldwide have shown the potential role of deep learning algorithm within quality assured artificial intelligence (AI) assisted systems for DR screening. In a prospective study published earlier this year, Gulshan and colleagues demonstrated the equivalent or better performance of a deep learning algorithm in the diagnosis of moderate or worse DR compared with experienced specialists or trained graders in a cohort of 3049 patients from two hospitals in India [8]. Deep learning algorithm using a convolutional neural network achieved high accuracy in identifying vision threatening DR from over 35,000 images from population-based cohorts of Malays, Caucasian Australians and Indigenous Australians [9]. In another large-scale multiethnic cohort, deep learning showed performance in the diagnosis of DR of various severity comparable with retina specialists or professional ocular image graders [10]. In 2018, IDx-DR developed by researchers from the University of Iowa, became the first FDA-approved AI-based automated DR screening system and achieved satisfactory results in screening referable DR in primary care [11]. In China, AI assisted system completed DR screening of over 28,000 [12] diabetic patients within 1 year from the National Metabolic Management Center that covers 542 hospitals in over 30 provincial administrative regions of China [13].

An ideal AI programme in DR screening should be able to demonstrate both high sensitivity and high specificity to assure an optimal balance between safety and efficiency. Tremendous amount of effort has been devoted to targeting and tackling the barriers to improving the performance of deep learning algorithms, such as the development of methodologies for image standardisation and preprocessing [14], selection of reference standards [6, 15], employment of convolutional neural networks [14], and adoption of suitable database for training and validation. The recently published work by Gulshan and colleagues using the improved Inception-v4 neural network architecture shows encouraging improvement by demonstrating equivalent or better performance in assessing moderate or severe DR compared with manual grading by retinal specialists and trained graders. The impressive model generalisation shown in this study conducted in two different hospitals is important for the application of deep learning algorithms in clinical settings where the consistency in performance can be affected by various factors, such as different cameras used and whether fundus photography was carried out under dilation. As the authors have rightfully pointed out, the high sensitivity and specificity achieved by the automated AI system adopted and improved in this study can potentially serve as part of a well-integrated system combining automated AI system and human graders to increase accessibility in low-resource settings and improve cost-effectiveness in high-resource settings.

Despite these promising study results achieved in the development and assessment of automated systems in DR severity diagnosis especially at the moderate or severe level, challenges remain to adopt automated systems as the ‘gate-keeper’ in extensive screening with wide coverage. For example, challenges remain to improve the capabilities of deep learning algorithm in early DR detection which involves accurately spotting minor changes including fine microaneurysms and small haemorrhages. Furthermore, the validity and accuracy of deep learning algorithms in screening other major causes of blindness such as glaucoma and age-related macular degeneration is also important for extensive application of AI systems in population-based screening programmes. There is also a lack of randomised controlled clinical trials comparing human graders with deep learning incorporated AI systems. However, due to the urgent need for DR screening in large populations in countries like China, we have to balance urgency with accuracy and aim for urgent implementation of AI assisted DR screening in China whilst more validation is being performed.