Accuracy of deep learning, a machine-learning technology, using ultra–wide-field fundus ophthalmoscopy for detecting rhegmatogenous retinal detachment

Rhegmatogenous retinal detachment (RRD) is a serious condition that can lead to blindness; however, it is highly treatable with timely and appropriate treatment. Thus, early diagnosis and treatment of RRD is crucial. In this study, we applied deep learning, a machine-learning technology, to detect RRD using ultra–wide-field fundus images and investigated its performance. In total, 411 images (329 for training and 82 for grading) from 407 RRD patients and 420 images (336 for training and 84 for grading) from 238 non-RRD patients were used in this study. The deep learning model demonstrated a high sensitivity of 97.6% [95% confidence interval (CI), 94.2–100%] and a high specificity of 96.5% (95% CI, 90.2–100%), and the area under the curve was 0.988 (95% CI, 0.981–0.995). This model can improve medical care in remote areas where eye clinics are not available by using ultra–wide-field fundus ophthalmoscopy for the accurate diagnosis of RRD. Early diagnosis of RRD can prevent blindness.


Discussion
Our results showed that the deep learning technology for detecting RRD on the Optos fundus photographs had high sensitivity and high specificity. The results indicate the great potential for RRD diagnosis in areas where ophthalmologists are not available. If a community has Optos, noninvasive wide-field fundus images can be captured easily without pupillary mydriasis and does not cause medical complications. A deep learning technology can provide accurate RRD classification at a high rate and patients classified as RRD by the deep learning technology are at first remotely reviewed by ophthalmologists. The patients diagnosed as having RRD in this way can be encouraged to see a specialist before the condition progresses to difficult-to-treat PVR, which shows the potential of this model for ophthalmic care. Generally, RRD does not immediately deteriorate to PVR; therefore, if air travel is available for patients, this model could be used to cover considerably wide areas. Because RRD is a highly curable condition if patients receive timely appropriate surgical treatment, the telemedicine system using Optos that we suggest here could significantly reduce the risk of blindness in RRD patients who live in areas where ophthalmic care is not available.
In this study, we used a deep learning technology to classify the Optos images based on the presence of retinal detachment. Our analysis showed that the deep learning model using convolutional neural network (CNN) achieved a better AUC than that of the SVM model; however, slight difference was noted. Although the advantage of deep learning was slight, it was considered clinically important for achieving higher accuracy. It is suggested that multiple layers of non-linear processing in a deep learning model can learn features from a broader perspective with greater flexibility relative to what can be achieved by using SVM, which maps data to a feature space and  This study had several limitations. When clarity of the eye is reduced because of severe cataract or dense vitreous haemorrhage, capturing images with Optos becomes challenging; thus, such cases were not included in this study. Additionally, this study only compared the images of normal eyes and eyes with RRD and did not include eyes with any other fundus diseases. We also need to point out that use of deep learning requires a large number of data sets. In the future, further studies using larger samples and including eyes with other fundus diseases are necessary to more broadly assess the performance and versatility of deep learning.

Methods
Data Set. Of the patients diagnosed with RRD by ophthalmologists at Tsukazaki Hospital since December 2011, those who had taken a fundus photograph with Optos were included. In addition, the patients without fundus diseases were extracted from the clinical database of the Ophthalmology Department of Tsukazaki Hospital. These images were reviewed by two ophthalmologists to confirm the presence of RRD and were registered in an analytical database. Eyes that contained vitreous haemorrhage to the extent that ophthalmologists could not determine the area of RRD with Optos image were excluded. Of 831 fundus images, 420 images were from non-RRD patients and 411 images were from RRD patients. According to ophthalmologists who use Optos images, the area of detachment of RRD eyes was less than one quadrant for 93 eyes (22.6%), one to two quadrants for 163 eyes (39.7%), two to three quadrants for 120 eyes (29.2%), and three or more quadrants for 35 eyes (8.5%). In addition, the number of retinal breaks was one for 292 eyes (71.0%), two for 46 eyes (11.2%), and three or more for 14 eyes (3.4%); however, breaks could not be confirmed in 59 eyes (14.4%). Of these, 665 images (336 non-RRD and 329 RRD images) were used for training, and 166 images (84 non-RRD and 82 RRD images) were used for validation of the trained model. The image data set for training was augmented to 11,970 images (6,048 non-RRD and 5,922 RRD images) by performing contrast adjustment for brightness, gamma correction, smoothing, noise imparting and reversal processing on the original images. This research adhered to the Declaration of Helsinki and was approved by the ethics committee of Tsukazaki Hospital. Since this was a study that retrospectively reviewed the Optos images and there were no anonymous issues involved, the Institutional Review Board of Tsukazaki Hospital waived the need for consent. Fig. 3 for use in our classification system. This type of multi-layer CNNs is known to automatically learn local features in images and generate a classification model [13][14][15] . The first layer is called Convolutional Layer 1 (Conv1) and obtains the feature quantities of the target through convolutional filters. Each of the convolutional layers (Conv1, 2 and 3) are followed by rectified linear unit 16 and max pooling layers (MP1, 2 and 3) to decrease position sensitivity and allow for more generic recognition. The last two layers (FC1, FC2) are fully connected, remove spatial information from extracted feature quantities and statistically recognise the target from other feature vectors. The last layer is a classification layer, which uses feature vectors of target images acquired in previous layers and the softmax function for binary classification. This network architecture is ideal for learning and recognition of local features of complex image data with individual differences.

Deep Learning Model. We implemented a deep learning model that uses a CNN shown in
Training the Deep Convolutional Neural Network. The aspect ratio of the Optos original image is 3900 × 3072 pixels, which is not square; but for analysis, we changed the aspect ratio of all input images and resized them to 96 × 96 pixels. For training, 100 images were mini-batch processed. The initial value of the network weight was randomly given as the Gaussian distribution with a zero mean and a standard deviation of 0.05. To avoid over-fitting during the training, a dropout technique was applied to two fully connected layers (FC1, FC2) to mask out with 70% probability and improve the performance 17 . The optimisation algorithm called AdaGrad (learning rate = 0.001), one of the stochastic gradient descent methods 18 , was used to train the network weights. We trained the model by using the GPU of GeForce GTX970M by NVIDIA installed on a commercially available computer. Support Vector Machine (SVM). The SVM model was used as a reference to evaluate our deep learning recognition model 19 . For binary classification SVM, LIBSVM 20 from the Scikit-Learn library 21 with the Radial Basis Function (RBF) kernel was used. Other feature extraction or feature quantity selection was not performed so that we matched the criteria of the input data used in the deep learning training. The cost parameter (C = 10) and the RBF kernel parameter (γ = 0.001) were used, which were the results of the optimisation by grid search (i.e. C = 1, 10, 100, 1,000, 10,000; gamma = 0.000001, 0.00001, 0.0001, 0.001). Scoring for the grid search was performed on the basis of the mean AUC-receiver operating characteristic obtained by 5-fold cross validation of the grading data. The data set used in the training and validation were the same as the set used in the deep learning model.

Evaluation of Deep Learning and SVM Models.
The deep learning and SVM models were operated one hundred times each. Subsequently, the average value and 95% CI for AUC were calculated. Data availability. The Optos image datasets analysed during the current study are available with the corresponding author on reasonable request.