Optical coherence tomography-based deep-learning model for detecting central serous chorioretinopathy

Central serous chorioretinopathy (CSC) is a common condition characterized by serous detachment of the neurosensory retina at the posterior pole. We built a deep learning system model to diagnose CSC, and distinguish chronic from acute CSC using spectral domain optical coherence tomography (SD-OCT) images. Data from SD-OCT images of patients with CSC and a control group were analyzed with a convolutional neural network. Sensitivity, specificity, accuracy, and area under the receiver operating characteristic curve (AUROC) were used to evaluate the model. For CSC diagnosis, our model showed an accuracy, sensitivity, and specificity of 93.8%, 90.0%, and 99.1%, respectively; AUROC was 98.9% (95% CI, 0.983–0.995); and its diagnostic performance was comparable with VGG-16, Resnet-50, and the diagnoses of five different ophthalmologists. For distinguishing chronic from acute cases, the accuracy, sensitivity, and specificity were 97.6%, 100.0%, and 92.6%, respectively; AUROC was 99.4% (95% CI, 0.985–1.000); performance was better than VGG-16 and Resnet-50, and was as good as the ophthalmologists. Our model performed well when diagnosing CSC and yielded highly accurate results when distinguishing between acute and chronic cases. Thus, automated deep learning system algorithms could play a role independent of human experts in the diagnosis of CSC.

Here, we propose and evaluate a deep learning systems (DLS) model for diagnosing CSC and its chronicity using OCT images. Recent advances in DLS techniques such as convolutional neural networks (CNNs) have provided an alternative method to characterize medical image data [18][19][20] . In ophthalmology, previous studies have reported the high accuracies possible when using CNN-based models for the detection of CSC from fundus photographs, detection and classification of diabetic retinopathy from fundus photographs, detection of AMD from fundus photographs or OCT, visual field examination of glaucoma patients, and the grading of pediatric nuclear cataracts 4,[20][21][22][23][24][25] . In this study, we propose a DLS model that uses OCT scans to distinguish between eyes with CSC and normal healthy eyes. In addition, the model distinguishes between acute and chronic CSC.

Results
A total of 2,360 images from the 220 participants were included in the study. The mean age of the participants in the normal group was 43.32 ± 13.68 years and that in the CSC group was 46.92 ± 9.54 years. Men constituted 79.03% and 79.75% of the participants in the normal and CSC groups, respectively. Detailed information of the data used in this study is shown in Table 1.

Model performance.
The results from the proposed model are shown in Table 2. There were 29 cases in which our model incorrectly judged CSC as normal; Table 3 shows six representative examples. Of these 29 cases, three were acute CSC with SRF and the remaining 26 cases were chronic CSC without SRF. The AUROC (Fig. 1), was 98.9% (95% confidence interval [CI], 0.983-0.995), which was less than that of VGG-16 (99.4%) but better than that of Resnet-50 (97.2%). The AUROC of the model for distinguishing chronic from acute CSC was 99.4% (95% CI, 0.985-1.000), which was better than that of both VGG-16 (97.4%) and Resnet-50 (94.2%) (Fig. 2).  Grad class activation mapping. Representative heat maps produced from the two classifications by the Grad-CAMs are shown in Fig. 3. The heat maps highlighted regions that were comparable with the region that retina specialists usually consider when diagnosing CSC. This shows that our model uses a similar approach in assessing CSC images.

Discussion
In this study, we built a DLS model and investigated its performance in using SD-OCT images to diagnose CSC and distinguish chronic CSC from acute CSC. Several studies [12][13][14][15][16][17] have previously analyzed CSC using OCT, however most of these were for segmentation, rarely for classification (Table 4). Our model showed promising performance in diagnosing CSC and performed well in distinguishing between acute and chronic CSC, where its performance was either comparable to, or better than, experienced retina doctors. This study is the first to evaluate the performance of a DLS model that classifies acute and chronic CSC using OCT. There were 29 false negatives where the model incorrectly classified CSC images as normal, which resulted in a relatively low sensitivity (90%). This could be due to the following reasons. First, there is no universally   26,27 . In this study, chronic CSC without SRD was included in the chronic CSC group, and some chronic CSC images without SRF could be confused as normal. Twenty-six out of the 29 false negatives were chronic CSC, and none of these 26 cases had SRF on the images. If we had only considered CSC cases with SRF when training and testing the model, it is likely that there would have been fewer false negatives, which would consequently have improved the performance of our model. Second, we used randomly selected non-centered image cuts as well as five centered image cuts showing the typical CSC pattern. Therefore, even if the centered image cuts of an OCT volume revealed prominent acute or chronic CSC characteristics, the non-centered image cuts may have shown similar characteristics to normal eyes, resulting in the misjudgment. Interestingly, all except four of the false negative cases were non-centered images. From a clinical perspective, it is important that a classification model shows a high sensitivity by reducing false negatives. If a normal case is incorrectly classified as CSC, i.e. a false positive, it may increase the burden on the healthcare system. However, if a CSC case is incorrectly classified as normal, i.e. a false negative, it can cause a serious problem for the patient; irreversible visual impairment and visual function deterioration may occur if the appropriate treatment is not initiated quickly. Therefore, we plan to improve the performance (especially the sensitivity) of our model on SRF-free and non-centered OCT images, even though the accuracy of our current model was comparable to that of the ophthalmologists.
Our model had a better sensitivity and accuracy than most of the ophthalmologists in distinguishing between acute and chronic CSC, and correctly classified 11 cases that two ophthalmology experts differed on. This implies that the model can provide useful information for diagnosis even when human experts with good agreement give differing interpretations. Therefore, our model performed promisingly in distinguishing chronic from acute CSC, and demonstrated a unique potential for using DLS technology to assess CSC based on OCT.
Improvements in OCT technology have increased the number of OCT images that are generated, subsequently increasing the amount of OCT data to be analyzed and pushing the limits of clinical capacity. Therefore, image analysis using DLS is expected to contribute increasingly in the future. CNNs are popular neural networks with www.nature.com/scientificreports/ many layers that perform particularly well in image recognition 19 . Our CNN included an iterative convolutional layer structure responsible for extracting local features of the image, and a pooling layer that summarized the features of each region. Unlike conventional machine learning classifiers, the CNN can use these automatically extracted features to accurately classify an image. By applying a CNN model to OCT images, both classification and segmentation can be performed. In classification, the model predicts the class for an unlabeled image, whereas in segmentation, the model tries to predict the class of a pixel in an image, and not the entire image. Hence, the final output of the model in the segmentation task can be an image comprising a set of labeled (or classified) pixels. In this study, we conducted two binary classification tasks. To feed our DLS CNN, we preprocessed the OCT images in two steps: cropping and resizing. We cropped the original OCT images to remove unnecessary parts, and then resized the cropped image to an input size of 224 × 224 pixels for our model, which is widely used  To build a robust model applicable to a variety of input images, we performed an image augmentation process in the training phase. When we initially trained the model without data augmentation, we found that in some instances images that were tilted were misclassified. To address this issue, we randomly rotated, changed the brightness, and horizontally flipped the images, and subsequently only included augmented images in the training phase. This augmentation notably improved the performance of our model, confirming that the preprocessing methods that we used were effective, especially when dealing with a relatively small data set. We showed that the Grad-CAM can correctly identify the pathologic region of an OCT image. The purpose of using Grad-CAM is to identify and specify the parts of an image that affect the probability scores of each class. The heat map of the regions activated by the model can identify and quantify differences, highlighting the important areas in the classification process. In clinical practice, we often observe CSC cases where the time alone cannot fully explain the chronicity of the disease. For example, some patients possess acute CSC findings even though their symptoms are more than 1 year old, while others have chronic CSC findings with extensive atrophic changes in the macula, despite their symptoms being less than 1 month old. Therefore, evaluating CSC chronicity with OCT as a biomarker alongside the time (actual onset time or time when the patient's symptoms were present), is more reliable than judging by the time alone.
In our study, we only used OCT images without information on the time variables in the proposed model. The highlighted part in the heatmap generated from GRAD-CAM refers to the location that is activated when the model classified a specific class (i.e., normal vs. CSC or acute CSC vs. chronic CSC). Such heatmaps are often beneficial when interpreting what the model considers for decision making. As shown in Fig. 3, the highlighted area on the heatmap presents all the inner retina, outer retina, and choroid layer, which includes the location of several known OCT biomarkers, including alteration of the RPE and outer retina morphology 2,[8][9][10][11] . Hence, we believe that using Grad-CAM during the learning process with OCT images (without any information on time) can provide useful detailed biomarkers for evaluating chronicity. Additionally, where many OCT images obtained by frequent examinations with long-term follow-up require analysis, the Grad-CAM result could be used to shorten the analysis time, avoid oversight, and help ophthalmologists arrive at a consistent prognosis.
This study has some limitations. First, the variety and number of OCT images available were limited. External validation is necessary in future studies because all the images in this study were acquired from a single institution. However, the dataset was sufficient to demonstrate the feasibility of our DLS model to diagnose CSC and distinguish chronic from acute CSC using OCT images. Second, the model could be extended toward predicting future disease progression through a series of OCT images. In addition to determining the current status by viewing the latest image, the extended model could predict the future progression or chronicity using the longitudinal image data of CSC patients. Recurrent neural networks and long-short term memory can be used in such sequential predictions. Third, since we did not find any similar investigations using OCT images to classify CSC from normal retinas and chronic from acute CSC, we could not compare performance with previous studies. Regardless of the above limitations, the developed model demonstrates a reasonable and promising performance and suggests the need for further investigations on its potential impact in clinical practice.
In our study, we developed and evaluated a deep learning model that can diagnose CSC and distinguish its chronicity using SD-OCT images, which can be clinically useful in either determining the treatment plan or predicting prognosis. In clinical practice, a patient may present several macular diseases (e.g., CSC, AMD, etc.) at the same time. In addition, even if only one macular disease is present, the patient may potentially possess several other typical macular diseases such as AMD, DR, and RVO in addition to CSC. Therefore, developing a model that can identify each macular disease and its severity is crucial from a clinical perspective. Such a model can be useful in simultaneously assessing the presence or absence of several macular diseases, and help with the correct diagnosis. www.nature.com/scientificreports/ In summary, we developed a DLS CNN model that performed well at diagnosing CSC and distinguishing chronic CSC from acute CSC without a segmentation algorithm. The process for assessing CSC needs to maximize its capacity to process the increasing number of images from participants who have examinations, with high accuracy. Automation of the classification process using DLS models may improve patients' quality of life by improving prognosis and may save cost and time for both healthy people and patients with CSC.

Methods
This study was conducted in line with the Helsinki Declaration of 1964. The Ethics Committee of Hangil Eye Hospital approved the research protocols and their implementation. The committee waived the requirement for obtaining informed consent given that this was a retrospective observational study of medical records and was retrospectively registered.
Data collection and labelling. We analyzed the records of patients who visited Hangil Eye Hospital between January 2017 and January 2020. We used spectral domain (SD)-OCT (Heidelberg Spectralis, Heidelberg Engineering, Heidelberg, Germany) images of normal participants, and of patients with CSC. Of the 220 patients enrolled at the outpatients' clinic during that period, 158 were diagnosed with CSC and 62 were normal healthy patients who were assigned to the control group. All CSC cases were diagnosed by means of fundus examinations, FA, ICGA, and OCT images by independent retinal specialists. A confocal scanning laser ophthalmoscope (Heidelberg Retina Angiograph, HRA; Heidelberg Engineering, Heidelberg, Germany) was used to perform simultaneous FA and ICGA on all CSC cases. One eye per patient was selected for this study, with one visit per patient. Our analysis excluded data that showed the presence of other potentially conflicting retinal pathologies such as AMD, polypoidal choroidal vasculopathy, pachychoroid neovasculopathy, and pachychoroid pigment epitheliopathy. We randomly selected 5-10 non-centered image cuts from the 25 volume scan image cuts for each OCT volume, as well as five centered image cuts showing the typical CSC pattern.
Acute CSC: Acute CSC was diagnosed based on the presence of serous detachment of the neurosensory retina involving the macula as demonstrated by OCT, and leakage at the level of the RPE on FA. Only classic, acute CSC with a symptom duration of less than 3 months since the first episode, was included in the acute CSC group.
Chronic CSC: Based on the Daruich and colleagues' classification scheme 27 , chronic CSC was diagnosed according to the RPE status and was defined as chronic chorioretinopathy with widespread RPE decompensation, with/without SRD, and with/without an active leakage site. As their definition, chronic CSC was diagnosed when extensive RPE atrophy findings were observed regardless of SRF.
Categorization: Categorization was performed by two retina specialists (JSH and DDH) who examined all images obtained by OCT, FA, and ICGA multimodal imaging methods, and reviewed the medical charts. In cases of disagreement, a third retina specialist (JMH) confirmed the discrepancy and discussed the case with the other specialists. After a discussion, all discrepancies were resolved by consensus.

Data preprocessing.
To use the SD-OCT images as input for a DLS CNN, we first removed the unnecessary parts (such as the company logo) from the original 596 × 1264 pixel SD-OCT images, which resulted in 380 × 764 pixel RGB images. We subsequently down-sampled the 380 × 764 pixel cropped images to 224 × 224 pixel RGB images, which were fed into the DLS CNN. Images of 224 × 224 pixel RGB are a widely-used image standard for classification models such as VGG-16 28 and Resnet-50 29 . To avoid overfitting 30 , we performed a data augmentation process to build a robust model from a variety of input images. The data augmentation process included random horizontal image flips, random brightness changes from 0.7 to 1.3, and random rotations of the image of up to 15°. This data augmentation process was only applied in the training phase.
Model architecture. To classify a given OCT image as either CSC or normal, we built a DLS model based on the CNN architecture. As shown in Fig. 4, the proposed model comprised 13 CNN layers with a rectified linear unit (ReLU) activation function 31 , four max pooling layers, two dropout layers, and four fully connected (FC) layers. The dropout layer helped our model to avoid overfitting 28 , and the FC layers formed a traditional multilayered perceptron 32 . The final output layer with a soft-max activation function was used to predict the binary classification result. The proposed CNN used 118,132,802 trainable parameters. The proposed architecture was also used to classify whether a CSC OCT image was either acute or chronic; the last output layer with the softmax activation function in the model was replaced for the binary classification between acute and chronic CSCs. We considered applying the transfer learning method to our model as other researchers have done 33,34 ; however, we decided not to use it because it performed poorly on our dataset.

Gradient weighted class activation mapping.
To visualize the pathologic region of an OCT image in the classification process, we applied gradient weighted class activation mapping (Grad-CAM) 35 to generate a heat map of activated regions. Grad-CAM uses the gradients of the target label (e.g., CSC) with respect to feature maps of the convolutional layer to highlight important regions in the image when predicting the target label. The heatmap illustrates the area of the image that the model uses for its classification. Experiment setup. The data were randomly split into a training (1,861) and test set (499). The test set was used only for the final evaluation of the model performance; no single patient case existed in both sets. To diagnose CSC, the training and test sets were split CSC/normal 1,171/690, and 289/210, respectively. To classify the CSC cases the acute/chronic split of the training and test sets were 371/800, and 95/194, respectively.  28 (VGG with 16 layers) and Resnet-50 29 (Resnet with 50 layers), we used our training and test sets on these models. All models were trained with a batch size of 64, epochs of 50, and with Adam optimization (learning rate 0.0001). To evaluate our model from a clinical perspective, the classification results for the test set (788 images) were compared with those made by five ophthalmologists, including three ophthalmology residents and two experts, each having more than 10 years clinical experience at an academic ophthalmology center.

Statistical analysis.
To measure the performance of the model, the sensitivity, specificity, accuracy, and the area under the receiver operating characteristic (ROC) curve (AUROC) 36 were determined. Cohen's Kappa coefficients were used to rate the agreement level between the two experts.

Data availability
The data are not available for public access because of patient privacy concerns, but are available from the corresponding author on reasonable request.