Introduction

Refractive error is defined as the state in which light is not accurately focused on the macula due to the ocular shape. Low vision due to refractive error is a leading cause of treatable blindness worldwide [1]. A recent study reported that uncorrected refractive error was associated with visual impairment in about 100 million patients in 2010 [2]. In particular, myopia is expected to increase exponentially among East Asian populations with a large socioeconomic burden [3]. Moreover, refractive error, including both severe myopia and hyperopia, is associated with an increased risk of several retinal ocular diseases and glaucoma [4, 5]. High myopia is associated with the risk of vision-threatening complications such as myopic choroid neovascularization and chorioretinal atrophy. However, refractive error may be overlooked in many low-vision cases because clinicians tend to focus on ocular diseases with pathological lesions [6].

Optical coherence tomography (OCT) provides structural information on the posterior pole of the eye with cross-sectional images. Currently, assessments of the retina and optic disc have been widely implemented using OCT to detect vision-threatening diseases such as diabetic retinopathy, age-related macular degeneration, and glaucoma [7]. Various studies have shown that retinal structure measured by OCT is closely related to refractive error. Generally, retinal nerve fiber layer thickness is associated with refractive error [8]. A previous study reported that the choroidal thickness measured by OCT was associated with myopic eyes [9]. OCT provides a diagnostic ability for various pathological conditions in patients with high myopia, such as retinoschisis, chorioretinal atrophy, and posterior staphyloma [10]. Because refractive error can affect the accuracy for diagnosis of several conditions, such as glaucoma, macular edema, and choroid neovascularization, it still remains an important factor during assessment of OCT images [4, 11].

Deep learning has achieved promising classification performance in ocular imaging domains [12, 13]. It can also predict clinical parameters that doctors have not been able to do before by extracting hidden features of medical images [14]. Automated interpretation of OCT images with deep learning has improved the diagnostic performance for detection of retinal diseases [15]. In addition, deep learning has provided new insight about the previously unknown relationship between retinal morphology and physiological variables [14]. In particular, a previous study conducted by Google revealed that uncorrected refractive error could be predicted using fundus photographs through deep learning [16]. A recent study reported that a deep learning algorithm was able to detect eyes with high myopia with high accuracy from fundus photographs [17]. However, no studies have used OCT to directly estimate refractive error by analyzing the shape of the retina and optic disc.

To this end, we analyzed retinal OCT images from healthy subjects without refractive surgery. We aimed to use a deep learning-based regression technique to develop a novel prediction model for estimating the uncorrected refraction error using OCT images. In addition, we aimed to identify OCT imaging features that contributed to the estimation of refractive error using an explainable technique.

Materials and methods

Dataset

This retrospective study protocol was approved by the Institutional Review Board of Korean National Institute for Bioethics Policy (P01-202104-21-010), which waived the requirement for informed consent. This study adhered to the tenets of the Declaration of Helsinki. In the current study, we analyzed the preoperative ocular data of healthy patients who intended to undergo refractive surgery at the B&VIIT Eye Center from January 2016 to December 2016. We retrospectively collected preoperative refractive error and OCT measurements that were used to develop a machine learning model to exclude patients who underwent refractive surgery. All patients underwent preoperative measurements of best-corrected distance visual acuity and manifest refraction, slit-lamp examinations of the anterior segment, and dilated fundus examinations. Subjects with a history of ocular surgery, corneal diseases, cataract, glaucoma, uveitis, and retinal diseases were excluded from the study. Subjects with missing data (refractive error or OCT images) or low-quality OCT images were also excluded. The inclusion criteria at the B&VIIT Eye Center were as follows: age between 21 and 50 years, stable refraction, +5.00 to −20.00 diopters of hyperopia or myopia with astigmatism of 5.00 D or less, and examination of retina and optic disc using three-dimensional OCT (Topcon 3D OCT‐1 Maestro, Tokyo, Japan). We captured retinal cross-sections passing through the center of the retina (fovea) along the horizontal and longitudinal axes during OCT examination.

Finally, this study comprised 936 eyes of 468 healthy subjects with no history of refractive surgery. For the prospective validation design [18], we assigned the subjects who visited before September 2016 (73.5%, N = 688 eyes of 344 subjects) to the training dataset, and those after October 2016 (26.5%, N = 248 eyes of 124 subjects) were used for the internal validation dataset. In the training process, we performed fivefold cross-validation, the preferred technique for assessing performance and optimizing prediction models.

Definition of refractive error

The refractive error was measured by expert optometrists during manifest refraction without cycloplegia. The target metric for refractive error is the spherical equivalent (SE), which is defined as the average spherical power considering both the spherical refractive power and astigmatism. The SE is calculated by adding the sum of the spherical refractive power to half of the cylinder power. Specifically, we defined high myopia as an SE of −6.00 D or worse and moderate myopia as an SE between −6.00 and −3.00 D.

Algorithm

The overall dataset and the algorithm development processes are illustrated in Fig. 1. For the refractive error prediction model, the OCT images were analyzed using a deep learning architecture. In the model development, SE was the main outcome as a continuous variable, and the model inputs comprised OCT images. In this study, the deep learning model consists of two parts: a convolutional neural network feature extractor and a fully connected layer. Pretrained deep learning architectures, including ResNet50, InceptionV3, and VGG16, were adopted as feature extractors. These neural network architectures were pretrained on the ImageNet database and imported into MATLAB R2020b (MathWorks Inc., Natick, MA, USA) platform. These architectures have been extensively and successfully used in previous studies involving ophthalmic disease detection using OCT, implying state-of-the-art performance [19, 20]. The input images were resized into the original input tensor of each deep learning architecture. A fully connected layer was used to concatenate the feature information from deep learning architectures for SE prediction. For this regression, a linear activation function was used in the fully connected layer [21]. In this study, we focused on ResNet50, which reformulated the layers as learning residual functions with reference to the layer inputs using skip connection, instead of learning unreferenced functions [22]. During the training, to prevent overfitting, data augmentation was performed using linear transformation, including left and right flip, width, and height translation from −5 to +5%, random rotation from −30° to 30°, zooming from 0 to 10%, and random brightness change from −10 to 10%. The loss function consisted of the mean square error between the regression output and reference SE because it showed a more stable performance during training than other metrics. The training iterations were tuned for each deep learning model to avoid overfitting. The model was optimized with Adam with a batch size of 20 and a base learning rate of 0.00001 for transfer learning of the model. Because there are fundamental differences, such as the presence or absence of the optic disc, we built deep learning models for horizontal and longitudinal OCT domains, respectively. A saliency map was generated using the Grad-CAM technique to visualize the characteristic features of high myopia and controls for interpretability.

Fig. 1: Workflow for data assignment and deep learning model development for refractive error prediction.
figure 1

A Data management for training models and validation. B Schematic of the deep learning algorithm to predict refractive error.

Statistical analysis

To evaluate the prediction performance of deep learning models, we used metrics of the mean absolute error (MAE), median absolute prediction error (MedAE), and Pearson correlation coefficient, of SE values in fivefold cross-validation and test dataset validation. Comparison of MAE between the models was performed using a paired t-test. To detect moderate myopia and worse (SE ≤ −3.00 D) and high myopia (SE ≤ −6.00 D), we calculated the area under the receiver operating characteristic curve (ROC-AUC). Youden’s index, which is a widely used estimate of the optimal threshold giving equal weight to sensitivity and specificity, was adopted in this study.

Results

The characteristics of the subjects and images in the development dataset (N = 688 eyes of 344 subjects) and test dataset (N = 248 eyes of 124 subjects) are summarized in Table 1. The subjects had a mean age of 29.01 years and a mean SE of −5.72 D in the total dataset. There was 41.0% and 37.9% high myopia among the development and test datasets, respectively. Eyes with hyperopia were very rare, with 0.7% and 0.8% among the development and test datasets, respectively. The distribution of the two datasets showed no significant differences in age, sex, or refractive error.

Table 1 Characteristics of the subjects for development and test validation datasets.

The SE estimation performances of the deep learning models are presented in Table 2. The ResNet50 model with a horizontal axis showed the best performance with a MAE of 2.42 D and a MedAE of 2.15 D, in the fivefold cross-validation. Similarly, it also showed the lowest error with a MAE of 2.66 D and a MedAE of 2.32 D in the test validation dataset. The ResNet50 with the horizontal axis showed significantly better prediction performance than that with the longitudinal axis in both the fivefold cross-validation (P = 0.008) and test validation (P = 0.043). The difference between deep learning architectures in MAE was not statistically significant in the horizontal axis domain (P > 0.05, for all comparisons).

Table 2 Performance of deep learning models for estimating uncorrected refractive error in development (fivefold cross-validation) and test validation datasets.

Figure 2A, B shows the distribution of the actual versus predicted SE values using ResNet50 as a backbone architecture. The Pearson correlation coefficient between the actual SE and predicted SE was 0.636 (P < 0.001) using fivefold cross-validation. The test dataset validation showed a Pearson correlation coefficient of 0.588 (P < 0.001). Figure 2C, D shows the percentage of eyes within a given range of SE prediction errors. When we used the ResNet50 model with the horizontal axis, which showed the lowest error to predict SE, 24% of the eyes were within the error range of ±1.00 D and 63% were within the error range of ±3.00 D in the fivefold cross-validation. The test dataset validation showed that 22% of the eyes were within the error range of ±1.00 D and 66% were within the error range of ±3.00 D.

Fig. 2: Performance of the developed ResNet50 model with horizontal OCT images via fivefold cross-validation and test dataset validation.
figure 2

Distribution of the predicted spherical equivalent against the actual spherical equivalent A in the fivefold cross-validation and B in the test dataset validation. Distribution of the error to show the accuracy of the predicted spherical equivalent to the actual spherical equivalent C in the fivefold cross-validation and D in the test dataset validation.

In our test dataset, we attempted to detect moderate myopia and worse (SE ≤ −3.00 D) and high myopia (SE ≤ −6.00 D) using the developed models. Precision–recall curves and ROC curves are shown in Fig. 3A, B. The ResNet50 model with the horizontal axis worked with the highest ROC-AUC of 0.789 (95% confidence interval [95% CI], 0.689–0.888) to detect moderate myopia and worse. Based on the Youden index, the ResNet50 model showed a classification performance with an accuracy of 79.8% (95% CI, 74.3–84.7%), sensitivity of 82.3% (95% CI, 76.3–87.3%), and specificity of 68.9% (95% CI, 53.4–81.8%). Similar findings have been observed in the prediction of high myopia. The ResNet50 model gave a ROC-AUC of 0.813 (95% CI, 0.744–0.881), accuracy of 71.4% (95% CI, 65.3–76.9%), sensitivity of 87.2% (95% CI, 78.8–93.2%), and specificity of 61.7% (95% CI, 53.5–69.4%). Figure 3C shows examples of the results from the developed ResNet50 model predicting the uncorrected refractive error with a saliency map. The inner retinal layers and relatively steepened curvatures were highlighted by the saliency map to detect high myopia based on the Grad-CAM technique. Grad-CAM frequently highlighted the fovea and optic nerve when the model detected eyes without high myopia.

Fig. 3: Classification performance of deep learning models with horizontal OCT images to detect myopia in the test dataset validation.
figure 3

A Precision–recall curves and ROC curves for moderate myopia and worse (SE ≤ −3.00 D) and B for high myopia (SE ≤ −6.00 D). C Examples of the developed deep learning (ResNet50) results predicting the uncorrected refractive error with a saliency map using the Grad-CAM technique.

Discussion

In this study, we applied a deep learning technique to estimate the uncorrected refractive error using OCT images containing the retina and optic disc. Our work shows that morphological factors of OCT images contribute to the detection of eyes with refractive errors, and artificial intelligence has the potential to accurately estimate uncorrected refractive error. This prediction using OCT has not been carried out by ophthalmologists before, indicating the development of novel medical artificial intelligence to aid in their diagnosis. To the best of our knowledge, this is the first automated model for estimating refractive error in posterior segment OCT images. Several previous studies have developed the deep learning models using OCT to predict degenerative conditions in patients with myopia [23, 24]. However, it is also important for patients with myopia to determine the risk of development before degenerative changes occur. In the future, our algorithm will be very useful to detect patients with refractive error if OCT is used as an eye disease screening device for general populations.

We observed that the output result from Grad-CAM highlighted visualizations of the characteristic OCT features of myopic eyes to predict eyes with high myopia. Our results are explicable based on previous literature studying OCT [10]. There were regional changes in retinal thickness within the macular region, thickness variations that correlated with axial length [25]. The thickness of the retinal nerve fiber layer is negatively associated with axial length [8]. Because axial length critically affects refractive error of the eyes, OCT of the macular region can provide clues to estimate refractive error. The steepened curve of the posterior pole may be a sign of thin sclera and long axial length of the eye [10]. Optic disc changes due to stretching of the scleral canal and lamina cribrosa are also frequent in eyes with high myopia [26].

In the current analysis, the accuracy of the deep learning algorithm for refractive error estimation using OCT (MAE = 2.66 D for validation) is lower than that using fundus photography (0.56 D for UK biobank and 0.91 D for AREDS datasets), which was reported by the Google team [16]. A previous study conducted in Singapore showed that the ROC-AUC for high myopia detection using fundus photographs was 0.978, which was higher than our result using OCT (AUC = 0.813) [17]. This may be because the fundus photograph contains distorted images with a relatively wide retinal area and vessel arcades, while the OCT provides only information from a narrow area that is relatively undistorted. The distorted retinal image may provide information about the curvature of the cornea and lens of the anterior segment. A small amount of training data in this study might also affect training, resulting in a lower accuracy of refractive error estimation.

Until now, deep learning researchers using OCT have focused on developing models for diagnosing specific ocular diseases, such as age-related macular degeneration, macular hole, and central serous chorioretinopathy [19, 20]. Recently, studies have been widely conducted to extract hidden clinical information as well as ocular pathological information from fundus photography using artificial intelligence [27]. One previous study predicted that artificial intelligence, which is able to diagnose dementia using OCT, will emerge soon using a large neuroimaging and OCT database [28]. We also focused on finding signs in OCT images that we have not seen before. In this study, we observed that OCT images alone provided previously elusive information for refractive error prediction by analyzing the cross-sectional shape of the retina and optic disc.

In recent decades, OCT has become the standard imaging tool for examination of structural information of the retina and optic disc [29]. We believe that this method will expand the utility of OCT imaging for more accurate diagnosis. For example, using our method, it is possible to estimate the severity of preoperative refractive error in patients who have undergone refractive or cataract surgery. Since the risk associated with high myopia remains after refractive surgery, our method is very useful in estimating such risk in screening ocular diseases using OCT. Preoperative refractive error can also provide a more accurate diagnosis of glaucoma and macular edema in patients with refractive surgery because myopic eyes have thinner baseline retinal and nerve fiber layer thickness [10].

A strength of this study was the use of uncorrected refractive error for deep learning algorithm development from posterior segment OCT. However, this study had several limitations. First, the datasets consisted of an East Asian population from a single center. A previous study showed that structural retinal parameters measured by OCT varied by ethnicity [30]. Therefore, it is not confirmed whether our method can be used in other ethnic groups. Second, the developed model was not validated using an independent external dataset. There are many factors that can affect OCT images, such as product specifications, software, and inspector skill. Their impact can be evaluated when the algorithm is verified on a variety of external datasets. Third, the distribution of SE in the datasets was biased toward myopia. Because there were very few subjects with hyperopia, we aimed to build a model to predict moderate or high myopia. Therefore, the developed deep learning model could not be fully validated in emmetropic and hyperopic eyes. Fourth, our study did not analyze axial length, which is highly associated with retinal thickness and refractive error [26]. Ocular biometry data were excluded from the analysis because they were not fully digitalized at the time of the study in the B&VIIT Eye Center. Based on the current findings, axial length is also presumed to be related to the cross-sectional shape of the retina, which should be addressed in future studies.

In summary, we developed a deep learning model to estimate the uncorrected refractive error from posterior segment OCT images. A deep learning algorithm shows that OCT could potentially be used as an imaging modality to estimate refractive error. This method will facilitate the evaluation of refractive error to prevent clinicians from overlooking the risks associated with refractive error during OCT assessment. We hope that the developed artificial intelligence will contribute to reducing the global burden of refractive error.

Summary

What was known before

  • Retinal structure measured by OCT is closely related to refractive error. OCT provides a diagnostic ability for various pathological conditions in patients with high myopia.

What this study adds

  • A deep learning technique can directly estimate the uncorrected refractive error using posterior segment OCT. This method will facilitate the evaluation of refractive error to prevent clinicians from overlooking the risks associated with refractive error.