This work aimed to use artificial intelligence to predict subjective refraction from wavefront aberrometry data processed with a novel polynomial decomposition basis. Subjective refraction was converted to power vectors (M, J0, J45). Three gradient boosted trees (XGBoost) algorithms were trained to predict each power vector using data from 3729 eyes. The model was validated by predicting subjective refraction power vectors of 350 other eyes, unknown to the model. The machine learning models were significantly better than the paraxial matching method for producing a spectacle correction, resulting in a mean absolute error of 0.301 ± 0.252 Diopters (D) for the M vector, 0.120 ± 0.094 D for the J0 vector and 0.094 ± 0.084 D for the J45 vector. Our results suggest that subjective refraction can be accurately and precisely predicted from novel polynomial wavefront data using machine learning algorithms. We anticipate that the combination of machine learning and aberrometry based on this novel wavefront decomposition basis will aid the development of refined algorithms which could become a new gold standard to predict refraction objectively.
Globally it is estimated that 153 million people aged 5 or above are visually impaired due to uncorrected refractive errors1.The ability to automatically refract a patient and provide a spectacle prescription, equivalent to the time consuming current gold standard of subjective refraction, is an elusive goal that has intrigued many ophthalmic clinicians and researchers2,3,4.
One such automated and objective method is optical wavefront sensing using aberrometry which allows mathematical reconstruction and analysis of lower and higher order monochromatic aberrations of the eye. This has led many to believe that this objective method had the potential to be the new standard for optimizing correction of refractive errors by converting aberrometry data to accurate sphero-cylindrical refractions5,6,7.
Though several small sample studies showed promising results in terms of accuracy and precision of objective refraction from several methods related to wavefront analysis to date5,6,7,8,9,10,11,12,13 no study has found a validated method that can be used to prescribe a spectacle correction. It was found that results from the aberrometer, autorefractor and subjective refraction, though comparable with each other, were not accurate enough to prescribe spectacles directly from either instruments14. A recent publication found that a visual image quality metric could predict subjective refraction in myopic eyes but not habitually undercorrected hyperopic eyes, though the data set was again small15. Variability in the gold standard subjective refraction measurements themselves were also thought to be a source of poor precision7.
The ocular wavefront error is most commonly described by the Zernike polynomials16. To satisfy orthogonality constraints with low order modes, some higher order Zernike polynomials contain low order terms in their analytical expression leading to lack of accuracy when predicting the sphero-cylindrical refraction17. It is known that conventional therapies such as spectacles or contact lenses correct just the lower-order aberrations but the presence of higher-order aberrations influences the prescription itself12. An important finding by Cheng et al.6 showed that subjective judgment of best focus does not minimize RMS wavefront error (Zernike defocus = 0), nor create paraxial focus (Seidel defocus = 0), but makes the retina conjugate to a plane between these two. The levels of spherical aberration (Z4°) and secondary astigmatism (Z4±2) influenced the levels of defocus and primary astigmatism that produced the best visual performance. These objective metrics were tested based on an assumption that the Zernike polynomial decomposition was producing a clear distinction between the low and high order components of the wavefront error. This assumption could explain the poor correlation between subjective and objective refraction, especially when it came to large amounts of higher order aberration5,6,7,9,11,12,18,19.
A new series of polynomials, labeled LD/HD (Low Degree/ High Degree), have been proposed to provide a more mathematically relevant separation of the higher and lower modes20. In this decomposition, the normalized higher order modes are devoid of low order terms and mutually orthogonal within but not with lower order aberrations. With this approach, the low order wavefront component is equal to the paraxial curvature matching of the wavefront map.
Machine learning is already in use in Ophthalmology for image analysis in medical retina21,22,23, and glaucoma (Visual Fields and Disc Photos)24,25 as well as recent developments for use in diagnoses including retinopathy of prematurity26. It is also used in regression tasks, notably in IOL calculations27. Deep learning has been applied to predict refractive error from fundus images and other image analysis techniques also28,29. Attempts to predict subjective refraction from Zernike polynomials have also been tried using a multilayer perceptron with two hidden layers30.
Our aim was to build and evaluate a set of predictive machine learning models to accurately and precisely objectively refract a patient using wavefront aberrometry with LD/HD polynomial decomposition, and to evaluate the relative importance of each polynomial in the prediction process for each vector.
Patients demographics are presented in Table 1. Training set and test set were comparable in terms of patient ages, sex-ratio, side repartition, mean refractive spherical equivalent and mean refractive cylinder.
Prediction performance of the different methods
The performance of prediction methods are presented in Table 2. Figure 1 illustrates the prediction performances of the different approaches for the M vector and Fig. 2 illustrates those for the J0 and J45 vectors. Statistical testing for differences between the various prediction methods is presented in Table 3. The XGBoost models using all the polynomials, resulted in a Mean Absolute Error of 0.30 ± 0.25 Diopters (D) for the M vector, 0.12 ± 0.09 D for the J0 vector and 0.09 ± 0.08 D for the J45 vector, whilst the Paraxial matching method resulted in a Mean Absolute Error of 0.40 ± 0.35 D for the M vector, 0.17 ± 0.14 D for the J0 vector and 0.14 ± 0.1 D for the J45 vector. The XGBoost models using only the low-degree polynomials resulted in a Mean Absolute Error of 0.35 ± 0.29 D for the M vector, 0.16 ± 0.14 D for the J0 vector and 0.12 ± 0.10 D for the J45 vector. Bland-Altman plots showed a good agreement between subjective refraction and the predictions obtained with the machine learning models, with no systematic error depending on degree of refractive error (Fig. 4). Paired t-test were not significant.
Pairwise prediction methods comparison
The XGBoost models using all the polynomials performed statistically better than Paraxial matching for every vector and every metric, except for accuracy for the J45 vector prediction. They also performed better than the XGBoost models trained with low-degree polynomials only, although the difference was not significant for precision in predicting the M vector and accuracy in predicting the J0 and J45 vectors.
SHAP value analysis for the three XGBoost models trained with the full set of polynomials is presented in Fig. 4. It showed that G20(defocus) was by far the most influential feature to predict the M vector, with G4° (primary spherical aberration) being the second most important feature. The bottom two graphs demonstrate that G22 (Vertical astigmatism) and G4−2 (Oblique secondary astigmatism) were the most important features to predict the J0 vector, while G2−2 (Oblique astigmatism) and G42 (Vertical secondary astigmatism) were the most important features to predict the J45 vector.
The machine learning approach using LD/HD polynomials was more effective than the paraxial matching method for predicting the results of conventional, sphero-cylindrical refraction from wavefront aberrations used by Thibos et al. previously7. Interestingly, the XGBoost models trained using low-order aberrations only proved more accurate than paraxial matching. This could suggest that those low-order polynomials interact, in some circumstances, in a more complex way than previously thought. The best precision and accuracy were obtained when all the novel polynomials coefficients were used as predictors, demonstrating the significant influence of the higher order aberrations on the spectacle correction.
Gradient boosting creates new models that predict the residual errors of prior models during the training process. The models are used together to predict the target value. XGBoost is an implementation of gradient boosted trees focused on performance and computer efficiency. It can perform both regression and classification tasks. It was chosen because of its recognized performances and its resistance to overfitting31.
Feature importance G2° (defocus) was unsurprisingly the most influential feature to predict the M vector, with G4° (primary spherical aberration) being the second most important feature. One interesting finding was that G4−2 (Oblique secondary astigmatism) was the second most important feature to predict J0, and G42 (Vertical secondary astigmatism) the second most important feature to predict J45, while the inverse would be more intuitive. This demonstrates the interest in the machine learning approach, that allows us to discover new patterns and relationships between predictors by disregarding previous assumptions.
Our results confirm the prevalence of 4th order aberrations within the higher order coefficients influencing the sphero-cylindrical refraction as it has been previously shown6. The LD/HD modes being devoid of defocus terms (radial degree 2), they unambiguously confirm the influence of the radial degree 4 of the wavefront error, on sphero-cylindrical refraction.
The predictive influence of the variables used in the model does not explain their exact role, and that is a weakness of such machine learning algorithms, as interpretability and model comprehension are limited by the big number of decision trees, their complexity and depth.
Our study had some unavoidable limitations, among which is accommodation. We created a study design using undilated refraction, mirroring the real life clinical environment where spectacle correction is provided in adults, as well as allowing preservation of data volume. We did not test children or elderly patients so cannot generalize to these groups. By virtue of the technique, it is not possible to objectively refract patients with strabismus, corneal scarring, cataracts or vitreous opacity that would preclude clear wavefront analysis.
Precision may be masked by the imprecision of the gold standard of subjective refraction. Of note the examiner was aware of the autorefraction. We hope our study results will enable future development of machine learning algorithms from the LD/HD polynomials and objective refraction techniques, to prescribe glasses efficiently, not only to adults but also to children and vulnerable adults without need for their input or prolonged cooperation.
Patients and dataset constitution
This study was approved by the Institutional Review Board at Rothschild foundation and followed the tenets of the Declaration of Helsinki. Informed consent was obtained from all participants. A total of 2890 electronic medical records of patients (6397 eyes) evaluated for refractive surgery at the Laser Vision Institute Noémie de Rothschild (Foundation Adolphe de Rothschild Hospital, Paris) were retrieved and consenting patients data was analyzed. We excluded patients with strabismus and any other ocular abnormalities except ametropia. After data cleaning, eyes with subjective refraction and a valid wavefront aberrometry examination were randomly split into a 350 eyes test set and a training set, with no cross over of same patient data. A manual review of medical records of eyes in the test set was checked to ensure the quality of data, leaving 3729 eyes for the training set.
Wavefront analysis was obtained using the OPD-Scan III (Nidek, Gamagori, Japan). The aberrometer was specially configured to run using beta-software incorporating the new series of LD/HD polynomials, noted Gnm using the same double index scheme of the Zernike polynomials. The wavefronts were decomposed up to the 6th order. We chose to stop our polynomials analysis at the 6th order. This cut-off is beyond the number of polynomials that was determined by the members of the Vision Science and its Applications Standards task force (VSIA) to be necessary to describe the HOA of the human eye with sufficient accuracy in 200036. It applied to the paraxial matching analysis as well as the machine learning approach. The first three polynomials (Piston, Tilt, Tip) were removed from the features because of their low relevance in this work. Defocus, Vertical Astigmatism and Oblique Astigmatism constituted the Low order polynomial group, and all the others constituted the High order polynomials group. A 4 mm pupil disk diameter was chosen to obtain the coefficients and any pupil less than 4 mm during the acquisition of the wavefront with the OPD-Scan III, was an exclusion criterion. A 4 mm pupil diameter analysis cut-off was used because it is close to the mean physiological photopic pupil diameter in different studies37,38,39. Our results may not reflect the results that could be found using very large or very small pupils.
Corresponding non-cycloplegic subjective refractions conducted on the same day by an experienced optometrist were analyzed. The maximum plus rule was used to the nearest 0.25 D to minimize accommodation and maximize the depth of focus7.
Power vector analysis
Each refraction in Sphere S, Cylinder C, and axis A format was transformed into 3D dioptric vector space (M, J0, J45) where the three components are orthogonal. Refraction data sets were vectorized using standard power vector analysis with the components M, J0 and J4540 using Eqs. (1), (2) and (3).
Machine learning methodology
Three machine learning models were separately trained to predict each vector component from the new series of polynomials. We used a Gradient Boosted Trees algorithm (XGBoost)41. Parameter tuning was performed using 5-folds randomized search cross-validation. Mean squared error regression loss was chosen as the evaluation metric. We used Python 3.6.8 with the following libraries: Jupyter 4.4.0. Pandas 0.23.4, Scikit-learn 0.20.2, Matplotlib 3.0.2, Seaborn 0.9.0, XGBoost 0.81.
Feature importance analysis
SHAP (SHapley Additive exPlanations) values were calculated for each model in order to determine the most influential polynomials (Fig. 3).
SHAP value is a recently described tool that aims to increase machine learning models interpretability42. It allows us to understand how a specific feature negatively or positively participates in the target variable prediction by computing the contribution of each feature towards the prediction. This allows better estimation of the importance of each feature in the prediction process. A random variable was introduced as a predictive feature during the training in order to help differentiate useful features from the others.
Model evaluation and statistical analysis
Performances of the machine learning models were evaluated on the test set never seen by the model nor used for the hyperparameters choice, to avoid overfitting. For each machine learning approach (using low order polynomials only, and using every polynomial), the three vectors of the refraction were predicted one by one using the three machine learning models. Paraxial matching predictions were calculated using Eqs. (4–6)
where Gnm is the nth order LD/HD coefficient of meridional frequency m, and r is the pupillary radius. It is important to note that as high order LD/HD coefficients are devoid of low-order aberrations, this calculation is equivalent to paraxial curvature matching calculated by computing the curvature at the origin of the Zernike expansion of the Seidel formulae for defocus and astigmatism using Zernike polynomials as described by Thibos et al.7.
Mean absolute errors were calculated for each prediction method. Accuracy of the predictions for each vector was defined as the mean value of the prediction error. Precision was defined as two times the standard deviation (SD) of the prediction error7. Each prediction method was evaluated against each other. Mean absolute prediction errors and mean prediction errors were evaluated using the Wilcoxon-signed rank test with Bonferroni correction. Differences in precision were evaluated using the Levene test with Bonferroni correction. We used a similar confidence ellipse to Thibos et al. to graphically report our results7,43. Bland-Altman plots and paired t-test were conducted to study the agreement between subjective refraction and the machine learning models predictions. A p-value less than 0.05 was considered significant.
The datasets generated during and analyzed during the current study are available from the corresponding author on reasonable request.
Resnikoff, S., Pascolini, D., Mariotti, S. P. & Pokharel, G. P. Global magnitude of visual impairment caused by uncorrected refractive errors in 2004. Bull. World Health Organ. 86, 63–70 (2008).
Bühren, J., Martin, T., Kühne, A. & Kohnen, T. Correlation of aberrometry, contrast sensitivity, and subjective symptoms with quality of vision after LASIK. J Refract. Surg. 25, 559–568 (2009).
Pesudovs, K., Parker, K. E., Cheng, H. & Applegate, R. A. The precision of wavefront refraction compared to subjective refraction and autorefraction. Optom. Vis. Sci. 84, 387–392 (2007).
Bullimore, M. A., Fusaro, R. E. & Adams, C. W. The repeatability of automated and clinician refraction. Optom. Vis. Sci. 75, 617–622 (1998).
Watson, A. B. & Ahumada, A. J. Jr. Predicting visual acuity from wavefront aberrations. J. Vis 8(17), 1–19 (2008).
Cheng, X., Bradley, A. & Thibos, L. N. Predicting subjective judgment of best focus with objective image quality metrics. J. Vis. 4, 310–321 (2004).
Thibos, L. N., Hong, X., Bradley, A. & Applegate, R. A. Accuracy and precision of objective refraction from wavefront aberrations. J. Vis. 4, 329–351 (2004).
Maeda, N. Clinical applications of wavefront aberrometry - a review. Clin. Experiment. Ophthalmol 37, 118–129 (2009).
Kilintari, M., Pallikaris, A., Tsiklis, N. & Ginis, H. S. Evaluation of image quality metrics for the prediction of subjective best focus. Optom. Vis. Sci. 87, 183–189 (2010).
Applegate, R. A., Marsack, J. D., Ramos, R. & Sarver, E. J. Interaction between aberrations to improve or reduce visual performance. J. Cataract Refract. Surg. 29, 1487–1495 (2003).
Jaskulski, M., Martínez-Finkelshtein, A. & López-Gil, N. New Objective Refraction Metric Based on Sphere Fitting to the Wavefront. J. Ophthalmol. 2017, 1909348 (2017).
Thibos, L. N. Unresolved issues in the prediction of subjective refraction from wavefront aberration maps. J Refract. Surg. 20, S533–6 (2004).
Marsack, J. D., Thibos, L. N. & Applegate, R. A. Metrics of optical quality derived from wave aberrations predict visual performance. Journal of Vision 4, 8 (2004).
Bennett, J. R., Stalboerger, G. M., Hodge, D. O. & Schornack, M. M. Comparison of refractive assessment by wavefront aberrometry, autorefraction, and subjective refraction. J. Optom 8, 109–115 (2015).
Hastings, G. D., Marsack, J. D., Nguyen, L. C., Cheng, H. & Applegate, R. A. Is an objective refraction optimized using the visual Strehl ratio better than a subjective refraction? Ophthalmic Physiol. Opt. 37, 317–325 (2017).
Lakshminarayanan, V. & Fleck, A. Zernike polynomials: a guide. J. Mod. Opt. 58, 1678–1678 (2011).
Klyce, S. D., Karon, M. D. & Smolek, M. K. Advantages and disadvantages of the Zernike expansion for representing wave aberration of the normal and aberrated eye. J Refract. Surg. 20, S537–41 (2004).
Guirao, A. & Williams, D. R. A method to predict refractive errors from wave aberration data. Optom. Vis. Sci. 80, 36–42 (2003).
Applegate, R. A., Ballentine, C., Gross, H., Sarver, E. J. & Sarver, C. A. Visual acuity as a function of Zernike mode and level of root mean square error. Optom. Vis. Sci. 80, 97–105 (2003).
Gatinel, D., Malet, J. & Dumas, L. Polynomial decomposition method for ocular wavefront analysis. J. Opt. Soc. Am. 35, 2035 (2018).
Choi, J. Y. et al. Multi-categorical deep learning neural network to classify retinal images: A pilot study employing small database. Plos One 12, e0187336 (2017).
Tufail, A. et al. An observational study to assess if automated diabetic retinopathy image assessment software can replace one or more steps of manual imaging grading and to determine their cost-effectiveness. Health Technol. Assess. 20, 1–72 (2016).
Ting, D. S. W. et al. Artificial intelligence and deep learning in ophthalmology. Br. J. Ophthalmol 103, 167–175 (2019).
Asaoka, R., Murata, H., Iwase, A. & Araie, M. Detecting Preperimetric Glaucoma with Standard Automated Perimetry Using a Deep Learning Classifier. Ophthalmology 123, 1974–1980 (2016).
Zhu, H., Poostchi, A., Vernon, S. A. & Crabb, D. P. Detecting abnormality in optic nerve head images using a feature extraction analysis. Biomedical Optics Express 5, 2215 (2014).
Gupta, K. et al. A Quantitative Severity Scale for Retinopathy of Prematurity Using Deep Learning to Monitor Disease Regression After Treatment. JAMA Ophthalmology 137, 1029 (2019).
Lee, A., Taylor, P., Kalpathy-Cramer, J. & Tufail, A. Machine Learning Has Arrived! Ophthalmology 124, 1726–1728 (2017).
Varadarajan, A. V. et al. Deep Learning for Predicting Refractive Error From Retinal Fundus Images. Investigative Ophthalmology & Visual Science 59, 2861 (2018).
Libralao, G, Almeida, O, Carvalho, A. Classification of ophthalmologic images using an ensemble of classifiers. Innov Appl Artif Intell., 6–13 (2005).
Ohlendorf, A., Leube, A., Leibig, C. & Wahl, S. A machine learning approach to determine refractive errors of the eye. Invest Ophthalmol Vis Sci 58, 1136 (2017).
Reinstein, I. XGBoost, a Top Machine Learning Method on Kaggle, Explained, https://www.kdnuggets.com/2017/10/xgboost-top-machine-learning-method-kaggle-explained.html.Last accessed 2/1/2020 (2017).
Asgari, S. et al. OPD-Scan III: a repeatability and inter-device agreement study of a multifunctional device in emmetropia, ametropia, and keratoconus. International Ophthalmology 36, 697–705 (2016).
Hamer, C. A. et al. Comparison of reliability and repeatability of corneal curvature assessment with six keratometers. Clinical and Experimental Optometry 99, 583–589 (2016).
Guilbert, E. et al. Repeatability of Keratometry Measurements Obtained With Three Topographers in Keratoconic and Normal Corneas. Journal of Refractive Surgery 32, 187–192 (2016).
McGinnigle, S., Naroo, S. A. & Eperjesi, F. Evaluation of the auto-refraction function of the Nidek OPD-Scan III. Clinical and Experimental Optometry 97, 160–163 (2014).
Thibos, L. N., Applegate, R. A., Schwiegerling, J. T., Webb, R. & VSIA Standards Taskforce Members. Standards for Reporting the Optical Aberrations of Eyes. Vision Science and its Applications (2000).
Sanchis-Gimeno, J. A., Sanchez-Zuriaga, D. & Martinez-Soriano, F. White-to-white corneal diameter, pupil diameter, central corneal thickness and thinnest corneal thickness values of emmetropic subjects. Surgical and Radiologic Anatomy 34, 167–170 (2012).
Hashemi, H. et al. Distribution of Photopic Pupil Diameter in the Tehran Eye Study. Current Eye Research 34, 378–385 (2009).
Oshika, T. et al. Influence of Pupil Diameter on the Relation between Ocular Higher-Order Aberration and Contrast Sensitivity after Laser In Situ Keratomileusis. Investigative Ophthalmology & Visual Science 47, 1334 (2006).
Thibos, L. N., Wheeler, W. & Horner, D. Power vectors: an application of Fourier analysis to the description and statistical analysis of refractive error. Optom. Vis. Sci. 74, 367–375 (1997).
Chen, T. & Guestrin, C. XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International conference on knowledge discovery and data mining, ACM, pp. 785–794 (2016).
Lundberg, S. M. & Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30 (eds. Guyon, I. et al.) 4765–4774 (Curran Associates, Inc., 2017).
Bland, J. M. & Altman, D. G. Regression Analysis. Lancet 327, 908–909 (1986).
D.G. is a consultant for Nidek but received no funding for this study. G.D., R.R. and J.M. declare no potential conflict of interest.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Rampat, R., Debellemanière, G., Malet, J. et al. Using Artificial Intelligence and Novel Polynomials to Predict Subjective Refraction. Sci Rep 10, 8565 (2020). https://doi.org/10.1038/s41598-020-65417-y