Fixation instability, astigmatism, and lack of stereopsis as factors impeding recovery of binocular balance in amblyopia following binocular therapy

Dichoptic therapy is a promising method for improving vision in pediatric and adult patients with amblyopia. However, a systematic understanding about changes in specific visual functions and substantial variation of effect among patients is lacking. Utilizing a novel stereoscopic augmented-reality based training program, 24 pediatric and 18 adult patients were trained for 20 h along a three-month time course with a one-month post-training follow-up for pediatric patients. Changes in stereopsis, distance and near visual acuity, and contrast sensitivity for amblyopic and fellow eyes were measured, and interocular differences were analyzed. To reveal what contributes to successful dichoptic therapy, ANCOVA models were used to analyze progress, considering clinical baseline parameters as covariates that are potential requirements for amblyopic recovery. Significant and lasting improvements have been achieved in stereoacuity, interocular near visual acuity, and interocular contrast sensitivity. Importantly, astigmatism, fixation instability, and lack of stereopsis were major limiting factors for visual acuity, stereoacuity, and contrast sensitivity recovery, respectively. The results demonstrate the feasibility of treatment-efficacy prediction in certain aspects of dichoptic amblyopia therapy. Furthermore, our findings may aid in developing personalized therapeutic protocols, capable of considering individual clinical status, to help clinicians in tailoring therapy to patient profiles for better outcome.


Methods
Participants. The tests were performed according to the tenets of the Declaration of Helsinki and were approved by the United Ethical Review Committee for Research in Psychology of Hungary (EPKEB; approval numbers 2014/5 and 2016/4). Participants were 45 amblyopic patients: 27 pediatric (mean age = 8.55 ± 2.12 years) and 18 adult (mean age = 34.09 ± 8.97 years) patients. Three children discontinued the training after 10 sessions and were removed from all analysis, leaving a total of 24 pediatric patients (mean age = 8.79 ± 2.13 years). Informed consent was obtained from all participants-and the legal guardians of pediatric patients-who underwent complete ophthalmic and orthoptic examinations at each follow-up examination. In the case of children, refractive error was measured under cycloplegia in a separate session. Inclusion criteria were as follows: ages of 5-13 years and 18-60 years for pediatric and adult patients, respectively, logMAR VA of 0.1-1.0 in the amblyopic, 0.1 or better in the dominant eye, with at least one line difference between eyes, heterotropia aligned with surgery or spectacle correction to within 10 prism diopters at near, absence of ophthalmological diseases, other than strabismic and/or anisometropic amblyopia, and the absence of neurological diseases that could affect the visual system. Our choice of including minimally amblyopic patients (i.e. one line difference between eyes; only 10% of our patient population according to the line assignment method) is contrary to common practice but it was to assist the prediction analysis with a wide range in depth of amblyopia. In addition, patients had to be past refractive adaptation (minimum of 4-6 weeks after new spectacle correction) and were not allowed to simultaneously participate in any other forms of amblyopia treatment other than the one under investigation. Thus, patching was also discontinued for the duration of the training. Table 1 summarizes participants' information including etiology, refraction, baseline VA, and stereoacuity. Patients were assigned into etiological groups based on the following criteria: patients with ≥ 1 D difference across the most anisometropic meridian were categorized as anisometropic 61 ; patients who had heterotropia on examination either at distance or near or had a history of strabismus were categorized as strabismic (S); while patients who were affected by both were categorized as mixed etiology (SA) 60,61 . For anisometropic patients, a distinction was made whether they had purely spherical anisometropia (A)-≥ 1 Dsph difference between eyes-or both spherical and astigmatic anisometropia (AA)-≥ 1 Dsph & ≥ 1 Dcyl difference between eyes 62 . Furthermore, astigmatism was also considered by itself: a patient was categorized as having astigmatism if the amblyopic eye had ≥ 0.75 cylinders regardless of the dominant eye's refraction status.
All participating subjects underwent binocular treatment of 20 h of game-play within a three-month period and underwent examinations to test visual functions: best-corrected visual acuity at near and distance (nVA & dVA), spatial contrast sensitivity (CS) function, and stereoacuity both before treatment (baseline visit: V BL ) and after 20 sessions (V 20h ). In addition, pediatric patients were evaluated two additional times during the training period: after 10 sessions (V 10h ), and one month after treatment was discontinued (follow-up visit: V FU ) to monitor changes more closely as occlusion therapy for children was suspended throughout the entire study (i.e. up until V FU ). Patients were instructed to play at least twice a week with a maximum break of three days. Children attended training sessions in the research institute, while half of the adults received the set-up for home-training with online progress monitoring.
Visual functions. Best-corrected distance and near visual acuity (BCVA) was measured using crowded tumbling E logMAR visual acuity charts for both children and adults. Tumbling E Series ETDRS chart (Precision Vision Ltd., La Salle, IL, USA) was used for distance vision measurement, and was viewed from 4 m using the corresponding backlit illumination cabinet at a luminance of 500 cd/m 2 . Tumbling E Runge Pocket Near Vision Test Card (Precision Vision Ltd., La Salle, IL, USA) was used for near vision measurement, and was viewed from 40 cm. A forced-choice testing method was applied, and visual acuity was scored using the standard technique of subtracting 0.02 logMAR units for each correctly identified optotype.
Contrast sensitivity function was measured using a standard lit Sine Wave Contrast Sensitivity Chart (Stereo Optical Company Inc., Chicago, IL, USA) with 3 m of viewing distance. The test comprised eight columns of tilted Gabor patches with decreasing contrast at 1.5, 3, 6, 12 and 18 cycles-per-degree (cpd). Patients' task was to indicate the tilting direction of a given patch (leftward, vertical, or rightward tilt), therefore, corresponding to a three-alternative forced choice (3AFC) task. The test began with the upper most row corresponding to 1.5 cpd Table 1. Patients' information. A spherical anisometropia, AA spherical and astigmatic anisometropia, S strabismus, SA mixed amblyopia, G gender, F female, M male, BCVA best-corrected visual acuity, D distance, N near, ET esotropia, XT exotropia, EP esophoria, XP exophoria, Ecc eccentric fixation, Occl occlusion therapy. *Patients dropped out from the therapy. † Patients undergone strabismus corrective surgery.

Pediatric
Age/G www.nature.com/scientificreports/ in a decreasing contrast order until they made an incorrect choice. If patients gave an incorrect response, the preceding (i.e. higher) contrast value was retested. The same procedure was repeated at each spatial frequency. Contrast sensitivity was the inverse of the contrast at the last correctly identified patch. For characterizing changes in contrast sensitivity, a broad contrast sensitivity metric, the area under the log contrast sensitivity function (AULCSF) was used 63,64 . This was calculated by fitting a third-order polynomial to the log contrast sensitivity versus log spatial frequency data of each subject and integrating between the lowest and highest spatial frequencies 65 . Stereopsis was measured for near distance (40 cm) using the Titmus graded circles stereo test as in the Stereo Fly Test (Stereo Optical Company Inc., Chicago, IL, USA). It shows graded circles containing nine panels with stimulus disparity spanning from 800 to 40 arc seconds. Each panel contained four contoured circles, of which only one had a crossed disparity. Subjects wearing polarized glasses were asked to identify the circle that appeared to pop out of the plane (4-AFC task). The test began with the panel containing the largest disparity and going in order with the next panel until the patient made an incorrect choice. If there was an incorrect response, the preceding panel was retested. In case subjects could not identify the stimulus with highest disparity (800″), the stereo fly diagram was shown and pinching of the fly wings was required to achieve 3500 arc seconds. If they succeeded, this value was entered as their stereoacuity, otherwise they were assigned 10,000 arc seconds (log stereoacuity of 4) corresponding to nil stereoacuity. For statistical analysis, the log transformed values were used.
Macular sensitivity and fixation stability were measured in adult patients using the Expert Protocol of the Macular Integrity Assessment (MAIA; CenterVue, Padova, Italy) microperimeter. Pupillary dilatation was not used. The patient's task was, as with conventional perimeters, to press a button to indicate the presence of a light spot whenever it was detected. The expert protocol allowed recording macular sensitivity at 37 macular points up to 10° of central visual field using Goldmann-standardized stimuli. In addition, the microperimeter, equipped with a scanning laser ophthalmoscope with real-time eye tracking system, provided fixation information with a sampling rate of 25 Hz. Only fixation stability obtained during the initial fixation phase of the microperimetric measurements is presented, which lasted 10 s with only the fixation point present. Due to the difficulty in obtaining microperimetric data in younger patients, a different fixation paradigm was employed for children using an IView X binocular infrared eyetracker (SensoMotoric Instruments GmBH, Teltow, Germany) with a sampling rate of 350 Hz. Each children underwent a separate fixation session prior to the binocular treatment, where they were required to fixate a 3° cross for 10 s. This procedure was repeated 10 times and out of the reliable trials, where children attended to the fixation point throughout the entire trial, only the one with the best overall fixation (smallest dispersion measure averaged across eyes) was used to facilitate comparison with adult measurements. Blinks and large saccades exceeding 1.5° were removed from the raw fixation data. Then the data were demeaned: the average of all fixation coordinates were subtracted from the data for each trail to evaluate the relative dispersion. This compensates for any possible shift during the measurement. For each eye position measurement (i.e., a pair of [x,y] coordinates), geometrical distance from the fixation point was calculated. The median distance was used as a measure of fixation stability in each subject separately for each eye with higher distance values meaning less stable fixation 66,67 . In addition, we have also calculated to more standard bivariate contour ellipse area (BCEA), which however suffers from normality assumptions and assumes fixation data is elliptically distributed, as opposed to the assumption free distance calculation 67 . Calculation of BCEA and the results obtained with it can be found in the "Supplementary Methods and Results". To make the two types of measurements from the groups comparable, an interocular difference was calculated, and this was used as a predictor in the analyses along with factor 'group' to account for any remaining difference between the measurement types. Binocular treatment. The treatment comprised of a playful software titled Stereopia in a fully immersive stereo 3D augmented reality environment built on the Leonar3Do equipment (Leonar3Do International Inc., Budapest, Hungary), which enables both viewing virtual 3D images on a passive 3D display and their manipulation in real time with a handheld 3D mouse. The display was an LG D2342 3D capable monitor with 1920 × 1080 resolution and interleaved 3D presentation, viewed through LG polarized glasses and the patients' optical correction. Stereopia software, developed in collaboration with Leopoly Ltd. (Budapest, Hungary), uses an interactive stereo 3D approach to directly train stereovision and eye-hand coordination, as both the virtual image and the real manipulating hand can be seen at the same time. The program involves a highly engaging children's video game to capture young patients' attention. The goal of the game was to capture 3D caterpillars emerging from a round fruit with a bird head sitting virtual at the tip of the 3D mouse. Players had to orient the mouse such that it was parallel with the caterpillar's motion vector and was also at the right depth (Fig. 1). Amblyopic and fellow eyes viewed stereo counterparts of the same image on a dark background. If the patient did not see a stereoscopic percept during game play, the luminance-and as a result the contrast-of the fellow eye was decreased until the stereo percept was obtained. The game had four levels, while game speed was manipulated throughout the game based on performance using a 2 up/1 down staircase procedure. Luminance of the fellow eye and depth threshold was also assessed at the start of each session using built-in tests measuring suppression and stereoacuity with the gaming parameters set respectively. Children's attention and motivation to play the game was high throughout the training. Most adults also enjoyed the game and found it challenging. Their main motivation, however, was the ability to finally do something to counter amblyopia. Statistical analysis. Monocular data obtained from each eye were used to calculate interocular values by subtracting the smaller value from the higher one (i.e. amblyopic eye (AE)-fellow eye (FE) for acuity and FE-AE for CS). Thus, higher interocular values indicated a bigger difference between eyes. Shapiro-Wilk's test was used to verify normal distribution. In the case of paired comparisons of post-vs. pre-treatment and post-treatment vs. follow-up values, the distributions of the differences were verified. If they met the criteria of a normal distribution at 1% significance level, post-vs. pre-treatment data were analyzed via Student's t-test. Otherwise, a nonpar- To analyze the treatment time course effects, data from the baseline visit and visits following 10 and 20 h of treatment (V BL , V 10h , V 20h ) were entered into general linear mixed models, with 'subject' as a random factor, 'time' as a continuous [0, 1, 2] fixed factor, and their interaction to account for individual differences in the slope of the improvement in the model. This model, when 'time' was significant, indicated a gradual linear change in the visual functions as an effect of the treatment proportional to treatment time. Stereoacuity, interocular distance (dVA) and near (nVA) visual acuity, and interocular contrast sensitivity were analyzed. For each measured visual function, the distribution of the pooled data from all three visits were evaluated in terms of normality and were transformed into normal distributions using square-root transformation in case normality was not met. The transformation was performed by shifting the minimum of the distribution close to zero, where applicable, and taking the square root of each value. In the case of interocular near logMAR VA values, another square root transformation was necessary.
For modelling treatment outcome, general linear models (GLM) were used aided by multiple regression analyses to identify the important predictor variables. Change-from-baseline (CFB) stereoacuity and interocular distance (dVA), near visual acuity (nVA) and contrast sensitivity (CS) measures were used as dependent variables. (Monocular amblyopic measures were also analyzed, the results of which can be found in the "Supplementary Results".) For possible predictors, the following variables were used: baseline measurements of stereoacuity and interocular values of dVA, nVA, and CS log transformed for normality, interocular fixation stability, and cylindrical diopter as continuous variables (i.e. covariates), and group (children vs. adults), age-group (< 9 years, 10-19 years, 20-39 years, or > 40 years), etiology (A, AA, S, or SA), heterotropia (present vs. not at V BL ), sightedness (myopic vs. hyperopic), presence of astigmatism (≥ 0.75 Dcyl in the AE), orientation of astigmatic axis (WTR, ATR, OBL or none), past occlusion (ever occluded vs. not occluded), the presence of stereopsis at baseline, and post-treatment poor dVA (≥ 0.4 logMAR) in the amblyopic eye (the latter only in the case of contrast sensitivity) as categorical factors. As a first step, all of these predictors were entered into a multiple regression analysis with a forward stepwise approach separately for each dependent variable, which automatically eliminated the predictors that did not influence the respective dependent variable and kept only those that explained significant variance in the dependent variable. Next, these automatically selected predictors were entered into a GLM analysis (ANCOVA), where also interactions were considered between predictors based on visual inspection of the data and common sense. For each dependent variable, several models were created in an iterative manner aiming for maximizing model fit (adjusted R 2 ), while minimizing residual error and the number of variables used to explain the data. The most economical model was chosen as the final model.

Results
Compliance was high in both groups. By the end of the three months, 24 (89%) children have completed ≥ 75% of the required training, while 3 (11%) dropped out of the training after 10 sessions and their results could not be analyzed. Out of the 24 children following through with the training, 19 (79%) has completed all 20 sessions, the remaining 5 (17%) completing ≥ 90% of the required sessions. In adults, half of whose learning was only remotely supervised, all 18 (100%) completed ≥ 75% of the required training sessions within three months, 15 (83%) of them completing all 20 sessions. The apple, the caterpillars (target) and the bird head (targeting object), which sits virtually at the tip of the 3D mouse, are perceived in 3D in front of the background, while the 3D mouse is held in hand. All 3D objects are part of an augmented reality, thus are intermixed with real objects (i.e. the patient's hand), the percept adjusting upon change in perspective. Effects of the training on visual functions. Analyzing the pediatric population, significant improvement was found in stereoacuity ( Fig. 2a; Wilcoxon matched pairs test: Z = 3.66, p = 0.0002), while the improvements were significantly stronger for the amblyopic compared with the fellow eye in the cases of near VA (nVA) and contrast sensitivity (CS), as their interocular difference showed a significant decrease as a result of the training (Fig. 2b,c; Z = 3.02, p = 0.0025 and paired t-test t (23) = 3.64, p = 0.0014, respectively). In the case of distance VA (dVA), however, the decrease was less pronounced and failed to reach the specified Bonferroni-corrected significance threshold ( Fig. 2d; p < 0.05/3 = 0.016; t (23) = 2.43, p = 0.023). Looking at the adult population, the change was significant in the cases of stereoacuity and nVA (Z = 3.08, p = 0.0021 and t (17) = − 3.54, p = 0.0025, respectively), but not in the cases of dVA and CS (t (17) = 1.32, p = 0.20 and t (17) = 1.76, p = 0.096 for dVA and CS, respectively), because of similar improvements in both amblyopic and fellow eyes. When compared directly, however, the two groups similarly benefited from the treatment as the change from baseline (CFB) in stereoacuity (two-samples t-test: t (40)  The pediatric patients were further divided into subgroups based on the tendency of change in their interocular values over time: to those patients improving, worsening, and stagnating based on the exponent value of their exponential fit being negative, positive, and close to zero, respectively (Figs. 3c, 4c, 5c, 6c). Approximately 60% of children showed improvements, ~ 30% did not change during the therapy, and ~ 10% declined. Nevertheless, the latter few mostly stabilized at their baseline value at V FU .
The stability of improvements was also investigated with results obtained one month after the cessation of the training, when children were instructed to refrain from occlusion, thus, they did not receive any treatment. Three patients were lost to follow-up, whereas one patient resumed occlusion therapy in this period. Hence, her data were excluded. Evaluating the remaining subjects, we observed stable improvements in visual functions from V 20h to V FU in the cases of stereoacuity (Wilcoxon matched pairs test: Z = 1.43, p = 0.15), nVA and CS (Z = 1.21, p = 0.22, t (19) = − 0.29, p = 0.77, respectively). Interocular dVA also did not change in the follow-up period (t (19)  Stereorecovery is only possible with stable fixation. The best prediction model for stereoacuity changes, with a model fit of R multiple = 0.90 and adjusted R 2 = 0.79 (F (3,28) = 40.27, p < 0.0001) explaining 79% of the variance of the data, was obtained. The final and most economical model included 'baseline stereoacuity' , 'baseline relative fixation stability' , and the interaction between these variables as predictors. Patients' data, whose improvements were out of the measurement range (40″-3500″), were excluded, because their improvements, if any, could not be quantified. Four additional patients did not have fixation data, therefore, 32 patients were included in this prediction model. Baseline stereoacuity had the strongest effect on stereoacuity improvement (main effect: F (1,28) = 120.48, p < 0.0001): the worse the stereoacuity was for a given patient, the more that patient could improve, indicating that the training strongly modulated stereoacuity (Fig. 3d). Importantly, relative baseline fixation stability had a significant effect on therapy outcome (main effect: F (1,28) = 7.87, p = 0.0090) as well as a significant interaction with baseline stereoacuity (F (1,28) = 9.67, p = 0.0043): patients with better interocular fixation stability (i.e. smaller difference in fixation between the eyes) had a higher potential for stereoacuity improvement. However, as the interaction indicated, this effect was dependent on baseline stereoacuity. In fact, interocular fixation stability was a strong predictor for patients with only coarse or nil stereoacuity (≥ 3500″). The same pattern-alas with a slightly lesser goodness of fit-was present if fixation stability was calculated using the more conventional method of bivariate contour ellipse area (BCEA; see "Supplementary Results"). This was further supported by the following analyses: (1) partial correlation between stereoacuity changes and interocular baseline fixation stability-controlling for the effect of baseline stereoacuity-showed a highly significant effect for patient group with coarse or nil stereoacuity (r partial = 0.94, p = 0.002, N = 8), but failed entirely when this group was excluded (r partial = 0.075, p = 0.73, N = 24) ii) there was a significant positive correlation between fixation stability and stereoacuity improvements for the coarse or nil stereoacuity group ( Fig. 3e; Spearman rho = 0.95, p < 0.0001, N = 8), (3) out of the three patients, excluded from this analysis because of non-measurable stereoacuity before and after the intervention, one with good fixation stability starting from suppression managed to achieve fusion by the end of the training, while the other two with poor fixation stability (within the worst 10%) did not progress from fusion to stereovision. Thus, the overall conclusion is that contrary to common expectation, poor stereopsis, or lack thereof, is not at all a contraindication for binocular treatment approaches. Rather, for patients lacking stereovision (outside of the Titmus stereo test range), a relatively stable fixation in the amblyopic eye is required for stereoacuity improvements using the 3D therapy reported here.
Even though the fixation measurements in the two groups were different, the inclusion of either 'group' or 'age-group' (either case: ps > 0.6) did not change the fit of the model, thus, possible differences between groups were not accounted for in the above model. Nevertheless, we have included separate analyses for each group. Despite the low number of data points, there was a similar pattern in the case of adult patients, where there was a significant interaction between baseline stereoacuity and relative fixation stability (F (1,9) = 5.65, p = 0.041) as well as a significant main effect of baseline stereoacuity (F (1,28) = 112.51, p < 0.0001). Importantly, this model had a very high adjusted R 2 value (0.92), demonstrating that the interplay between baseline stereoacuity and relative fixation stability is enough to explain the observed gain in stereoacuity in the case of adults. On the other hand, the interaction was not significant in the case of children (F (1,15) = 1.47, p = 0.24), even though the same pattern of correlation were observable for the three children with coarse or nil stereoacuity (Fig. 3e). The discrepancy between groups could, however, result from children's lesser quality of fixation data or the fact that there were too few children with coarse or nil stereoacuity in our dataset.
Visual acuity improvement is limited by astigmatism in children. Interocular distance visual acuity. The final prediction model fit for dVA, with R multiple = 0.91 and adjusted R 2 = 0.75 (F (12,25) = 10.32, p < 0.0001) explaining 75% of data variance, included the full factorial model of {'baseline interocular distance visual acuity (dVA)' , 'group' , and 'presence of astigmatism'}, and 'baseline interocular near visual acuity (nVA)' , 'etiology' , and 'past occlusion' as predictors. Four patients were classified as outliers (i.e., standard residual ≥ 2 SD), and were removed, therefore the final prediction model included 38 patients. As expected, baseline dVA had a significant, although not the most pronounced effect on therapy outcome (main effect: F (1,25) = 23.64, p < 0.0001). This could not be attributed to a simple tendency of patients with higher interocular difference in dVA being able to improve more, as there was no Spearman correlation between baseline dVA and dVA improvements (rho (N=38) = 0.009, p = 0.96). Importantly, this main effect was significantly modified by predictors 'group' and 'presence of astigmatism' , which indicated that astigmatism had a significant impact on interocular dVA improvements overall. The effect was also different between groups and in how baseline dVA affected improvement across groups (main effect of 'presence of astigmatism': F (1,25)  . The presence of astigmatism was a significant limiting factor for dVA improvements only in the pediatric group (Fig. 4d). Non-astigmatic children showed progressively more www.nature.com/scientificreports/ improvements as a function of baseline dVA (i.e. the worse the baseline dVA, the more the dVA improvement is; rho (N=8) = 0.93, p = 0.0008), while surprisingly there was an opposite, i.e. negative correlation for astigmatic children (rho (N=13) = − 0.84, p = 0.0003). Pediatric patients showing dVA improvements were non-astigmatic children with at least three dVA lines difference between the eyes and astigmatic children with two or less lines difference in dVA between the eyes at the baseline examination. On the other hand, adults did not improve regardless of their baseline dVA, or whether they had astigmatism in their amblyopic eye (Fig. 4e). Baseline nVA had a significant effect on the amount of dVA improvement (main effect F (1,25) = 10.05, p = 0.0040). Interestingly, nVA showed an opposite trend: the lower the interocular nVA, the higher the dVA gain provided by the training. This was confirmed by partial correlations, in which either nVA or dVA was controlled. There was a significant positive partial correlation between baseline dVA and dVA gain (r partial (N=38) = 0.48, p = 0.002) and, on the contrary, there was a significant negative correlation between baseline nVA and dVA gain (r partial (N=38) = − 0.53, p = 0.001). The effect of baseline nVA in the training outcomes was similar between groups regardless of astigmatism, hence these interactions were not accounted for in the final model. Thus, dVA gain, the main target of amblyopic treatments, was dependent on both dVA and nVA baseline values in an opposite manner, suggesting that moderate cases of amblyopia, considering interocular nVA and dVA, may be prognostic   www.nature.com/scientificreports/ modified by predictors 'group' and 'presence of astigmatism' , meaning that astigmatism had a significant effect overall, between groups and on how baseline nVA affected improvements (main effect of 'presence of astigmatism' F (1,27) = 5.12, p = 0.032; 'group × presence of astigmatism': F (1,27) = 5.82, p = 0.022; and 'baseline nVA' × 'presence of astigmatism': F (1,27) = 9.78, p = 0.0042, while main effect of 'group' , 'group × baseline nVA' interaction, and the three-way interaction between them were not significant all Fs ≤ 0.86, ps ≥ 0.36). In fact, astigmatism was a strong limiting factor for nVA improvements only in children (Fig. 5d): the majority of children with astigmatism either failed to improve or improved much less than expected based on their baseline interocular nVA (post-hoc astigmatic vs. non-astigmatic children p = 0.039). On the other hand, astigmatism had no effect on nVA improvements in adult patients ( Fig. 5e; post-hoc astigmatic vs. non-astigmatic adults p = 0.63). This was confirmed by Spearman correlations: there was a significant positive correlation between baseline nVA and nVA gain for the adults (rho (N=16) = 0.78, p = 0.0004), while there was no significant correlation for the children (rho (N=23) = 0.33, p = 0.12). Importantly, however, the latter was explained by the presence of astigmatism: when astigmatic children were excluded from correlation, it became significant (rho (N=8) = 0.91, p = 0.002). The few astigmatic children, who were the exception to the above 'rule' , almost exclusively had pure astigmatism without spherical refractive error. Finally, two additional factors played significant roles in explaining the amount of interocular nVA improvements: sightedness and etiology. Hypermetropia (far-sightedness) was predictive of a more robust nVA improvement especially in the adult population. Far-sighted patients, whose vision could have been compromised for near vision before they received correction, improved significantly more than near-sighted (i.e. myopic) patients (F (1,27) = 17.42, p = 0.0003), even though there was no baseline difference between the groups either in amblyopic or interocular nVA. This emphasizes the potential use of binocular approaches to treat adult far-sighted patients, who are overaged for standard monocular (patching) treatments. Etiology also had a significant effect on nVA improvement (F (3,27) = 6.49, p = 0.0019). Subjects, who had both spherical and astigmatic anisometropia (AA) gained significantly less compared with all other etiology groups (post-hoc p = 0.0003, p = 0.0005, and p = 0.093 for AA vs. A, AA vs. SA, and AA vs. S, respectively).
Contrast sensitivity deficits can be treated above a critical visual acuity in the presence of stereopsis. Interocular contrast sensitivity (CS) changes were reliably predicted with a model fit of R multiple = 0.93 and adjusted R 2 = 0.81 (F (10,27) = 16.47, p < 0.0001) explaining 81% of the variance. The final and most economical model included 'baseline interocular CS' , 'age-group' (i.e., < 9 years, 10-19 years, 20-39 years, and > 40 years), 'baseline interocular CS × age-group' interaction, 'measurable stereopsis at baseline' , 'post-treatment poor dVA (≥ 0.4 logMAR) in the amblyopic eye' , 'measurable stereopsis × poor amblyopic dVA' interaction as predictors. The final prediction model included 38 patients, as four patients were classified as outliers (i.e. standard residual ≥ 2 SD) and were removed. Baseline CS had the strongest effect on CS improvements (main effect: F (1,27) = 79.03, p < 0.0001): the worse the CS, the better the CS improvement (Fig. 6d). Importantly, age-group also had a significant effect on the therapeutic outcomes (main effect: F (3,27) = 3.89, p = 0.020) and a significant interaction with baseline CS (F (3,27) = 5.71, p = 0.0037): the 20-39 years age group showed the least overall improvement, which was significantly different from the larger improvement of the < 10 years child age group (post-hoc p = 0.038). Moreover, as the interaction indicated, the correlation between baseline CS and CS improvements was age-group dependent. In the 20-39 years age group, the individual CS improvement was independent of baseline CS. On the other hand, there were consistent dependencies in the other groups. This was corroborated by Spearman correlations: there were significant positive correlations between baseline CS and CS improvements in the < 10 years, 10-19 years, and > 40 years groups (rho (N=11) = 0.65, p = 0.032, rho (N=12) = 0.82, p = 0.0012, and rho (N=5) = 1.00, p = 0.017, respectively), while it was not significant in the 20-39 years age group (rho (N=10) = − 0.15, p = 0.67). Most notable was this dependency in the 10-19 years age group, in which data points fell on a relatively straight line, therefore, close of completely resolving baseline CS deficits in patients who are generally regarded as too old to be treated.
Importantly, however, two factors limiting CS gain were found: poor amblyopic dVA at the end of the treatment, and non-measurable stereopsis at the baseline examination. The criterion for poor dVA was ≥ 0.4 logMAR (≤ 20/50 Snellen acuity), the corresponding Snellen equivalent to the highest contrast stimulus with 12 cpd grating, which was the lowest spatial frequency where most of the patients had worse CS in the amblyopic compare with the fellow eye. Thus, the amblyopic dVA achieved was converted into a binary predictor indicating whether dVA better than the limiting 0.4 has been achieved. While this factor does not have any predictive value in the classical sense as it can only be obtained as an outcome of the training, the inability to resolve the spatial frequencies demonstrating amblyopic CS deficit does explain the lack of improvement. Significant effects for both predictors (F (1,27) = 6.92, p = 0.014 and F (1,27) = 16.28, p = 0.0004 for 'poor dVA' and 'measurable stereopsis' , respectively), and, more importantly, a significant interaction between them (F (1,27) = 16.38, p = 0.0004) were observed. While both poor baseline amblyopic dVA and non-measurable stereopsis hindered CS improvement, it was the combination of the two factors that prevented CS gain (Fig. 6e). Thus, measurable stereopsis and relatively preserved amblyopic dVA were required for recovering amblyopic CS regardless of age. In fact, most patients in this category showed improved amblyopic CS: 20 out of 24 patients achieved CS within the normal range defined for the SWCT test. Therefore, fully treated.

Discussion
The results showed that the 3D augmented reality interactive game training reported here was able to improve visual functions in both pediatric and adult patients with amblyopia, which could be predicted considering baseline clinical parameters. The training successfully improved stereoacuity in amblyopic patients. Moreover, significant monocular improvements in near visual acuity and contrast sensitivity were observed even if amblyopic changes were corrected for possible learning effects (i.e. normalized to the changes observed in the dominant eye). Importantly, critical factors strongly limited or even prevented improvements: (1) astigmatism in children limited visual acuity improvement both at near and distance, (2) comparable fixation stability between the eyes www.nature.com/scientificreports/ was necessary for stereopsis recovery in stereoblind patients, and finally, (3) stereopsis and a critical minimal visual acuity was required for contrast sensitivity improvements. This is the first report, to our knowledge, demonstrating patterns of predictive values in clinical parameters other than the isolated baseline values to estimate treatment effectiveness. Based on these results, we propose a unified treatment protocol.
In the present investigation, binocular training induced interocular change in near visual acuity (nVA) was assessed and proved to be more sensitive to treatment modulation compared with distance visual acuity (dVA), which did not show substantial improvement after interocular normalization. This is consistent with pediatric clinical experience that improvement in nVA precedes that of dVA and underlines the importance of measuring nVA when assessing the effectiveness of occlusion therapy in children. However, nVA improvements were surprisingly more evident in adult compared with pediatric patients, likely because of the presence of astigmatism in 60% of the children, which was observed to be a limiting factor for nVA improvements in the pediatric group. In fact, astigmatism in children strongly limited VA improvements in general: only astigmatic patients with very mild amblyopia showed VA improvements. Interestingly, while astigmatism in the amblyopic eye had little effect on the adult's nVA improvement, patients with both spherical and astigmatic anisometropia had less chance of improving regardless of age. These results are in line with findings from Hussein et al. 49 , who reported clinical factors limiting the success of occlusion therapy in a retrospective study and have also confirmed that children with astigmatism (≥ 1.5D) were unlikely to achieve the desired outcomes. Most studies evaluating the relationship between astigmatism and amblyopia were conducted retrospectively focusing on the orientation of the astigmatic meridian [68][69][70] . In fact, astigmatic children are at risk for visual dysfunctions 71 . For instance, young infants at about 6 months old show lower grating visual acuity with proper astigmatic correction compared with non-astigmatic children 72 . Moreover, mild to moderate amblyopic children with astigmatism have significantly worse stereoacuity compared with hyperopic or myopic patients without astigmatism 73 . Even though large astigmatic refraction errors, especially in anisometropic patients, can be a challenge to reliably measure in children 74 , it would be crucial for astigmatic children to receive the best correction as this alone could improve visual functions over time 75 . Unfortunately, not enough emphasis is given to astigmatism when prescribing optical correction in children, which may contribute to its limiting effect on visual improvements in amblyopic children.
Foveal fixation depends on a diversity of voluntary and involuntary eye movements and eccentric (extrafoveal) fixation is closely associated with long-term visual acuity decrease after amblyopic treatment 76 . It has been previously established that (1) poor fixation stability is associated with poor monocular and binocular functions in amblyopic patients 16 www.nature.com/scientificreports/ shorter for patients with better fixational abilities of the amblyopic eye in occlusion therapy [56][57][58] ; and (3) fixation stability can be improved in childhood 80 and even in adult patients over the critical period of development [81][82][83][84][85] . This is the first report, to our knowledge, showing that the more similar the fixation stability between the eyes, the more likely a patient classified as stereoblind according to Titmus test is to develop a certain level of stereopsis as a result of binocular treatment, regardless of etiology or severity of amblyopia and, more importantly, regardless of age. Supporting this, fixational eye movements abnormalities, i.e. fusion maldevelopment nystagmus syndrome (FMNS) and nystagmus without FMNS have also been found to prevent and limit stereopsis improvements, respectively 57 , with no difference in the stereoacuity gain among amblyopic etiologies. Taken together previous reports and the present data, future investigations may consider proper fixation stability as a clinical requirement for visual improvements. The present results might also be potentially relevant for visual scientists and clinicians planning and designing future study protocols for the upcoming clinical trials to treat amblyopia 26 . Visual acuity is still the standard clinical parameter for characterizing amblyopic status and for the management of several diseases affecting the visual system, even though it has been extensively reported that luminance contrast sensitivity (CS) is highly related to the quality of vision 86 . Moreover, CS provides a more complete picture of spatial vision compared to visual acuity, besides its potential to measure binocular balance in amblyopic patients 3,64,87 . In line with our results, significant contrast sensitivity improvements have been reported in adult amblyopic eyes following dichoptic training 44 and perceptual learning 45,46 . Here we have further demonstrated that CS improvements can be achieved regardless of age or amblyopic etiologies, but only if at least a coarse stereovision or a minimum amblyopic visual acuity is present (≤ 0.4 logMAR), to allow for reading the higher frequencies of the SWCT chart. Importantly, our results also show, that in the absence of the above limiting factors, almost complete CS recovery is possible with the binocular approach reported here. In teenage patients (age 10-19 years), the slope of the regression line between baseline interocular CS and change in interocular CS was close to − 1 (β 1 = − 0.79), resulting in interocular CS difference of less than 10% of that of the dominant eye. As a matter of fact, 83% of patients without limiting factors for CS had achieved amblyopic CS that was in the normal range.
The present study utilized a new binocular method focused on stereo image presentation in an immersive 3D AR environment, similar to which only a handful of studies have pioneered so far 39,43 . Thus, it was conducted as a critical first step of exploring and proving its potential on a patient population moderate in size. This inherently holds some limitations to our study. First, it is unclear whether our findings would be completely generalizable to longitudinal studies in a larger group of patients or to dichoptic treatment approaches already undergone clinical investigation. Second, the study design did not include a control (non-treated/occluded) group of amblyopic patients, which could have been used to evaluate the effectiveness of the present training. Nevertheless, by calculating interocular values in the present study, amblyopic improvements were normalized to that of the dominant eye, considering the decrease in interocular values over time. Such a decrease is not expected to arise from a simple learning effect or the test-retest variability of the conducted tests, as ETDRS VA test-retest variability is comparable for amblyopic, fellow, and control eyes 88 . However, the evaluation of stereoacuity improvements could have been influenced by learning or test-retest variability. Therefore, these findings require further confirmation with a larger group of patients including treated and control groups, especially given that randomized clinical trials have so far failed to prove dichoptic treatments superior to control treatments 26,60,[89][90][91] .
Taken together, the present results emphasize the benefits of 3D binocular training in the management of amblyopia: significant, lasting improvements of monocular and binocular vision in both pediatric and adult patients, supporting its efficacy even after the critical period of visual development. Moreover, the findings shed light on specific clinical parameters that may help to anticipate the magnitude of visual improvements induced by binocular treatments, which may contribute to a better understanding of monocular/binocular interactions following binocular training. Figure 7 shows a meaningful integration ofthe different existing therapeutic approaches into a combined treatment protocol for amblyopia based on our results. In practice, best refractive correction is provided with attention to the proper correction of the astigmatism 92 , especially for young children, as this has been demonstrated here and elsewhere 49 that larger astigmatism can be a serious limiting factor in visual acuity improvement during occlusion and dichoptic therapies. After visual acuity improvement gained from optical correction has plateaued, occlusion therapy has its place if amblyopic visual acuity is still lower than 0.4 logMAR, or the child is too young to be treated using dichoptic games. Meanwhile, if significant fixation instability of the amblyopic eye is observed, the treatment should target fixation stability balance between the eyes, which was shown here to be required for stereovision recovery in the case of stereoblind patients. After the best possible fixation stability is achieved, a treatment scheme can commence involving both binocular (i.e. stereo) and dichoptic (2D complementary images) stimulation in an interactive and engaging format to provide better stereoacuity or at least coarse stereopsis and robust visual acuity improvement, respectively. When amblyopic visual acuity has achieved a mild-moderate range and stable stereovision is measured, contrast sensitivity improvement or even normalization could be expected. Lastly, with binocular functions and interocular balance restored, full and lasting visual acuity recovery in amblyopia may be attained with further treatment 40 . This protocol may help clinicians to recommend therapeutic solutions for a personalized and more reliable visual restoration in amblyopia.