Introduction

Contextual modulation is a general phenomenon that relates to changes in the perceived appearance of targets or objects when they are presented within the context of other targets or objects. Some well-known types of contextual modulations are visual masking (including center-surround), crowding, grouping and several types of contextual illusions. However, most research interest has focused on masking (spatial and spatio-temporal) and crowding; both phenomena refer to reduced performance on a target stimulus when the mask stimulus is presented within a small spatio-temporal window1,2,3,4,5,6,7,8.

Perceptual learning has a major influence on our understanding of the development and plasticity of visual processes such as masking and crowding. It is considered to be highly specific to the particular characteristics of the stimuli used during training (e.g., the location in the visual field and orientation), which is thought to reflect encoding in early visual areas9,10,11,12,13. However, recent studies show that learning and transfer may depend on several training properties such as the task, attention, difficulty and the paradigm's manipulations such as training on two tasks simultaneously, the sequence of stimulus presentation (roving vs. fixed stimuli), among others11,14,15,16,17,18,19,20,21,22,23,24,25. Some insight into the mechanism underlying learning comes from lateral masking experiments26. In such experiments, when participants are trained to detect a low-contrast Gabor target embedded between two similar Gabor flankers, higher sensitivity to the target in the presence of flankers compared with that of the target alone (termed the facilitation effect) and an expansion of the target-flanker distance that induced facilitation are observed. These effects are found only when the target and flankers have the same orientation and are positioned along a collinear direction27,28,29. The lateral facilitation effect is largely explained and modeled in terms of spatial processing such as a) the propagation of lateral excitation from the flankers through the horizontal connections in the primary visual cortex1,26,27,29,30,31, b) contrast integration of the flankers and the target within large simple32, or c) complex33 receptive fields. Quantitative models suggest that the flanker effects are multiplicative terms applied to both the excitatory and inhibitory terms of a divisive inhibition response function34,35. Top-down modulation of the target response was also considered36,37,38. Another study shows that similar training shortens the processing time needed for target detection39. More specifically, it suggests that practice involving targeting the improvement of the spatial and temporal lateral interactions increases the efficacy of the lateral interactions between neighboring neurons and improves the processing speed; hence, it enables the practice-based improvement to be generalized to other untrained visual functions16,40.

Studies have shown that training effects on lateral interactions can be generalized to non-trained visual functions such as visual acuity41,42, contrast sensitivity16,41,42,43, contrast discrimination40 and reading speed40. However, most of these results were obtained for impaired vision following abnormal visual development such as amblyopia41,42,43, developmental visual form agnosia44, or in the case of blurred retinal inputs in the aging eye (presbyopia)40. It was shown that the extent of the improvement is proportional to the initial level of the visual function42,45. Thus, these remarkable improvements may be found only in cases of impaired visual functions that lead to initial sub-normal vision. In addition, the generalization of these effects might critically depend on the initial (pre-training) sub-normal vision. A similar procedure, when applied to young participants with normal vision, resulted in reduced backward masking effects, shortened reaction times and shortened latencies of an EEG component that is thought to reflect visual integration39.

Visual information processing takes time, whether for simple tasks such as target detection or for more complex tasks such as reading, searching, or object tracking. Thus, in order to enable appropriate behavior, processing at all stages must be coordinated in time and completed within a limited time window46. It was shown that categorization of visual images involves several stages, with increasing time needed to process the information, e.g., fast for the early detection processing stage and longer for the later identification stage47. Visual information processing may be compromised if any of the processing stages are inefficient, for example, due to noisy retinal input40, slow neural processing16, masking2,48, or crowding3,4,5,49. Thus, improved processing speed through perceptual learning may enable a processing gain within the limited time window and lead to the observed generalization of the training effect to many untrained visual functions16,40 including the transfer from contrast detection (masking) to letter identification (crowding).

The relationship between masking (spatial and spatio-temporal) and crowding (letter acuity)

Both masking and crowding include a situation in which the reaction to a target stimulus is deteriorated by other stimuli, called masks. In crowding the surrounding masks are usually presented simultaneously with the central stimulus and in the case of masking, the mask can appear before the stimulus (forward masking), after the stimulus (backward masking), or also simultaneously, as in crowding.

The literature on masking distinguishes between pattern masking (the mask and target presented at the same retinal location) and lateral masking (the mask location does not overlap with the target location)1,5. Likewise, the crowding effect is measured when the target and flankers are not overlapping; thus it parallels the lateral masking measurements. Since both crowding and lateral masking share similar properties such as dependency on the distance between the target and flankers (spacing) and an increase of the effect with increasing eccentricity, some studies suggest that masking and crowding are related1,50,51,52 and some even view crowding as a type of masking1,49,52,53.

On the other hand, visual crowding extends throughout large parts of the visual field3,4,54,55 (mostly found in peripheral vision but in some studies it has been found in the normal fovea56,57,58 and in the foveal region of people with strabismic amblyopia3,59) and – compared to lateral masking – up to longer distances between the target and flankers. Furthermore, since masking is assumed to affect the detection level (the stimulus is rendered invisible) and crowding is assumed to affect the identification level (the stimulus can be detected but not identified), the general view, supported by many studies, considers crowding to be a different process than ordinary masking, especially in the periphery3,4,5,55,60.

Recently it was shown that young adults with normal foveal vision exhibit crowding for very short presentation times or when the availability of the stimulus is limited by backward masking49, indicating that processing of targets under crowding conditions requires a longer processing time. Therefore, here we hypothesize that increasing the processing speed can lead to reduced crowding effects. In this study we investigated how perceptual learning affects the visual processing of healthy young people using the GlasseOff application, which is used to improve vision in presbyopia40. In a study with presbyopes, using this technique, it was shown that training, which focused on improving spatio-temporal processing by strengthening lateral interactions, resulted in improved visual performance. More specifically, it enabled the participants to read smaller font sizes and to increase their reading speed and thus to overcome and/or delay some disabilities imposed by the aging eye. This improvement was achieved without changing the optical characteristics of the eye. It was shown that visual acuity deteriorates when the presentation time is shortened61. In the current study we determined whether the training on contrast detection of a Gabor target, under conditions that pose limitations on the processing time, leads to generalization and hence to an improvement in spatial and temporal visual functions such as letter recognition under crowding conditions with a short presentation time.

Our second aim in this study was to determine whether training on near distance will transfer to improvement in visual functions tested at far distances. It is generally thought that perception is invariant to the viewing distance if the retinal image size is the same (retinal spatial frequency). However, this notion of distance invariance is surprising, given that early and recent studies62,63 have consistently shown lower visual resolution for near rather than for far viewing and that this difference is related to the difference in the accommodation power needed for fixation from far to near viewing. This is further supported by a study that contradicts the basic assumption of distance-invariant perception and shows that perception of retinal spatial frequency might be affected by the context64.

We believe that investigating this issue will provide very useful information for future experiments, for example, about the appropriateness of collecting data using near presentations and hand-held devices. Thus, here we examined whether training on tasks involving fixation for near viewing (hence, involving accommodation) transfers to visual tasks involving far viewing and whether the same visual mechanisms process these different tasks. This transfer of improvements between the two domains is not trivial and has not been previously reported.

Results

Spatial processing: contrast detection and lateral masking (Gabor targets)

We measured several distance visual functions on a PC screen (with a viewing distance of 150 cm) before and after near vision training (detecting Gabor targets, 1.3 to 8 cycles per degree, [cpd]) from a 40 cm viewing distance using personal iDevices (iPhones or iPods) to determine whether training from near viewing transfers to distant visual functions.

We found that distant contrast sensitivity, i.e. the ability to detect a target at low contrasts, significantly improved after training, as displayed in Figure 1c. A 2-way ANOVA with factors training (pre vs. post) and spatial frequency (5, 6.5, 8.5 and 13 cpd) revealed a significant main effect of training (F(1,13) = 9.215, p = 0.0096) driven by improvement at spatial frequencies of 5, 6.5 and 13 cpd (post-hoc paired 2-tailed t-tests for 5, 6.5 and 13 cpd, respectively: t(13) = 4.19, p* = .0011; t(13) = 2.735, p = .017; t(13) = 2.198, p = .047; *significant at Bonferroni corrected alpha level = .0125).

Figure 1
figure 1

Improvement of contrast sensitivity and masking effects following training: (a) Example of a single Gabor target used in the experiments. (b) Example of lateral masking with different target-mask separations used in the experiments. The lateral masking consisted of a target in the presence of two collinear flankers. (c) Sensitivity to an isolated Gabor target in log units (y axis) against the spatial frequency (x axis). (d) Sensitivity in log units to a target under the lateral masking condition (y axis) against a target-mask separation in λ (wavelengths) units (x axis). Red lines and filled diamonds denote the results before training and the blue lines and filled circles denote the results after training. (e) Threshold elevation (sensitivity of the target under masking conditions (see d) normalized by sensitivity to the target alone (see c)) in log units. We found significant post-training improvements in (c) and (d) but not in (e). Error bars denote the standard error of the mean (n = 14).

Previous studies showed that practice increases the range of the lateral interactions (the distance up to which the presence of flankers modulates the target detection threshold), but only when the flankers are collinear with the target26,65. This finding suggests that practice increases the efficacy between neighboring neurons along the collinear direction, an effect that enables connectivity with remote neurons via a cascade of local interactions. Previous studies also show that training does not improve the sensitivity to the target alone65 when the training is limited to one spatial frequency. Here we investigated how lateral interactions at distant vision (from 1.5 m) are modulated by near vision training (from 40 cm) when the training included spatial frequencies between 2 and 8 cpd and target-flanker separations of 1.5, 2, 3 and 4 wavelengths (λ) during the training. We tested lateral interactions before and after training at a spatial frequency of 6.5 cpd, which was identical for the near vision training and for the far distance pre and post training testing sessions; this is a frequency at which performance is typically neither at floor nor at ceiling levels. We found that the sensitivity to detect a distant target (from 1.5 m) when it is embedded in collinear flankers increased significantly following the near vision training (see Figure 1d). A 2-way ANOVA with factors training (pre vs. post) and target-flanker separations (4, 3, 2 and 1.5 λ) revealed a significant main effect of training (F(1,13) = 8.25, p = .0131), an expected main effect of separation (F(3,39) = 58.9; p < 10−4) and no interaction (F(3,39) < 1; p = 0.54). Post-hoc t-tests revealed that the improvement following training resulted from a significant improvement in the 4λ target-flanker separation (2-tailed paired t(13) = 3.15, p = .0076, with Bonferroni corrected alpha level = .0125), a trend for improvement in 3λ (t(13) = 1.702, p = 0.11), whereas the other target-flanker separation showed no significant improvements (all t's < 1.43, p's > 0.17).

The results presented in Figure 1e are in line with the typical effects of target detection modulation, namely, collinear facilitation (the presence of collinear flankers improves target detection, above the y = 0 line) at 3 and 4λ as well as collinear suppression (reduced target detection in the presence of collinear flankers, below the y = 0 line) at 1.5λ1. However, after training, unlike previous findings26, there was no significant change in the modulation effects. The lack of a significant change in the modulation effects is due to a parallel improvement in the sensitivities to the target alone (Figure 1c) and the target within the collinear configuration (Figure 1d). Here the participants were trained on the tested parameters (spatial frequency, orientation and target-mask separations) for a very limited number of trials (1–2 blocks) and sessions (only 2) before they moved on to the next parameters, whereas in the previous studies the participants were extensively trained at the same spatial frequency and orientation26. This short training per stimulus feature may prevent deterioration within a session66 and enable transfer between different tasks. Moreover, here we show improvement in the target-alone condition in parallel with improvement under the lateral masking condition. This effect may be due to training on a wide range of spatial frequencies and orientations40, whereas the previous studies used only one spatial frequency and orientation26,65. However, here, owing to the parallel increase in the sensitivity to the target under both conditions, we did not observe an appreciable effect of enhanced facilitation.

Temporal processing: backward masking (Gabor targets)

Previous studies showed that presenting collinear masks after the collinear flankers and the target (lateral masking) abolishes the facilitation effect1,67. Consistent with these results, Figure 2 shows that the effect of suppression by backward masking is larger for short inter-stimulus intervals (ISIs) and that it decreases with longer ISIs. Figure 2b shows the reduced thresholds of target detection with training (pre vs. post) as a function of increasing the length of the ISIs (60, 90, 120 and 150 ms). A 2-way ANOVA with the factors training (pre vs. post) and ISI revealed a significant main effect of training (F(1,13) = 11.4, p = .0049), a main effect of ISI (F(3,39) = 5.88, p = .0021) and a significant interaction (F(3,39) = 3.803, p = .0175), resulting from the significant improvement in the short ISIs (post-hoc t-tests for 60 ms: t(13) = 3.84, p = .002, 90 ms: t(13) = 2.38, p = .03). Figures 2b and 2c show the effect of training on the threshold change in the target that was presented with the two flankers (lateral masking), followed by backward masking of the two flankers. Figure 2b presents the unnormalized data (contrast detection thresholds (log units)) and Figure 2c shows the data as threshold elevations (normalized to the contrast detection threshold without backward masking (but with lateral flankers)). After training (blue line, filled circles), the backward masking effect was significantly reduced only for the short ISIs. A 2-way ANOVA with training and ISI as factors revealed no main effect of training (F(1,13) = 1.704, p = .214), a main effect of ISI (F(3,39) = 5.885, p = .0021) and a significant interaction (F(3,39) = 3.8, p = .0175). Here too, this effect was revealed due to the large reduction for the shortest ISI (from 0.4 to 0.15 log units, 78%, ISI = 60 ms, 2-tailed, p = 0.0029; for all other ISIs, 2-tailed, p > .18), reaching almost a “flat” level across ISIs. Before training, the backward masking effect for short ISIs of 60 and 90 ms was significantly different from the one resulting from longer ISIs of 120 and 150 ms (2-tailed p < 0.0223). However, after training the performance for the shorter ISIs improved and became as good as for the longer ISIs (not significantly different from longer ISIs). The results, presented in Figure 2, show that the slope after training has changed, indicating that after training the participants were able to process the information much faster and could overcome backward masking effects. This result supports our hypothesis that our training leads to improved processing speed.

Figure 2
figure 2

Reduction of temporal (backward) masking effects following training: (a) Example of the stimuli; the backward masking consisted of a target and two collinear flankers followed by another two collinear flankers presented after varying times (ISIs). Target detection threshold in log units (y axis) against an inter-stimulus interval (ISI, x axis). (b) Detection thresholds of the target under the backward masking conditions. (c) Threshold elevation (threshold of the target under the backward masking conditions normalized to the threshold without backward masking). Red lines and filled diamonds denote the results before training and the blue lines and filled circles denote the results after training. The post-training results for the short inter-stimulus intervals (ISI) of 60 and 90 ms significantly improved. Error bars denote the standard error of the mean (n = 14).

Temporal processing: visual acuity under temporal crowded conditions (E letters)

Before and after training we also measured crowding (the crowded condition) as a function of presentation time (30, 60, 120 and 240 ms) using E letters on an iPod from a distance of 40 cm. The results, presented in Figure 3b, show that the effect of crowding by E letters is similar to the effect of temporal masking by Gabors (cf. with Figures 2b and 2c). To measure crowding, an E target is embedded in a matrix of randomly oriented E letters, with 0.4 inter-letter spacing. An adaptive method was used for measuring the smallest E for which the direction in which it is facing can be identified. The y axis denotes visual acuity in LogMAR (the minimal angle of resolution) units, where 0 denotes a visual acuity of 6/6 (a log minimal angle of 1). Before training, the results showed significant crowding for short presentation times of 30 and 60 ms (p = 0.022), which decreased with increasing presentation time. The crowding was significantly reduced for stimulus durations of 120, 60 and 30 ms. A 2-way ANOVA with the factors training (pre vs. post) and stimulus durations (30, 60, 120 and 240 ms) showed a main effect of training (F(1,13) = 24.342, p = .0003) and a main effect of duration ((F(3,39) = 30.098 p < .0001). For the 30 ms presentation time, the crowding was reduced from 0.26 to 0.09 log units (41%), for 60 ms, from 0.2 to 0.04 (45%) and for 120 ms, from 0.1 1 to 0.01 (26%). The interaction was marginally significant (F(3,39) = 2.75, p = .056). Note also that after training, the participants achieved a better than normal vision level of 6/6 at about 240 ms. Very interestingly, the participants were able to isolate the target faster: as Figure 3b shows, before training they were able to identify a crowded letter equivalent to a visual acuity of 6/6 (0 LogMar) in about 240 ms, whereas after training they almost reached this level in about 120 ms. When this effect was calculated for each participant, the change was from 204 to 123 ms (Figure 3c). This slope change supports the notion that the training led to an improvement in the processing speed and not to an improvement in sensitivity per se.

Figure 3
figure 3

Reduced crowding and improved processing speed following training: (a) Stimulus example; E target (center) surrounded by E masks. (b) Visual acuity (VA) under crowded conditions in logMAR units (a minimal angle of resolution, y axis) as a function of the presentation time (x axis) in the pre-training (first) and post-training (second) testing sessions for the training and control groups. VA of the smallest target is presented in logMAR units. The zero line denotes a VA of 6/6 (a log minimal angle of 1). The training group is denoted by solid lines and circles. The control group is denoted by dashed lines and triangles. Open symbols and red lines stand for the pre-training results and filled symbols and blue lines denote the post-training results. The control group's second testing after a break lasted as long as the training period. Following training, the trained group improved significantly for all durations, whereas the control group did not (see Results). (c) Reduced stimulus duration required to reach a VA of 6/6 (0 logMAR on the Y axis) following training. Whereas before training, the average exposure duration required to reach a VA of 6/6 was 204 ms (red bar), after training this exposure duration was reduced to 123 ms (blue bar). Error bars denote SEM (trained group: n = 14, controls: n = 19).

To test whether this improvement was merely due to a test-retest effect, we tested a control group (n = 19) on this task in two sessions spaced apart as the duration of the training. A 3-way ANOVA on temporal visual acuity with the between-subject factor group (training, control), the within-subject factors testing session (pretest, posttest) and the duration revealed an unsurprisingly significant effect of duration (F(3,90) = 74.64, p < .001), a significant effect of group (F(1,30) = 40.49, p < .001), a significant effect of testing session (F(1,30) = 10.74, p = .003), and, importantly, a significant interaction between the group and the testing session (F(1,30) = 17.52, p < .001). These results, presented in Figure 3, show that there was no significant learning effect in the control group because their scores on this temporal visual acuity test did not change from the first to the second session (2-way ANOVA on the control group's temporal visual acuity with the duration and testing sessions revealed no significant effect of testing session: F(1,17) < 1, p = .498 and no interaction between duration and testing session: F(3,51) < 1, p = .721). Thus, we contend that the significant improvements reported here are due to the training (a significant effect of the testing session in the group receiving training: F(1,13) = 24.34, p < .001) and not due to test-retest effects.

Spatial processing; Improvement of static visual acuity as measured on ETDRS clinical charts

Previous studies reported that the near visual acuity is significantly worse than the far visual acuity62,63. Here we measured the visual acuity of all participants on near (40 cm) and far (3 meters) ETDRS clinical charts, for the training group, before and after the training and for the control group in the first and second testing sessions (see above). In the first testing session, the average near visual acuity of all participants (N = 33, −0.1 ± 0.01 (SE) LogMAR (1 line better than 6/6)) was significantly worse than their far visual acuity (−0.15 ± 0.01 (1.5 lines better than 6/6); far vs. near 2-tailed paired t-test: t(31) = 2.95, p < .006) by 12%. This effect further supports our study's aim to explore whether the mechanisms processing near and far vision are the same. Thus, our results suggest that the distance-invariant notion is more complex than the received view and that vision and visual acuity may be affected not only by the physical image present on the retina but also by the distance of the image.

In order to examine the effect of training on visual acuity, we ran a 3-way ANOVA with the between-subject factor group (training, control), the within-subject factors testing session (pre, post) and the VA-measurement distance (near, far). Interestingly, we found a significant effect of VA distance (F(1,31) = 11.44, p = .002) and a tendency towards a three-way interaction (F(1.31) = 2.36, p = .071). Post-hoc analysis revealed that prior to training the visual acuities of the two groups were not significantly different (far: t(31) = 1.16, p = .256; near: t(31) = 1.57, p = .127; two sample t-test). However, after training, the near visual acuity of the trained group improved slightly but significantly (7%, t(13) = 3.85, p = .002, paired t-test), whereas that of the control group did not (t(18) < 1, p > .7). The far visual acuity of both groups remained unchanged (t's < 0.91, p's > 0.38). The significant difference between the far and near VA that was evident in the trained group prior to training was no longer present following training (t(13) = 0.78, p > .44).

This effect of specific improvement in near visual acuity (40 cm), which did not transfer to far visual acuity (3 meters), may suggest that the improvement is due to the training on near visual tasks that did not transfer to far visual acuity, suggesting that the visual processing involved in the spatial processing of letter resolution for near is different from that of far visual acuity. However, we noted that the training for near (40 cm) did transfer to improved detection of far Gabor targets, as measured on a PC (1.5 m). Thus, a conclusion regarding the transfer of improvement between distances may be confounded by the possibility that the far visual acuity was already very good and may have reached nearly the best level (the ceiling effect), thus not enabling further improvement. It was shown that the extent of improvement, in particular, visual functions is proportional to the initial level of these visual functions before training42,45. Thus, further studies may consider designing a study in which this issue is tested with populations with reduced far visual acuity to enable improvement.

Discussion

Here we trained young adults with normal or corrected to normal vision using a visual paradigm that combined spatial and temporal Gabor detection tasks at near vision. We found that visual improvements were not specific to the trained tasks and that they generalized to other non-trained visual functions such as detection under crowded conditions and importantly, to far vision (1.5 meters). Although these results are consistent with previous results in atypical vision showing generalization of improvements41,44, this is the first study to show generalization of improvements in normal young adult vision, including several novel effects of perceptual learning that are discussed next.

Faster temporal processing for detection (Gabors) and identification (letter crowding)

A previous study suggested that visual improvements following perceptual learning may result from improved contrast sensitivity and/or processing speed39. Here we directly tested the improvement in spatial and temporal processing. We found robust temporal improvements (a gain of 81 ms in the processing of letter acuity) despite only subtle improvement in contrast sensitivity. Therefore, our results provide evidence favoring an alternative explanation, namely, that the improvements following visual training are due to faster processing of visual information, together with a reduction of crowding and masking effects. Recently, it was emphasized that crowding is an essential bottleneck in perceptual and perceptual processing3,4. Since the processing of visual information takes time, in order to mediate relevant behavior, the processing must be completed within a limited time window. Thus, the gain in temporal processing speed may enable one to overcome the bottleneck of crowding and may provide a better stream of visual information for perceptual processing; thus, it may improve cognitive functions such as decision-making39 and reading40. A previous study showed that visual recognition, as measured by letter size (visual acuity), takes more time with decreasing letter size61. Here we show that before training the participants needed 204 ms to recognize a letter (of a size that leads to 6/6), but that they were able to do so within only 123 ms following training. Interestingly, following training, at 240 ms they reached a better than normal vision level. This result further supports our hypothesis that improved processing speed may underlie the generalization of improvement in many visual functions40.

There has been some controversy about the nature, size and even the existence of crowding in the fovea3,4,5,57,58,68. Here we show robust foveal crowding for short presentation times. This result is consistent with earlier studies showing that Vernier acuity (measured at the fovea) is affected by crowded displays and by their distinctiveness from the targets57 or at very short exposures58 (<100 ms). Moreover, a recent study49 showed that both target identification and reaction time are affected when a foveal target is presented for a short time, or when the processing time is limited by backward masking. These findings suggest that extra processing time is required to overcome foveal crowding49,58. These results are consistent with our current findings showing improved contrast detection under backward masking conditions and improved letter identification (visual acuity) under crowded conditions with short presentation times. All together, the results suggest that the improved processing is achieved in stages, where an early detection stage is followed by a later identification stage69. After training, the detection task was accomplished within a shorter time period, suggesting that the overall processing was much faster. This could be attributed to faster processing of either the first (detection) or the second (identification) stage.

A few possible neural changes at different levels of the visual processing hierarchy may underlie improved performance, leading to improved processing speed. One possibility is that neurons at the early processing levels (e.g. in V1) may improve their sensitivity47, resulting from sharpening of the orientation tuning curves70 or a reduction in the receptive field size71. Other possibilities are related to the retuning of internal templates72, or to noise reduction72,73. A previous study showed that increasing contrast is associated with increased neural responses and decreased neural latencies of single neurons in the primary visual cortex74. It was also shown that training reduces the internal noise in human visual processing73 and thereby improves sensitivity. However, neurons in the visual cortex are extensively connected to other neurons, enabling them to integrate lateral inputs (which are noisy as well). Thus, noisy responses may also result from lateral influences. Moreover, imbalanced excitation-inhibition inputs may contribute to noisy activity. It has been suggested that reduced inhibition in the visual cortex underlies increased noise75 or reduced processing speed76. It was shown that collinear facilitation reduces the noise of neural responses77, that similar training shortens the response latency in young participants39 and that collinear facilitation expedites the brain's processing78. Thus, we can conclude that our training, which attempts to improve the efficiency of the spatial and temporal interactions at early visual areas, might improve processing speed directly by changing the excitation-inhibition balance, or possibly indirectly by reducing internal noise and improving neural sensitivity.

Generalization of improvement: Transfer between tasks

An important result of our study is the transfer of improvement following training on contrast detection of Gabor patches to improvement in a letter visual acuity task (visual acuity under crowded conditions presented for short times). Although transfer of improvements were shown previously with clinical populations (amblyopia42,43, presbyopia40 and developmental visual agnosia44), our results are novel since a) we found that the improvement is greater for shorter presentation times and when measured under crowded conditions, whereas previous studies showed improvements in visual acuity using static clinical charts; b) we provide data for young participants with normal/corrected to normal vision and not in impaired vision (clinical cases); c) we showed transfer from near vision training to both improved near visual acuity and to far temporal processing, while previous studies showed transfer of visual functions only for a trained viewing distance (either far visual acuity improvement following far visual training in amblyopia, or near visual acuity improvement following near visual training in presbyopia). We found that following training, static near visual acuity improved, whereas static distance/far visual acuity did not improve. This may suggest that near and far visual acuity, as measured by static charts, do not rely on joint mechanisms. However, since the spatial visual acuity for distance vision, as measured on the static ETDRS chart, may have reached a ceiling performance in the training group, our results did not allow us to reach such a conclusion.

One can claim that the improvements reported here may be due to a retest effect, i.e. very fast learning taking place already during the pretest. It has been established that many perceptual learning studies show improvement in learning just after a few sessions9,10,14,79 mainly if the effects are not robust. However, to date, no study has shown rapid, remarkable improvement in visual acuity. Moreover, previous training studies that have used similar methods found no improvement in lateral facilitation, contrast sensitivity and backward masking for the control group just by retesting or placebo training for 50 hours80,81. No improvement in contrast sensitivity was found even after 10 sessions of training39. An appreciable improvement in collinear facilitation requires many sessions of training at the same orientation and spatial frequency26. Furthermore, we show (Figure 3) the results from a control group, tested on the novel temporal visual acuity task under the crowded condition task. This group did not undergo training and was retested after the same time period as the training group (~2 months). The results showed that for the control group there was no improvement at any of the presentation times. It is also worth noting that studies show that the magnitude of the improvement is related to the initial level of the participants' performance, being maximal for worse vision and minimal for good vision42,45. In our study, the initial level of the spatial processing of the young participants was very good and therefore, it is expected that the improvement will not be robust, as found for the improvement of contrast sensitivity or for the static visual acuity for distance. Moreover, the main novel result of our study is an improvement in temporal processing. Indeed, the initial vision for short presentation times was reduced and it improved remarkably after the training. The control group (Figure 3) that was retested for the same task after ~2 months showed no improvement at all the presentation times. Therefore, we contend that the training on Gabor patches is transferred to spatio-temporal gains of letter resolution and crowding.

Using iDevices for training

Here we show for the first time conclusive evidence showing the significant effects of training on hand-held iDevices using the GlassesOff application. The results provide encouraging news for future research in the field of perceptual learning. Training on hand-held devices may increase training efficiency, simplify future research and make the training much easier for potential users. Such training can be effectively used for testing and training children and for special populations and also bypass transportation limitations.

The relationship between spatio-temporal masking and crowding

We recently showed51 that masking and crowding behave similarly in the fovea and in the periphery for a particular range of spatio-temporal parameters. Those results suggest that a joint mechanism might exist and that it may mediate these masking and crowding effects. Both masking and crowding may be related to the size of the human perceptive fields in the fovea1,82,83,84,85 as well as in the periphery76,85. Participants with larger perceptive fields exhibit greater effects of masking and crowding and vice versa. However, the mere correlation between masking and crowding does not necessarily suggest that they operate by mutual processes.

Accumulating evidence suggests that multi-dimensional parameters and multiple factors may affect the relationships between masking and crowding. Thus, masking and crowding may be determined by multiple sources of interference operating at several levels of cortical processing51,86 and each of them might affect the task. Among these factors are a) the proximity between the target and the flankers, which depends on the eccentricity3,4,85, b) the duration for which the target is visible [in the fovea longer presentation times reduce crowding and masking such that at presentation times longer than 120 ms there are no crowding effects49,51; in the periphery, presentation times longer than 250 ms do not affect crowding88 even though such elongated presentation times can involve eye movements that potentially increase crowding87 and whether presentation times shorter than 250 ms affect peripheral crowding is still unclear, c) the temporal order (dynamics) of the presentation (backward, simultaneous, or forward masking49,58,89), d) the global configuration and grouping between the mask and the target elements, where collinear configuration seems to produce the maximal effect57,86,90, e) contrast – where higher crowding is found with a higher contrast threshold and f) attention91. Thus, crowding and masking may or may not be correlated, depending on the particular spatial-temporal parameters chosen in the study.

Conclusions

Since the processing of visual information takes time, in order to mediate relevant behavior, the processing must be completed within a limited time window. Thus, the gain in temporal processing speed may enable one to overcome the bottleneck of crowding and may provide a better stream of visual information for perceptual processing and thus may improve perceptual functions such as contrast detection, identification and object recognition and cognitive functions such as decision-making39 and reading40. The results of our current study show that improved processing speed also improves the temporal processing of both crowding, using letters and masking, using Gabors, suggesting that the two phenomena are at least partly related49,51. Thus, processing speed may lead to overcoming foveal crowding and might be the enabling factor for generalizing to other visual functions.

Methods

The paradigm used in this study is similar to the paradigm used in our earlier studies in presbyopic [aging eye] participants40 in terms of behavioral tasks and temporal conditions. Visual acuity, spatial contrast sensitivity, crowding and backward masking were tested before (pretest) and after (posttest) the treatment using a PC at a distance of 150 cm in the laboratory.

Participants

Twenty-three young participants with no neurological conditions and with normal or corrected-to-normal vision in both eyes volunteered to participate in the training study. Fourteen of them (aged 24 ± 5 years old, mean ± STD) completed the training and returned for the posttest. Twenty additional participants enrolled in a control group and completed the pretest. Nineteen of them (aged 24 ± 5 years old, mean ± STD) returned for the posttest after the same time as the group undergoing training but without any training. The procedures were approved by the ethics committee of the Charité and all participants gave informed written consent to participate in the study. They were paid for participation in pre- and posttests and voluntarily completed the training phase. The study was performed at the Visual Perception Laboratory, Charité – Universitätsmedizin Berlin, Germany. All experimental protocols were performed in accordance with the guidelines provided by the committee approving the experiments.

Apparatus

Pretest and posttest were measured at the lab on a Samtron 98PDF 19″ CRT screen (1024 × 768 pixels at a 100 Hz refresh rate; the effective screen diagonal was 43.6 cm) controlled by a PC.

Visual acuity before and after training

We measured near (40 cm) and far (3 meters) visual acuity with an ETDRS chart. Far visual acuity was measured from a viewing distance of 3 meters using a wall-mounted ETDRS chart (Precision Vision) and near vision was measured using a hand-held chart from 40 cm.

Psychophysical measurements before and after training: stimuli and paradigms

PC test -The stimuli were vertically oriented localized gray-level gratings (Gabor patches, GPs) with an equal luminance distribution (STD, σ, allowing a minimum of 2 cycles in the GP) and the viewing distance was 150 cm. A 2AFC paradigm was used and participants were asked to report which interval contained the target. Target detection contrast threshold was determined for each condition, using a separate adaptive method for each block that converged to 79% percent correct. Participants started each trial by pressing the middle mouse button. A visible fixation circle was presented in the center of the screen until the participants pressed the button again to start the intervals. The two intervals were 60 ms each with an 800 ms gap between them. The first interval was preceded by a 300 ms blank period with a temporal jitter of 500 ms on average. The target GP was presented in only one of the intervals (the order was randomized). Participants were asked to report which interval contained the target by pressing a mouse button (left for the first interval and right for the second). Across trials, the target presentation was equally distributed between the two intervals. Participants were instructed to maintain their fixation at the center of the monitor and to avoid eye movements during the trials.

Psychophysical measurements included the following: 1) contrast sensitivity: The task was to detect a single Gabor patch target with a spatial frequency of 5, 6.5, 8.5, or 13 cycles per degree (cpd) presented for 60 ms; 2) lateral masking (LM): Detection of a Gabor target masked by two high-contrast (60%) collinear Gabor flankers with a target-flanker distance of 1.5, 2, 3 and 4 wavelengths (λ) (presented for 60 ms) with a spatial frequency of 6.5 cpd occupying 0.31 degrees of visual angles; 3)temporal masking: Backward masking following lateral masking, composed of LM followed by another mask, identical to the two flanking collinear GPs used in LM, presented at the same location but with varied time intervals (inter-stimulus interval, ISI) after LM. The ISIs were 60, 90, 120 and 150 ms. The target-flanker distance was 2λ and the target and flankers had a spatial frequency of 6.5 cpd.

Training on iDevices using the GlassesOff application

The paradigm is a structured perceptual learning training method originally developed for improving visual functions in presbyopia (GlassesOff applications for iDevices). The results of each session are sent via the internet to a remote server that analyzes the results. The training difficulty for the next session is adapted individually for each user according to the user's performance in the previous session. Thus, the pace of the progress is determined according to the individual's results. The initial number of sessions is individually set after an initial evaluation of the temporal visual acuity92 (see below) and is continuously updated throughout the training, based on the user's performance. The participants were instructed to perform at least 3 sessions per week and completed 24 ± 3 sessions (mean ± STD, range 20–33); one participant performed 33 and 2 participants performed 20 sessions on different days not including the days of the pretest and posttest.

Training on iDevices

Recent technology enabled the use of high-resolution screens on iDevices known as retina displays. The pixel size of the retina display is 0.078 × 0.078 mm, about 4 times smaller than the standard pixel size of PC monitors. This provides the advantage of presenting high spatial frequencies viewed from short distances. In this study we were able to train the participants using high spatial frequencies up to 8 cpd. We recently showed93 that the contrast sensitivity measured on a retina display is much better than that measured on PC monitors and that this improves the visual functions of presbyopes. The screen resolution of the iPod and iPhones was 960 × 640 pixels at a 60 Hz refresh rate, whereas the effective training area from 40 cm was a circle with a diameter of 4.9 cm.

To avoid variability among the resolution, screen size and luminance values that exist between the different iDevices, the pre and post testing of the temporal acuity were performed on the same device for all participants: an iPod (retina display) in a controlled environment at the lab. The training was performed using the participants' personal devices, which all had retina displays, except for one user who used an iPhone 3 (pixel size 0.156 × 0.156 mm2, better than a PC). Nevertheless, the application sets the overall luminance and the image size of the Gabor patches (by compensating for the known pixel size of each device) at the beginning of each session to be the same among the different devices.

The luminance of the screen was controlled throughout the training by automatically setting it to its maximal value (120 cd/m2) at the beginning of each session and returning it back to the user's preferences at the end of the session. The participants were instructed to train at home in a dark environment from 40 cm with both eyes open in a dark environment at their convenience. Each participant was provided with a ribbon of this length so that they could easily adjust the distance from the device to their eyes at home.

Training paradigm

Participants were trained on contrast detection of Gabor targets under lateral and backward masking conditions, by posing spatial and temporal constraints on the visual processing. The training covered a range of spatial frequencies (2–8 cpd; the size of the Gabor patches ranged from 0.18 to 1 deg) and included 4 orientations (0, 45, 90 and 135 deg) that were modified in accordance with the improved performance. Each session included 6 blocks that included the target alone and 5 blocks composed of two of the above-described 4 conditions (contrast sensitivity, lateral masking, spatial masking (crowded configuration), temporal masking) and a fifth condition: pedestals: contrast discrimination of the target while the two flankers served as pedestals either at a) 1.5 λ or 0 λ. The selection of the conditions was determined by an automated algorithm that advanced the conditions, the difficulty level (spatial frequency, orientation and target contrast) and ISI according to the participant's performance. Each condition was repeated twice during different successive sessions on different days.

The ISIs were 60, 90, 120, 150, 180, 210, or 240 ms. A 2AFC paradigm was used, identical to the one used in the pretest and posttest and the participants were asked to report which interval contained the target. Auditory and visual feedback were provided. ISI, the duration of the presentation of target and flanking Gabors, as well as their orientation and spatial frequency were modified between sessions, one parameter at a time, according to the performance in the preceding session. The duration of the stimulus presentation was 60 ms. The spatial distance between the target and the flankers varied from 0 to 4 λ. The orientation of the Gabor patches was always the same for the target and masking GPs (i.e., collinear, side-by-side or cross: ‘collinear + side-by-side’).

Visual acuity under temporal crowded conditions using E letters on an iPod (at the lab)

We applied here, on iPods, the same paradigm that we used before49,59 in order to investigate the crowding (letter resolution) at different presentation times. This method accurately predicts the near visual acuity, as measured on near ETDRS charts92. It is a LogMAR chart equivalent, monitor-based paradigm that uses E-patterns presented for presentation times ranging from 30–240 ms. Five rows of five E-patterns each, facing one of four directions, with a 0.1-log unit size difference between the rows were presented. These stimuli correspond to a subset of the LogMAR chart, with a baseline pattern size corresponding to baseline (i.e. 6/6 vision) of the LogMAR chart. The central pattern (the center of the middle row) was always the target for identification. The patterns were dark gray on a gray background and the viewing distance was 40 cm. For each trial the task was to determine the direction of the central E (the target) presented for durations ranging from 30 to 240 ms. An adaptive procedure in which the pattern size and spacing were modified in 0.1 log unit steps was used to determine the size for 50% correct (the chance level was 25%). A different auditory feedback was given for correct and incorrect responses. To determine crowding, we used a crowded condition (0.4 letter spacing)49 for each presentation time. We recently showed that the results revealed from this procedure are highly correlated with near visual acuity, as measured on an ETDRS chart92. This measure was used twice in the lab using the same iPod in a controlled environment. The second measure (posttest) took place immediately after the training period for the group undergoing training and after the same time period but without intervening training for the control group.