Visual crowding, as context modulation, reduce the ability to recognize objects in clutter, sets a fundamental limit on visual perception and object recognition. It's considered that crowding does not exist in the fovea and extensive efforts explored crowding in the periphery revealed various models that consider several aspects of spatial processing. Studies showed that spatial and temporal crowding are correlated, suggesting a tradeoff between spatial and temporal processing of crowding. We hypothesized that limiting stimulus availability should decrease object recognition in clutter. Here we show, for the first time, that robust contour interactions exist in the fovea for much larger target-flanker spacing than reported previously: participants overcome crowded conditions for long presentations times but exhibit contour interaction effects for short presentation times. Thus, by enabling enough processing time in the fovea, contour interactions can be overcome, enabling object recognition. Our results suggest that contemporary models of context modulation should include both time and spatial processing.
Visual crowding is the inability to recognize objects in clutter and sets a fundamental limit on conscious visual perception and object recognition throughout most of the visual flield1,2 and is most pronounced in peripheral vision or in the fovea of people with strabismic amblyopia1,2,3,4. Crowding is contextual modulation that is viewed either as masking5,6 or, generally as unlike masking7,8,9. Most theories of crowding, suggest the existence of multi-stage processing2,5,7,8,9,10,11 whereby in the first stage the features are detected independently and they are integrated together for object recognition at later stages. Both crowding and masking are affected by similar factors such as the distance between the flankers and the target, their relative similarities and global arrangement1,8,12,13, as well as attention10,14. The main difference between crowding and other masking effect comes from observations that in the fovea the masking is general, whereas crowding is rare; however, it is very strong at the periphery1,2,7,8,15. Since the extent of crowding in the fovea is still controversial16,17, here we will use the term contour interaction to describe the effect of reduced letter recognition at the fovea. However, in the discussion we will raise the question whether this effect can be regarded as a crowding effect.
Flom, Weymouth and Khaneman's classical study (1963)4 measured the contour interactions at the fovea of people with normal vision and those with amblyopia. They found an effect at the fovea for target-flanker separation of less than 5 arc minutes for people with normal vision but a much greater effect at the fovea of amblyopic people. They concluded that the contour interaction effect is related to the visual acuity (minimum angle of resolution, MAR), a conclusion that is widely unaccepted1,2,8. The effect of crowding at about 0.5 deg at the normal fovea5 was reported and it has become well known that crowding exists in the fovea of amblyopic subjects1,2,3. Recently we showed that spatial and temporal crowding are correlated in the fovea of amblyopic participants3 and noted during the studies of visual training18,19 that participants with presbyopia (aging eye) that have reduced near visual acuity at the fovea exhibited an increased contour interaction effect in the fovea. This effect may result either from lower near visual acuity18, as predicted by Flom et al., (1963), or from deterioration of the processing speed with age18,19,20. In this study our aim was to explore the effect of crowded conditions at the fovea of people with normal vision and in people with presbyopia using the contour interaction paradigm. We further hypothesized that limited processing times may reveal the effect of contour interactions at the fovea. To explore this premise, we used our previously used method to measure crowding3,21, which is very similar to the method of Flom et al., (1963) but differs by using a limited presentation time.
First, we measured the contour interaction effect as a function of temporal durations of the target presentation. We used a method originally termed “contour interactions” to measure the surround effect on recognition of a single E letter3,21. Here, we manipulated the presentation time of the stimuli. The E target was presented for presentation times between 30 to 120 msec, either alone or embedded between a surround array of E letters3 (Fig. 1a). First, the spacing between the target and the surround was one letter size (0.18 deg., 10.8 arc min, VA = 0.3 LogMar) larger than the critical distance known to produce crowding1,2,4,8. The observer's task was to report whether the E was pointing to the right or to the left side. The results show (Fig. 1b), as expected, that there is no effect for all durations. There was a slight reduction in the percentage of correct answers at a presentation time of 30 and 60 msec from 96% to 92% and from 98% to 95%, respectively, but the effect is not significant (paired t-test, p = 0.2618). However, when inspecting the reaction time, the results clearly show (Fig. 1c) that there is a consistent and significant (paired t-test, p = 0.0007) reduction of about 50 msec under the crowding conditions for all presentation times. These results suggest that extra processing time is required to overcome the effect of contour interaction in the fovea.
To further explore this effect, we repeated the experiment with eleven new young participants using a smaller letter size (0.12 deg., 7.2 arc min, VA = 0.22 LogMar) and target-flanker spacing of either one letter size (Figure 2a) or 0.4 letter size (Figure 2c). Like in Figure 1b, the results show a sign of contour interaction for presentation times of 30 and 60 msec for single letter spacing (Figure 2a) in which the percentage correct is reduced slightly but significantly only for 30 msec (from 90 to 86 percent correct; paired t-test, p = 0.019). However, for 0.4 letter spacing, the effect is remarkably and significantly apparent for all presentation times (paired t-test, p < 0.0005 for presentation times of 30, 60, 120; p = 0.02 presentation time = 240 msec). Here also the reaction time (Figures 2c, d) clearly shows that there is a consistent and significant reduction (paired t-test, p < 0.0005 for all presentation times) of about 50 msec (one letter spacing) and about 100 msec (0.4 letter spacing) under the crowded conditions for all presentation times. These results support the idea that extra processing time is required to overcome the effect of contour interaction in the fovea.
Next, we present data from presbyopic (aging eye) participants whose near vision is blurred due to deterioration of the accommodation power18,19. In Figure 3a (N = 97 age 51.091 ± 0.64) we present data for a single letter and for crowded conditions with a target-flanker separation of 1 letter spacing. In Figure 3b (N = 41 age 50.32 ± 0.13) we present data for crowded conditions with a target-flanker separation of 1 letter and 0.4 letter spacing. In Figure 3c we present the results obtained from young participants with normal vision (N = 18, average age 25.4 ± 0.77). Here we used the adaptive method (staircase) that we used previously to measure the crowding effect in the fovea of controls and amblyopic subjects3,21. The method is similar to that of Flom et al., 19634 whereby the contour interaction is calculated as the difference between the thresholds of a recognized isolated target letter E and the target is embedded in a matrix of E letters.
In all cases, consistent with a previous study22, the visual acuity decreased with the shortening of the presentation time. In Figure 3a (presbyopic participants) the visual acuity decreased similarly from 0.48 to 0.67 LogMar in both cases, for the single target and under the crowded condition, showing no effect of contour interaction for one letter spacing. Figure 3b shows a similar effect of reduced visual acuity with a shortened presentation time, but the effect is greater for 0.4 letter spacing (triangles) compared to 1 letter spacing for all presentation times (circles) showing the effect of contour interaction. The effect is significant for all presentation times (paired t-test, p < 0.001). A similar effect is shown in Figure 3c, for young participants who have good visual acuity but still exhibit the reduction effect of visual acuity with shortening the presentation time. They also show a consistent contour interaction effect for 0.4 letter spacing (circles) for all presentation times (paired t-test, p < 0.001). Note also that for both presbyopes and young participants the effect is nearly constant relative to their visual acuity at each presentation time. This effect is consistent with the original suggestion by Flom et al, (1963) that the contour interaction effect is relative to the visual acuity. It's also interesting and surprising that the young group exhibit a larger effect of contour interaction than the presbyopic group. This effect may results from the fact that the initial visual acuity of presbyopes is worse, thus the testing performed on much larger letter size leading to larger separation in terms of visual angles. It is also probable that the worse acuity resulted in ceiling effect in the measurement of the presbyopic group. This effect warrant further exploration of the effect of crowding as a function of age in the future studies.
This dependency of the effect of contour interaction on presentation time in the fovea of young and presbyopic participants is novel and is not predicted by the contemporary models of crowding1,2,8 that explicitly assume that there is no crowding effect in the fovea (and they may regard this effect as different than crowding). Thus, we hypothesized that limiting the stimulus availability will reveal the spatial crowding effect even in cases where the spacing is larger, as shown in Fig. 1. Therefore, we imposed the condition of backward masking (Figure 4a), which interferes with the processing of the target23,24,25 and enables one to estimate the processing time. We found, consistent with our hypothesis, that contour interaction appears when the stimulus availability is limited (Figure 4b). The results show the robust effect of a reduced percent of correct responses for presentation times of 30 and 60 msec in which the percent correct is reduced remarkably and significantly (paired t-test, p = 0.0002 time duration = 30; p = 0.0005 time duration = 60) between 30 and 120 msec of inter-stimulus intervals. Interestingly, the effect of backward masking on the crowded conditions is much stronger, reducing the percent correct from 92% to 66% for short inter-stimulus intervals. However, for the target alone, the effect of backward masking is lower and maximal for a time duration of 30 msec, where the percent correct was reduced from 96% to 77% for a shorter inter-stimulus interval of 30 msec. The effect of backward masking on the target alone (return to the dashed lines) diminished after stimulus asynchrony onset of 120 msec (presentation time + inter-stimulus interval), whereas under the crowded conditions the effect is apparent for after stimulus asynchrony onset of more than 180 msec. Thus, for the target alone, a stimulus availability of 120 msec is enough for correct processing, whereas under the crowded conditions the stimulus availability needs to be much longer to correct the processing of the target. This result supports the idea that crowded conditions impose longer processing times. In parallel to the backward masking effect on the percent correct, the reaction time becomes significantly slower (paired t-test, p = 0.00001 time duration = 30; p = 0.0001 time duration = 60) in all cases of backward masking, ranging from 50–140 mesc under the crowded conditions. These results clearly show that processing of letter recognition under crowded conditions requires more processing effort, as revealed by the longer time needed for decision (reaction time).
Most of the previous studies focused on the effect of crowding in the periphery where the effect is very pronounced, assuming that crowding is absent in normal fovea1,2,8,10. However, recently there has been some controversy about the nature, size and even the existence of crowding in the normal fovea9,16,17. Here, for the first time, we show the robust effect of performance reduction (contour interactions) under crowded conditions in the fovea, when the stimulus availability is limited. Since the effect revealed from target-flanker spacing is much larger than the known effect of contour interactions, next we will argue that this effect demonstrates the crowding effect at the fovea. However, since our results are not consistent with recent attempts to characterize the crowding effect1,7, one may continue to characterize this as being “not the crowding effect”. The major assumptions in these studies is that crowding is absent in the fovea. However, we noted that due to the assumption that there is no crowding in the fovea, the criteria were derived from studies of crowding in the periphery. Moreover, the temporal domain of crowding is either explicitly ignored4, acknowledged, but not included1, seen as temporal interference26 or assumed that it is not part of the crowding7. Thus, these criteria are not directly applicable to our results that were revealed in the fovea using the temporal domain. Moreover, our results are supported by a recent study showing that, even at the periphery, the critical distance is not fixed and is presentation time dependent27, deviating from the basic law of fix window of crowding effect in these criteria1,7, showing that the critical distance for crowding, at the periphery, is not fixed and is larger for shorter presentation times. Most models of crowding consider two stages of processing. Two stages of processing naturally cost processing time, which are not included in the spatial models of crowding. However, recent study28 found “two processing stages: an early ‘detection’ stage, whereby only locations of high-contrast energy in the image are selected, followed (after 100 ms) by an ‘identification’ stage, whereby image intensity at selected locations is used to determine the identity (whether bright or dark) of the target.” These results are consistent with our finding that extra time is needed to overcome the effect in the fovea. Thus, taken together, there is an emerging support for the idea that classical spatial crowding behaves differently under changing temporal conditions when the time processing limitation is short.
At the fovea, each point in the visual field is processed by several overlapping receptive fields. Moreover, at the centermost 0.75° of the fovea, the cortical representations for both V2 and V3 are larger than those of V129, indicating that more processing power is dedicated to second-level analysis in this small but important part of the visual field. Thus, given enough processing time, smaller receptive fields in V1 responding to the small target (0.33°) may participate in the processing of crowding, following the initial processing of V2 and V329 neurons that have larger receptive fields and thus overcome the crowding. Overcoming the crowding effect may also be achieved by delayed lateral facilitation30,31. Thus, there may be a tradeoff between the time of stimulus availability, which enables longer processing times and improvement in spatial processing. Such a tradeoff of time for better performance is known in visual searching. Moreover, the cortical processing is capable of extracting better information than does retinal input such as hyperacuity32 and with recovering noisy input from an aging eye after training18.
It was suggested that the suppression effect at short target-flanker separations in the lateral masking experiment may explain the crowding effect6. This suggestion is supported by earlier studies, suggesting that crowding is related to “the size of the receptive field (and hence to the resolving capacity) associated with the retinal region used to fixate the target”4. It is possible that larger receptive fields are activated first for short durations, whereas smaller receptive fields are activated later after a longer presentation time33. Thus, for short durations the processing of the target and the surround may take place within the same receptive field, which is larger than the target alone. For longer presentation times, smaller and more optimal receptive fields may respond to the target, an effect that can increase the spatial resolution and enable one to overcome the crowding effect. Our recent study shows that similar rules (suppression and facilitation) apply for the fovea and the periphery when estimating the size of the human perceptive field34, consistent with the cortical magnification factor35. Moreover, we also show that crowding and masking effects are highly correlated with the size of the perceptive field36.
The dichotomy between masking and crowding may arise from anatomical and functional differences between the fovea and periphery aimed at providing a processing advantage. The amount of simultaneous visual stimuli that stimulate the visual field is enormous and cannot be processed reliably due to the limited processing resources. Thus, it is a great advantage that most of the processing power is allocated to the fovea, which specializes in processing high-resolution objects. To achieve this aim, the crowding in the periphery is an automated process, thus reducing the amount of information with increasing eccentricities. This effect is achieved due to several factors that characterize the periphery: RF's increased size34, less processing power, lack of second-order processing of V2 and V329 and a special “foveal” area for object recognition (LOC)37, as well as limited attentional resolution14,15 and eye movements. However, the results of this study provide additional insight that contour interactions at the fovea may behave as crowding under limited time processing conditions, suggesting that future studies of context modulations (masking, contour interaction and crowding) should consider the time domain as a necessary factor.
A total of 178 participants, 40 young and 138 presbyopes (aging eye), participated in the experiments. The number of participants in each experiment is different and will be detailed in the Results section. The participants signed an informed consent form that was approved by the local Institutional Review Board of Sheba Medical Center. All experimental protocol were performed in accordance with the guidelines provided by the committee approving the experiments.
Visual stimuli and procedures
The targets were “Tumbling-E” patterns that are always presented at the fovea, at the fixation location, for durations ranging from 30 to 120 msec. A forced-choice paradigm was used in which the subjects were asked to detect whether the open side of a visible letter E (Fig. 2) was to the right or left side; they reported their answer by pressing the left or the right mouse keys. The size of the letters differed in different experiments and will be described in the Results section. A visible fixation circle appeared in the center of the screen (thus directing the attention to the target location in the fovea) before each trial, which disappeared when the participants pressed the “ready” bottom, after which a blank screen appeared for 300 msec; thereafter the trial began. The subjects were informed of a wrong answer by auditory feedback after each presentation throughout the experiment. The stimuli were viewed from a distance of either 150 cm or 40 cm, which will be described later. The experiments were conducted using a blocked procedure in which only one time duration (30, 60, 120 and 240 msec) was used and the target was presented 100 trials per data point in the fixed size experiment or until the size of the letter reached a threshold using the adaptive method. The order of the blocks was random.
In cases where there was a crowding condition, an array of a random direction of E (flankers) surrounding the target was added. The target-flanker separation was either one letter size (Figure 1) or 0.4 letter size (Figure 2) and remained constant for all the presentation times. The percentages of correct answers for the target alone and under the crowded conditions were measured separately. The crowding effect was indicated by a reduction of the percent correct under the crowding conditions relative to the target alone.
In cases of presbyopia we applied here the same paradigm that we had used before3,21 in order to investigate the crowding at different presentation times. It consists of a LogMAR chart equivalent, monitor-based paradigm that used E-patterns presented for presentation times ranging from 30–240 msec. Three rows of five E-patterns each, facing one of two directions, with a 0.1-log unit size difference between the rows were presented. These stimuli correspond to a subset of the LogMAR chart, with a baseline pattern size corresponding to the baseline (i.e. 6/6 vision) of the LogMAR chart. The central pattern (the center of the middle row) was always the target for identification. The patterns were black on a gray background, with a luminance of 40 cd/m2 and the viewing distance was 40 cm. For each trial the task was to determine the direction of the central E (the target) presented for durations of 30, 60,120 and 240 msec. An adaptive procedure in which the pattern size and spacing were modified in 0.1 log unit steps was used to determine the size for 79% correct (the chance was 50%). Different auditory feedbacks were given for correct and incorrect responses. To determine crowding, we used separate runs for the target alone (single) and the crowded (crowded) conditions for each presentation time. We then computed the crowding value as crowded – single (difference on a log scale), i.e. normalizing the crowded conditions by the acuity of a single pattern. We recently showed that this procedure is highly correlated with the measure of near visual acuity on an ETDRS chart38.
In cases of backward masking (Figure 4), a matrix of 5 × 5 randomly oriented E letters appears for the same duration as the target's duration and is delayed by 30–120 msec (inter-stimulus interval, ISI) after the target presentation.
Stimuli were displayed on a Philips 107P color monitor. The experiments were controlled by a Dell PC. Screen resolution was 1024 × 768 pixels occupying a 9.2° × 12.2° area. The mean display luminance was 40 cd/m2 in an otherwise dark environment. Gamma correction was applied.
Whitney, D. & Levi, D. M. Visual crowding: a fundamental limit on conscious perception and object recognition. Trends cogn sci 15, 160–168, 10.1016/j.tics.2011.02.005 (2011).
Levi, D. M. Crowding–an essential bottleneck for object recognition: a mini-review. Vision res 48, 635–654, 10.1016/j.visres.2007.12.009 (2008).
Bonneh, Y. S., Sagi, D. & Polat, U. Spatial and temporal crowding in amblyopia. Vision res 47, 1950–1962 (2007).
Flom, M. C., Weymouth, F. W. & Kahneman, D. Visual resolution and contour interaction. J Opt Soc Am 53, 1026–1032 (1963).
Chung, S. T., Levi, D. M. & Legge, G. E. Spatial-frequency and contrast properties of crowding. Vision res 41, 1833–1850 (2001).
Polat, U. & Sagi, D. Lateral interactions between spatial channels: suppression and facilitation revealed by lateral masking experiments. Vision res 33, 993–999 (1993).
Pelli, D. G., Palomares, M. & Majaj, N. J. Crowding is unlike ordinary masking: distinguishing feature integration from detection. J Vis 4, 1136–1169 (2004).
Pelli, D. G. & Tillman, K. A. The uncrowded window of object recognition. Nat neurosc 11, 1129–1135 (2008).
Chakravarthi, R. & Cavanagh, P. Recovery of a crowded object by masking the flankers: determining the locus of feature integration. J Vis 9, 4 1–9, 10.1167/9.10.4 (2009).
He, S., Cavanagh, P. & Intriligator, J. Attentional resolution and the locus of visual awareness. Nature 383, 334–337 (1996).
Parkes, L., Lund, J., Angelucci, A., Solomon, J. A. & Morgan, M. Compulsory averaging of crowded orientation signals in human vision. Nat neurosc 4, 739–744. (2001).
Polat, U. & Sagi, D. The architecture of perceptual spatial interactions. Vision research 34, 73–78 (1994).
Livne, T. & Sagi, D. Multiple levels of orientation anisotropy in crowding with Gabor flankers. Journal of vision 11, 18, 10.1167/11.13.18 (2011).
Carrasco, M. Visual attention: the past 25 years. Vision res 51, 1484–1525, 10.1016/j.visres.2011.04.012 (2011).
Coates, D. R. & Levi, D. M. Contour interaction in foveal vision: A response to. Vision res, 10.1016/j.visres.2013.10.016.
Siderov, J., Waugh, S. J. & Bedell, H. E. Foveal contour interaction for low contrast acuity targets. Vision res 77, 10–13, 10.1016/j.visres.2012.11.008 (2013).
Intriligator, J. & Cavanagh, P. The spatial resolution of visual attention. Cognit Psychol 43, 171–216 (2001).
Polat, U. et al. Training the brain to overcome the effect of aging on the human eye. Sci rep 2, 278, 10.1038/srep00278 (2012).
Polat, U. Making perceptual learning practical to improve visual functions. Vision res 49, 2566–2573, 10.1016/j.visres.2009.06.005 (2009).
Owsley, C. Aging and vision. Vision res 51, 1610–1622 10.1016/j.visres.2010.10.020 (2011).
Bonneh, Y. S., Sagi, D. & Polat, U. Local and non-local deficits in amblyopia: acuity and spatial interactions. Vision res 44, 3099–3110 (2004).
Baron, W. S. & Westheimer, G. Visual acuity as a function of exposure duration. J Opt Soc Am 63, 212–219 (1973).
Sterkin, A., Yehezkel, O., Bonneh, Y. S., Norcia, A. & Polat, U. Backward masking suppresses collinear facilitation in the visual cortex. Vision res 49, 1784–1794, 10.1016/j.visres.2009.04.013 (2009).
Breitmeyer, B. G. Visual masking: an integrative approach. Vol. 4 (Oxford University Press, 1984).
Polat, U. & Sagi, D. The relationship between the subjective and objective aspects of visual filling-in. Vision res 47, 2473–2481 10.1016/j.visres.2007.06.007 (2007).
Westheimer, G. & Hauske, G. Temporal and spatial interference with vernier acuity. Vision res 15, 1137–1141 (1975).
Tripathy, S., Cavanagh, P. & Bedell, H. Large Interaction Zones for Visual Crowding for Briefly Presented Peripheral Stimuli. J Vis 13, 571, 10.1167/13.9.571 (2013).
Neri, P. & Heeger, D. J. Spatiotemporal mechanisms for detecting and identifying image features in human vision. Nat neurosc 5, 812–816 10.1038/nn886 (2002).
Schira, M. M., Tyler, C. W., Breakspear, M. & Spehar, B. The foveal confluence in human visual cortex. J neurosci 29, 9050–9058, 10.1523/JNEUROSCI.1760–09.2009 (2009).
Polat, U. & Sagi, D. Temporal asymmetry of collinear lateral interactions. Vision res 46, 953–960 (2006).
Sterkin, A. & Polat, U. Response similarity as a basis for perceptual binding. J Vis 8, 17 11–12, 10.1167/8.7.17 (2008).
Westheimer, G. Editorial: Visual acuity and hyperacuity. Invest Ophthalmol 14, 570–572 (1975).
Bar, M. et al. Top-down facilitation of visual recognition. P Nat Acad Sci USA 103, 449–454 (2006).
Lev, M. & Polat, U. Collinear facilitation and suppression at the periphery. Vision res 51, 2488–2498, 10.1016/j.visres.2011.10.008 (2011).
Levi, D. M., Klein, S. A. & Aitsebaomo, A. P. Vernier acuity, crowding and cortical magnification. Vision res 25, 963–977 (1985).
Lev, M. & Polat, U. When masking is like crowding. J Vis 12, 333, 10.1167/12.9.333 http://www.journalofvision.org/content/12/9/333.abstract?sid=7d3f1e40-d6d3-42bf-986b-83777a38128b (2012).
Malach, R. et al. Object-related activity revealed by functional magnetic resonance imaging in human occipital cortex. P Nat Acad Sci USA 92, 8135–8139 (1995).
Yehezkel, O., Sterkin, A., Lev, M. & Polat, U. Digital precise remote near visual acuity evaluation using mobile devices. Ass Res Vis and Ophthal (2013). http://www.arvo.org/webs/am2013/abstract/sessions/128.pdf.
This study was supported by grants from the Israel Science Foundations (ISF188/2010) and Glassesoff, Inc.
Dr U.P. work has been funded by Glassesoff Inc. He has received compensation as consultant and a member of the scientific advisory board and owns stock in the company. O.Y. work has been funded by Glassesoff Inc. as employee and owns company options as employee. M.L. declares no competing financial interest.
About this article
Cite this article
Lev, M., Yehezkel, O. & Polat, U. Uncovering foveal crowding?. Sci Rep 4, 4067 (2014). https://doi.org/10.1038/srep04067
Scientific Reports (2021)
Mixture model investigation of the inner–outer asymmetry in visual crowding reveals a heavier weight towards the visual periphery
Scientific Reports (2021)
Scientific Reports (2018)
Visual crowding is a combination of an increase of positional uncertainty, source confusion, and featural averaging
Scientific Reports (2017)
Scientific Reports (2016)