Introduction

The purpose of the visual system is to perceive objects or scenes from an input image. Since the function of the early visual mechanism is to extract local features in an image1,2, in order to perceive a coherent object, the visual system has to combine local features into global patterns. This perceptual grouping process requires a higher-order visual mechanism which can integrate the local features extracted by the early visual mechanism.

Furthermore, most objects and scenes are colored. The chromaticity of objects or scenes does play an important role in vision. It has been shown that the early visual mechanism can extract chromatic local features in an image. A human observer can detect periodic patterns modulated only in chromaticity3,4. The mechanisms which detect chromatic patterns also show a tuning to spatial frequency5,6,7,8,9,10,11,12 and orientation5,11,12,13 like those that detect luminance patterns. However, while there are many studies on the properties of the early visual mechanism for extracting chromatic local features, how the visual system integrates these chromatic local features into a global pattern is less well understood. Here, we approached this issue of integrating chromatic information with symmetry detection.

A visual stimulus is symmetric if one part of the stimulus is a reflection of another part about an axis, called the symmetry axis. To determine whether an image is symmetric, the visual system first has to find correspondence between local features with an idiosyncratic filter14,15,16,17,18,19,20,21 or reverse mapping22,23,24,25,26,27. Then, a higher-order mechanism takes the output of these early mechanisms and decides whether an image is symmetric23

There have been studies on detecting symmetric patterns with more than one chromaticity. Troscianko28 showed that an observer can discriminate a yellow symmetric pattern from a random pattern on an isoluminant green background. This suggested that chromaticity alone can support symmetry. However, since Troscianko28 used only one pair of chromaticities, it is not clear whether the symmetry detection mechanism responds selectively to a specific chromaticity, as in the early visual mechanism29,30, or to symmetric patterns in any chromatic contrast.

Pashler and his colleagues31,32 presented their observers with image patterns composed of colored squares arranged to be either completely symmetric about a vertical axis or with one or two pairs of corresponding squares mismatched in color. They showed that the response time for the observers to judge whether a pattern was symmetric was longer when the pattern contained four colors than when the pattern contained only two colors. They argued that their result could be explained by a color-blind symmetry mechanism, guided by attention shift, which assesses symmetry sequentially from one color to the other. Their results, however, can also be explained by attention shift between color-selective symmetry mechanisms which can extract symmetry in each individual color. Whether the mechanisms underlying symmetry detection are color-blind or color selective is still inconclusive. In addition, in their stimuli the image elements were not equiluminant and thus it is possible that their result was contaminated by luminance differences between image elements.

In this study, we are interested in whether the mechanisms underlying symmetry detection are color selective or not. We approached this issue with a noise masking paradigm, which is a well established paradigm to evaluate the color selectivity of a visual mechanism33,34,35. In a typical noise masking experiment, the task of the observers is to detect a target embedded in a noise mask. The detection threshold for the target measured with a noise mask is then compared with that without a mask. The rationale of noise masking is that to detect a target, the response of the visual mechanism for detecting the target has to be intense enough to overcome the limitation imposed by the noise. The noise mask, if it has any effect on the detection mechanisms, is to perturb the mechanism's response and provides, in addition to the intrinsic random activities in the system, another source of noise. Thus, if the noise mask is effective, it would be more difficult for an observer to detect the target. Otherwise if the noise mask is ineffective, there would be little change in the threshold when the mask is presented. The effectiveness of a noise mask might reflect the sensitivity of the target mechanisms to the image properties in the mask. Hence, the change in target threshold reflects the sensitivity of the target mechanism to the image properties in the noise mask.

In our experiment, the observer was to detect a symmetric pattern (target) embedded in a noise pattern (mask). We manipulated the chromaticity of the symmetric target and mask and then observed how the detectability of the symmetric target was affected by the presence of the mask. If a symmetry detection mechanism was color selective, it would only be able to find correspondence between local elements of a small set of colors. The presence of a noise mask in the color preferred by the target detection mechanism would lead to more mismatches in that mechanism and in turn degrade the performance. If the noise color was different from that preferred by the target detection mechanism, it would not interfere with that mechanism and thus have no effect on symmetry detection. On the other hand, if the symmetry detection mechanisms were not color selective, it would find correspondence between all local elements, regardless of their color. Hence, all noise masks would increase the chance of mismatch and in turn degrade symmetry detection performance, regardless of the color of the target or mask.

Results

Figure 1 shows the density threshold for detecting a symmetric target. We used a two-alternative forced choice (2AFC) paradigm to measure threshold at 75% correct level. In each trial, a vertical symmetric target was randomly presented in one of the two intervals while a random dot control, in the other. The observers' task was to determine which interval contained the symmetric target. The chromaticity was defined in polar coordinates of a cone contrast space36,37 with azimuth angle denotes the deviation from the “red” color (L-M), the elevation angle denotes the luminance modulation and the distance denotes the contrast.

Figure 1
figure 1

The example of the stimuli.

Panel a represents a red target superimposed on a red noise mask. Panel b represents a red target superimposed on green noise. The mask density was 1%.

Columns A-C of Figure 1 represent the results for the red, blue and white targets respectively. Each row represents the data from one observer. The circles in each panel represent the target density threshold in the presence of various noise mask densities. The triangles and dashed lines represent the target density threshold measured without a mask. The smooth curves are the fit of 8th-power Gaussian function, that is, the target density threshold in log units, y = a* exp(−x8/2σ8) + b, where x is the azimuth or elevation difference between the target and the mask in degree, a and b are constants and σ is the scale parameter (standard deviation) of the Gaussian function.

As shown in Figure 1A, the elevation of density threshold for the red target was greatest under the presence of the red mask. The target density threshold increased from 1.08 to 1.37 log units, corresponding to a 12- to 23-fold difference. The difference was statistically significant for all three observers (all t > 9.67, p < .0001). This masking effect decreased as the mask hue deviated from the target. The bandwidth, defined as the scale parameter of the 8th-power Gaussian function, averaged across observers, was 75.4 degrees (SD = 2.7). The orthogonal (90° deviation from the target on the isoluminance plane) and opponent (180° deviation) masks showed little, if any, masking effect. Their effects were only 8.8 to 9.1% of the effect of the same-hue mask. This reduction in the masking effect was statistically significant (t(2) = 14.40, p = .0024 for the orthogonal mask and t(2) = 43.61, p = .00026 for the opponent mask).

The blue target condition showed similar results, though the bandwidth was narrower (Figure 1B). For all three observers, masking was greatest when the mask and target had the same hue. The target density threshold increased by 1 to 1.3 log units, or a 10- to 20- fold difference with the presence of the same-hue mask. This difference was statistically significant for all observers (all t > 8.73, p < .00016). The target density threshold dropped rapidly when the hue of the mask deviated from that of the target. The averaged bandwidth was 5.98 degrees (SD = 4.55). The orthogonal and opponent masks showed a small masking effect. The target density threshold increment was no more than 0.26 log units and was only 7.6% to 10.2% of the same-hue mask effect. This reduction in masking effect from that of the same-hue mask was statistically significant (t(2) = 16.34, p = .0029 for the orthogonal mask and t(2) = 12.04, p = .0034 for the opponent mask).

The luminance target conditions also showed a similar trend to the isoluminant conditions (Figure 1C). The masking effect was largest when the target and the mask had the same luminance. For all three observers, the masking effect was 0.97 to 1.2 log units, or a 9- to 16-fold difference (all t > 7.72, p < .0003) when the mask and the target had the same luminance. The masking effect decreased rapidly as the difference in luminance between the target and mask increased. The bandwidth, estimated from the 8th-power Gaussian function, was 17.6 degrees (SD = 21.77), averaged across three observers. The isoluminant mask (90° deviation from the target) and the black mask (180° deviation) showed little masking effect. They increased the target density threshold about 0.03 to 0.24 log units (1.08- to 1.75-fold). It was only 9.7% to 12.4% of the effect produced by a mask of the same luminance polarity. The reduction in masking effect was statistically significant (t(2) = 9.70, p = .0052 for the orthogonal mask and t(2) = 19.48, p = .0013 for the opponent mask).

Discussion

We measured the masking effect of noise on symmetry detection. We showed that a noise mask with the same chromaticity as a symmetric target always produced the largest masking effect on target detection. The masking effect decreased as the difference in chromaticity between the target and mask increased. This result suggests that the symmetry detection mechanisms are color selective. Each symmetry detector only extracts symmetry from image elements of a specific chromaticity. The luminance condition showed a similar masking effect. Hence, the symmetry mechanisms are also selective for luminance polarity.

Both orthogonal and opposite masks had little, if any, effect on target detection. Thus, the target detector treated opponent color the same as orthogonal color. This is consistent with the property of the half-wave rectified cortical color vision mechanism which responds to one polarity of opponent colors30. Thus, the mechanism underlying color selectivity in symmetry detection should be cortical in origin.

In addition, the tuning functions for the three targets were all limited in bandwidth. Hence, no single mechanism can cover symmetry detection in all colors. That is, there are multiple color selective mechanisms underlying symmetry detection. On the other hand, the visual system may not require a large number of color selective mechanisms. The bandwidth of the tuning functions, expressed as half-height-half-width, for the red and blue targets roughly covers the whole quadrant of the isoluminant plane (78.54° + 6.23° = 84.88°). Hence, there is little need for mechanisms selective for intermediate colors.

Chromaticity processing in grouping mechanisms

In addition to symmetry, there are other phenomena that require the visual system to integrate local information into a global percept. There have been studies concerned with the role of color in these integration phenomena, most of which did not show a color selectivity. For example, Gheorghiu and Kingdom38 measured the perceived spatial frequency of a waved contour consisting of a string of Gabor elements, before and after the observers adapted to a waved texture. They found that whether the adaptor and the test modulated in the same color directions had little effect on the magnitude of the aftereffect. Hence, they suggested that the texture-shape mechanism was nonselective for color. Some other studies using Glass patterns, which contain randomly distributed dot pairs or dipoles whose orientations are determined by certain geometric transforms39,40, came to the same conclusion. Wilson and his colleagues41,42 manipulated the color difference between dipoles to see how the global mechanism pools local orientation information across dipoles to exact a global pattern. They found no effect of the difference in luminance and isochromaticity between local features on Glass pattern detection. Hence, they argued that the global Glass pattern detection mechanisms are not color selective. Evidence from the adaption paradigm also supports the non-color selective claim. Rentzeperis and Kiper43 found that adapting to Glass patterns significantly elevated threshold for the subsequent detection of patterns of the same form, regardless of their color or luminance polarity43. With a noise masking paradigm similar to ours, Wilson and Switkes41 also showed that noise masks always raise Glass pattern detection thresholds regardless of the chromaticity of the target and mask. This also went against the prediction of the color selective mechanism hypothesis.

On the other hand, there are also studies showing that the global pattern mechanisms are color selective. Using a noise masking paradigm, but different experimental settings from Wilson and Switkes41, Cardinal and Kiper44 found that detection thresholds of Glass patterns were highest when the chromaticity of the noise was equal to or near that of the signal and that the threshold decreased as the difference between signal and noise colors increased. Hence, they suggested that global pattern processing is color selective.

There may be two reasons for the discrepancy in color selectivity in these global pattern mechanisms. The first is that different types of stimuli require different visual mechanisms and those visual mechanisms have different color properties. Some are color selective while others are not. Second, the discrepancy may be due to the difference in task requirements and in turn the mechanisms involved. Some tasks may rely more heavily on local mechanisms than others. Hence, some experiments may just pick up the color properties of the local mechanisms rather than that of the global pattern mechanisms. For example, detecting Glass patterns requires local mechanisms to process the local orientation information provided by individual dipoles45,46,47,48,49,50. If these local pattern mechanisms are color selective, a task which requires the observers to rely on local information will display the color selectivity of the local mechanisms rather than the global one41,42,51.

Our result, however, cannot come from the local mechanisms. Unlike other global pattern integration phenomena, such as Glass patterns or texture contour, where the local feature grouping plays a crucial role, symmetry detection requires the visual system to find long-range pairs23. The possibility of local information intrusion may come from local elements near the symmetry axis. We avoided this intrusion by inserting a blank region covering the symmetry axis, removing the local information available to the observer. Hence, contamination from the local mechanism was largely reduced in our experiment. Therefore, the color selectivity properties we found here should be due to a unique symmetry mechanism that is different from those for other global pattern integration phenomena.

Methods

Participants

Three observers participated in this experiment. Among them, CCW was one of the authors of this paper while the other two were naïve to the purpose of the experiment. All observers had corrected to normal (20/20) visual acuity. This study was approved by IRB of National Taiwan University Hospital. The written consent was obtained from all observers.

Apparatus

The stimuli were presented on a 24-inch LCD monitor controlled by a Macintosh computer via a Radeon 7200 graphics board which provided 10-bit digital-to-analog converter depth. The LCD monitor was calibrated with a PhotoResearch PR655 radiometer for both luminance and chromaticity. The display had a mean luminance of 76.81 cd/m2 and mean chromaticity at (0.33, 0.33) in CIE 1931-xy coordinates. The refresh rate of the monitor was 60 Hz. The viewing distance was set such that each pixel extended 1′ of visual angle.

Stimuli

The stimuli were composed of dots randomly distributed in a 46 (W) by 40 (H) grid system. The width of each cell was 0.21° visual angle. The display had a 9.9° visual angle extent. The position of a dot was jittered within the cell. The purpose of using the grid system was to avoid overlapping dots within one image.

Each dot was defined by an 8th-power Gaussian function, or K(x, y) = BG + BG.* C exp(x8/2σ8 + y8/2σ8) where x and y were the distances in degrees from the fixation point; σ = 0.11° was the space constant; BG was a 3 by 1 vector that specified the cone excitation coordinates of the background; C was the 3 by 1 cone contrast column vector52 that specified the color modulation and the symbol “.*” denoted element by element multiplication of the two vectors.

The cone contrast vector C = [CL, CM, CS]T was a column vector with three elements. Among them, the L-cone contrast, CL, was defined as ΔL/L0 where L0 was the L-cone excitation produced by the background and ΔL = L − L0 where L was the L-cone excitation to the center of a dot. If there was a decrement in cone excitation, the cone contrast was negative. The M-cone and S-cone contrasts, denoted by CM and CS respectively, were defined similarly. Cone excitations were the product of the power spectral distribution of the light and the estimated spectral sensitivity functions of the corresponding cones53.

Each cone contrast vector can be separated into two parts: a scalar value for contrast and a normalized cone contrast vector, C/|C|, where |C| denotes the length of the vector C. The contrast of a stimulus was defined as c = (CL2 + CM2 + CS2)0.5/30.5. This measure was proportional to the square root of the cone contrast energy and varied between 0 and 1. Contrast was expressed in dB re 1 which equaled 20 log10c. In the experiment, the contrast of each stimulus was set at three times threshold of that stimulus for each observer. The experiment for threshold measurement is described in the Supplementary Method online.

The stimuli were defined in a color space spanned by one isochromatic and two isoluminance axes. The isochromatic axis defined luminance modulation from the background. The two directions of this axis were white and black, with the normalized cone contrast vectors [0.577, 0.577, 0.577]T and [−0.577, −0.577, −0.577]T respectively. The two isoluminance axes were computed as in the null space of the CIE2007 luminous efficiency function Vλ54. That is, we first computed the relative contribution of each cone type to the luminosity function, which corresponded to the normalized cone vector [0.853, 0.522, 0]T. We then computed the null space of this cone excitation vector and scaled the result by the background to obtain the two isoluminant axes. One isoluminance axis defined the modulation between red and green, with normalized cone contrast vectors [0.416, −0.909, 0]T and [−0.416, 0.909, 0]T respectively while the other axis defined the modulation in blue and yellow, with normalized cone contrast vectors [0, 0, 1]T and [0, 0, −1]T respectively. Notice that, any color on a plane spanned by any two isoluminant axes were also isoluminant.

For a better visualization, one can also represent the color of a stimulus as a point in a polar cone contrast space36,37,55 with the distance from the origin to the point representing contrast; elevation, luminance; and azimuth, hue. In this space, the elevations −90° and +90° represent black and white respectively. The azimuth angles 0°, 180°, 90° and 270°, with zero elevation, represent red, green, blue and yellow respectively. Table 1 lists polar representations of our stimuli and their corresponding cone contrast vectors.

Table 1 The coordinates of the color space and chromoluminance cone contrast space of the color

In each trial, the stimulus consisted of a noise mask composed of random dots, on which was superimposed either a target of vertically symmetric dots or a noise control composed of random dots. The density of the noise mask was fixed at 0.01. That is, there were about 18 noise dots in the display. In a symmetric target, half of the image was a reflection of the other half, about a vertical axis. That is, a pixel at position (x,y) of the symmetric target I has the property I(x, y) = I(−x, y). The density of the target and the noise control varied from one trial to the next according to the experimental procedure discussed below. The target and noise control were either red, blue or white. The normalized cone contrast vector of the noise mask deviated from the target by 0°, 22.5°, 45°, 67.5°, 90°, or 180°, either on the isoluminant plane or on the plane spanned by the red-green axis and the luminance axis. To prevent observers from using local information near the axis to make a judgment, no dots were presented in the region 0.7° to the left and right of the symmetry axis. Figure 2 shows examples of our stimuli.

Figure 2
figure 2

The results of the experiment.

Panel A represents the results for the isoluminant red target condition. The red symbols denote the target density thresholds for a red target superimposed on a noise mask of various colors respectively. The pink symbol denotes the red target density threshold when there was no mask, serving as a baseline. Panel B represents the results for the isoluminant blue target condition. The blue symbols denote the target density thresholds for a blue target superimposed on a noise mask of various colors respectively. The cyan symbol denotes the blue target density threshold when there was no mask, serving as a baseline. Panel C represents the results for the luminant white target condition. The black symbols denote the target density thresholds for a white target superimposed on a noise mask of various colors respectively. The gray symbol denotes the white target density threshold when there was no mask, serving as a baseline. Each graph in three panels represents data from one observer.

Procedures

A temporal 2AFC paradigm was used to measure the density thresholds of the symmetric target detection. In each trial, the symmetrical target was randomly presented in one of the two intervals while the noise control was presented in the other interval. Both were embedded in the noise mask that was presented in both intervals.

The stimulus duration was 233 ms and the inter-stimulus interval (ISI) was 600 ms. An audio tone indicated the beginning of each interval. The task of the observers was to judge which interval contained a symmetric target. The observers were informed that the symmetry axis was vertical. An audio feedback for the response was provided. The PSI threshold-seeking algorithm56 was used to measure the density threshold at 75% correct level. There were 40 trials for each threshold measurement. Each datum point reported was an average of four to eight repeated measurements. The experiment was blocked by target chromaticity to reduce uncertainty. Within each block, the order of noise type was randomized.