Interocularly merged face percepts eliminate binocular rivalry

Faces are important visual objects for humans and other social animals. A complex network of specialized brain areas is involved in the recognition and interpretation of faces. This network needs to strike a balance between being sensitive enough to distinguish between different faces with similar features, and being tolerant of low-level visual changes so that a given face is stably perceived as a particular individual. Such stability may require feedback from higher brain regions down to the level where details are represented. Here, we describe a phenomenon in which interocular competition between face features is stabilized and eliminated when observers attend high-level face characteristics. Two different face images presented to the individual eyes do not cause the perceptual fluctuations that are typically observed in binocular rivalry. Instead, they merge into a stable percept of an intermediate face that combines features from both eyes’ images. The stability of the intermediate face percept depends on the observer attending holistic face properties such as identity or gender. It disappears when observers explicitly attend facial features, suggesting a crucial role of top-down stabilizing feedback from high-level areas that represent holistic faces back to lower processing levels where detailed face features compete for conscious representation.

Faces are important visual objects for humans and other social animals. A complex network of specialized brain areas is involved in the recognition and interpretation of faces. This network needs to strike a balance between being sensitive enough to distinguish between different faces with similar features, and being tolerant of low-level visual changes so that a given face is stably perceived as a particular individual. Such stability may require feedback from higher brain regions down to the level where details are represented. Here, we describe a phenomenon in which interocular competition between face features is stabilized and eliminated when observers attend high-level face characteristics. Two different face images presented to the individual eyes do not cause the perceptual fluctuations that are typically observed in binocular rivalry. Instead, they merge into a stable percept of an intermediate face that combines features from both eyes' images. The stability of the intermediate face percept depends on the observer attending holistic face properties such as identity or gender. It disappears when observers explicitly attend facial features, suggesting a crucial role of top-down stabilizing feedback from high-level areas that represent holistic faces back to lower processing levels where detailed face features compete for conscious representation.
In binocular rivalry, two different images are presented to the individual eyes. This set-up typically leads to perceptual fluctuations between the two images every few seconds 1,2 . Because of this clear dissociation between visual input and conscious visual experience, vision scientists often use binocular rivalry demonstrations to inform a general audience about the decisive role of the brain in visual perception. On one such an occasion, we attempted to demonstrate the binocular rivalry phenomenon with conflicting male and female face images and serendipitously noticed that, instead of the expected fluctuations between perceiving male and female faces, we perceived a stable intermediate, androgynous face (Fig. 1a). The fact that the stable face percept eliminated typical binocular rivalry alternations suggests a role of top-down stabilization mechanisms in face perception, whereas the intermediate androgynous nature of the percept implies that face feature information from both eyes is used to generate the percept. We set out to formalize this phenomenon and investigate its mechanism in a series of experiments.
It was later brought to our attention that interocular face merging had already been informally described in a letter to Charles Darwin, sent in 1877 and discussed in the works of Sir Francis Galton 3 . It was qualitatively explored about 80 years later 4 , yet despite this early exploratory study encouraging "further study to provide a satisfactory account of the effect" 4 , we believe that our current work offers the first controlled investigation of the phenomenon. switches than the more conventional rivalry between orthogonal gratings or house and face images but significantly more than between the two faces (one-way ANOVA with factor stimulus type F(3,28) = 77.8, p < 0.001; Tukey's HSD multiple comparison's tests, all p < 0.001, except G:G vs. H:F p = 0.80). This suggests that part of the stability of interocularly merged face percepts may be due to the high level of similarity between the two face images, but another important part is presumably due to mechanisms that are typical for face perception. We investigate this further in the remaining experiments. Experiment 2 addressed the apparent interocular perceptual merging of monocularly presented face images. We briefly (1,000 ms) presented observers with a male face image in one eye and a female face image in the other eye (M:F) and asked them to rate the gender of the face they perceived by setting a slider between 100% male and 100% female. In other conditions of this experiment we presented the two eyes with identical images of either male or female faces (M:M/F:F), or a morph image that we constructed to be exactly intermediate between male and female (I:I). The distributions of reported gender ratings were compared among stimulus conditions with the rationale that if male-female face image combinations rival, observers would either perceive the male or the female face and gender rating distributions would be bimodal and comparable to the combined distribution of the all-male and all-female conditions. If, however, male-female face combinations merge into a stable intermediate face, gender rating distributions are more likely to be unimodal, centered around the middle of the male-female scale and comparable to the intermediate morph condition. The latter is what we found (Fig. 1c). Gender rating distributions for combinations of male and female faces were unimodal (Hartigan's dip test, D = 0.012, p = 1) 5  Attention to holistic faces stabilizes rivaling face features. In Experiment 3A, we investigated the interocular face merging phenomenon in a different way. We reasoned that if the intermediate face percept is indeed a composite of face feature information from both eyes, then removing the image from one eye during the stable percept should always cause a change in perceived face identity as some of the features which contributed to the mixed percept will disappear. If, however, the two face images would rival as in traditional binocular rivalry, then removing one of them would only cause a perceptual change if the removed image was perceptually dominant at the time of removal. In addition, to test whether the stabilization of face rivalry could be a specific consequence of the specialized face perception system we also performed the experiment with upside-down faces that are thought to engage the brain's face perception system to only a limited extent [6][7][8][9][10] .

Interocular face combination yields intermediate face percepts.
Face images were presented to the two eyes for 1,500 ms. After a brief blank interval (200 ms), only one of the images returned on the screen for another 1,500 ms and observers reported whether face identity changed across the blank interval. As expected, for all conditions with two identical images (M:M, F:F, or I:I) observers rarely reported a change (False positive rate = 7.2%) (Fig. 2a). For the upright male-female face condition, however, observers reported more identity changes than the expected rate of 50% for rivalling stimuli (one-tailed t-test, t(7) = 5.54, p < 0.001). This was a significantly higher rate than that observed for upside down faces (t(7) = 6.44, p < 0.001) for which the reported rate was indistinguishable from 50% (t(7) = −0.52, p = 0.69). Moreover, while changes from the interocular perceptual morph were on average reported at a slightly higher rate when probed with a male face (76% ± 5% SEM) compared to when probed with a female face (66% ± 4%)(two-tailed t-test, t(7) = 2.73, p < 0.03), both cases had a reported change rate that was significantly higher than 50% (one-tailed t-tests, both p's < 0.01).
To ensure that these results were not due to difficulties in recognizing identity changes in upside-down faces, we carried out a control experiment (Experiment 3B) in which we repeated those conditions of Experiment 3 A in which the same images were presented to the two eyes in the first interval (i.e. M:M, F:F and I:I), but now in half of the trials we presented a different face image in the second interval, expecting this to lead to high levels of identity change detection. For M:M and F:F, changes were detected well above 50% (one-tailed t-tests, p < 0.01) and observers were equally accurate for upright and upside-down faces (t(7) = 1.08, p = 0.32). In all other conditions, performance was slightly better for upright than for upside-down faces (all p < 0.05), but both changes and no-changes were reported well above 50% for both orientations (all p < 0.01).
Since the continuous rivalry paradigm in Experiment 1 showed that rivalry between two similarly looking house images also evoked significantly less perceptual switching than rivalry between gratings or between houses and faces, we wondered whether the interocular merging we observed for face images would also to some extent be present for house images. To test this idea, we performed another control experiment (Experiment 3C). Here, we presented the same or different house images to the two eyes. After a brief blank interval, only one of the house images re-appeared and observers reported whether the houses they perceived before and after the blank were the same or different. The results were strikingly different from those with face stimuli in Experiment 3A (Fig. 2b). Changes were reported in the rivalry condition (i.e., two different houses presented before the blank) on approximately 50% of the trials (50% ± 2%; two-tailed t-test against 50%, t(7) = 0.17, p = 0.87). For individual observers, biases were observed towards one of the two houses dominating the rivalry (i.e., a high probability that observers reported a perceptual change when the other house image was shown after the blank). This effect was however not significant at the group level (change rate when probed with house 1: 65% ± 8%, t(7) = 1.86, p = 0.11; change rate when probed with house 2: 35% ± 6%, t(7) = −2.31, p = 0.05; house 1 vs. house 2: t(7) = 2.11, p = 0.07). In cases where the same images were shown to the two eyes before the blank, observers reliably reported both changes (96% ± 1%, t(7) = −29.11, p < 0.001) and repetitions (7% ± 1%, t(7) = 56.06, p < 0.001).
Since the face perception system is thought to process holistic faces and facial features with independent mechanisms 11-15 , we also investigated whether the interocular merging of face images is specific for observers attending holistic face features, like identity (Experiment 4). To this end we asked observers to focus on the eyes of the faces and report changes in this feature. Using the same paradigm as for Experiment 3A, observers were presented with either two identical face images (male or female), or with a male face image in one eye and a female face image in the other eye. Only one of the images was presented again after the blank interval and observers were asked whether they perceived any changes specifically in their perception of the eyes before and after the blank. In contrast to when subjects attended the identity of the face, the observers reported changes in the eye-percept on approximately 50% of trials. This proportion was independent of whether the second image was a male (48% ± 5%) or female face (54% ± 4%)(two-tailed t-test, t(18) = −0.89, p = 0.39), suggesting that face features indeed engaged in rivalry (Fig. 2c). The results were similar regardless of whether the faces were presented in an upright or inverted orientation (comparison to 50% level, upright: t(9) = 1.22, p = 0.26, upside-down: t(9) = 0.11, p = 0.91; upright vs. upside-down: t(9) = 0.87, p = 0.41). The results did not reflect an increased tendency to report eye changes as observers rarely reported a change when the post-blank face was identical to the pre-blank faces (12.0% and 12.5% for upright and upside-down faces respectively).

Monocular grouping cues prevent interocular merging of face percepts.
Our results thus far point to a mechanism in which face features may engage in binocular rivalry before being interocularly combined into an intermediate face percept that comprises visual information from both eyes. Once established, this face percept is then stabilized through top-down influences that rely on holistic face processing. We hypothesized that this mechanism would depend on a lack of monocular feature grouping cues and/or interocular feature contrast. To test this idea, we performed Experiment 5, a version of the gender-rating experiment (Experiment 2) in which we

Discussion
Binocular rivalry is typically characterized as fluctuations between perceptual representations of conflicting stimuli presented to the two eyes. The perceptual fluctuations are however hardly ever this clear-cut and, depending on stimulus conditions, there can be a substantial proportion of the observation time when mixtures of the two stimuli are perceived instead [16][17][18][19] . The amount of interocular mixing and suppression can provide insight into the neural mechanisms of perceptual organization 18 . For conflicting face images, we have shown that in the absence of explicit monocular grouping cues or additional feature competition between the eyes, face information of the two eyes is integrated into a single percept.
Faces are a somewhat special class of visual objects, thought to engage a highly specialized neural machinery 11,14,15,20 . An obligatory face detection stage has been suggested to gate access of visual information to this machinery based on an initial comparison of the incoming visual features against a rough but specific face template 20 . If the matching to this template depends on specific contrast values of the face and on its upright orientation, it would explain why many of the effects that distinguish face processing from object processing are absent if faces are displayed upside-down or with inverted contrast [6][7][8][9]21 . In our experiments, the interocular merging of face information was absent when faces were presented upside-down, suggesting that this phenomenon specifically relies on properties of these specialized face processing modules.
The primate face perception system comprises a complex network of specialized brain areas 20,[22][23][24][25][26][27] . While different aspects of face perception have been shown to preferentially engage the left and right hemispheres [28][29][30][31] , recent findings from studies with monocularly presented images suggest that earlier cortical or even subcortical structures have specific roles in face perception as well [32][33][34] . In these studies, the detection of low-level differences between face images shown before and after a blank interval was better for face images that were monocularly shown to the same eye compared to images shown to different eyes. This advantage disappeared when observers were probed for more complex face features, like gender, implying that either the interpretation mechanisms that rely on holistic face percepts operate on binocular visual information only, or that estimating a gender difference does not require detailed low-level analysis of the individual face features. The interocular mixing of face features into holistic face percepts we observe in the current experiments suggests the former. Our results have important implications for the understanding of binocular rivalry. The fact that participants perceived an ambiguous face composed of features originating from different eyes is difficult to explain as a result of global competition between the two eyes. Interocular combination of visual information into coherent percepts has previously been shown to occur in binocular rivalry 19 . However, in this previous study, interocularly grouped percepts were perceived only fleetingly and more recent studies have questioned whether these grouped percepts may have arisen due to a more local form of eye competition 35,36 . Our results are consistent with a local form of eye competition in which a random selection of male and female monocular features win the competition for representation at binocular levels (Fig. 4). What is remarkable is that this interocularly grouped percept appeared almost completely immune to rivalry fluctuations. This result suggests that once the initial, local competition between the eyes establishes the chimeric face percept, this representation is somehow stabilized to prevent rivalry with a face composed of the non-selected features.
One likely source of this stabilizing factor is feedback from neurons representing the identity of the face. If these neurons actively select their preferred features and/or suppress their non-preferred features, then this may be sufficient to bias the competition between different features so that the identity of the face remains 'locked-in' . This stability could be broken by specifically attending to one of the features (i.e., the eyes) of the face. In this case, the eyes appeared to rival as would be expected in traditional binocular rivalry. This result is consistent with the view that the stabilizing factor comes from a level higher than the processing of individual face features. This stabilization of face identity may be an important evolutionary adaptation allowing the face recognition system to be robust to changes in the incoming visual input (for example small changes in viewing angle or lighting conditions) and extract the unchanging identity of the face.
Is the stabilization of an interocularly merged percept specific for faces? Two dichoptically presented grating stimuli do generally not stabilize into a merged plaid percept, but we did observe that competing house images perceptually alternate at a lower rate than gratings or a house image competing with a face image. While this suggests that the amount of interocular conflict could play a role in the phenomenon, we found no evidence for the interocular merging of house images in the change detection paradigm. This could however be a matter of expertise. Faces are somewhat special visual objects because they are abundantly present, highly socially relevant, and stereotypical in their feature configuration. As a result, most humans are face perception experts and effortlessly assign high-level meaning to faces, like gender or emotion. It is conceivable that comparable expertise for another stimulus class could yield similar results, but it is difficult to imagine a stimulus class where such expertise exists. Perhaps expert house observers, like architects, will perceptually merge two dichoptically presented house images into an intermediate house percept that, to them, is clearly distinguishable from both constituent images on the basis of a holistic attribution that is inaccessible to non-experts. It would furthermore be interesting for future studies to investigate how two dichoptically presented face images are perceived by observers that lack the common holistic face perception expertise, for instance those that are diagnosed with prosopagnosia or autism spectrum disorder [37][38][39] .

Methods
Participants. Observers (Exp 1: n = 8; Exp 2: n = 9; Exp 3 A,B,C: n = 8; Exp 4: n = 10; Exp 5: n = 8) had normal or corrected-to-normal visual acuity and ranged in age between 20 and 37 years. They gave informed consent and all, except one author, were naive with respect to the aims of the study. All experimental procedures were approved by the Ethics Committee from the Faculty of Social and Behavioural Sciences at the University of Amsterdam and performed in accordance with all relevant guidelines and regulations.
Stimuli. All stimuli were generated and displayed using the PsychToolbox extensions 40,41 in Matlab R2010a (The MathWorks Inc., Natick, MA) running under Microsoft Windows 7. They were presented on a gamma-corrected DELL Trinitron 21″ CRT Monitor (1280 × 1024, 85 Hz) and viewed through a mirror stereoscope with a chinrest at a distance of 51 cm. To facilitate binocular fusion, all rivalry images were 3 degrees wide, 4 degrees high and drawn inside a 3.5 by 4.5 degrees alignment frame, on top of an alignment cross with a horizontal section of 5 degrees and a vertical section of 6 degrees (Fig. 1a). To further aid fusion, 100 circles with a random size between 0.5 and 2 degrees and a random luminance between 30 and 70% of the screen's white intensity, were randomly positioned on the neutral gray background.
Ten male (M) and ten female (F) face images, Caucasian, with frontal viewpoints, forward gaze, and neutral expressions, were selected from the Radboud Faces Database (RaFD) 42 . These images were cropped within an elliptical aperture, converted to grayscale, matched for contrast, and facial features were aligned. The house images and gratings (oriented at +45 and −45 degrees) of Experiment 1 were treated similarly. For ten pairs of unique male and female faces we used the Morph Age Pro (Creaceed, Mons) software to create a series of morph images between the male and female face based on 5 fiducial markers. The middle image of each series was selected as the 'Intermediate morph face' (I) of the two original images.
Procedures. In Experiment 1, pairs of images (Grating:Grating, House:M, House:F, M:F, House:House) were shown for two minutes while observers reported any perceptual change by pressing a key on the keyboard. Each condition was shown twice, in random order and counterbalanced over the eyes. We analysed the rate of reported perceptual changes per condition.
In Experiment 2, pairs of face images were shown to the two eyes (M:M same, F:F same, M:F, I:I same) for 1,000 ms after which a slider appeared on the screen and participants used the keyboard to indicate how they judged the gender of their face percept by adjusting the slider between 100% male and 100% female. A total of 240 trials were performed per observer, equally spread over conditions, presented in random order, and counterbalanced over the eyes. We calculated the distributions of gender estimates across conditions. Unimodality of these distributions was evaluated with Hartigan's Dip Test 5 . We determined whether gender-rating distributions were statistically distinguishable using Kolmogorov-Smirnov tests and estimated whether pairs of distributions differed in their similarity by bootstrapping Kolmogorov-Smirnov tests and comparing the resulting distributions of KS statistics with Mann-Whitney U tests.
In Experiment 3A, pairs of face images were shown to the individual eyes (M:M same, F:F same, M:F, I:I same) in upright or upside-down orientation for 1,500 ms. After a 200 ms blank interval during which only the alignment elements were displayed, only one of the original face images returned on the screen for another 1,500 ms and observers indicated whether they perceived a change of face identity across the blank by pressing one of two buttons on a keyboard. A total of 320 trials were performed per observer, equally spread over conditions, presented in a random order, and counterbalanced over the eyes. The probability of a reported change was evaluated with t-tests against the 50% level or between conditions. In Experiment 3B, pairs of images (M:M, F:F, or I:I) were again shown for 1,500 ms followed by a blank 200 ms blank interval. In the 1,500 ms after the blank either one of the original images or a new image from the same category (M, F, or I) was shown to one of the eyes and observers reported whether they perceived an identity change by pressing one of two buttons on a keyboard. Both upright and upside-down faces were tested. A total of 160 trials were performed per observer, equally spread over conditions, presented in random order, and counterbalanced over the eyes.
Experiment 3C was similar to 3A and 3B, but this time pairs of either the same or different house images were shown to the individual eyes for 1,500 ms. After a blank period of 200 ms blank interval, only one of the original house images was shown to one of the eyes and observers reported whether this house was the same as, or different from, the house they perceived before the blank. 240 trials were performed per observer, equally spread over conditions, presented in random order, and counterbalanced over the eyes. Experiment 4 replicated Experiment 3A with a subset of image pairs (M:M, F:F, M:F). Both images (either upright of upside-down) were again shown for 1,500 ms, followed by a 200 ms blank, and a 1,500 ms interval in which one of the images was shown again. Observers this time reported whether they detected a difference in the eyes before and after the blank by pressing one of two buttons on a keyboard. A total of 160 trials were performed per observer, equally spread over conditions, presented in random order, and counterbalanced over the eyes. Experiment 5 replicated Experiment 2 with the only difference that each eye's image was displayed in a different color, red. RGB intensities of 100%, 50% and 50% (red) and 32.5%, 32.5% 65% (green) were chosen based on subjective determination of being balanced during rivalry. The color-to-eye assignment was counterbalanced and pseudorandom. Data availability. Both the datasets generated during and/or analyzed during the current study and the experiment code are available from the corresponding author on reasonable request.