The other-race effect and holistic processing across racial groups

Wong, Hoo Keat; Estudillo, Alejandro J.; Stephen, Ian D.; Keeble, David R. T.

doi:10.1038/s41598-021-87933-1

Download PDF

Article
Open access
Published: 19 April 2021

The other-race effect and holistic processing across racial groups

Hoo Keat Wong¹,
Alejandro J. Estudillo^1,2,
Ian D. Stephen^3,4 &
…
David R. T. Keeble¹

Scientific Reports volume 11, Article number: 8507 (2021) Cite this article

14k Accesses
18 Citations
20 Altmetric
Metrics details

Subjects

Abstract

It is widely accepted that holistic processing is important for face perception. However, it remains unclear whether the other-race effect (ORE) (i.e. superior recognition for own-race faces) arises from reduced holistic processing of other-race faces. To address this issue, we adopted a cross-cultural design where Malaysian Chinese, African, European Caucasian and Australian Caucasian participants performed four different tasks: (1) yes–no face recognition, (2) composite, (3) whole-part and (4) global–local tasks. Each face task was completed with unfamiliar own- and other-race faces. Results showed a pronounced ORE in the face recognition task. Both composite-face and whole-part effects were found; however, these holistic effects did not appear to be stronger for other-race faces than for own-race faces. In the global–local task, Malaysian Chinese and African participants demonstrated a stronger global processing bias compared to both European- and Australian-Caucasian participants. Importantly, we found little or no cross-task correlation between any of the holistic processing measures and face recognition ability. Overall, our findings cast doubt on the prevailing account that the ORE in face recognition is due to reduced holistic processing in other-race faces. Further studies should adopt an interactionist approach taking into account cultural, motivational, and socio-cognitive factors.

Covariation in the recognition of own-race and other-race faces argues against the role of group bias in the other race effect

Article Open access 29 July 2022

Ao Wang, Craig Laming & Timothy J. Andrews

Predominance of eyes and surface information for face race categorization

Article Open access 21 January 2021

Isabelle Bülthoff, Wonmo Jung, … Christian Wallraven

Multi-cultural cities reduce disadvantages in recognizing naturalistic images of other-race faces: evidence from a novel face learning task

Article Open access 27 May 2022

Xiaomei Zhou, Catherine J. Mondloch, … Margaret C. Moulson

Introduction

The other-race effect (ORE; also known as the own-race bias) is a well-documented phenomenon showing that people are generally better at recognizing faces of their own race, compared to faces of different races. It exists across different countries and ethnic groups¹ and is evident not only in laboratory settings but also in real-world scenarios². Although the ORE has been extensively studied for the last four decades, the specific mechanisms underlying this effect are still poorly understood. The present paper aims to shed light on this issue by exploring the holistic processing account of the ORE³.

According to a long-standing scientific tradition, holistic processing is the hallmark of adults’ expert face recognition⁴. While the exact definition of holistic processing is a matter of ongoing debate, it is widely accepted that when adults perceive faces holistically, the facial components (e.g., eyes, nose, mouth) are integrated into a whole or gestalt-like representation^4,5. Two experimental paradigms have been widely employed as standard measures of face-specific holistic processing: the whole-part task and the composite face task. In the whole-part task^6,7, recognition memory of a facial part (e.g., the eyes) is more accurate when it is presented in the context of a whole face than in isolation, suggesting that facial features are embedded into a holistic face percept. In the composite face task⁴, observers’ performance on matching two identical top face halves is better when these top halves are misaligned (i.e., spatially offset) with different bottom halves than when the top and the bottom parts are aligned. This composite effect demonstrates that the face parts are not perceived independently from the whole face.

Holistic processing has been proposed as one important mechanism underlying the ORE. According to this view, in contrast to own-race faces, people are inefficient at integrating facial components from other races into a whole representation^8,9, and therefore other-race faces might be subject to weaker holistic processing than own-race faces. Although a stronger holistic processing for own-race faces compared to other race faces has been reported using the whole-part task⁹ and the composite face task¹⁰, these results are not always replicated^11,12,13. In fact, the results obtained from the composite task are very inconsistent^8,11,14, and certainly not as consistent as those from the whole-part task. The discrepancy in the holistic effect results may stem from methodological differences between studies (e.g., face size¹⁵, measuring methods^10,16, limited construct validity of holistic processing^17,18,19, and independent sample collection from race groups who have differential level of interracial experience^9,10,12). Yet, these observations lend support to the claim that the holistic mode of processing faces allows efficient encoding of an individual face²⁰ and can be moderated by the race of observer²¹.

Limited experience with other-race faces has been proposed as one of the causes of the reduced holistic processing for other-race faces, and therefore the robust ORE. For example, in the aforementioned studies, Caucasian observers had very limited exposure to Asian faces in either daily life or the media; in contrast, Asian participants in the these studies were international students in Western universities and reported having similar amount of social contact with own-race and other-race individuals^8,22. Yet, this experience-based explanation of holistic processing has been questioned because other studies have found equivalent levels of holistic processing for both own- and other-race faces in Asian participants with limited exposure to other-race faces^{10,12,13,23,24}.

An explanation for the roughly equivalent holistic processing magnitude for own-race and other-race faces found in the Asian samples is that, compared to Caucasians, Asians are more prone to holistic processing of both face and non-face visual stimuli. For example, Asian observers exhibit a stronger global processing bias in the classical Navon task than Caucasian observers²⁵. Not only does this theoretical explanation underline the cultural differences in cognitive styles between Caucasians and Asians, but it also implies that holistic processing detected for other-race faces in Asian participants may be attributable to domain-general global processing bias instead of specialised higher-level mechanisms for face recognition, as argued by Michel et al.^10,26. Based on such a general cognitive style, Asians may maintain a relatively broad facial representation that is advantageous for recognising both own- and other-race faces, thereby reducing the ORE. This may further explain why some researchers failed to observe the ORE in Asian samples^27,28. Although empirical studies have set out to explore the association between domain-general global processes, face recognition ability, and face-specific holistic processing^29,30, only a few studies directly evaluated its validity by comparing between multiple ethnic groups with the use of face stimuli of different races. For instance, DeGutis et al.’s¹⁶ and Wang et al.’s³¹ conclusion that recognition ability is strongly linked to the magnitude of holistic processing lack external validity as the former study only tested a Caucasian participant sample with the use of Caucasian faces, whereas the latter study did not report the race of participants and only used Asian face stimuli.

The present study

The widespread assumption in the face perception literature is that the whole-part and the composite face tasks measure the same underlying (holistic) mechanisms^{32,33,34,35,36}. However, a recent study found no association between these two tasks³⁷, suggesting that they, in fact, tap different perceptual mechanisms. So far, only one recent study¹³ employed both composite-face and whole-part tasks to index holistic processing while comparing between two different race groups (Caucasian vs. Chinese). Mondloch et al. reported evidence that the magnitude of holistic processing for own-race and other-race faces did not differ in both Caucasian and Chinese adults. However, this cross-racial study did not measure participants’ face recognition memory and therefore it remains unclear to what extent holistic processing affects the ORE in recognition memory.

In the present study, we investigate whether the ORE in face memory can be attributed to reduced holistic processing (as indexed by both composite-face and whole-part effects) of unfamiliar other-race faces. To increase the generalizability of our results, we test face recognition ability and holistic processing in Malaysian Chinese, African, European Caucasian, and Australian Caucasian young adults using three races of faces (Chinese, Caucasian and African faces). If holistic processing is important for recognising faces and individual-level face discrimination experience is crucial for holistic processing to develop, we would expect that participants from different race groups will show the typical ORE in face memory, and stronger holistic processing for own-race faces than other-race faces. Alternatively, if holistic processing can be generalised to facial morphologies that are less visually experienced without extensive individuating (e.g.^38,39,40), both own- and other-race faces would elicit holistic effects of similar magnitudes across race groups.

In addition, we used Navon figures to compare global–local processing differences between the four race groups. Based on the accumulated evidence of stronger global processing but weaker local processing in East Asians compared to Western Caucasians⁴¹, we predicted that Malaysian Chinese would be more susceptible to global–local interference (GLI)—an index of the tendency to globally process general objects—than Caucasian groups (European and Australian). Such a perceptual difference indicates that information-gathering strategy (global versus local processing) for general stimuli can be culture-dependent^25,42, with collectivist societies (i.e., the East) producing a preference for integrating context, and individualist societies (i.e., the West) producing a preference for ignoring context⁴³. Like South-East Asia, African cultures are also considered collectivistic⁴⁴, but research on cultural differences in perceptual processing bias has often neglected this population. To ensure valid theoretical conclusions, we also tested African participants from collectivistic societies and hypothesised that they would show an evident GLI (i.e. faster and more accurate at global processing).

Furthermore, if the mechanisms involved in holistic processing can apply to other object classes (e.g. Navon letters) and are not specialised for faces per se (“domain-generality hypothesis”), then GLI scores would vary systematically with performance on both the whole-part task and the composite face task. Conversely, if special mechanisms are involved in processing faces holistically (“domain specificity hypothesis”), the magnitude of GLI would not correlate with holistic face processing measures and face recognition ability, such that perceptual biases for general information processing is not necessarily generalisable to high-level, specialised face processing.

Method

Participants

Thirty-one Malaysian-Chinese (16 females; M_age = 21.65, SD = 2.6), 30 European Caucasians (14 females; M_age = 22.40, SD = 3.10), 30 Australian Caucasians (23 females; M_age = 21.03, SD = 4.45), and 30 Africans (12 females; M_age = 26, SD = 5.5) took part in this study. All participants self-reported single rather than mixed-race descent. Malaysian Chinese were students studying at the University of Nottingham Malaysia. They were all born and grew up in Malaysia. None of them reported spending more than 9 months outside Malaysia. European Caucasian and African participants were international students recruited at the University of Nottingham Malaysia. European-Caucasians were mostly British (one Italian, one Dutch) who had resided in Malaysia for 6.5 months on average. None reported spending more than 2 years in a predominantly Asian country. African participants were mostly Nigerians (five Kenyans, two Zimbabweans, one Zambian, one Somali) who had resided in Malaysia for 1.5 years on average. Australian-Caucasian participants were recruited from Macquarie University, Sydney. All were born in Australia and had not lived in a predominantly Asian country for more than 4 months (M = 5.4 days, SD = 21, range 0–120 days). All participants reported having normal or corrected-to-normal vision and having no difficulty with face recognition. All experimental protocols were approved by the University of Nottingham Malaysia, Faculty of Science Ethics Committee, and all methods were carried out in accordance with guidelines of the British Psychological Society. The individuals depicted in all figures signed a written informed consent for their images to be published. Participants gave written informed consent prior to the experiment and received either course credit or monetary compensation of RM10 (approximately US$3) for their participation.

A priori power analysis using G*Power 3.1.9.2⁴⁵ showed that, for all of the terms in our analyses that directly related to our hypotheses (all of which are 4 × 3 within-between interactions in mixed ANOVAs), this sample size gave sufficient power to detect effect sizes of η_p² < 0.06 (a small-medium effect size), with α = 0.05, and power (1 − β) = 0.80.

Apparatus, stimuli and procedure

Chinese, Caucasian and African faces were used. Chinese facial images were collected from a student population at the University of Nottingham Malaysia Campus; Caucasian faces were obtained from students at Macquarie University, Australia. African faces were requested from Coetzee’s⁴⁶ face database. All stimuli used in the face tasks were frontal images of young adult faces (both male and female) with neutral expression, and no glasses, facial hair, or distinctive blemishes (see Fig. 1). Individual face identities did not appear in more than one task. Considering that face photograph memorability is influenced by a combination of facial properties such as distinctiveness and attractiveness⁴⁷, 216 face images (72 for each race) were originally sampled according to the results of a prior experiment in which each face race was matched in terms of attractiveness and distinctiveness as rated by 95 young adult participants (24 Chinese, 24 Malay, 25 Indian, and 21 Caucasian) on a 7-point Likert scale⁴⁸. This selection criterion minimised potential confounds of facial distinctiveness and attractiveness on participants’ recognition performance. The original images were first cropped to form an ellipse shape that excluded external features (leaving a roughly oval shape with no hair on the top and sides). To minimise the low-level image cues (e.g., skin colour information), all face images were transformed into 8-bit grayscale images in Adobe Photoshop CS6 and were aligned on the eyes’ position using Psychomorph software⁴⁹ (http://users.aber.ac.uk/bpt/jpsychomorph/, Version 6). Stimuli were presented on a 15.6″ monitor (resolution 1366 × 768). Participants were tested individually in a quiet dimly lit room with three face tasks (yes–no recognition task, composite task, and whole-part task), in counterbalanced order. Participants also performed a global–local task; however, as this task induces holistic or featural processing biases⁵⁰, it was always performed last. Participants completed all tasks in approximately one hour, including breaks between each task.

Yes–no recognition task

Sixteen faces of each race group (eight females) were selected to form the experimental set. Each face was presented only once on a light grey background and sized 7.5° horizontal by 10.5° vertical at approximate viewing distance of 60 cm. During the learning phase, participants were asked to passively view and learn 24 faces (eight per race group). On each trial, a face was presented randomly in one of the four quadrants for 5 s, preceded by a central fixation cross for 1 s. In the recognition phase, 24 learned faces were randomly intermixed with 24 novel faces. For learned faces, the facial expression (neutral or smiling) changed between the learning and recognition phases to avoid a trivial image matching strategy. On each trial, participants were required to indicate as quickly and as accurately as possible whether they had seen the face in the learning phase. The face was presented for up to 5 s and no trial-by-trial feedback was given. If participants did not respond within the first 5 s, a blank screen would appear until they responded. Both response times and accuracy were recorded. Faces were presented in a random order, with the constraint that no more than three trials involving a given race occurred in immediate succession. The experimental procedure is illustrated in Fig. 2.

Whole-part task

Stimuli were created from 36 face images: 12 target faces (two of each race and sex) and 24 distractor faces containing four faces of each race and sex. Within each race and sex category, a standard face outline template was used, and each target face was created by aligning eyes, nose, and mouth features into the template. Distractor faces for the whole trials were created by replacing one feature (i.e., eyes, nose, or mouth) in the target face with the respective feature of another face of the same race and sex. Part stimuli were created by extracting the eye, nose, or mouth region from each of the target faces and the distractor faces. Target and distractor stimuli for the part trials displayed only the critical feature (see Fig. 3). At a viewing distance of approximately 60 cm, whole faces were of 7.5° horizontal by 10.5° vertical and for isolated features the sizes were: eyes 6.5° × 2.2°; nose 2.6° × 2.2°; mouth 3.8° × 1.9°.

The task comprised three study-test race-blocks (Chinese, Caucasian and African faces). During the study phase, participants were instructed to memorise four faces (two males) and their associated names (e.g., John, James, Jill, and Jane). Each face-name pair was shown for 5 s with an inter-stimulus interval of 1 s. Participants entered the test phase only when they could correctly identify every face-name pair in a single loop; otherwise, an additional reminder would be presented after three iterations. This ensures that participants were familiarised with each face. On each trial in the test phase, a question was presented (e.g. “Which is John’s nose?), followed by a choice of two alternative images presented on the left and right sides of the screen, both horizontally centred. In the part condition, the display consisted of two isolated features (two eyes, two noses, or two mouths), one was from the target face, and the other was from the distractor face. In the whole condition, the display contained two whole faces, with the target and a distractor face differing only with respect to one face part. Participants were required to indicate if the target stimulus was on the left or on the right. The image pair remained on the screen until response.

Stimuli were matched between the two conditions, such that facial parts tested in the part condition were also tested in the whole condition. The whole and part conditions were randomly intermixed. Each block consisted of 24 part and 24 whole trials. The order of block presentation was counterbalanced across participants.

Composite task

Faces were generated from 60 images (20 for each race; half females) of Chinese, Caucasian, and African faces. Each face image was divided into two halves horizontally across the middle of the nose using Adobe Photoshop CS6. The top and bottom halves from same-gender faces of different individuals were then recombined at random, leaving a 3-pixel gap between the two parts. The top half and bottom halves were presented either aligned or misaligned (see Fig. 4a). In the misaligned trials, the top and bottom face parts were misaligned by shifting the top half horizontally to the left by half a face width. The same composite faces were used in both conditions. This resulted in 40 aligned and 40 misaligned composite faces in total for each race category. Stimuli in the aligned condition were 7.5° horizontal by 10.5° vertical while stimuli in the misaligned condition were 11. 2° horizontal by 10.5° vertical.

Following Gauthier and Bukach¹⁷ (Fig. 4a), in congruent trials, the top and bottom parts of the face were created either from the same faces or from different faces (i.e., top-same and bottom-same or top-different and bottom-different). On the other hand, in incongruent trials, one of the face halves was created from the same face, while the other half was created from different faces (i.e., top-same and bottom-different or top-different and bottom-same). This paradigm allows the calculation of a bias-free measure of sensitivity—d′ prime^51,52.

Each trial started with a central fixation cross for 500 ms, followed by a centred face for 200 ms. After a Gaussian noise mask of 500 ms, a test face appeared randomly in one of eight locations, each placed 1.2° from the screen’s centre, for 200 ms. Next, a blank screen was presented until a response was made. The participants’ task was to judge as quickly and accurately as possible whether the top half of the test face was identical to the preceding study face while ignoring the task-irrelevant bottom half. They were instructed to indicate their decision by pressing two keys on a keyboard (see Fig. 4b). On each trial, both faces within a pair were either aligned or misaligned, and these two conditions were intermixed. Trials were blocked by face race, and the order of blocks was counterbalanced across participants. Hence, each participant performed three experimental blocks of 80 trials (40 aligned and 40 misaligned), half of which consisted of face pairs that shared an identical top half (same trials), and half of which consisted of face pairs with different top halves (different trials). Order of trial presentation was fully randomised across participants. Participants first completed 12 practice trials to ensure that they understood the task.

Global–local task

This task is a variant of Navon’s⁵³ task used in Wang et al.³¹ and assesses participants’ bias to attend to the global shapes versus local shapes, or vice versa. In congruent shapes, the global and the local objects forming the shapes shared an identity (e.g., local squares forming a global square). In incongruent shapes, the shapes at the two levels had different identities (e.g., local circles forming a global square). In addition to congruent and incongruent conditions, we also included a neutral (baseline) condition at both global and local levels in which a task-irrelevant object (an X) forms the global or local shapes (see Fig. 5). The Navon stimuli consisted of shapes (circle, square or cross) with white outline presented on a black background. Each local shape was 0.5° × 0.5°; the local shapes were arranged to form a global square (4.9° × 4.9°), global circle (5.6° × 5.6°), or a global cross (4.9° horizontal × 5.3° vertical).

There were two blocks of trials, each containing 18 practice and 108 test trials. Each block was preceded by instructions to identify the target shapes (circle and square) at either the global or local level as quickly and accurately as possible. In each block, there were 36 congruent trials, 36 incongruent trials and 36 neutral trials (18 local, 18 global). The neutral trials were included to serve as a baseline measure. The three main types of trials were randomly intermixed. Each trial began with a blank screen (500 ms), followed by a central fixation cross (700 ms), Then, a shape stimulus appeared randomly in one of the eight possible locations (0.49° away from the centre of the screen) for 150 ms, followed by a mask (48 × 48 array of diamonds each 0.19° × 0.19°) for 500 ms. Participants were asked to indicate whether the target shape they saw was a circle or a square as fast as possible. This task took approximately 3 min. Each participant completed 216 trials in total (108 local-level and 108 global-level), with 18 practice trials in each block.

Results

Distributions were normal as indicated by Kolmogorov–Smirnov test (all ps ≥ 0.1). The assumptions of homogeneity of variance were met in the three main measures (i.e., d′, accuracy, and mean response time) and no violations were detected (Levene’s test all p > 0.05). Prior to each analysis for these three measures, outliers further than two standard deviations from the mean were removed. For each ANOVA, Greenhouse–Geisser corrections were applied whenever sphericity was violated. Follow-up tests were conducted using post-hoc tests with Bonferroni correction for significant main effects and planned comparisons for significant interaction effects. Bonferroni-corrected p values were reported. To ensure there was no speed-accuracy trade off, analyses on face task performance were repeated using mean response times (RTs) as the dependent variable. Given that the pattern of results was similar in the accuracy and RT data, in the interest of brevity, we report the response time results in Supplementary Text.

It is frequently argued that support for the null hypothesis being true cannot be obtained from the fact that the p-values are larger than the alpha level (e.g.^54,55,56). Thus, in addition to reporting the traditional null hypothesis significance tests, we also performed Bayesian analyses^57,58 using the statistical software JASP⁵⁹ (0.14.0.0, https://jasp-stats.org/) and the JASP default prior^60,61 (Cauchy prior, r = 0.707; JASP Team, 2020). Bayesian analysis has the pragmatic benefit that it is not based on the evaluation of significance levels that can be interpreted incorrectly, particularly when the results are non-significant⁶². The Bayes Factor (BF10) provides the likelihood ratio of the probability of the data given the alternative hypothesis (H1) divided by the probability of the same data given the null hypothesis (H0). A BF₁₀ value between 1 and 3 provides anecdotal evidence for H₁; a value between 3 and 10 provides moderate evidence for H₁; a value above 10 provides strong evidence for H₁; a value between 1 and 1/3 provides anecdotal evidence for H₀; a value between 1/10 and 1/3 provides moderate evidence for H₀ and; a value less than 1/10 provides strong evidence for H₀.