Introduction

Humans are experts in facial recognition but can more accurately remember faces of their own race relative to faces of individuals of other races. This phenomenon has been called the other-race effect or own-race bias1. While individuals tend to better remember own-race faces, less time is needed to categorize faces of other races, an effect referred to as other-race category advantage2,3,4,5. Several theoretical explanations have been proposed to account for these phenomena, including perceptual experience theory6, the race-feature hypothesis7,8, the multidimensional space hypothesis9, and social cognitive theory10. Through integration of previous theories, Hugenberg and Young11 posited the categorization-individuation model (CIM) that there were two different ways of processing faces during the encoding stage: categorization and individuation. Categorization refers to the classification of the face into a certain group based on facial characteristics (e.g., skin tone). Individuation consists of distinguishing and identifying a face as belonging to a specific individual. Although categorization and individuation play an important role in shaping racial face encoding, the two ways do not have equal influences. Categorization, the first stage of facial encoding, occurs faster for other-race faces than for own-race faces, which causes observers to have less motivation to individualize other-race faces, whereas does not influence the automatic individuation of own-races faces. Consequently, other-race faces are classified by race faster, whereas own-race faces are more accurately recognized. Thus far, numerous studies have used various tasks to investigate differences in categorization and individuation of faces.

A variety of classical tasks have been used to investigate racial face processing, including the learning-recognition paradigm12,13,14, the oddball paradigm15,16, face orientation judgment17, face/race categorization task2,18, visual search task8, remember–know–guess judgment19, and the repetition lag paradigm20. For example, Levin8, using a visual search task, showed that faces of other races were more easily detected than faces of the individual’s own race. Moreover, using the learning-recognition paradigm, Wiese, Kaufmann, and Schweinberger13 asked participants to complete an additional racial categorization task in the learning phase and found that own-race faces were categorized more slowly, but were better recognized than other-race faces. Among the wide variety of research paradigms mentioned above, the racial categorization task and learning-recognition paradigm are highly favored in addressing racial face processing. By using these two tasks, researchers can directly compare the categorical and individual processing of racial faces.

Although, these studies provide some evidence that supports the categorization-individuation model (CIM), Hugenberg and Young11 pointed out that differences between categorization and individuation processing occur almost exclusively during facial encoding (rather than during storage or retrieval). Event-related potentials (ERPs) are an appropriate method to test individuation and categorization encoding of racial faces. Of the possible time points to measure ERPs, N170 has been shown to be the most important component in face related studies. N170 is a negative deflection that peaks at approximately 170 ms over occipito-temporal sites, and is thought to reflect the structural encoding of faces21 or configural processing of facial features22. Interestingly, results from previous studies at N170 under these two tasks are inconsistent. For racial categorization tasks, the majority of previous studies have found a non-significant difference in N170 of own- and other races3,23,24,25,26,27,28,29; while other studies have reported either a larger N170 for own- relative to other-race faces30, or a reversed trend, in which other-race faces exhibit a larger N170 when compared to own-race faces2,5. Previous results examining N170 during a learning-recognition task are even more complicated, as many studies embedded other tasks in the learning phase, such as a racial categorization task, or facial attractiveness evaluation task12,13,14,27,31. Although the result of N170 was reported in many of these studies, it was derived from different stages of the learning-recognition task, including the learning12,13,14 and test phases27. Ito and Senholzi4 suggested that these inconsistent results raised the question as to whether categorical and individuation processing of racial faces might be brought about by task demands, rather than racial face processing per se. Similarly, Wiese27 posited that task demands play an important role in the emergence of the N170 racial effect.

In this current study, we aimed to answer the simple and important question of whether categorization and individuation differ for processing of own- and other-race faces during the encoding stage. To address this aim, we used racial categorization and learning-recognition tasks, to provide information for future studies related to racial face encoding.

In addition to the N170 component, several other ERP components were of interest in this study. The early component, P1, which is a positive deflection over occipital areas, peaks at approximately 100 ms after the presentation of a visual stimulus. P1 is sensitive to basic stimulus properties, such as spatial frequency, contrast, and luminance32, but is also modulated by stimulus category33 and influenced by selective attention34. Recent studies have found P1 might also be modulated by task demands. Colombatto and McCarthy35, using a categorization task demand, found own-race faces had a larger P1 than other-race faces. As for demand of an individualization task, Wang and colleagues36 found P1 was influenced by race of faces and facial configural processing when face was attended to.

The midline P2 component is a positive deflection peaking approximately between 100–250 ms, which has been associated with selective attention37, and is considered an index of other racial-processing38. Research has shown larger P2 amplitudes for other-race faces, revealing greater attention orientation to other-race faces15,16,26,38,39,40,41,42,43,44, with a majority of these studies have using a racial categorization task demand. This larger amplitude of P2 for other-race faces appeared robust, even being observed when the categorization task was focused on gender15,44,45. Volpert-Esmond, et al.46 compared a race-irrelevant task (evaluation priming) and racial categorization task, and observed a larger P2 for Black than White faces from White participants for both tasks. However, Ito, et al.47 asked participants to passively look at racial faces, and think about whether they like or dislike the particular face being viewed, and did not observe a P2 difference in Black and White faces from White participants. At minimum, the results of these two studies suggest a potential influence of task demands on the amplitude of P2 for racial faces. Furthermore, when Asian faces are taken into consideration, results are also mixed41, Willadsen-Jensen and Ito42 found that both Asian and White participants have larger P2 for other-race faces using a racial categorization task; whereas, He, et al.48 did not observe a difference in P2 between White and Asian faces for White participants using a racial irrelevant task. Nevertheless, the CIM framework describing the two methods of encoding own-face and other-race faces was applied in the current study to directly compare how individualization and categorization task demands affect P2 when encoding racial faces, so that we could explore whether attentional involvement of P2 results from differences in processing quantity of facial categories and whether task demands also play a role in this processing for Asian participants.

P300 is the third ERP component to be investigated in our study. It is the most positive peak between 300 and 600 ms and is particularly sensitive to social category information15,39, task complexity49, degree of relevance, and motivational significance of stimuli50. Specifically, P300 amplitude varies between different task demands, reflecting focused attention on facial processing51 and discrimination between differences in identity52. While some studies related to racial faces have observed larger P300 amplitude for other-race faces when compared to own-race faces in racial categorization tasks28,29, another study, utilizing an identity matching task did not observe racial differences in P30036. However, despite the White and Asian faces used in these three studies, the racial results of P300 still varied between task demands. In the current study, we hoped to provide evidence of this important component related to racial face processing under recognition and racial categorization tasks at the encoding stage.

The present study aimed to investigate the effect of task demand on neural processing of own- and other-race faces during the encoding stage. By comparing the categorization task with the learning phase of the recognition task, based on the CIM, we hypothesized that task demands would not automatically affect the individual processing of own-race faces. Thus, own-race faces should demonstrate similar N170 for both categorization and recognition tasks, whereas N170 would be larger for recognition tasks for other-race faces, when compared to categorization tasks. Since we were uncertain how the task demands influenced racial face encoding related to P1, P2, and P300, we had no strong predictions regarding such conditions.

Methods

Participants

Twenty-seven Chinese participants (13 male, mean age 24.13 ± 2.00 SD) were initially selected. Ethical approval for the study was granted by Northwest Normal University in accordance with the Declaration of Helsinki. All participants signed informed consent before the experiment.

Stimuli

Stimuli consisted of 100 White (50 female) and 100 Asian (50 female) faces with neutral expressions that were selected from the CAL/PAL Database53. All faces were converted to grayscale, and placed on a black background (240 × 240 pixels), informed consent to publish identifying images was obtained.

Procedure

Throughout the experiment, participants sat in a stationary chair in a dimly lit room, holding a distance of approximately 60 cm. A 23-inch monitor with a resolution of 1920 × 1080 pixels was used for the experiments. Before the formal experiment, participants completed 8 practice trials to familiarize themselves with the experimental procedure and response button. Each participant completed eight task demand blocks (four blocks of categorization tasks and four blocks of recognition tasks) in the formal experiment, which were counterbalanced across participants (e.g., ABABABAB and BABABABA). There were 128 faces randomly assigned to four recognition task blocks, with 32 faces (half female, half Asian) in each block. Another different 64 faces were used in four categorization task blocks with 16 faces of each (half female, half Asian).

Each recognition task block included a learning and a test phase. During the learning phase, 16 faces (half female, half Asian) were presented randomly once, and then learned again in the same order. Participants were asked to remember the presented faces. Each trial started with a fixation (500 ms), followed by presentation of the face (2000 ms) that participants were asked to passively observe and remember. The test phase started when participants indicated they were ready. In the test phase, 32 faces were presented randomly (16 previously presented faces and 16 new faces). In the test phase, after the fixation (500 ms), the test face was presented for a maximum duration of 5000 ms or when the participant made a response. Participants were asked to press "F" for previously-viewed faces or "J" for new faces; then, the next trial was started (see Fig. 1).

Figure 1
figure 1

Experimental procedure of recognition and racial categorization tasks.

For each categorization block, 16 faces were randomly presented. Participants were asked to determine whether the presented face was White or Asian as accurately and quickly as possible. Each trial started with a fixation (500 ms), then a racial face was displayed for a maximum of 5000 ms, or until the participant pressed a response key: "F" for White faces and "J" for Asian faces (counterbalanced across participants) (see Fig. 1).

EEG recording and analysis

EEG was recorded using 64 Ag/AgCl electrodes mounted on an elastic cap (eego sports, ANT Neuro, Inc) and continuously sampled at 500 Hz by an AsaLab Amplifier (www.ant-neuro.com), which was referenced on-line to the vertical central parietal (Cpz). All electrode impedances were kept below 5 KΩ. EEG data processing was performed off-line using EEGLab, recalculated to average reference, and band-pass filtered (0.1–30 Hz). Contributions of eye blink artifacts were corrected. The EEG was segmented from − 200 to 800 ms relative to stimulus onset, with the first − 200 ms as baseline. To eliminate artifacts, we used an independent component analysis (ICA) from the EEGLAB toolbox54 with ICLabel55 to distinguish and exclude non-brain components. The remaining artifacts exceeding ± 100 mV amplitude were rejected. After preprocessing, the mean number of remaining trials were 110.0 and 108.9 (SD = 25.0, 25.0) for White and Asian faces of the learning phase of the recognition task, respectively, and 47.6 and 47.8 (SD = 17.1, 17.5) in the categorization task, respectively. Electrode selection was based on previous studies28,30,31. The peak latency of each component was selected based on the grand average waveform across all participants and conditions (as recommended by Luck and Kappenman56). Then, a ± 20 ms-window around the grand average peak latency for P1 and N170, ± 30 ms-window for P2, and a ± 200 ms-window for P300, were used to calculate the mean amplitude of each respective measure56, with P1 (120–160 ms O1/O224), N170 (180–220 ms P7/P8/PO7/PO857), P2 (180–240 ms Fz/ Cz/ Pz26,41), and P300 (400–800 ms Fz, Cz, Pz, Oz28). Since the main purpose was to compare the perceptual differences of racial faces under the two different task demands, we only recorded the learning phase of the recognition task to enable direct comparison with the categorization task to be made. Greenhouse–Geisser corrections were used to address violations of sphericity. A Bonferroni correction was applied for each pairwise comparison of all experiments in this study to control for type I error.

Results

Behavioral results

The mean corrected reaction times (RT) of the two face types under the recognition task demand (White: 1046.90 ± 212.25 ms, Asian: 1069.72 ± 260.71 ms) the and categorization task demand (White: 779.29 ± 154.98 ms, Asian: 820.70 ± 167.63 ms) were analyzed by a repeated-measures ANOVA, with race (White vs. Asian) and task demands (recognition vs. categorization) as within-subject factors. The main effect of task demands was significant, F(1, 26) = 90.536, p < 0.001, ηp2 = 0.777, with a faster response speed in the categorization task than the recognition task. The main effect of race was also significant, F(1, 26) = 6.989, p = 0.014, ηp2 = 0.212, with White faces exhibiting faster response speed when compared to Asian faces. However, the interaction between task demands and face race was not significant (p = 0.524, η2p = 0.016).

We analyzed accuracy of race under the recognition (White: 0.75 ± 0.07, Asian: 0.74 ± 0.09) and categorization (White: 0.93 ± 0.07, Asian: 0.94 ± 0.07) task demands using the same analysis method of RT. The main effect of task demand was significant, F(1, 26) = 117.005, p < 0.001, ηp2 = 0.818, reflecting more accuracy in the categorization task relative to the recognition task. Both the main effect of race and the interaction between task demands and race were not significant (ps > 0.338, ηs2p < 0.035).

EEG results

For P1, a 2 (race: White, Asian) × 2 (task demand: recognition, categorization) × 2 (hemisphere: O1, O2) repeated-measure ANOVA was conducted (see Fig. 2A). The main effect of task demand was significant, F(1, 26) = 9.926, p = 0.004, η2p = 0.276, with the categorization task (1.56 ± 0.48 µV) exhibiting a larger amplitude than the recognition task. The interaction between task demand and hemisphere was also significant F(1, 26) = 10.381, p = 0.003, η2p = 0.285, with the categorization task demonstrating a larger amplitude than the recognition task in both the left (1.24 ± 0.48 vs. 0.59 ± 0.40; p = 0.039) and right (1.88 ± 0.52 vs. 0.68 ± 0.41 µV; p < 0.001) hemispheres, but the variations between the two task-evoked P1 amplitudes were different, with a larger difference noted in the left hemisphere. Other main effects and the interaction did not reach significant levels (ps > 0.174, ηs2p < 0.070).

Figure 2
figure 2

Grand average waveforms. (A) P1 at O1/O2 in the 120–160 ms time window, (B) N170 at P7/8 and PO7/8 in the 180–220 ms time window, (C) P2 at Fz/ Cz/ Pz in the 180–240 ms time window, P300 at Fz, Cz, Pz, Oz in the 400–800 ms time window.

For N170, a 2 (race: White, Asian) × 2 (task demand: recognition, categorization) × 2 (hemisphere) × 2 (site: P7/8, PO7/8) repeated-measure ANOVA was performed (see Fig. 2B). The main effect of task demand for N170 was significant F(1, 26) = 14.786, p = 0.001, η2p = 0.363, with recognition (− 3.95 ± 0.64 µV) demonstrating a larger N170 amplitude than categorization (− 2.65 ± 0.56 µV) task demand. The main effect of race and the interaction effect of race and task demand were not significant (ps > 0.215, ηs2p < 0.059). The interaction between the hemisphere and task demand was significant F(1, 26) = 8.856, p = 0.006, η2p = 0.254; further analysis found that the right hemisphere demonstrated larger N170 amplitudes than the left hemisphere for both categorization (− 3.06 ± 0.66 vs. − 2.23 ± 0.50; p = 0.024) and recognition tasks (− 4.44 ± 0.75 vs. − 3.24 ± 0.57; p = 0.001), but with differing magnitudes. The interaction between hemisphere and race was significant F(1, 26) = 8.225, p = 0.008, η2p = 0.240, with White faces (− 4.02 ± 0.67 µV) evoking larger N170 amplitude than Asian faces (− 3.79 ± 0.69 µV) in the right hemisphere (p = 0.022, η2p = 0.186), and the N170 amplitude of White faces (− 2.71 ± 0.52 µV) and Asian faces (− 2.76 ± 0.51 µV) was not significantly different at the left hemisphere (p = 0.713, η2p = 0.005). The interaction between site, task demand, and race was significant F(1, 26) = 8.016, p = 0.009, η2p = 0.236. Further separate analysis found that the interaction between site and race was significant for the recognition task F(1, 26) = 4.400, p = 0.046, η2p = 0.145, with a marginally significantly larger N170 amplitude for White (− 4.81 ± 0.66 µV) than Asian (− 4.55 ± 0.68 µV) faces at P7/8 (p = 0.090), whereas a non-significant difference of N170 amplitude was found for both White (− 3.16 ± 0.64 µV) and Asian (− 3.28 ± 0.63 µV) faces at PO7/8 (p = 0.458). The interaction between the site and race was not significant in the categorization task F(1, 26) = 3.067, p = 0.092, η2p = 0.106. Other post hoc separation analyses of site, task and race did not exhibit significant two-way interaction effects (ps > 0.147, ηs2p < 0.079). The interaction between task demand, race, and hemisphere was not significant F(1, 26) = 2.510, p = 0.125, η2p = 0.088, for the left hemisphere (P/PO7), the interaction between task demand and race was not significant F(1, 26) = 0.900, p = 0.352, η2p = 0.033. Regarding the right hemisphere (P/PO8), the interaction between task demand and race did not reach a significant level F(1, 26) = 0.153, p = 0.69, η2p = 0.006. The main effect of site was significant F(1, 26) = 39.848, p < 0.001, η2p = 0.605, with P7/8 (− 4.01 ± 0.60 µV) demonstrating a larger N170 amplitude than PO7/8 (− 2.58 ± 0.57 µV). The main effect of the hemisphere was significant F(1, 26) = 10.036, p = 0.004, η2p = 0.279, so larger N170 amplitudes were observed for the right (− 3.86 ± 0.68 µV) versus (− 2.74 ± 0.51 µV) the left hemisphere. The other main and interaction effects were not significant (ps > 0.071, ηs2p < 0.120).

Analysis of P2 was conducted with a 2 (race: White, Asian) × 2 (task demand: recognition, categorization) × 3 (site: Fz, Cz, Pz) repeated-measured ANOVA (see Fig. 2C). A significant main effect for task demand was observed, F(1, 26) = 31.402, p < 0.001, η2p = 0.547, with the recognition (1.62 ± 0.27 µV) task exhibiting a larger P2 amplitude than the categorization (0.77 ± 0.24 µV) task. The main effect of race was significant, F(1, 26) = 4.377, p = 0.046, η2p = 0.144, with White faces (1.26 ± 0.25 µV) resulting in a larger P2 than Asian(1.13 ± 0.24 µV) faces. The main effect of site was also significant, F(2, 52) = 15.132, p < 0.001, η2p = 0.368, reflecting a larger amplitude of P2 at Fz than Pz (1.96 ± 0.36 vs. 1.72 ± 0.38 µV; p = 0.001), and Cz than Pz (1.72 ± 0.38 vs. − 0.101 ± 0.26 µV; p < 0.001). No other main or interaction effects reached significance (ps > 0.057, ηs2p < 0.132).

For P300, a 2 (race: White, Asian) × 2 (task demand: recognition, categorization) × 4 (site: Fz, Cz, Pz, Oz) repeated-measured ANOVA was performed (see Fig. 2C). The main effect of task demand was significant F(1, 26) = 16.538, p < 0.001, η2p = 0.389, revealing that the recognition task (0.81 ± 0.15 µV) demonstrated a larger P300 than the categorization task (0.39 ± 0.20 µV). The main effect of site was significant F(3, 78) = 4.071, p = 0.027, η2p = 0.135, reflecting a larger amplitude of P300 at Pz (1.41 ± 0.32 µV) than Fz (− 0.32 ± 0.33 µV; p = 0.024) and Cz (0.84 ± 0.26 µV; p = 0.035). The interaction between the site and race was significant, F(3, 78) = 6.610, p = 0.003, η2p = 0.203. Post-hoc analysis found that for Asian faces, a smaller amplitude of P300 was found at Fz (− 0.47 ± 0.34 µV) than Cz (0.69 ± 0.29 µV; p = 0.036) and Pz (1.34 ± 0.31 µV; p = 0.018); for White faces, a smaller amplitude of P300 was found at Fz (− 0.18 ± 0.33 µV) than Cz (0.97 ± 0.25 µV; p = 0.020). The other main and interaction effects were not significant (ps > 0.100, ηs2p < 0.101).

To confirm that the given number of participants in the study were allowed to detect of sufficient power, we conducted a sensitivity power analysis using G*Power software58. The sensitivity power analysis can assess the minimal effect size to achieve 80% power when the sample size and alpha level are fixed59. A repeated measurement of ANOVA Analysis with 27 participants in our study would be sensitive to the effects of f = 0.230 (η2p = 0.050) with 80% power (α = 0.05), revealing that an observed effect size larger than η2p = 0.050 would have 80% power to detect the effect.

Discussion

Using a within-subject design, the present study investigated the effect of task demand on neural processing of own- and other-race faces during the encoding stage. The behavioral results demonstrate that other-race faces had a faster categorization speed than own-race faces. With regard to ERP results, we observed a larger P1 for the categorization task than the learning phase of the recognition task. Conversely, N170 and P2 demonstrated a larger amplitude for the learning phase of the recognition task than the categorization task. Moreover, both N170 and P2 revealed larger amplitudes for other relative to own-race faces, but this result of N170 amplitude was only observed at the right hemisphere. Specifically, for N170, a marginally enhanced N170 for other-race faces was observed during learning of the recognition task at anterior locations (P7/8). Finally, P300 was larger for recognition relative to racial categorization task. These ERP results are discussed in the following paragraphs.

The P1 analysis revealed that P1 was larger for the categorization task than the learning phase of the recognition task. Although previous studies have suggested that the P1 component is sensitive to visual properties, such as luminance60,61 and contrast32, the main effect of enhanced P1 for the categorization task compared to the learning phase of the recognition task is unlikely due to the variation of these low-level differences, since the racial faces were randomly assigned across two tasks, and the presentation of the two tasks was counterbalanced across participants. Stahl, et al.12, using a recognition task, and with additional instruction in the learning phase that either asked participants to complete a attractiveness rating task or a racial categorization task, found that the categorization task demonstrated a larger P1 than the attractiveness rating task. Similarly, such differences between task demands of P1 in our experiment also reflected differences of early attentional arousal in early visual processing of different task demands.

We found that other-race faces elicited larger N170 amplitudes than own-race faces at the right hemisphere. The racial effect was consistent with that of previous studies2,5,13. However, if we narrow down the facial race stimuli to White and Asian and the race of participants to Asian and White, the majority of studies did not observe that the racial effect was on N1703,23,24,25,26,27,28,29, and only Wiese et al.13 found a larger N170 for other-race faces in the learning phase. Note that in this study, the learning phases of the recognition task embedded an additional racial categorization task. Indeed, participants were invited to complete two different racial tasks, similar to our study. It seems that when participants were asked to process racial faces that simultaneously included racial categorization and individualization demands, a larger N170 for other than own-race faces would be observed. The possible explanation is that the simultaneous completion of categorization and individuation demands requires the processing of both racial category and face identity information, leading to deeper encoding than the single-task demand, thus causing differences between the racial faces to emerge. Considering the decreased amplitude of N170 reflects deeper structural encoding of faces21, such as upright compared to inverted faces21,62. Therefore, the increased N170 amplitudes of other-race faces observed in the current study reflect that the structural processing of own-race faces is stronger than other-race faces for Chinese participants at the right hemisphere.

Additionally, we found larger N170 amplitude in the learning phase of the recognition task compared to categorization task. The task demand of recognition typically demonstrates deeper overall individual encoding when compared to the racial categorization. This might be interpreted as the memorization instruction in the recognition task inducing individual processing12 of racial faces, resulting in deeper individual face encoding in the recognition task compared to race categorization task. Therefore, our result indicates that the task demands alone could induce the differences observed in N170, which may enhance the individual identity processing of racial faces.

Most importantly, we observed an interaction between site, facial race, and task demand. Marginal differences between own- and other-race faces at P7/8 were observed for the recognition task. Although this racial effect only reached marginal significance in our study, the trend of observing the racial effect of N170 at anterior electrodes for the recognition task is consistent with that of Wiese, et al.13. This trend suggested that when the task demand asked the participant to perform individual-level encoding of racial faces, own-race faces probably elicited stronger structural encoding than other-race faces, which is not in line with the current CIM-based hypothesis. In fact, this result does not appear to be explained by other theories, such as perceptual experience theory6 or social cognitive theory10. Thus, the theory that accounts for these differences remains to be elucidated.

The current results show that other-race faces elicited a larger P2 than own-race faces, which is consistent with the results of the majority of previous studies15,16,26,38,39,40,41,42,42,43,44. Considering that smaller amplitude of P2 reflects less attentional orientation to racial faces40, and Willadsen-Jensen and Ito41,42 found the consistent result that both Asian and White participants had a larger P2 for other-race faces, we can infer that the overall encoding of own-race faces employs less attention than other-race faces for Chinese participants. Additionally, we observed that larger P2 amplitudes for recognition task compared to the racial categorization task. P2 has been shown to be sensitive to racial categorization information, with the enhancement of P2 reflecting increased attentional involvement40,45. This result indicates that the recognition task may require enhanced attention allocation compared to the categorization task. Furthermore, in previous studies41,Willadsen-Jensen and Ito42 observed a larger P2 for other-race faces using a racial categorization task for both Asian and White participants. However, He, et al.48 did not observe a difference in P2 between White and Asian faces for White participants. This inconsistency might be accounted by the task demand, as they asked participants to complete a racially irrelevant task (gender categorization). By using two racial related tasks, racial categorization and recognition task demands, we still observed that other-race faces elicited a larger P2 than own-race faces. Thus, we can infer that this race-related P2 component has inter-task stability, which can help us understand the attention resource allocation of racial face encoding in further studies.

In addition, when comparing the results of N170 and P2 together, we found some interesting patterns. Overall, other-race faces elicited larger amplitudes than own-race faces of N170 (at the right hemisphere) and P2 (at both hemispheres). Furthermore, both N170 and P2 observed larger amplitudes for the recognition task relative to categorization task. These similar trends of results of N170 and P2 raise an alternative explanation of P2; that is, P2 may share similar cognitive facial processes as N17045, which may be the so-called vertex positive potential (VPP)63. Meanwhile, the analyzed electrodes of P2 in our study overlapped with the electrodes’ sites of VPP (peaking at fronto-central sites). Just as Volpert-Esmond and Bartholow45 posited that the studies related to P2 may be integrated with VPP. In our study, the midline P2 component may alternatively be a measurement of VPP. So far, the facial-related P2 and VPP have not been integrated yet, but our results provide some evidence that can be used in future studies.

As for P300, the recognition task elicited an enhanced amplitude of P300 when compared to the categorization task. In previous studies, P300 was influenced by the task complexity49, and discrimination between differences in identity52, social category information15,39, which require less cognitive resources, resulted in a lower amplitude of P30064. Therefore, in our experiment, the larger P300 of the recognition task may demonstrate an increase in the cognitive resources needed when compared to the racial categorization task. Furthermore, a meta-analysis by Meissner and Brigham65 found that recognition tasks have similar processing mechanisms to perceptual individualization tasks, with the instruction to memorize faces given in the recognition paradigm possibly constituting individualization12. Hence, one possible inference is that the individualization processing underlying the recognition task requires more cognitive resources than categorization of racial face. However, considering that the significance of P300 might be a consequence of the difference found for P2, we should be cautious about this conclusion.

As only Chinese participants were involved in our study, we could not estimate how the experience affected our results. To thoroughly investigate the question that we posited, an ideal study design should be a complete cross-over design (i.e., both Asian and White participants). In addition, although the sensitivity power analysis provided evidence that supported the robustness of the results in the current study, there still existed the possibility that the overall power is underpowered because of the small sample size66. A study with overall low power would reduce the chance of detecting the true effect and the reproducibility of significant results66. Further studies should adopt larger sample size to increase the possibility of detecting a true effect66, to ensure that the effect size would not be inflated66,67, and to avoid false-positive results68. Besides, O'Toole et al.69 suggested that race typicality is related to racial face memorization. In future studies, using more other-race faces and adopting a cross-over design of participants may help reveal clearer evidence relating to how memory goals and task demand play a role in racial face processing.

Conclusions

The present results provide evidence that task demands not only influence the encoding of racial faces, but also has a more profound effect on the encoding stage of a recognition task for other-race faces when compared to own-race faces. More specifically, own-race faces elicit deeper structural encoding than other-race faces (N170), with less attentional involvement (P2). Moreover, a recognition task might promote individual-level encoding (N170) requiring more attentional resources in the early (P2) stage that are potentially maintained until relatively late (P300) stages. These results may assist researchers in making clearer inferences in future racial face studies.