Differences in configural processing for human versus android dynamic facial expressions

Humanlike androids can function as social agents in social situations and in experimental research. While some androids can imitate facial emotion expressions, it is unclear whether their expressions tap the same processing mechanisms utilized in human expression processing, for example configural processing. In this study, the effects of global inversion and asynchrony between facial features as configuration manipulations were compared in android and human dynamic emotion expressions. Seventy-five participants rated (1) angry and happy emotion recognition and (2) arousal and valence ratings of upright or inverted, synchronous or asynchronous, android or human agent dynamic emotion expressions. Asynchrony in dynamic expressions significantly decreased all ratings (except valence in angry expressions) in all human expressions, but did not affect android expressions. Inversion did not affect any measures regardless of agent type. These results suggest that dynamic facial expressions are processed in a synchrony-based configural manner for humans, but not for androids.

The second method is to manipulate asynchrony between facial features.The processing of synchronous and asynchronous expressions recruits activity in different brain centres, potentially reflecting global versus local processing of face expression information 17 .Furthermore, synchronous motion helps bind motion features in facial expressions which is disrupted when facial muscles move in an asynchronous manner 18 .Thus, an asynchrony manipulation can disrupt the configural processing of dynamic emotional expressions.
To investigate the role of configural processing in an androids' emotional expression, we presented videos of angry and happy dynamic facial expressions of humans and Nikola in the upright vs. inverted direction and synchronous (normal) vs. asynchronous mode.If configural processing between a human and an android is analogous, then inversion and asynchrony should both disrupt the emotion recognition of human and android agents.To further investigate the effects on emotion processing, valence (pleasure-displeasure) and arousal (physiological excitation) dimensional measures from the circumplex model, a widely used model of the assessment of facial emotion expressions, was used 19 .
The hypotheses were as follows: 1. Inversion reduces the ability to recognize angry and happy expressions for both human and android agent expressions.2. Asynchrony reduces the ability to recognize angry and happy expressions for both human and agent expressions.3. Inversion reduces valence and arousal ratings in human and agent expressions.4. Asynchrony reduces valence and arousal ratings in human and agent expressions.

Emotion recognition
Within-subject ANOVAs were conducted on the ratings of angry and happy recognition scales with orientation, asynchrony, agent, and emotion as factors.Results for both angry and happy ratings are depicted in Fig. 1.
For the main effect of orientation, however, the follow-up Bonferroni-corrected Tukey tests showed no significant differences between upright and inverted expressions (t(1111) = 1.44, p adj = 0.151).
For the interaction between asynchrony, agent, and emotion, Bonferroni-corrected Tukey-tests on the effect of asynchrony were analysed for each agent's target emotion condition.Asynchrony decreased emotion recognition of angry human expressions (t(1105) = 6.16, p adj < 0.001), but not for angry android expressions (t(1105) = 0.01, p adj = 1) expressions.
For the recognition of happiness, the results showed significant main effects for orientation The patterns of post-hoc Bonferroni-corrected Tukey tests were identical to that of angry expressions.For the main effect of orientation, no differences between orientation conditions were found (t(1111) = − 0.7, p adj = 0.48).For the interaction between asynchrony, agent, and emotion, the tests of asynchrony showed that asynchrony reduced happy recognition for happy human (t(1105) = 10.95,p adj < 0.001), but not happy android (t(1105) = 1.18, p adj = 0.478) expressions.
In summary, asynchrony meanwhile decreased the ability to correctly recognize human expressions, but not for androids.No inversion effects were observed.

Valence and arousal
Within-subject ANOVAs were conducted on valence and arousal ratings with orientation, asynchrony, agent, and emotion as factors.
In summary, the effect of asynchrony on valence and arousal differed again between agents: While asynchrony decreased valence and arousal ratings for human expressions, it did not affect any of the android's expressions.Again, evidence for an inversion effect was not found.

Discussion
The study's goal was to investigate the effects of inversion and asynchrony of the processing of emotions in human and android expressions.Contrary to previous research, inversion did not affect emotion recognition in either android or human agents.Meanwhile, asynchrony reduced the ability to correctly recognize angry and happy expressions in human faces while it did not affect android faces.Furthermore, arousal and valence ratings only decreased for angry and happy human (not android) faces and no effects of inversion were observed.Thus, hypotheses 1 and 3 (inversion effects) were not supported and hypotheses 2 and 4 (asynchrony effects) were only observed for human but not android expressions.Previous research found that inversion reduces the ability to recognize emotions in dynamic facial expressions [10][11][12] .Meanwhile, inversion effects on arousal and valence ratings have been more mixed [20][21][22] .However, in this study no inversion effect on any variable has been observed.It is possible that due to the limited range of emotional expressions present in this study (angry and happy) and because expressions could be watched indefinitely, participants could rely more on feature-based expression recognition in inverted conditions.Alternatively, certain features present in the expressions (e.g., an open mouth in happy faces) may have facilitated feature-based processing 23 .In fact, inversion effects are not consistently found for emotion expression recognition and especially not for happy expressions [24][25][26] which were used in this study.
Asynchrony decreased both emotion recognition and arousal and valence ratings for human expressions, indicating that asynchrony disrupts the typical processing of human emotional expressions.Interestingly, asynchrony did not affect emotion processing in android expressions.One possibility is that featural processing is increased for android expressions and thus the observation of individual AU motions, rather than the synchrony of the whole expression, is sufficient to recognize the emotion.However, as no inversion effects were observed, this agent effect cannot be explained by differences in configural processing.
Alternatively, participants may be more sensitive to asynchronies in real human compared to android faces.Thus, the same level of asynchrony may have stronger effects on human compared to android expressions: Face-related processing decreased for robot and android faces compared to human faces 27,28 .However, both previous studies also used mechanical-looking faces rather than realistic faces including humanlike faces such as Nikola's-hence, it is unclear whether this decreased face-related processing in android faces can be applied to this study.Asynchronies can disrupt configural processing in facial expressions 17,18 .However, no effects of inversion have been observed in this study, complicating interpretations of the involvement of configural processing.Furthermore, even though the instructions asked to observe the stimuli the way they were presented, participants may have turned their heads for inverted expressions, thus negating orientation effects.However, online experiments find effects of inversion in face rating tasks 32 , indicating that participants generally do not tend to rotate their screens for inverted stimuli when not supervised.Finally, response times were not measured in this experiment.As delayed response times may indicate difficulty and uncertainty, response time analysis may provide an additional indicator of disturbed emotion recognition processing when used in future research.
Previous research on asynchrony or inversion used computer-generated (CG) face stimuli 17,18 while this study is the first to investigate the role of asynchrony in human expressions.CG faces recruit decreased levels of configural processing, and human responses to CG emotion expressions tend to be impoverished compared to human expressions 29,30 .A decreased level of emotion processing in CG faces may not survive global inversion, thus diminishing asynchrony-related processing.Meanwhile, deeper processing of human facial expressions may allow the processing to remain present even when stimuli are inverted.

Participants
Seventy-five Japanese participants (36 female, 37 male, 2 not reported; age, M = 30.85,SD = 4.3, and ranged from 18 to 35) were recruited via CrowdWorks (Tokyo, Japan).The sample size was determined via a-priori power analysis.We assumed to conduct a 2 × 2 × 2 × 2 repeated-measures analysis of variance (ANOVA) with an α of 0.05, power (1 -β) of 0.80, effect size f of 0.10 (weak), and correlation among repeated measures of 0.5.The results showed that more than 60 participants were needed.All participants provided informed consent before participating in the study.The study was approved by the RIKEN Ethics Committee and performed in accordance with the Declaration of Helsinki.

Materials
Video clips (1.25 s each) of emotion expressions were used, divided by two agents (android, human), two emotions (angry, happy), two orientations (upright, inverted), and two asynchrony levels (synchronous, asynchronous).In the asynchrony conditions, the upper right half of the face moved with a 500 ms delay, and the upper left half with a 1000 ms delay starting after motion onset.Human videos were created from the AIST Expression Database 31 and asynchronies were created using the cropping tool of the Adobe Premiere video editing software.Android videos were created by filming the front face of the android Nikola while it expressed angry and happy emotions and asynchronies were created by delaying the programmed motion onset of the relevant actuators.Actuators (which imitate specific face AUs) were chosen according to previous research on Nikola's empirically validated basic emotions 9 .Specifically, for angry expressions, the following AUs were used: 4 (brow lowerer), 5 (upper lid raiser), 7 (lid tightener), 23 (lip tightener), and 25 (lips part).For happy expressions, the following AUs were used: 1 (inner row raiser), 6 (cheek raiser), 12 (lip corner puller), 15 (lip corner depressor), and 25 (lips part).
To control the stimuli, all videos were manipulated to have a white background and to have the agents' noses at the same height, with cut-offs at the neck (bottom), head (top), and ears (left and right).A total of 16 videos were used in the study.Android stimuli are depicted in Fig. 4. Because the AIST prohibits the distribution of stimulus material due to the risk of public familiarization as a confounding variable, human stimuli are not depicted.

Stimulus validation
To validate objective and subjective comparability between the android's and human's emotion expressions, two analyses were conducted.

Objective validation
First, facial expressions were analysed using OpenFace (version 2.2.0) 33 .Intensity of face action units (AUs) as indicators for angry and happy expressions respectively over the course of the video are depicted in Fig. 5.For angry expressions, AU4 (brow lowerer) was used, and for happy expressions, AU12 (lip corner puller).Figure 5 indicates analogous trajectories for human and android actors.

Figure 1 .
Figure 1.Mean (with standard error) target emotion ratings divided by asynchrony and agent conditions.Error bars indicate standard errors and asterisks show significant differences (which were found for human agents only).

Figure 2 .
Figure 2. Average valence ratings divided by emotion, agent, and asynchrony conditions.Error bars indicate standard errors and asterisks show significant differences (which were only found for happy human expressions).

Figure 3 .
Figure 3. Average arousal ratings divided by emotion, agent, and asynchrony conditions.Error bars indicate standard errors and asterisks show significant differences (which were present only for human agents).
https://doi.org/10.1038/s41598-023-44140-4www.nature.com/scientificreports/Subjective validation An online pilot study (n = 11) was conducted using single-scale items of angry and happy recognition as well as arousal and valence, ranging from 0 to 100.Within-subject ANOVAs were conducted with actor type (android, human) as predictors.No significant main effects of actor were found for angry recognition (F(1,10) = 1.42,

Figure 4 .
Figure 4. Android expression stimuli divided by condition.Note.Baseline (neutral) expression is depicted to the left, followed by synchronous and 500 ms delay asynchronous expressions.The top and bottom rows show angry and happy expressions, respectively.