Introduction

Facial mimicry is an involuntary, rapid and automatic response, in which an individual mimics the facial expression of another individual. This phenomenon can be distinguished from other voluntarily and cognitive forms of imitation1,2 because of the rapidity of the response involving exclusively the face. Numerous studies document that people mimic emotional facial expressions of others within 1,000 ms3. Rapid facial mimicry (RFM) has been widely described in children4,5 and adult humans, Homo sapiens6, whose congruent reactions are elicited more frequently and rapidly in response to a dynamic facial expression compared to a static one7.

RFM has been proposed to be grounded in the automatic perception-action coupling of sensorimotor information that occurs in motor brain areas8. Neurophysiological evidence of this coupling is derived from the discovery of mirror neurons in the premotor and parietal cortices of monkeys9,10,11. In fact, they fire when a monkey performs an action and when it observes a similar action performed by another individual9. Functional brain imaging studies in humans showed that the observation of facial emotions activates, similarly to monkeys, not only shared motor representations in premotor and parietal areas but also in insular and cingulate cortices, being these latter directly involved in processing visceromotor sensations. During the observation of a specific facial expression, the observer's covert motor activation results in the experience of a matching emotional state12,13,14,15. In this perspective, human RFM has been theorized to be central in connecting emotionally two individuals. This theoretical account is also supported by behavioural studies showing that the frequency of RFM is higher among friends and kin than among unfamiliar individuals16,17,18. Therefore, RFM could be advantageous to promote social connections and affiliative behaviours among individuals19,20.

Considering the importance that RFM might play in social interactions, it has been proposed that this phenomenon is more widespread than previously reported and not confined to humans. In nonhuman primates RFM has been only investigated in the orang-utan, Pongo pygmaeus. In this study, orang-utans viewed a playful facial expression performed by a playmate and then produced a congruent expression within 1 sec. Such response appears to be homologous with RFM in hominoidea21.

Here, we investigated the presence of RFM in a cercopithecoid species, the gelada (Theropithecus gelada), by focussing on the facial expressions typically performed in the playful context. Geladas are a good model species because in captivity they engage in high levels of social play even as adults22. They show high levels of social affiliation spending much time in grooming sessions23 and, finally, they frequently use playful facial displays to fine-tune their playful interactions24: the play face (PF, the mouth is opened with only the lower teeth exposed) and the full play face (FPF, the lower and upper teeth and gums are exposed via the active retraction of the lips)25. In addition to these important behavioural features, a recent finding of yawn contagion in geladas suggests that they are sensitive to the facial expressions of conspecifics with whom they are closely affiliated26. Considering the features characterizing the species under study and the presence of the neural bases linked to the imitative phenomena in monkeys9, we expect that RFM is present in geladas (Prediction 1).

During RFM the subjects share not only the same facial expression but they also feel the same emotional experience underlining such facial expression (emotional contagion). This phenomenon represents one of the most basic forms of empathy2,19 which, in an evolutionary perspective, is probably rooted in the emotional contagion characterizing the strongest and most basic, of the social bonds, the mother-infant one. Therefore, if RFM is favoured by inter-subject familiarity and/or has a genetic basis, we predict that mother-infant dyads are characterized by more frequent, accurate and faster facial responses compared to unrelated dyads (Prediction 2).

Results

Prediction 1

The frequency of the three types of response significantly differed (congruent, incongruent, and no-response; for the definitions see Methods) both in adult (Friedman's χ2 = 20.26, n = 18, d.f. = 2, p = 0.0001) and in immature subjects (Friedman's χ2 = 23.84, n = 16, d.f. = 2, p = 0.0001) (Figure 1).

Figure 1
figure 1

Rapid facial mimicry in adult and immature individuals (PF and FPF as stimuli) - RFM events per number of trigger stimuli, when the observer was an adult (on the left) and an immature individual (on the right).

Thick horizontal lines indicate medians; height of the boxes corresponds to inter-quartile range; thin horizontal lines indicate range of observed values.

When the trigger stimulus was a lip-smacking (LS), the frequency of the three types of response significantly differed (congruent, incongruent, and no-response; for the definitions see Methods) both in adults (Friedman's χ2 = 5.89, n = 11, d.f. = 2, p = 0.052) and in immature subjects (Friedman's χ2 = 21.72, n = 16, d.f. = 2, p = 0.0001) (Figure 2). Seven adult subjects were excluded from the analysis because they never received LS as stimulus during the playful context.

Figure 2
figure 2

Rapid facial mimicry in adult and immature individuals (LS as stimulus) - RFM events per number of trigger stimuli, when the observer was an adult (on the left) and an immature individual (on the right).

Thick horizontal lines indicate medians; height of the boxes corresponds to inter-quartile range; thin horizontal lines indicate range of observed values.

The frequency of congruent responses was higher than incongruent responses, when the trigger stimulus was a PF or a FPF (Wilcoxon's T = 0, ties = 4, n = 34, p = 0.0001). When the trigger stimulus was a LS the congruent responses did not significantly differ from the incongruent responses (Wilcoxon's T = 80.50, ties = 0, n = 22, p = 0.134).

Congruent responses were faster than incongruent responses (Wilcoxon's T = 42.00, ties = 2, n = 23, p = 0.016).

Prediction 2

Immature individuals were equally likely to play with their mothers and other unrelated adults (Wilcoxon's T = 52.00, ties = 0, n = 16, p = 0.433), thus suggesting that playful motivation did not differ between mother-infant and unrelated dyads. Yet, immature individuals showed higher levels of RFM with their mothers compared to the unrelated adults (Wilcoxon's T = 15.00, ties = 0, n = 13, p = 0.032; Figure 3). Three immature subjects were excluded from the RFM analysis because they did never have the opportunity to perceive a stimulus given by unrelated adults during their play sessions.

Figure 3
figure 3

Rapid facial mimicry: infant-mother dyads vs infant-unrelated adult dyads - Frequency of the congruent responses (RFM event per number of PF and FPF perceived) exchanged between infants and their mothers and between infants and other unrelated group-members.

Thick horizontal lines indicate medians; height of the boxes corresponds to inter-quartile range; thin horizontal lines indicate range of observed values.

The frequency of RFM, which occurred between the two playmates, was faster between mothers and offsprings than between unrelated adults and immature individuals (Wilcoxon's T = 5.00, ties = 1, n = 13, p = 0.006; Figure 4).

Figure 4
figure 4

Time latency of congruent response: infant-mother dyads vs infant-unrelated adult dyads - Time latency (10 msec) of the congruent responses exchanged between infants and their mothers and between infants and other unrelated group-members.

Thick horizontal lines indicate medians; height of the boxes corresponds to inter-quartile range; thin horizontal lines indicate range of observed values.

Discussion

In the present study, we provide evidence that RFM occurs in a non-ape species, the gelada (Prediction 1 supported). Specifically, both immature and adult subjects mimicked play faces (PF/PF; FPF/FPF) (Figure 1 and 5) but not lip-smacking (LS) (Figure 2 and 6). A clear RFM response was not found for LS. This is a signal that can have different meanings and functions depending on the context in which it is used, on the target animal to which it is directed and on the species27,25. Therefore, the low RFM frequency for LS could be due to the fact that, in the current study, it was recorded and analysed exclusively during playful activity. Additional data collected outside playful context could clarify this issue.

Figure 5
figure 5

An example of congruent response in RFM - RFM during a play session between an adult (left) and an immature individual (right).

The immature mimics the adult's full play face (FPF). (Photo by P.F. Ferrari).

Figure 6
figure 6

An example of incongruent response - Infant's incongruent response (right) to the facial expression of an unrelated female (left).

Infant is performing a play face (PF) and adult female lip smacking (LS). (Photo by P.F. Ferrari).

Why do play faces elicit mimicry responses? Primate play faces are considered homologous to human's laughter28 which, being the external manifestation of joy and happiness, is found across many different human cultures29. Different from LS, primate PF and FPF are strongly linked to an unambiguous positive emotion arising from play, an emotionally positive and self-rewarding behaviour30. The primate play face can, through RFM, evoke in the perceiver the same positive emotional state31,32. Indeed, this ability to instantly recognize and generate the same emotional states of others is adaptive, as it allows an individual to foresee the playmate's intentions33 and fine-tune its own motor sequences accordingly25. Such ability is a prerequisite to avoid any misunderstanding, manage a playful interaction successfully and promote social affiliation34.

In humans, mimicking others' facial expressions facilitates the recognition of the emotional state underlying such facial expressions28,18. For example, Stel and van Knippenberg35 showed that blocking mimicry influenced the speed of facial expression recognition in women, but not their skill to categorize facial expressions as positive (i.e. happiness, joy) or negative (i.e. sadness, anger). Moreover, humans scoring high levels of RFM tended to have also high levels of empathy28. Taken together, these findings strongly suggest that RFM is important in the recognition process when it requires fine distinctions of similar facial expressions conveying subtle differences in meaning25, such as the processing of different smile types in humans28.

In terms of proximate mechanisms responsible for RFM, it has been previously hypothesized that activating shared motor representation could explain this phenomenon, at least in part. Individuals can understand the meaning of an action performed by another individual through a direct activation of a corresponding motor representation10,36,37,38,9. Normally, during action observation the motor output (i.e., the cortico-spinal tract, the muscles, etc.) is suppressed because some of the components of the motor network are not active. However, neural matching mechanisms, in conjunction with other motor areas, can produce an overt activation of the observed behaviours39,40,41. This mechanism has been described, by means of different electrophysiological techniques, in some macaque species and also humans2,8,36 and it has been shown to be involved, since very early in development, in imitative phenomena8,42. It is likely that also in the gelada, a homolog mechanism is present and might contribute to several behavioural and psychological processes, including RFM. In humans, such mirroring activity may have implications for the capacity of individuals to empathize with others18,14,43. While the correlation between the activity of mirror system and empathy is supported by several fMRI (functional magnetic resonance imaging) investigations, more recently it has been also shown that behavioural synchrony and matching activate neural circuits involved in reward and positive affect21,28. In fact, research using infrared spectroscopy demonstrated that in both mothers and infants there is an increase in activation of the orbitofrontal cortex in response to the smile of one's own infant or mother, respectively44. Despite the hypothetical link between the phenomenon of RFM and the interpersonal emotional connection, no study has ever empirically tested the emotional connection hypothesis in a non-human primate species. In line with this, RFM has never been investigated in mother-infant interactions, in which the emotional engagement is extremely high and thus, for this reason, it could represent an optimal social model to verify this hypothesis. Our findings, although far from definitely demonstrating the actual linkage between RFM and emotional connection, suggest that RFM differs both quantitatively (frequency; Figure 3) and qualitatively (time latency; Figure 4) according to the genetic and emotional closeness between playmates: the mother-infant dyads showed the highest level of RFM and the fastest responses (Prediction 2 supported). The temporal coordination of face-to-face interaction that occurs between mothers and infants has been extensively documented in humans16. Such moments of affective matching are important for the neuro-physiological maturation and for the functional attachment relationship of the infant with the caregiver45. In non-human primates, RFM could reflect one of the core elements of the mother-infant relationship and might represent an important indicator of the quality of such relationship.

Methods

Subjects and housing

The colony of geladas housed at the NaturZoo (Rheine, Germany) was composed of two one-male units (OMUs) including 2 adult males, 18 adult females and 18 immature subjects (1–6 months, black infants; 7 months–2 years, infants; 3–4 years, juveniles). Kin relations were known.

Individual identification was based on sex, age and on distinctive external features (scars, size, patterns of fur patches, fur colour and facial traits). The two OMUs were housed in two enclosures, both with an indoor (rooms of about 36 m2) and outdoor facility (islands of 2,700 m2 surrounded by a boundary ditch). The outside enclosures were located in an open, naturally hilly area equipped with trees, branches, ropes and dens. The animals were fed with grass, vegetables and pellets, which were scattered on the ground two times a day (9:30 a.m., 2:30 p.m.). Water was available ad libitum. No stereotypic or aberrant behaviours have ever been observed in this group. The research complied with current laws of Germany, Italy and the European Community. The local committee of the NaturZoo has approved this study. The study was purely observational (with no manipulation whatsoever) and subjects were observed in their natural social setting. Thus, the ethical committee of the University of Pisa and of the University of Parma waived the need for a permit.

Data collection procedure

The 1,121 dyadic play bouts involving 18 adults and 16 immature subjects were video-recorded during a 4-month (June–September, 2009) and a 2-month period (July–August, 2010). Video-analysis was conducted using Kinovea v. 0.7.10 software.

A play session began when one partner invited another individual, or directed any playful pattern toward it. A session ended when playmates ceased their activities, one of them moved away, or when a third individual interfered, thus interrupting the interaction. If another play session began after a delay ≥10 s, that session was counted as new25.

We defined RFM as the visible response of facial musculature by an observer to match the facial gestures in another individual's facial expression. This congruent response must be rapid: within 1 sec from the emission of the facial stimulus. To examine the presence of RFM we collected data on the playful expressions: the play face (PF) and the full play face (FPF)25. Since during play geladas frequently lip smacked (LS, lips are protruded and then smacked together repeatedly, sometimes alternated with tongue protrusions) toward conspecifics, we measured LS as a control. Like PF and FPF, LS involves motor muscles of the orofacial area and it is a facial display used to signal benign intentions. Different from PF and FPF, LS is not a context-specific signal as it occurs in a variety of different contexts46,47.

Videometric analyses were primarily conducted by G.M. Interobserver reliability was tested by G.M. and E.P. with one-frame accuracy (one frame/10 msec). The mean Cohen's kappa values were 0.78 for PF, 0.81 for FPF and 0.76 for LS.

To test for the presence of RFM, we measured the facial displays of one individual (the observer, hereafter) to see whether the observer's expressions varied as a function of the facial displays of the play partner (the trigger, hereafter) within a 1-s time window. The triggers were the first playmates who emitted a facial stimulus (PF, FPF, or LS). In order to reliably assess that the facial expression performed by the observer was actually elicited by the facial expression performed by the trigger, we considered only those interactions in which the observer looked at the face of the trigger and did not show any facial expression in the 1 s prior to the trigger's stimulus. Chewing behaviours and biting transitional facial expressions were excluded from the analysis to reduce ambiguities during the analysis.

Each play session could involve more than one triggered event. In this case, we considered as a new event the subsequent triggered event that occurred after the two playmates had interrupted the visual contact for at least 2 sec. This made possible to collect more than one triggered event for each observed individual. Due to the subject variability in terms of play frequency and facial expressions performed, the analysis was carried out at an individual level for a more conservative statistical approach, which is usual when dealing with data collected under natural conditions.

After the trigger emitted a specific play signal (stimulus: PF or FPF), we categorized the observer's responses into three possible categories: congruent, incongruent and no-response. The congruent-response occurred when the observer mirrored the same facial display of the trigger (stimulus PF/response PF; stimulus FPF/response FPF). When the observer responded with a LS, the response was labelled as incongruent. When the observer did not show any facial reaction (neutral face) we categorized the absence of response as no-response. The same analysis was conducted considering LS as the stimulus. Observers who never displayed PF, FPF, or LS in response to a stimulus were excluded from the analysis.

The latencies were measured frame-by-frame starting from the onset of the trigger stimulus and ending with the onset of the observer's facial response with 10 msec accuracy.

Statistical analysis

Due to non-normal data distribution, we employed nonparametric statistics48. To compare the frequency and the latency of the observer's response we applied the Friedman test when k > 2 and the Wilcoxon's test when k = 2. The Mann-Whitney U-test was used to compare the frequency of responses for immature subjects and adults. Statistical analyses were performed using SPSS 17.0 software. Exact tests were applied to all the analyses49.