Introduction

Faces possess a special status across different domains of cognitive functioning due to their social relevance: they convey valuable information for effective interpersonal interaction and non-verbal communication. Many neuropsychiatric, neurodevelopmental, and psychosomatic disorders are characterized by impairments in visual social cognition, body language reading, and facial assessment of a social counterpart1,2,3,4,5 that may lead to interpersonal awkwardness. Face processing is widely believed to be atypical in autism6,7,8,9,10,11,12.

Autism spectrum disorders (ASD) represent a range of neurodevelopmental conditions characterized by impairment in social interaction and communication, co-occurring with restricted interests and repetitive behaviors, such as persistent fixations on parts of objects13. The disorder has a wide spectrum in severity of symptoms, intelligence quotient (IQ) level, performance on cognitive tasks, and brain neuroanatomy. ASD individuals display diminished orientation towards faces or face disinclination, impaired eye contact, and other deficits in face processing and recognition. Yet the origin of these impairments is still poorly understood, and the experimental evidence is controversial8, 14. It is postulated that ASD individuals preferentially exhibit featural face encoding. They are biased toward detailed facial information over the global form processing, better recognizing isolated facial cues and inverted faces than typically developing (TD) peers12, 15, 16. Autistic individuals prefer single face elements, in particular, located in the lower part of the face such as mouth15, 17,18,19, whereas for TD individuals, eyes play an important role in face decoding and recognition. Individuals with autism fail to engage in the emotionally or socially relevant content of social scenes by devoting substantially more time to the area of mouth than to eyes20. Face identity discrimination in autism is more difficult when access to local cues is minimized, and when dependence on integrative analysis is increased21. Other studies demonstrate, however, deficits of ASD individuals in configural face processing (for review, see ref. 8): they point to the substantial face inversion effect in ASD along with intact sensitivity to the Thatcher illusion22, 23 (a perceptual phenomenon indicating that in typical development, display inversion severely weakens configural face processing. This illusion is named after the late former British Prime Minister Margaret Thatcher, on whose photograph the inversion effect was first demonstrated by Peter Thompson in 198024).

Face tuning is an automatic, rapid and primarily subconscious process, constituting one of the core components of the social perception25. Faces can be easily seen in non-face images such as grilled toasts, clouds or landscapes26. This phenomenon reflects high tuning to faces termed face pareidolia. TD infants are reported to be well tuned to pareidolic faces, i.e., protofaces or schematic faces27. Studies with face-like non-face objects indicate that TD children aged 24–60 months are more likely to direct their first fixation towards upright face-like objects than ASD individuals that points to poor face orientation and tuning in autism28. Yet high functioning adolescents with ASD are as sensitive to faces as TD peers29. Most recently, however, it had been shown that ASD children aged 8–18 years identify substantially fewer depictions of face-like objects as faces in a sequence of ambiguous stimuli30. Both adolescents and adults with ASD showed preferential detection of upright protofaces (schematic faces) under continuous flash suppression stimuli31, 32 (CFS, a technique with a target presented to one eye rendered invisible by high-contrast masks flashed into the other eye; ref. 33). Under this condition, visual stimuli are suppressed from awareness, and cortical face processing is strongly reduced, whereas subcortical brain areas continue to respond to invisible stimuli.

The present work was aimed at investigation of face tuning in individuals with ASD in a recently created Face-n-Food task34. This task consists of a set of food-plate images composed of food ingredients (fruits, vegetables, sausages, etc.) in a manner slightly bordering on the style of Giuseppe Arcimboldo (1526–1593), an Italian painter best known for creating imaginative portraits composed entirely of fruits, vegetables, plants, flowers, books, and even body parts35, 36 (Figs 1 and 2). Obviously, one can perceive a Face-n-Food image either as a composition of elements (fruits, vegetables, etc.) or as a Gestalt (a face). As mentioned earlier34, the primary advantage of these images is that single components do not explicitly trigger face-specific processing, whereas in face images commonly used for investigating face perception (such as photographs or depictions), the mere occurrence of typical features or cues (such as a nose or mouth) already implicates face presence. For individuals with ASD, the use of such images provides an additional benefit of not being confounded by social features present in real faces, notably the eyes: the eye region of the face is perceived by ASD individuals as socially threatening, and elicits an increased physiological response as indicated by heightened skin conductance and amygdala activity14. In addition, in the Face-n-Food task, face tuning occurs spontaneously without being explicitly cued. We assume, therefore, that if in autistic individuals face tuning is deficient because of weakened configural face processing, they would experience more difficulties on the Face-n-Food task than TD controls. This task also benefits from using unfamiliar ‘face’ images that is of importance in clinical settings37. Tuning to faces in the Arcimboldo paintings emerges early in perceptual development: already infants aged 7–8 months prefer the Arcimboldo portraits over the same images presented ‘wrong way up’38. On overall, TD adults and children possess an entire bias for seeing faces in Arcimboldo-like images.

Figure 1
figure 1

Example of the Giuseppe Arcimboldo style. The painting ‘Vertumnus’ by Guiseppe Arcimboldo (1526–1593), an Italian painter best known for creating fascinating imaginative portraits composed entirely of fruits, vegetables, plants, and flowers, depicts the Holy Roman emperor Rudolf II as Vertumnus, the Roman God of metamorphoses (http://vangoyourself.com/paintings/vertumnus; public domain).

Figure 2
figure 2

Examples of images from the Face-n-Food task. The least resembling face (left panel) and most resembling face (right panel) images from the Face-n-Food task (from Pavlova MA, Scheffler K, Sokolov AN. 2015. Face-n-Food: Gender Differences in Tuning to Faces. PLoS ONE 10(7): e0130363. doi:10.1371/journal.pone.0130363; the Creative Commons Attribution (CC BY) license).

Results

Participants were presented with the set of Face-n-Food images, one by one, in the predetermined order from the least to most resembling a face (images 1 to 10). Both TD and ASD individuals either described a food-plate image in terms of food composition (non-face response, 0) or as a face (face response, 1). Figure 3 shows the thresholds for face tuning (i.e., average image number, on which face response was initially reported on the Face-n-Food task) separately for ASD individuals and TD controls. ASD individuals experienced more troubles in spontaneous recognition of the images as a face: TD controls reported seeing a face on average on 3.56 ± 1.59 (mean ± SD) image, whereas ASD individuals (14 out of 16) gave the first face response on average on 5.36 ± 1.91 image. Two out of 16 ASD individuals completely failed on the Face-n-Food task: they did not spontaneously recognize even the most recognizable image number 10 as a face. ASD individuals significantly differed from matched TD controls on the face recognition thresholds (t(28) = 2.25, p < 0.016, two-tailed, with an effect size Cohen’s d = 1.02).

Figure 3
figure 3

Tuning to faces. The average image number, on which resembling a face on the Face-n-Food task (face response) was initially reported, separately for ASD individuals and TD controls. Vertical bars represent SEM. Significant difference in thresholds for face tuning between ASD individuals and TD controls is indicated by asterisk.

Once seen as a face, Arcimboldo paintings are often processed with a strong face-dominating bias. However, both ASD and TD individuals delivered non-face responses on some subsequent Face-n-Food images. As some participants did not report seeing a face on all subsequent images after an initial face report, an additional analysis was performed on the total number of face responses. The difference in the overall percentage of face responses between TD (70.63 ± 16.52; mean ± SD) and ASD individuals (48.13 ± 25.09) was significant (t(30) = 2.74, p < 0.005, two-tailed, with an effect size Cohen’s d = 1.06).

Figure 4 represents the percentage of face responses for each Face-n-Food image for ASD individuals and TD controls. As seen from this figure, individuals with ASD much later reported seeing a face and gave on overall much fewer face responses. As indicated by multiple stepwise nominal logistic regression analysis, the effect of group (TD vs. ASD) is significant (χ2(1) = 24.9, p < 0.0001). For the first five images, ASD individuals provided far less than 50% of face responses, whereas TD group gave almost 50% face responses already from the third image. Starting from the images 5–6, controls very fast reached the ceiling level of performance. By contrast, ASD individuals much later than controls attained the maximal number of face responses for their group, and still gave only 87.7% face responses even with the most resembling face image number 10. Although face recognition level of ASD individuals is much lower, there is no significant interaction between group and image number (χ2(9) = 6.77, p = 0.661, n.s.): the spontaneous face recognition is uniformly shifted down in ASD individuals as compared to TD controls.

Figure 4
figure 4

Percentage of face responses for ASD individuals and TD controls. The image number reflects its face resemblance (1 – the least recognizable, 10 – the most recognizable as a face). Vertical bars represent 95% CI.

Figure 5 represents odds ratios for all consecutive pairs of Face-n-Food images independent of group. The leaps in face recognition occurred from the image 1 to 2 with an odds ratio of 6.07 (95% CI, confidence interval, 0.005 to 0.658; p < 0.01) and from the image 4 to image 5 with an odds ratio 6.24 (95% CI, 0.075 to 0.742; p < 0.01). The odds ratios for all other pairs (3/2, 4/3, 6/5, 7/6, 8/7, 9/8, and 10/9) are not significantly greater than 1 that indicates the lack of substantial increase in face recognition. As shown by the likelihood ratio analysis, in TD controls as compared to ASD individuals, the odds ratio to give face response to each Face-n-Food image in the set is 6.38 (95% CI, 3.21 to 13.6; p < 0.0001).

Figure 5
figure 5

Odds ratios of face recognition between pairs of Face-n-Food images. Vertical bars represent 95% CI. The significant leaps in face recognition occur from the image 1 to 2 and image 4 to 5. The odds ratios for all other pairs do not differ from 1, indicating the lack of increase in face recognition.

In ASD individuals, no correlation was found between their performance on the Face-n-Food task (face response rate) and the general IQ scores (Spearman’s rho = 0.239, p = 0.37, n.s., two-tailed; Fig. 6). This indicates that the impaired performance on the Face-n-Food task in ASD individuals stems from the face tuning deficits rather than from general cognitive disabilities that may affect task performance.

Figure 6
figure 6

Face resemblance and the general IQ scores in ASD individuals. No substantial link occurs between face resemblance (as proportion of face responses) and the IQ scores.

Discussion

By using the recently created Face-n-Food task consisting of a set of food-plate images that comprised food ingredients such as fruits, vegetables, and sausages34,35,36, we investigated face tuning in autism. The key benefit of these images is that their single components do not explicitly trigger face processing. The findings indicate that ASD individuals exhibit poor face tuning on the Face-n-Food task. Thresholds for recognition of the Face-n-Food images as a face in ASD individuals were substantially higher than in TD matched controls: they did not report seeing a face on the images, which TD matched controls easily recognized as a face, and gave on overall fewer face responses.

The most plausible explanation for this outcome is that eficits on the Face-n-Food task are triggered by difficulties of ASD individuals in visual feature integration: one can perceive a Face-n-Food image either as a composition of elements (fruits, vegetables, etc.) or as a whole or Gestalt (a face). Once seen as a face, the Face-n-Food images are processed with a strong face-dominating bias and, therefore, top-down influences may substantially affect bottom-up visual processing of these images. In line with this, recent findings in TD adults indicate that original Arcimboldo hidden-face portraits are judged as being more ambiguous by perceivers with local as compared with global perceptual style39, 40.

Recently, the Face-n-Food paradigm was used for face tuning examination in the other neurodevelopmental disorder, namely, in individuals with Williams-Beuren syndrome (WS)36. Strikingly, although WS individuals possess a hypersocial personality profile that is manifested as a drive for social interaction and particular face fascination, their tuning to faces is extremely poor. As compared to individuals with WS, the sample of ASD individuals appears to be less impaired on the Face-n-Food task. Although both groups differ in respect to gender, cultural background, and age of participants (WS individuals in Pavlova et al.36 were aged 23.3 ± 10.6 years with age range 8 to 44 years, whereas ASD individuals in the present study were aged 14.13 ± 1.86 years with age range 11 to 17; chronological age can be considered as one of predictors of featural face decoding in autistic individuals, but less in WS individuals41), it appears that WS individuals experience more troubles in spontaneous recognition of the images as a face. The threshold of face tuning on the Face-n-Food task is much higher in WS as compared to ASD individuals (8.18 ± 1.47 vs 5.36 ± 1.91 image, respectively). Most interesting, as indicated by the form and slopes of the fitted face recognition curves (Fig. 4 of this paper and Fig. 4 of Pavlova et al.36), the recognition dynamics is remarkably different. Although both groups gave about 85% face responses even on the image 10 most resembling a face, WS individuals did not recognize the first five images as a face at all, and were close to 50% face recognition even on image 9, whereas in ASD individuals, face recognition continuously elevated resulting in face recognition above 50% already on the images 6–7, and reached the ceiling level for their group on images 8–10. This means that face tuning scarcity in ASD and WS individuals may be of diverse origin. Clarification of the precise nature of this deficit is of immense value. Our assumption about the diverse origin in face processing impairment dovetails well with the electroencephalographic (EEG) findings. Differences in EEG gamma band oscillatory brain activity (that is thought to underlie visual binding of elements) suggest that, although both ASD and WS individuals tend to rely more on featural processing in face recognition, the precise machinery of featural processing differs between the two disorders: In autism, apparently normal bursts of gamma activity occur, but they are rather similar for upright and inverted faces, whereas in WS individuals, no clear gamma peaks are observed for both upright and inverted faces42.

To date, there is no consensus on whether face encoding abilities are preserved in autism. The evidence on the origin of face encoding deficits in ASD remains controversial, with some arguments for typical holistic processing and other arguments for atypical development with a preference for featural encoding8, 41. The fusiform face area (FFA), a region known to be heavily involved in typical face processing, is less or differently activated in autism43,44,45,46. Moreover, lower functional connectivity of the FFA with the frontal cerebral cortex had been reported46. Other indispensable parts constituting the social brain such as the superior temporal sulcus (STS) and amygdala, which is heavily involved in affective processing with and without awareness47, also exhibit atypical activation in autism48,49,50,51. In ASD, EEG shows atypical face-specific N170 component of the event-related potentials (ERPs) with bilateral, as compared to typically right lateralized, voltage distribution52. Yet functional magnetic resonance imaging (fMRI) studies found normal FFA activation when gaze patterns and attention to faces, and notably to the eyes, were controlled48, 53. It is also assumed that aberrant face encoding in ASD primarily stems from an eye-avoidance strategy resulting from the eye region of the face being perceived as socially threatening14. In ASD individuals, positive correlation is found between amygdala activation and gaze time spent in the region of eyes44. The present work suggests a limited ability of ASD individuals for seeing faces in the Face-n-Food images, and may be considered as a further step towards putting the Face-n-Food task into clinical setting. One possible explanation for this outcome is that the Face-n-Food test may be more sensitive to preferences for the featural/local face encoding strategy.

Only a few brain imaging studies in TD individuals investigated brain response to Arcimboldo-like images, and the outcome of these studies appears controversial. In the occipito-temporal network underpinning face processing (including the FFA), bilateral superior and inferior parietal cortices, and the right inferior frontal gyrus, Arcimboldo portraits compared to Renaissance portraits and non-artistic face representations (photographs) elicit greater fMRI response39. When contrasting with the same upside-down paintings, Arcimboldo portraits activate the right FFA and posterior STS54, two essential parts of the social brain. In the right hemisphere, face-sensitive N170 component of the ERP is the same in response to Arcimboldo portraits and natural faces, whereas in the left hemisphere N170 amplitude is larger for natural faces55. In 7–8 month-olds infants, near-infrared spectroscopy indicates that the left temporal area is more responsive to the Arcimboldo portraits than to single elements (such as vegetables) constituting these images38.

As mentioned earlier36, important step in further elaborating the Face-n-Food task would be recording of the functional brain activity. Comparison of topographic patterns and temporal dynamics of the neural circuitry underpinning facial processing (with hubs in the FFA and posterior STS, which are considered pivots of the social brain) between individuals with diverse neurodevelopmental disorders such as ASD and WS individuals can add essential information on typical and atypical face processing.

In the light of profound non-face visual perceptual deficits in ASD56, it is essential to figure out whether face processing has a special status in ASD individuals. In the present study, face tuning on the Face-n-Food task does not relate to general cognitive abilities as measured by general IQ. This indicates that the poor performance on this task in ASD individuals does not relate to possible intellectual or cognitive disability. In other words, the impaired task performance in ASD most probably stems from face tuning as opposed to a number of other alternative explanations.

As the only one female participant with ASD had been enrolled in this work, one of the study’s limitations is that it left beyond investigation possible sex differences in autism. It is well-known that males have a higher risk for developing ASD than females, with a sex ratio of about 4:157 or even higher. Yet females are affected much more severely, and therefore, in high-functioning autistic individuals, this ratio may be higher. Furthermore, females possess higher risk of under-identification of autism due to a possible ‘female camouflage effect’58. Taken together, this suggests that in clinical settings, access to female ASD individuals is more difficult. The lack of studies in females with autism calls for a thorough investigation of their neurobiological profile2. Neuroanatomy of autism differs between females and males59. Structural MRI reveals sex-specific morphology in young children (aged 2–7 years) with autism: in boys with ASD, two male-specific regions of increased gray matter volume (the left middle occipital gyrus, BA 19, and right superior temporal gyrus, BA 22) are reported, whereas in girls with ASD, increased grey matter volumes are found in the bilateral frontal regions, right anterior cingulate cortex (BA 32), and the right cerebellum60. Studies investigating gender/sex differences in face encoding in autism are extremely sparse. It is reported that face identity and face recognition are stronger impaired in autistic males61, 62. In autistic girls, but not boys, atypical face-sensitive N170 component of the ERP is associated with symptom severity63. In the light of profound gender effects on the Face-n-Food task in TD adults34, 35, it is worthwhile for future research to take a close look at potential gender impact on face tuning in ASD.

Resume

The outcome of this work indicates that autistic individuals exhibit substantial deficits in seeing faces in the Face-n-Food images represented by a composition of food ingredients in a manner bordering on the Giuseppe Arcimboldo style. In autism, thresholds for recognition of the Face-n-Food images as a face are substantially higher: they did not report seeing a face on the images, which TD matched controls easily recognized as a face. This outcome not only lends support for atypical face tuning, but provides novel insights into the origin of face encoding deficits in autism. The precise nature of this aberration including the brain mechanisms underlying face encoding and gender impact on this deficit remains to be clarified. In addition, comparison of face tuning in ASD and individuals with Williams syndrome36 suggests that face tuning scarcity in ASD and WS individuals may be of diverse origin.

Methods

Participants

Sixteen individuals with ASD (1 female, 15 males) were enrolled in the study. They were recruited at the Unit of Child and Adolescent Neurology and Psychiatry of Asst Spedali Civili (Civil Hospital) of Brescia, Italy. In addition to clinical expertise, autism13 was diagnosed using the Autism Diagnostic Interview - Revised (ADI-R)64 and Autistic Diagnostic Observation Schedule - General (ADOS-G)65. Participants were aged 14.13 ± 1.86 years (mean ± SD; age range, 11 to 17 years). Their general IQ (GIQ, WISC) was on average 100.06 ± 13.08 (ranging from 78 to 128, 14 of them had general IQ higher than 90). Most of them (12 out of 16) had severity level 1 (DSM-5), three out of 16 had level 2, and the only female participant had 3 (her general IQ was in the normal range, 95). Sixteen TD controls pairwise matched with ASD individuals for gender and age had been recruited from the local community of Brescia, Italy. Participants were run individually. All of them had normal or corrected-to-normal vision. None had previous experience with such images and tasks. The study was conducted in line with the Declaration of Helsinki and was approved by the local Ethics Committee of Asst Spedali Civili (Civil Hospital) of Brescia, Italy. Informed written consent was obtained from all participants or their care providers. Participation was voluntary, and the data were processed anonymously.

The Face-n-Food task

The Face-n-Food task was administered to participants. This task is described in detail elsewhere34,35,36. For this task, a set of ten images was created that were composed of food ingredients (fruits, vegetables, sausages, etc.), and to different degree resembled faces. The images slightly border on the Giuseppe Arcimboldo style (Figs 1 and 2). Participants were presented with the set of images, one by one, in the predetermined order from the least to most resembling a face (images 1 to 10). This order was determined in the previous study with TD volunteers34, and had been used since once seen as a face, Face-n-Food images are often processed with a strong face-dominating bias. On each trial, participants had to perform a spontaneous recognition task: they were asked to briefly describe what they saw. Their reports were recorded, and then analyzed by independent experts. For further data processing, the responses were coded as either non-face (0) or face (1) report. No immediate feedback was provided. To avoid time pressure that can potentially cause stress and negative emotional and physiological reactions blocking cognitive processes, there was no time limit on the task. With each participant, the testing procedure lasted no longer than 20–25 min.