Introduction

Eye contact plays a crucial role in human social communication. In typically developing populations, eye contact facilitates emotion recognition and mental state understanding1, 2 and increases positive affect3. During mutual gaze, the pupil size of typically developing individuals adapt to that of the interacting partner so that larger pupils in others elicit greater dilation4, 5. This involuntary effect termed pupillary contagion is observed in typically developing populations as young as four months of age6, 7 and is relatively independent of age thereafter8. Pupil size is directly controlled by joint innervation from both branches of the autonomic nervous system (ANS)9 and pupillary contagion is therefore interpreted as being driven by spontaneous arousal synchronization7. In typically developing populations, pupillary contagion correlates with an increased activity in brain regions involved in mentalizing and social cognition including the superior temporal sulcus and medial prefrontal cortex10.

Arousal synchronization between humans may promote social closeness and interaction11. Consistently, pupillary contagion is found to facilitate mutual trust and empathy in typically developed adults and could therefore have an important role in social communication4, 12. If this is correct, altered pupillary contagion would be expected in neurodevelopmental conditions associated with differences in social interaction and eye contact. This theory remains largely unexamined.

The current study examined pupillary contagion in Williams syndrome (WS), a rare genetic disorder caused by a hemideletion of approximately 25 genes at chromosome 7q11.23. WS is associated with a striking behavioral phenotype, characterized by high social approach motivation, increased reward value of social stimuli13 and increased attention to other’s faces14,16,17,17. Because of its distinct phenotype and known genetic cause, WS has been seen as a model condition for studies of social brain development and function18. Many individuals with WS have an intellectual disability (ID) in the mild to moderate range19 and while variable, many require a high degree of societal support20. Despite a heightened social drive and interest in social interaction, people with WS often experience challenges in social communication and cognition, even relative to other populations with intellectual disability. These include social communication differences21, altered Theory of Mind22, 23 and challenges establishing social reciprocity24. People with WS often find it difficult to judge the trustworthiness of others, which can expose them to risks25, 26. Reduced facial emotion recognition skill has also been reported27, although the results are inconsistent28. The combination of high social drive and social skill challenges seemingly contrasts with influential theoretical accounts of neurodevelopmental conditions such as autism, where low social motivation might fuel social cognition and interaction alterations29.

Autism is a neurodevelopmental condition associated with atypical social interaction and eye contact30, 31. A recent study reported typical pupillary contagion associated with autism32 suggesting that reduced pupillary contagion is not a necessary correlate of neurodevelopmental conditions associated with reduced sociability. To better understand the role of pupillary contagion in human social interaction and motivation, it is crucial to determine whether it is atypical in rare developmental conditions associated with increased rather than reduced sociability.

Several studies have documented atypical attention to other’s eyes in individuals with WS, although the exact nature and consequences of these alterations remain unclear33,35,36,36. Ridley and colleagues36 analyzed parent reports of eye contact in WS and found descriptions of an unusual quality of eye contact in a large proportion of participants, including staring, and brief glances. Experimental studies reported altered attention to the eye region of static images of faces in WS which seem time-scale dependent. When visual attention is averaged over several seconds, people with WS were found to attend more to the eye region than controls14, 34 whereas a recent study including a relatively large sample with WS (n = 37) found a reduced tendency to orient to others’ eyes directly following stimulus onset33. A study using the standardized Autism Diagnostic Observation Schedule (ADOS)37, a first-choice clinical scale to assess autistic symptoms, reported unusual eye contact in 50% of young children with WS38. In the ADOS, eye contact is judged as being either typical or atypical based on the examiner’s impression of eye gaze being appropriate, flexible, socially integrated and modulated or not.

Several studies have documented differences in social perception and decision-making between people with WS and idiopathic autism, that is autism of unknown origin (e.g., no presence of a genetic syndrome)15, 23, 27, 39, 40. However, studies using the ADOS have documented elevated levels of autistic symptoms in WS38, 41, and may therefore explain inter-individual variability in social processing.

To the best of our knowledge, pupillary contagion has not been studied in WS. However, the unique combination of heightened social drive, atypical eye contact, and social cognitive challenges means that the condition is ideal for disentangling how pupillary contagion varies, based on genetic and neurodevelopmental conditions and in human development in general. Since pupillary contagion has previously been found to be relatively stable throughout development, we compared participants with WS not only to age-parallel-matched typically developing participants but also to typically developing infants aged 4 to 6 months, i.e., the earliest age when pupillary contagion is consistently found. These groups were included to address some of the challenges associated with the selection of comparison groups in studies of rare genetic conditions associated with intellectual disability. Age-matched typically developing controls will differ from study participants on multiple parameters, including cognitive. Controls matched for cognitive ability will typically be much younger and therefore also differ from the study group on multiple parameters. Finally, a comparison group with intellectual disability of unknown genetic origin may be included. This avoids some of the problems with typically developing control groups but is also likely to result in a comparison group with high but unknown genetic heterogeneity, since most cases of intellectual disability in the moderate to mild range are known to be genetic42.

Methods

Stimuli

Images of the eye region of human faces with a neutral emotional expression (see Fig. 1) were presented. The original images were taken from the Karolinska Directed Emotional Faces database43 and digitally edited to create three versions of each image with different pupil sizes, henceforth referred to as constricted, medium, and dilated. Constricted pupils were 40% smaller than medium pupils, which were in turn 40% smaller than dilated pupils. The three stimulus categories had identical mean image brightness6 and covered 21.74° of the visual field horizontally and 8.03° vertically. The full stimulus set consisted of images of six models (3 males, 3 females) (Image IDs: BF19NES, BF13NES, BF01NES, AM14NES, AM10NES, and AM08NES), editing described in Ref.6. For examples, see Fig. 1.

Figure 1
figure 1

Example of stimulus images.

Procedure

Participants completed one of two versions of the task. The long version included images of all six models which were presented twice with each pupil size, resulting in 36 trials, following the original study where the stimuli were used6. The short version of the task included four of the six models (3 male, 1 female) presented once with each pupil size, resulting in 12 trials. Data collection started using the shorter version of the task to minimize the attentional demands of participants. Qualitative observations during testing indicated that the task was not too demanding for most participants. To facilitate comparisons with the first study using the stimuli6, and increase the number of data points, we therefore switched to the longer version of the task. Stimuli were shown for 3 s in counterbalanced order, preceded by a fixation cross shown for one second. Data collection took place at a lab facility at the Karolinska Institute, at the Uppsala University Child and Baby Lab (infants), or at the Ågrenska Foundation in Gothenburg, Sweden. The study is part of a larger project which examines the behavioral, genetic, neural, and medical aspects of rare genetic disorders13, 33, 44.

Participants

The study included individuals with WS and two comparison groups: typically developed children and adolescents, and typically developing infants. Since WS is a rare condition, it was not deemed feasible to determine a sample size a priori. Instead, recruitment was conducted continuously and stopped when information about the study had been spread through all available channels (see Williams syndrome) and when the sample size was comparable to previous studies finding a pupil contagion effect6, 732, 45. Sample size in the comparison groups was determined based on previous studies of pupil contagion in typically developing individuals57, 45, 46.

Demographic information, baseline pupil size, and number of included trials by group and condition are shown in Table 1. TD-infants were (by definition) younger than the other groups. No other differences were found.

Table 1 Demographic characteristics and number of valid trials.

Williams syndrome

Individuals with WS (n = 44, age range 9–57 years) were recruited through contacts with Habilitation and Health Care services, the Swedish Williams Syndrome Association, and the Ågrenska Foundation. Five additional participants were tested but excluded because of invalid data. Protocols from genetic assessments were available for 22 participants. These had typical deletions at the 7q11.23 region. In individuals not genetically tested in the current study, a diagnosis of WS was confirmed from medical records and caregiver report (n = 4) or parental report only when medical records could not be obtained (n = 18). Demographic characteristics are shown in Table 1. Fifteen of the 44 participants completed the longer version of the task.

Psychiatric assessment

Twenty-two individuals, and their caregivers, completed a larger assessment including the Wechsler Adult Intelligence Scale, 4th Edition (WAIS-IV47) or the Wechsler Intelligence Scale for Children, 5th Edition (WISC-V48) depending on their age, and a diagnostic interview for psychiatric disorders conducted by a clinical psychologist or psychiatrist using the Mini International Neuropsychiatric Interview (MINI)49. Co-occurring psychiatric conditions were specific phobia, n = 8, Attention deficit/hyperactivity disorder (ADHD), n = 3, panic disorder, n = 4, post-traumatic stress disorder (PTSD), n = 1, tics disorder, n = 1, depression, n = 1, autism, n = 1. These participants completed the ADOS-2 with module selected depending on the participant's age and verbal ability (Module 2: n = 1, Module 3: n = 3; Module 4: n = 4). ADOS-2 gives separate scores for the social communication (termed social affect) and restricted and repetitive behavior (RRB) domains of autistic symptoms as well as a total score. Following50, raw scores were converted to calibrated severity scores (CSS) to facilitate comparisons across ADOS-2 modules.

Typically developing children and adults (TD, age range 9 to 63 years) were recruited from a database of individuals expressing interest in participating in research and through advertising in social media. Exclusion criteria were any ongoing psychiatric or neurological condition. Initially, 66 individuals expressed an interest and participated. Of these, the absence of an ongoing psychiatric disorder was confirmed by an experienced clinical psychologist using the MINI in 51 cases, and by self-report in 1549. Of the initial sample, one was excluded because of a lack of valid data, resulting in a final sample of 65 individuals of which 50 completed the short, and 15 the long version of the task.

Typically developing infants (TD-infant, age range 4 to 6.5 months) were recruited through a database of families expressing interest in participating in developmental research. The final sample consisted of 79 infants of which 31 completed the short, and 48 the long version of the task. In addition, eight participants were seen but excluded because one parent had a diagnosis of a neurodevelopmental condition (n = 4), gestational age less than week 35 (n = 1), birth complications (n = 1), suspected genetic syndromes in multiple second-degree relatives (n = 1), or low data quality (n = 16). Infants completing the shorter version of the task were recruited specifically for the current study, whereas the group completing the longer version were tested in the context of a previous study using the same stimuli6.

Ethical approval and consent to participate

The study was conducted in accordance with the declaration of Helsinki and approved by Regional Ethics Committee of Stockholm, Sweden (dnr 2018/1218-31 with subsequent amendments). Participants aged 15 or above gave verbal and written consent to participate. Legal guardians of participating children gave written consent.

Pupil dilation recording and analysis

Data were recorded using Tobii corneal reflection eye trackers (Tobii Inc., Danderyd, Sweden). Eighty-two participants (TD, n = 50, WS, n = 29) were tested with a Tobii X-120 which samples the pupil at 40 HZ and gaze at 120 HZ, 30 participants with a Tobii Spectrum (TD, n = 15, WS, n = 15) at 1200 HZ. The infant group was tested with a Tobii T120 at 60 HZ (n = 48) or a TX300 at 120 HZ (n = 31).

Data were first downsampled to 40 HZ. Subsequently, gaps in the data shorter than 8 samples were interpolated. Samples directly preceding and succeeding an absolute sample-to-sample change in pupil size > 3* the median absolute distance (MAD) were removed, since sudden changes of this magnitude are physiologically unlikely. After that, data were filtered with a moving average filter covering 8 samples. Initial visualization of the data (see Fig. 3) showed a pupillary light reflex caused by change in luminance at stimulus onset approximately within the 0–1.5 s time period following stimulus onset. Since the pupillary light reflex is not sensitive to attention and has other neural correlates than task-evoked pupil dilation9, this interval was removed from analysis in line with previous studies5, 8. The pupil dilation response was defined as the peak pupil dilation during the 1.5–3 s interval, baseline corrected by subtracting mean pupil size during the last 0.4 s of the preceding fixation cross to correct for differences in tonic pupil size6.

Trials with < 30% valid samples during the baseline or trial interval were rejected (WS: 11.80%, TD: 3.68%, TD-infant: 28.10%). To maximize the number of included data points, all valid trials were included, i.e., participants were not excluded if they only contributed a small number of trials. However, results did not change when participants completing less than two valid trials per condition were excluded. Saccades were identified with an I-VT filter algorithm with a velocity threshold set to 25°/second. Fixations were defined as periods between saccades. Subsequent fixations at a distance shorter than 0.5° and occurring within 75 ms were merged. Eye movement data were analyzed at the original sampling rate.

Statistical analysis

Data were analyzed using Bayesian linear mixed effects models (LMM) with random effects for individual. LMMs model simultaneously model data at the levels of trial and participant, which is preferable to traditional analysis of variance (ANOVA) in designs with multiple trials per participants and potential imbalance between conditions in the number of included trials due to data loss51. Perhaps most importantly for the current study, LMMs do not assume an equal number of trials per participants and can account for potential variation between stimuli within a condition by specifying random factors. The statistical models included pupil dilation as dependent variable and trial number, percent gaze recorded during the trial, sex, and age as covariates. Random factors were participant id (i.e., trials from the same individual were treated as repeated measures) and stimulus model. Eye tracker model was included as an additional covariate in the WS and TD groups, but not in infants since all infants tested at 120 HZ were older than four months, leading to multicollinearity between age and eye tracker.

A set of exploratory analyses examined the relationships between individual estimates of the pupillary contagion effect and symptoms of autism and IQ. Participants with four or more valid trials in the medium and dilated conditions and available clinical data were included in these analyses (n = 16, note that 22/44 participants completed the longer clinical assessment described under Psychiatric Assessment). Visualizations of the ADOS-2 total and RRB scores indicated a bimodal distribution, and they were therefore transformed to a binary variable with two levels: no- to minimal symptoms (0–1 points) or symptoms present (2 or above). Continuous variables were screened for outliers defined as values outside the median + /− 3*MAD range. This threshold for outlier detection can be considered as “very conservative”52. One outlier value in ADOS-2 Social Affect was detected and replaced by the highest acceptable value.

We reasoned that evidence for a lack of pupillary contagion in WS would be equally informative as evidence for it, as a lack of pupillary contagion would indicate a potentially important deviation from normative development. Because of this, we chose to use Bayesian statistics for the current analysis. Bayesian statistics directly compare the likelihood of the hypothesis (e.g., that a pupillary contagion effect exists) to that of a null hypothesis (e.g., that no contagion effect exists). Therefore, three potential conclusions are possible: (1) that the null hypothesis is best supported, (2) that the hypothesis is best supported, and (3) that the data is inconclusive. This contrasts with traditional null hypothesis significance testing (NHST) based on p-values. In NHST, a result can only indicate that the null hypothesis should, or should not be rejected, but not the degree to which it is supported, even if p-values are high. In studies of rare genetic disorders, it may be particularly important be able to differentiate between support for the null hypothesis and insufficient support for the hypothesis (is there evidence for absence of an effect, or absence of evidence?). Specifically for our study, we tested whether there is evidence to support the existence of pupillary contagion in WS (H1), evidence to support its absence (H0), or whether the results are inconclusive.

Statistical decisions were based on Bayes factors (BF10) which quantifies the relative support for the hypothesis over the null. By convention, a BF10 > 3 indicates support for the hypothesis, BF10 > 20 strong support, and BF10 > 150 very strong support. A BF10 > 0.5 and < 3 indicates inconclusive results53. A BF10 can be converted to reflect the degree of support for the null (BF01) through the formula 1/BF10 which can be interpreted at the same scale. Since the pupillary contagion effect implicates larger pupil dilation to relatively larger pupils, Bayes factors are expressed for directed hypotheses. To facilitate comparison with traditional null hypothesis significance testing, we report p-values based on the probability of direction (pd) measure. Pd ranges between 0.5 and 1 and indicates the degree to which the posterior distribution goes in the observed direction. The value 1-pd is equivalent to a one-sided p-value54.

Bayesian parameter estimation results in a distribution of plausible parameter values in which a highest density interval (HDI) can be defined, which covers the most plausible values. Following55, we compared the 95% HDI (the credibility interval, CI) of the estimated parameters to a region of practical equivalence (ROPE) covering 0 + /0.1 SD of the dependent variable corresponds to a small effect size56. Overlap between the ROPE and the HDI indicate that the parameter value is practically equivalent to zero. The package rstanarm57 in R58 was used to fit all models. The BayestestR library was used to calculate Bayes factors, 95% HDIs and ROPE intervals. Weakly informative Cauchy priors (μ = 0, scale = 0.707) were used in all analyses. Visualizations indicated that the residuals from all models were approximately normally distributed. R-hat values > 0.999 for all parameters, indicating good model convergence.

Results

Covariates

Looking time at the stimuli (%gaze) was not associated with pupil dilation in any of the groups (BF01 > 3), see Fig. 2). Higher age was associated with lower pupil dilation in typically developing children and adults (BF10 = 10.24) with a small effect size, indicated by almost complete overlap with the ROPE. No relation between pupil dilation and age was seen in the other groups (BF10 < 1). Other covariates were not associated with pupil dilation (all BF10 < 2.7, see Table 2).

Figure 2
figure 2

Heatmaps of gaze allocation to constricted, medium sized, and dilated pupils by group. Gaze data were smoothed with a Gaussian filter with sigma corresponding to 1 degree of the visual field. The scale goes from blue to red, with more gaze allocations at more red areas.

Table 2 Results of Bayesian linear effect models in typically developing groups.

Main analysis: typically developing groups

As expected, higher pupil dilation to eyes with dilated as compared to medium pupils was found in typically developing children and adults (BF10 = 17.69) and infants (BF10 = 5.23), see Fig. 3. As can be seen in Table 2, the posterior distributions had only a small overlap with the 95% ROPE, providing further evidence for the hypothesis. Evidence for a difference between dilated and constricted pupils was inconsistent in both typically developing groups (children and adults: BF10 = 1.20, infants: BF10 = 0.36). However, it can be noted that the estimated effect was in the expected direction with a one-sided p-value of p = 0.01 for children and adults and p = 0.032 for infants.

Figure 3
figure 3

Mean pupil dilation while viewing constricted, medium, and dilated pupils as a function of time split by group. Shaded areas cover + / − 1 standard error of the mean.

Main analysis: Williams syndrome

The data strongly supported the null hypothesis that pupil dilation responses did not differ between conditions (all BF10 ≤ 0.04, all BF01 ≥ 25.00, see Fig. 3). As seen in Table 3, the posterior distributions of the effects had large overlaps with the ROPE, suggesting that the effect sizes can be considered practically equivalent to zero.

Table 3 Results of Bayesian linear effect models in Williams syndrome.

Relationship between pupil contagion, autistic symptoms, and IQ in Williams syndrome

A negative relationship was found between ADOS-2 Social Affect and pupillary contagion (b =  − 0.037, CI =  − 0.071, 0.004, BF10 = 3.278, p = 0.028, see Fig. 4 and Table 4), indicating that one point increase in social communication symptoms corresponded to a 0.037 mm lower pupillary contagion effect, approximately 75% of the pupillary contagion effect seen in TD-adults and children. In contrast, inconclusive evidence for a relationship between pupillary contagion and ADOS-2 total and RRB scores and IQ were found (see Fig. 4 and Table 4).

Figure 4
figure 4

Relationship between pupillary contagion and ADOS-2 measures of autistic symptoms of social affect (A), repetitive behaviors and restricted interests (B), and ADOS-2 Total scores (C) in individuals with Williams syndrome. (B,C) show dichotomous variables due to a bimodal distribution.

Table 4 Relationship between pupillary contagion and clinical and demographic factors in Williams syndrome.

Discussion

Pupillary contagion is a developmentally early emerging phenomenon believed to facilitate social affiliation and group interaction. The current study examined pupillary contagion in WS, a condition associated with a combination of a hypersocial behavioral phenotype and social interaction challenges. The data gave strong support for the null hypothesis that, at the group level, pupillary contagion is not seen in WS. Although the null hypothesis was supported, there was considerable variability between individuals. A follow-up analysis showed that this variability was partly explained by autistic symptoms of social communication, so that individuals with lower symptom levels showed a more normative pupillary contagion effect.

Taken together, these results suggest that high social drive, a cardinal feature of WS, does not necessarily result in spontaneous synchronization of arousal through the eyes. Instead, lack of spontaneous synchronization of arousal may be part of the underlying mechanism of social communication challenges faced by individuals with WS. For example, people with WS often have challenges judging the trustworthiness of others4, 59, an ability previously linked to pupil contagion4.

The present results may also partly explain why a high level of interest in others’ eyes as commonly seen in WS does not lead to social interaction skills. As noted in the introduction, studies examining attention to others’ eyes over periods ranging from seconds to minutes demonstrated a high level of attention to this region14, 17, 34. In a striking contrast, studies examining the time scale33 and communicative use36, 38 of eye contact in people with WS demonstrated clear differences from the typical developmental pattern. This underlines the importance of treating social attention in neurodevelopmental conditions as a multifaceted concept involving multiple cognitive and attentional mechanisms60.

Replicating previous studies, pupillary contagion was found in neurotypical comparison groups from four-month-old infants to adults completing the same task. This suggests that the reduced pupillary contagion effect found in WS represents an altered developmental trajectory at a very early stage. It should be noted that Bayesian statistics directly quantifies the support for the null, and that the null result in WS does not merely indicate inconclusive data, as would be the case with a high p-value. In the present study, a negative relationship between pupillary contagion and social communication symptoms was seen with an observed effect size of 0.37 mm (CI − 0.037, − 0.071) corresponds to approximately 75% of the average pupillary contagion effect observed in typically developing children and adults, suggesting that a practically meaningful part of the individual variability was explained. Only a subset of 16 individuals were available for this analysis. Although this small subsample merits caution, it is on par with the majority of previous eye tracking studies of WS. It should also be noted that Bayesian statistics are more robust to false positives than traditional null hypothesis significance testing53.

Although the neural mechanisms underlying diminished pupil contagion in WS remain to be understood, it should be noted that functional and structural atypicalities have been noted in medial prefrontal brain regions in WS18, 61 which have also been suggested to underlie pupil contagion in typically developing adults10. Atypical tonic and phasic activity in the ANS have also been suggested in previous studies.

Although the current study included one of the largest groups with WS in experimental research, it should be noted that the age range was wide. We were also not able to examine the longitudinal predictive relationships between pupillary contagion and later outcome. While the present study demonstrates altered pupillary contagion in WS compared to typically developing population across the development, additional studies involving genetically homogeneous populations with intellectual disability are needed to examine whether this is a syndrome-specific alteration.

The current study suggests that reduced synchronization of arousal may be one of the reasons why people with WS often have challenges with social interaction, despite their hyper-social personality and strong approach motivation. It is also in line with previous suggestions that, despite its hypersocial phenotype, WS is linked to altered processing of other’s eyes33, 35. These results contribute to our understanding of human sociability.