Infants’ looking preferences for social versus non-social objects reflect genetic variation

To what extent do individual differences in infants’ early preference for faces versus non-facial objects reflect genetic and environmental factors? Here in a sample of 536 5-month-old same-sex twins, we assessed attention to faces using eye tracking in two ways: initial orienting to faces at the start of the trial (thought to reflect subcortical processing) and sustained face preference throughout the trial (thought to reflect emerging attention control). Twin model fitting suggested an influence of genetic and unique environmental effects, but there was no evidence for an effect of shared environment. The heritability of face orienting and preference were 0.19 (95% confidence interval (CI) 0.04 to 0.33) and 0.46 (95% CI 0.33 to 0.57), respectively. Face preference was associated positively with later parent-reported verbal competence (β = 0.14, 95% CI 0.03 to 0.25, P = 0.014, R2 = 0.018, N = 420). This study suggests that individual differences in young infants’ selection of perceptual input—social versus non-social—are heritable, providing a developmental perspective on gene–environment interplay occurring at the level of eye movements.


Supplementary Tables
. Univariate saturated model for the number of objects explored (in 0-10 seconds) including covariates (age and sex).The χ2 distribution and associated p-value were used to test the effect of the covariates (there was evidence of an effect if there was a significant decrement in fit compared to the saturated model) and twin model-fitting assumptions (assumptions were violated if there was a significant decrement in fit compared to the saturated model)..259Model definitions.The baseline model is the fully saturated model of the observed data, which models the means and variances separately for each twin in a pair and across zygosity.Age.Testing the significance of age, Sex.Testing the significance of sex, 1. Equating means across twins within a pair, 2. Equating means across zygosity, 3. Equating variances across twins within a pair, and 4. Equating variances across zygosity (i.e., the constrained saturated model).
-2LL = fit statistic, which is minus two times the log-likelihood of the data.df = degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.2. Univariate twin model fit statistics and parameter estimates for number of objects explored (in 0-10 seconds).The best fitting model was selected based on non-significance (meaning that there was no decrement in fit compared to the saturated or the genetic model, indexed by the χ2 distribution) and the AIC fit statistic (Akaike information criterion, which incorporates information about both explained variance and parsimoniousness).-2LL = fit statistic, which is minus two times the log-likelihood of the data.df = degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.3. Univariate saturated model for face orienting (proportion of first look to faces) including covariates (age and sex).The χ2 distribution and associated p-value were used to test the effect of the covariates (there was evidence of an effect if there was a significant decrement in fit compared to the saturated model) and twin model-fitting assumptions (assumptions were violated if there was a significant decrement in fit compared to the saturated model).-2LL = fit statistic, which is minus two times the log-likelihood of the data.df = degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.

Supplementary Table 4. Univariate twin model fit statistics and parameter estimates for face orienting (proportion of first look to faces)
. The best fitting model was selected based on non-significance (meaning that there was no decrement in fit compared to the saturated or the genetic model, indexed by the χ2 distribution) and the AIC fit statistic (Akaike information criterion, which incorporates information about both explained variance and parsimoniousness).-2LL = fit statistic, which is minus two times the log-likelihood of the data.df = degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.

Supplementary Table 5. Univariate saturated model for the face preference (proportion looking time on face) including covariates (age and sex).
The χ2 distribution and associated p-value were used to test the effect of the covariates (there was evidence of an effect if there was a significant decrement in fit compared to the saturated model) and twin model-fitting assumptions (assumptions were violated if there was a significant decrement in fit compared to the saturated model)..512Model definitions.The baseline model is the fully saturated model of the observed data, which models the means and variances separately for each twin in a pair and across zygosity.Age.Testing the significance of age, Sex.Testing the significance of sex, 1. Equating means across twins within a pair, 2. Equating means across zygosity, 3. Equating variances across twins within a pair, and 4. Equating variances across zygosity (i.e., the constrained saturated model).
In bold: models with a significant poorer fit compared with the saturated model.-2LL = fit statistic, which is minus two times the log-likelihood of the data.df = degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.-2LL = fit statistic, which is minus two times the log-likelihood of the data.df = degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.

Supplementary Table 7. Assumption testing for the bivariate model between face orienting (proportion of first look to faces) and face preference.
The χ2 distribution and associated p-value was used to test the twin model-fitting assumptions (assumptions were violated if there was a significant decrement in fit compared to the saturated model)..637Model definitions.The Fully Sat.model is the fully saturated model of the observed data, which models the means and variances for both variables, and the phenotypic and cross-twin-cross-trait correlations between the two variables, separately for each twin in a pair and across zygosity.5.In the Bivariate model fitting, the constrained saturated model equates means, variances, phenotypic and cross-twin-cross-trait correlations across twins within a pair and across zygosity, for both variables of interest.The best-fitting model (in bold) was the non-significant and most parsimonious model, as well as the one with the lowest AIC.
-2LL = fit statistic, which is minus two times the log-likelihood of the data.df = degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.Model definitions.The Fully Sat.model is the fully saturated model of the observed data, which models the means and variances for both variables, and the phenotypic and cross-twin-cross-trait correlations between the two variables, separately for each twin in a pair and across zygosity.

Model
In bold: the best-fitting model was the non-significant with the lowest AIC.
-2LL = fit statistic, which is minus two times the log-likelihood of the data.Supplementary Table 9. Summary of the two Generalized Estimating Equations models including genome-wide polygenic scores (GPSs) for autism, ADHD, bipolar disorder, major depressive disorder, and schizophrenia, 10 principal component of ancestry, and age and sex, as predictors of face orienting (proportion of first look to faces) or face preference, with twin pair id as cluster-defining variable.Significant predictors are in bold (adjustments were made for multiple comparisons setting the alpha threshold for the number of outcomes, = .025). 1) Continuous raw eye tracking data was resampled to 60Hz 2) Off-screen gaze was marked as missing gaze 3) X and Y coordinates were averaged when binocular data was present (data from one eye was used when one eye was missing) 4) Large AOIs (centre, face, noise, car, bird, phone) were defined around each stimulus (see Supplementary Figure 2).
a. Raw data was assigned to AOIs, by coding logical vectors (n=5) of gaze samples inside (1) and outside (0) each AOI b.AOI vectors were interpolated to fill in gaps of missing data shorter than 200ms (i.e.recode from 0 to 1) c.Any runs of samples in an AOI vector with a length less than 50ms were recoded to 0 (trigger tolerance for AOI activation), to ensure that a minimum of 50ms of gaze data was accumulated in an AOI for a look to be computed.

5)
For each trial and AOI we computed: a. Whether the AOI was looked at, coded as true if at least a contiguous run of samples with 50 ms was identified inside the AOI.
b.The latency of the first look (if a. was true), coded as the duration from the start of the trial to the first sample in the AOI.
c.The looking ratio in the AOI, coded as the number of samples in AOI per number of samples in all AOIs.
6) For each trial we also computed: a.The number of objects looked at, coded as the count of AOIs looked at (i.e., 5. a. was true).

Supplementary Methods 2. Sensitivity analyses for face orienting
To fulfil the pre-registered analyses plan, we report here the univariate analyses for face orienting operationalized as a composite average score between the proportion of trials that the infant looked at the face as the first AOI and the mean latency to look at the face (latencies shorter than 120 ms were excluded prior to averaging across valid trials) -see more details in the Methods section "Computation of primary measures").The latency score was reversed and both measures were z-scored before averaging them.A higher score on this measure indicates a faster and larger face orienting..019Model definitions.The baseline model is the fully saturated model of the observed data, which models the means and variances separately for each twin in a pair and across zygosity.Age.Testing the significance of age, Sex.Testing the significance of sex, 1. Equating means across twins within a pair, 2. Equating means across zygosity, 3. Equating variances across twins within a pair, and 4. Equating variances across zygosity (i.e., the constrained saturated model).In bold: models with a significant poorer fit compared with the saturated model.-2LL = fit statistic, which is minus two times the log-likelihood of the data.df = degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.12. Univariate twin model fit statistics and parameter estimates for the composite score of face orienting.Note this is not the same measure that it was reported in the manuscript (proportion of first look to face).The best fitting model was selected based on nonsignificance (meaning that there was no decrement in fit compared to the saturated or the genetic model, indexed by the χ2 distribution) and the AIC fit statistic (Akaike information criterion, which incorporates information about both explained variance and parsimoniousness).-2LL = fit statistic, which is minus two times the log-likelihood of the data.df = degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.Model definitions.The Fully Sat.model is the fully saturated model of the observed data, which models the means and variances for both variables, and the phenotypic and cross-twin-cross-trait correlations between the two variables, separately for each twin in a pair and across zygosity.5.In the Bivariate model fitting, the constrained saturated model equates means, variances, phenotypic and cross-twin-cross-trait correlations across twins within a pair and across zygosity, for both variables of interest.

Model
In bold: the best-fitting model was the non-significant with the lowest AIC.
-2LL = fit statistic, which is minus two times the log-likelihood of the data.df = degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.

Supplementary Methods 3. Sensitivity analyses for number of objects explored (in 0-20 seconds)
To fulfil the pre-registered analyses plan, we report here the univariate analyses for number of objects explored operationalized as the number of objects looked at in the whole trial (20 seconds, in contrast with the measure reported in the main manuscript where only the first half of the trial was included -see more details in the Methods section "Computation of primary measures").As in the reported measure, the twin correlations suggested no familial influences on the number of objects explored in the entire trial (ICC MZ=.08, 95% CI [-.10, .25];ICC DZ=.12, 95% CI [-.05, .28]).This shows that, the result of the variability in visual exploration being best explained solely by unique environmental factors (which include measurement error) does not seem to be driven by any ceiling effects of the distribution of this measure.
Supplementary Table 14.Descriptive statistics for the number of objects explored during the entire trial length (0-20 seconds).Note this is not the same measure that it is reported in the manuscript (number of objects explored in the first 0-10 seconds)..650Model definitions.The baseline model is the fully saturated model of the observed data, which models the means and variances separately for each twin in a pair and across zygosity.Age.Testing the significance of age, Sex.Testing the significance of sex, 1. Equating means across twins within a pair, 2. Equating means across zygosity, 3. Equating variances across twins within a pair, and 4. Equating variances across zygosity (i.e., the constrained saturated model).

Overall
-2LL = fit statistic, which is minus two times the log-likelihood of the data.df = degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.

Supplementary Table 16. Univariate twin model fit statistics and parameter estimates for the number of objects explored (in 0-20 seconds). Note
this is not the same measure that it is reported in the manuscript (number of objects explored in the first 0-10 seconds).The best fitting model was selected based on non-significance (meaning that there was no decrement in fit compared to the saturated or the genetic model, indexed by (i.e., excluding looking time to the face).These measures can be seen as approximate analogies to the reported face attention variables, while at the same time are mathematical independent of these variables.
Car preference (Mean proportion = .34)and car orienting (Mean proportion = .33)were significantly above the chance level (which is .25 because we excluded the face).There were no significant genetic effects in terms of either car orienting (E model had the lowest AIC) or car preference (the univariate AE model had the lowest AIC but the genetic effect estimate was not significant, see table below).In terms of the bivariate model between car orienting and car preference (rPh = .25,95% CI [0.17, 0.33]), the E model had the lowest AIC, with most E on car preference being unique to that variable (unique E = 0.94, 95% CI [0.89, 0.97]), and just a small significant proportion being shared with car orienting (shared E = 0.06, 95% CI [0.03, 0.11]).These analyses suggest that while social (face) preference and orienting has a clear genetic contribution, etiological influences to preference and orienting to the second most salient object (car) are different, and do not seem to include substantial familial effects.17.Twin correlation coefficients (95% confidence intervals are shown in parentheses) for the primary face looking measures, separate for MZ and DZ pairs.The baseline model is the fully saturated model of the observed data, which models the means and variances separately for each twin in a pair and across zygosity.Age.Testing the significance of age, Sex.Testing the significance of sex, 1. Equating means across twins within a pair, 2. Equating means across zygosity, 3. Equating variances across twins within a pair, and 4. Equating variances across zygosity (i.e., the constrained saturated model).

Supplementary Table
-2LL = fit statistic, which is minus two times the log-likelihood of the data.df = degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.

Supplementary Table 20. Univariate twin model fit statistics and parameter estimates for car orienting (proportion of first look to car) and car
preference, including covariates.The best fitting model was selected based on non-significance (meaning that there was no decrement in fit compared to the saturated or the genetic model, indexed by the χ2 distribution) and the AIC fit statistic (Akaike information criterion, which incorporates information about both explained variance and parsimoniousness).precision (root mean square of the gaze inter-sample Euclidean distances) measured in four post-hoc calibration stimuli presented in randomized positions at fixed time points during the task battery.Associations between accuracy and precision, and the visual attention measures, were tested within the GEE framework (one linear model with all gaze quality covariates as predictors were run for each primary variable).
Accuracy was found to be significantly related to face preference (in addition to proportion of missing gaze), no other links were found.
Accuracy and proportion of missing data were regressed from face preference before repeating analyses.Results were similar.
Results were the same as before for the univariate twin modelling of face preference, for the bivariate twin modelling (face preference and face orienting), and for the associations with questionnaire data and with polygenic scores..667Model definitions.The Fully Sat.model is the fully saturated model of the observed data, which models the means and variances for both variables, and the phenotypic and cross-twin-cross-trait correlations between the two variables, separately for each twin in a pair and across zygosity.5.In the Bivariate model fitting, the constrained saturated model equates means, variances, phenotypic and cross-twin-cross-trait correlations across twins within a pair and across zygosity, for both variables of interest.-2LL = fit statistic, which is minus two times the log-likelihood of the data.df = degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.-2LL = fit statistic, which is minus two times the log-likelihood of the data.df = degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.

Supplementary Table 30. Assumption testing for the bivariate model between face orienting (proportion of first look to faces) and face
preference, including covariates (corrected age and sex).The χ2 distribution and associated p-value was used to test the twin model-fitting assumptions (assumptions were violated if there was a significant decrement in fit compared to the saturated model).
df = degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.A = additive genetic influences C = shared environment influences D = non-additive genetic influences E = non-shared environment influences A-C/D-E.1 = variance on Phenotype 1 (proportion of first look to face) A-C/D-E.12 = variance on Phenotype 2 (face preference) that is shared with Phenotype 1 (proportion of first look to face) A-C/D-E.2 = unique variance on Phenotype 2 (face preference) In bold: the best-fitting model was the non-significant with the lowest AIC.
Model definitions.The baseline model is the fully saturated model of the observed data, which models the means and variances separately for each twin in a pair and across zygosity.Age.Testing the significance of age, Sex.Testing the significance of sex, 1. Equating means across twins within a pair, 2. Equating means across zygosity, 3. Equating variances across twins within a pair, and 4. Equating variances across zygosity (i.e., the constrained saturated model).
In bold: the best-fitting model was the non-significant with the lowest AIC.

Table 6 .
Univariate twin model fit statistics and parameter estimates for face preference (proportion looking time on face).The best fitting model was selected based on non-significance (meaning that there was no decrement in fit compared to the saturated or the genetic model, indexed by the χ2 distribution) and the AIC fit statistic (Akaike information criterion, which incorporates information about both explained variance and parsimoniousness).

. Steps in gaze offline pre-processing:
Supplementary Table11.Saturated model for the composite score of face orienting (proportion of first look to face and latency), including covariates (age and sex).Note this is not the same measure that it was reported in the manuscript (only proportion of first look to face).The χ2 distribution and associated p-value were used to test the effect of the covariates (there was evidence of an effect if there was a significant decrement in fit compared to the saturated model) and twin model-fitting assumptions (assumptions were violated if there was a significant decrement in fit compared to the saturated model).
The twin correlations suggested genetic influences on the composite score (ICC MZ=.25, 95% CI [.08, .39];ICCDZ=.13,95%CI[-.06,.30]).The twin models confirmed these genetic effects but the assumptions for twin modelling were not met.Supplementary Table10.Descriptive statistics for face orienting composite score and latency to look at the face AOI.
In bold: the best-fitting model was the non-significant with the lowest AIC.
value were used to test the effect of the covariates (there was evidence of an effect if there was a significant decrement in fit compared to the saturated model) and twin model-fitting assumptions (assumptions were violated if there was a significant decrement in fit compared to the saturated model).
Supplementary Table15.Saturated model for the number of objects explored (in 0-20 seconds), including covariates (age and sex).Note this is not the same measure that it is reported in the manuscript (number of objects explored in the first 0-10 seconds).The χ2 distribution and associated p-

N (Twin pairs*) Car orienting Preference for car
Supplementary Table18.Descriptive statistics for car orienting and preference.
Supplementary Table19.Saturated model for car orienting (proportion of first look to car) and car preference, including covariates (age and sex).The χ2 distribution and associated p-value were used to test the effect of the covariates (there was evidence of an effect if there was a significant decrement in fit compared to the saturated model) and twin model-fitting assumptions (assumptions were violated if there was a significant decrement in fit compared to the saturated model).

Table 21 .
In bold: the best-fitting model was the non-significant with the lowest AIC.-2LL = fit statistic, which is minus two times the log-likelihood of the data.df = degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.Bivariate twin model fit statistics for car orienting and car preference.The best fitting model was selected based on non-significance (meaning that there was no decrement in fit compared to the saturated or the genetic model, indexed by the χ2 distribution) and the AIC fit statistic (Akaike information criterion, which incorporates information about both explained variance and parsimoniousness).

Table 22 .
Univariate saturated model for face preference (proportion looking time on face, regressed on proportion of missing gaze and accuracy) including covariates (age and sex).The χ2 distribution and associated p-value were used to test the effect of the covariates (there was evidence of an effect if there was a significant decrement in fit compared to the saturated model) and twin model-fitting assumptions (assumptions were violated if there was a significant decrement in fit compared to the saturated model).Model definitions.The baseline model is the fully saturated model of the observed data, which models the means and variances separately for each twin in a pair and across zygosity.Age.Testing the significance of age, Sex.Testing the significance of sex, 1. Equating means across twins within a pair, 2. Equating means across zygosity, 3. Equating variances across twins within a pair, and 4. Equating variances across zygosity (i.e., the constrained saturated model).In bold: models with a significant poorer fit compared with the saturated model.-2LL = fit statistic, which is minus two times the log-likelihood of the data.df = degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.

Table 23 .
Univariate twin model fit statistics and parameter estimates for face preference (proportion looking time on face, regressed on proportion of missing gaze and accuracy).The best fitting model was selected based on non-significance (meaning that there was no decrement in fit compared to the saturated or the genetic model, indexed by the χ2 distribution) and the AIC fit statistic (Akaike information criterion, which incorporates information about both explained variance and parsimoniousness).In bold: the best-fitting model was the non-significant with the lowest AIC.-2LL = fit statistic, which is minus two times the log-likelihood of the data.df= degrees of freedom AIC, fit statistic.Lower values denote better model fits.Δ χ2 = difference in −2LL statistic between two models, distributed χ2.Δ df = difference in degrees of freedom between two models.

Table 24 .
Assumption testing for the bivariate model between face orienting (proportion of first look to faces) and face preference (regressed on proportion of missing gaze and accuracy).The χ2 distribution and associated p-value was used to test the twin modelfitting assumptions (assumptions were violated if there was a significant decrement in fit compared to the saturated model).
). Best-fitting model in bold.The best fitting model was selected based on non-to the saturated or the genetic model, indexed by the χ2 distribution) and the AIC fit statistic (Akaike information criterion, which incorporates information about both explained variance and parsimoniousness).In bold: the best-fitting model was the non-significant with the lowest AIC. compared Model definitions.The Fully Sat.model is the fully saturated model of the observed data, which models the means and variances for both variables, and the phenotypic and cross-twin-cross-trait correlations between the two variables, separately for each twin in a pair and across zygosity.5.In the Bivariate model fitting,