Genetic contribution to ‘theory of mind’ in adolescence

Difficulties in ‘theory of mind’ (the ability to attribute mental states to oneself or others, and to make predictions about another’s behaviour based on these attributions) have been observed in several psychiatric conditions. We investigate the genetic architecture of theory of mind in 4,577 13-year-olds who completed the Emotional Triangles Task (Triangles Task), a first-order test of theory of mind. We observe a small but significant female-advantage on the Triangles Task (Cohen’s d = 0.19, P < 0.01), in keeping with previous work using other tests of theory of mind. Genome-wide association analyses did not identify any significant loci, and SNP heritability was non-significant. Polygenic scores for six psychiatric conditions (ADHD, anorexia, autism, bipolar disorder, depression, and schizophrenia), and empathy were not associated with scores on the Triangles Task. However, polygenic scores of cognitive aptitude, and cognitive empathy, a term synonymous with theory of mind and measured using the “Reading the Mind in the Eyes” Test, were significantly associated with scores on the Triangles Task at multiple P-value thresholds, suggesting shared genetics between different measures of theory of mind and cognition.

all or none. This may be manifested in children with autism developing theory of mind abilities later than age and IQ matched typical controls. Adults with autism also show difficulties in theory of mind, using age-appropriate tasks 14,15 . Similarly, a meta-analysis identified significant impairments in tasks involving theory of mind in individuals with schizophrenia 16 . Difficulties in theory of mind have also been identified in people with unipolar and bipolar disorders [17][18][19] , eating disorders 20 , and attention deficit and hyperactivity disorder or ADHD 21 . For example, a recent meta-analysis of theory of mind in people with eating disorders (15 studies, 677 cases and 514 controls) identified significant deficits in theory of mind in individuals with anorexia compared to typical controls 22 . In ADHD, theory of mind difficulties are associated with deficits in executive function 23 . Theory of mind is also predicted by measures of general cognition such as IQ and working memory 24,25 .
These differences in performance on tests of theory of mind could be due to underlying biology, or other environmental processes that mediate performance on tests of theory of mind in individuals with psychiatric conditions. Here, we test the genetic correlates of first-order theory of mind using the Emotional Triangles Task (Triangles Task) 6 . In the Triangles Task, participants are required to attribute mental states to animated triangles (e.g., "the triangle is angry"). In the original version of this task, the participant is simply asked to describe what they see, and the spontaneous narratives are coded for the number and type of mental state attribution. In the version used in the current study, participants are asked to pick the right mental state in a forced choice format based on motion-cues of the triangles. The sample comprised 4,577 13-year olds from the Avon Longitudinal Study of Parents and Children for whom we had both genetic and phenotypic data, after quality control. They took the Emotional Triangles task during adolescence, a period marked by key changes in neural architecture, in peer-relationship, and in hormonal profile 26,27 . Interrogating the genetic relationship between theory of mind at this age and risk for psychiatric conditions with known difficulties in theory of mind (autism, ADHD, anorexia nervosa, bipolar disorder, depression, and schizophrenia) may identify genetic biomarkers.
This study has three specific aims: 1. To determine the SNP heritability of theory of mind in 13-year-olds, measured using the Emotional Triangles Task; 2. To identify genes and genetic loci associated with the Triangles Task; and 3. To test if polygenic risk scores for six psychiatric conditions (ADHD, anorexia, autism, bipolar disorder, depression and schizophrenia), cognitive aptitude, and two different measures of empathy (the Empathy Quotient (EQ) 28 , which is a self-report measure of empathy, and the Eyes Test 10 , which is a measure of cognitive empathy) predict performance on the Triangles Task in 13-year-olds.

Methods
Phenotype and participants. Theory of mind was measured using the Emotional Triangles Task (Triangles Task) 6 . All participants were 13 years of age (born in April 1991-Dec 1992), and measures were collected as a part of the ongoing Avon Longitudinal Study of Parents and Children (ALSPAC). Data was queried using the fully searchable data dictionary, which is available online at http://www.bristol.ac.uk/alspac/researchers/access/. ALSPAC consists of 14,541 initial pregnancies from women resident in Avon, UK resulting in a total of 13,988 children who were alive at 1 year of age. In addition, children were enrolled in additional phases, which are described in greater detail elsewhere 29 .
The study received ethical approval from the ALSPAC Ethics and Law Committee, and written informed consent was obtained from parents or a responsible legal guardian for the child to participate. Assent was obtained from the child participants where possible. In addition, we also received ethical permission to use de-identified summary genetic and phenotype data from the Human Biology Research Ethics Committee at the University of Cambridge. All research was performed in accordance with the Helsinki Declaration.
The Triangles Task is a test of theory of mind where participants have to attribute mental states to non-living shapes (animated triangles). Test-retest reliability of the mental state coding scheme has identified an interclass correlation of 0.69 and a technical error of measurement of 0.66 6 . The Triangles Task consists of 28 questions −16 scored questions and 12 control questions. In each question, a 5-second animation of a triangle is shown and a question is asked about the mental state of the triangle (e.g. Was the triangle angry?). Participants are asked to choose from 0 to 5 (a Likert-scale) to respond to the question, where 0 indicates that the triangle did not possess the mental state described in the question, and 5 indicates that the triangle definitely possessed the mental state described in the question. In total, four mental states were tested (happy, sad, angry, and scared). For each mental state, there were two positive questions, where the mental state of the triangle matched the mental state described in the question, and two negative questions, where the mental state of the triangle did not match the mental state described in the question (e.g. the triangle is shown to be happy, and the subsequent question is "Was the triangle sad?"). Hence, in total, there were 16 questions that were scored. Control questions comprised asking if the triangle was living, and participants, again, had to choose between 0-5.
We calculated the total score by adding the score for all the positive questions and subtracting the score for the negative items. To avoid negative scores, we added 40 to the total score, giving the score a range from 0-80. We removed participants for whom we did not have any genetic data. We removed participants who had chosen the same answer for more than 50% of the items, including the control items, suggesting that they were not attending to the task. This was done after carefully evaluating the options selected and the reaction times. We noticed that, for these participants that 1. The reaction time was small for three or more consecutive items at various points in the test 2; the same option was chosen for three or more consecutive items at various points on the tests; 3. the same option was chosen for both the positive and negative questions; and 4. The control questions were answered incorrectly at various points in the test. After removing these participants, we had phenotypic and genetic data on 4,577 participants (n = 2,217 females, and n = 2,360 males).
Genotyping and Imputation. Genotyping  of America), and with support from personal genomics company 23andMe., Inc. This resulting raw genome-wide data were subjected to the following quality control procedures: Individuals were removed with discordant gender information, if there was excessive or low genetic heterozygosity, if missingness was >3%, if they were of non-European ancestry as measured using multidimensional scaling analyses compared with Hapmap II (release 22), and if there was evidence of cryptic relatedness (>10% IBD). SNPs were removed if they had a minor allele frequency <1%, deviated from Hardy-Weinberg equilibrium (P < 5 × 10 −7 ), and had a call rate <95%. This resulted in a total of 526,688 genotyped SNPs. Using SNP data from mothers and children (477,482 common SNPs between mothers and children), haplotypes were estimated using ShapeIT (v2.r644) 30 . Imputation was performed using Impute2 V2.2 31 against the 1000 genomes reference panel (Phase 1, Version 3), using all 2186 reference haplotypes including non-Europeans. Imputed SNPs were excluded from all further analyses if they had a minor allele frequency <1% and info <0.8, which resulted in a total of 8,282,911 SNPs.

Genome-wide association analyses and gene based analyses.
We conducted a genome-wide association analyses (GWAS) on the total score on the Triangles Task. In addition to the quality control procedure described above, we conducted additional quality control steps for the participants included in this study. We included only those SNPs with a minor-allele frequency >1%. We excluded all SNPs that were not in Hardy-Weinberg equilibrium (P < 5 × 10 −7 ), had a per-SNP missing rate >5%. Similarly, we excluded all individuals who had a genotype missing rate >10%. Regression analyses was run using a linear regression model in Plink 1.9 32 , with sex and the first two genetic ancestry principal components as covariates. BGEN files were converted to Plink format using hard calls. Calls with uncertainty greater than 0.1 were treated as missing. After quality control, 4,577 were included in the GWAS. Gene-based analyses was conducted using MAGMA 33 . Significant genes were identified after Bonferroni correction (P < 2.74 × 10 −6 ).
Heritability and Polygenic scores. SNP heritability was estimated using GCTA v1.26 34 (GREML) and LDSC 35 . For GCTA GREML, heritability was calculated after including sex and the first two genetic principal components as covariates. We investigated for inflation in Chi-square statistics due to uncorrected population stratification using LDSC 35 . Given that the cohort was comprised of unrelated individuals and individuals with cryptic relatedness were removed (IBD > 10%) during quality control of the raw genotype data, we calculated the genetic relatedness matrix using all individuals in the study.
Given the different polygenicity and power of the GWAS used as the training datasets, we constructed polygenic score using PRSice 36 at eight different P-value thresholds (0.01, 0.05, 0.1, 0.15, 0.20, 0.5, 0.8 and 1.0). We chose these thresholds so as to balance the signal to noise ratio in GWAS results used as training datasets. PRSice calculates a weighted-average score of all the relevant alleles. Clumping was performed in PRSice using an r 2 of 0.1. We used a linear model to regress the polygenic scores and the covariates against the scores on the Triangles Task. Polygenic scores were constructed using summary data for 6 psychiatric conditions (ADHD 37 (n = 55,374), anorexia 38 (n = 14,477), autism 39 (n = 15,954), bipolar disorder 40 (n = 16,731), major depressive disorder 41 (n = 16,610), and schizophrenia 42 (n = 79,845), cognitive aptitude 43 (n = 78,803), self-reported empathy (EQ.) 28 (n = 46,861), and cognitive empathy (the Eyes Test) 10 (n = 89,553). We excluded educational attainment from the current analyses given that the summary GWAS data available for the educational attainment GWAS included participants from the ALSPAC and is likely to increase the probability of false positives. We included sex, and the first two ancestry principal components as covariates for polygenic scoring. We did not include age as all participants were tested at approximately 13 years of age. We used Benjamini Hochberg FDR correction to correct for the tests conducted.
Data availability. Data were downloaded from the Psychiatric Genomics Consortium (PGC) website for the 6 psychiatric conditions (http://www.med.unc.edu/pgc/results-and-downloads), and the Complex Traits Genomics website for cognitive aptitude (https://ctg.cncr.nl/software/summary_statistics). Data for the EQ and the Eyes Test were obtained from 23andMe, Inc.

Results
Phenotypic distribution. The range of the scores of the participants was 28-80. Inspection of the frequency histogram and quantile-quantile plot suggested a normal distribution (Fig. 1). The mean score of the participants was 56.93 (SD = 7.43). Females scored significantly higher than males (Females: 57.68, SD = 7.43; Males: 56.22, SD = 7.36; P < 0.001, unpaired, two-tailed T-test), though the effect size was small (Cohen's d = 0.19). Genome-wide association analyses and heritability. Genome-wide association analyses did not identify any significant SNP. The sentinel SNP (SNP with the lowest P-value at the locus) at the most significant locus was rs2120452 (P = 6.8 × 10 −7 ) on chromosome 1. The SNP lies in a non-coding RNA LOC105372904. The sentinel SNP at the second-most significant locus was rs17753687 (P = 8.6 × 10 −7 ), which is an intronic SNP in BBS4, a gene implicated in Bardet-Biedl syndrome. Investigation of the QQ plots and LD score regression intercept did not reveal any inflation in effect sizes due to population stratification (LD score regression intercept = 0.99 ± 0.0063). SNP heritability was small and nonsignificant (LDSR -h 2 SNP = 0.13 ± 0.10; P = 0.16; GCTA -h 2 SNP = 0.072 ± 0.069; P = 0.29). Manhattan and QQ-plots are provided in Fig. 2.
Gene-based analysis. We conducted gene-based analysis using MAGMA, and did not identify and significant genes at a genome-wide P-value threshold of 2.74 × 10 −6 . The most significant gene was MARK4 at 19q13.32 (P = 2.96 × 10 −6 ). MARK4 is involved in phosphorylating microtubule associated proteins. It has high expression in the brain and in testes, according to GTEx 44 .
Polygenic scores. As the heritability was non-significant to conduct genetic correlation analyses, we conducted polygenic score regression with 6 psychiatric conditions, cognitive aptitude, cognitive empathy, and self-reported empathy. We tested polygenic score at 8 P-value thresholds. We did not identify a significant polygenic score after FDR-based correction for any of the six psychiatric conditions investigated (Fig. 3). Overall, the polygenic scores predicted limited variance for psychiatric conditions (Table 1). However, polygenic scores in both cognitive aptitude and cognitive empathy were significantly associated with scores in the Triangles Task across all the thresholds tested using after FDR correction. For cognitive aptitude and at two P-value thresholds for cognitive empathy, these results were significant even after using a more stringent Bonferroni correction ( Fig. 3 and Table 1). This likely reflects both the greater statistical power of the two datasets when compared to the other GWAS datasets in the condition and the underlying pleiotropy between theory of mind and cognition as previous studies have identified a modest, positive correlation between different measures of theory of mind and cognition 10,45 .

Discussion
We investigated the genetic correlates of first-order theory of mind using the Triangles Task. In total, 4,577 13-year-olds completed the Triangles Task, making this the largest investigation of genetic correlates of theory of mind at a specific age. At a phenotypic level, the scores on the Triangles Task were normally distributed and we observed a small but significant female-advantage on the Triangles Task. This is similar to what has been observed in other studies of cognitive empathy 46 and facial expression recognition 47 . The current study finds limited evidence for a genetic contribution to the Triangles Task in 13-year-olds in this sample. Genome-wide association analyses did not identify any significant loci at P < 5 × 10 −8 . Furthermore, gene-based analyses also did not identify any significant genes. We note, however, that the current study is statistically underpowered. Previous work on the genetics of cognitive empathy, which is related to theory of mind had identified that the per-SNP variance explained for the most significant SNP was 0.013% after correcting for winner's curse 10 . Post-hoc power calculations suggest that a sample two orders of magnitude larger than the current sample would be required to identify genome-wide significant loci, if the effect sizes are similar. This, however, is challenging given the nature of the task, which demands that participants spend at least half an hour to complete the task. We also note that we are statistically underpowered to identify significant additive SNP heritability, assuming a true additive SNP heritability of 5%, which is similar to the SNP heritability reported elsewhere 10 . These calculations preclude us from conducting genetic correlation analyses using the current cohort.
We also investigated if polygenic scores from 6 psychiatric conditions, empathy, cognitive empathy, and cognitive aptitude are associated with performance on the Triangles Task. We used PRSice and investigated the predictive power of polygenic scores at eight different P-value thresholds providing reasonable resolution. We note that the sample sizes for the training GWAS set are varied, although all the GWAS datasets had more than 10,000 participants. However, polygenic scores for none of the psychiatric conditions were significantly associated with performance on the Triangles Task across the six different P-value thresholds. In contrast, polygenic scores for cognitive empathy as measured using the Eyes Test, and cognitive aptitude significantly predicted variance in the Triangles Task, underscoring previously observed results 10 .
Our results indicate that genetic risk for psychiatric conditions do not explain much of the variance in theory of mind ability in adolescents in this sample. We speculate that this must be due to different reasons. First, it is likely that the current task does not capture the entire variance in theory of mind. Indeed, as mentioned earlier, theory of mind is complex and designing a task to capture the intrinsic variance in theory of mind is challenging. The Triangles task only considers first order mental state attributions, and the range of mental states is very limited, compared to for example the Eyes Test 14 . It is also likely that the difficulties in theory of mind observed in individuals with psychiatric conditions may be due to other processes that mediate theory of mind. Interrogation  of the genetic architecture of diverse phenotypes that contribute to social behaviour and theory of mind will help understand how they contribute to genetic risk for various psychiatric conditions. We can also not exclude the possibility that using either a larger training dataset and/or a larger target dataset will help improve the statistical significance of the polygenic score association. Finally, we cannot ignore non-genetic contributors to theory of mind. Certainly, twin studies do suggest that for certain theory of mind tasks, the genetic contribution is negligible. Previous work from our lab investigated the genetic architecture of cognitive empathy measured using the Eyes test in a sample of more than 88,000 individuals of European ancestry 10 . Here we draw several distinctions between the current study and the earlier study. First, the Triangle Task requires making inferences about mental states to animate non-social shapes, while the Eyes Test requires identifying the mental state from photographs of human eyes. Second, the current study investigates the genetic architecture of theory of mind at a specific age (13 years old). This allows for interrogation of the genetic contribution in adolescence when individuals are particularly vulnerable to several psychiatric conditions. However, we note that this is an age when the participants are undergoing puberty. Specific aspects of theory of mind develop differently during the course of puberty 48,49 . We were unable to account for differences in pubertal development in the current study.
In conclusion, this study does not find a significant genetic contribution to first-order theory of mind in adolescents. We find limited evidence that genetic variants that contribute to risk for psychiatric conditions predict variance in theory of mind ability in adolescents. However, we do find that genetic variants contributing to cognitive aptitude and cognitive empathy are significantly associated with theory of mind ability in adolescence. We speculate that observed differences in theory of mind in individuals with psychiatric conditions may be due to both biological and non-biological factors, or other biological phenotypes that mediate performance on tasks of theory of mind.  Table 1. Results of the Polygenic score analyses. * Indicates significant polygenic score association after Bonferroni correction (P < 6.94 × 10 −4 ).