Article | Open | Published:

# Genome-wide analyses of self-reported empathy: correlations with autism, schizophrenia, and anorexia nervosa

## Abstract

Empathy is the ability to recognize and respond to the emotional states of other individuals. It is an important psychological process that facilitates navigating social interactions and maintaining relationships, which are important for well-being. Several psychological studies have identified difficulties in both self-report and performance-based measures of empathy in a range of psychiatric conditions. To date, no study has systematically investigated the genetic architecture of empathy using genome-wide association studies (GWAS). Here we report the results of the largest GWAS of empathy to date using a well-validated self-report measure of empathy, the Empathy Quotient (EQ), in 46,861 research participants from 23andMe, Inc. We identify 11 suggestive loci (P < 1 × 10−6), though none were significant at P < 2.5 × 10−8 after correcting for multiple testing. The most significant SNP was identified in the non-stratified analysis (rs4882760; P = 4.29 × 10−8), and is an intronic SNP in TMEM132C. The EQ had a modest but significant narrow-sense heritability (0.11 ± 0.014; P = 1.7 × 10−14). As predicted, based on earlier work, we confirmed a significant female advantage on the EQ (P < 2 × 10−16, Cohen’s d = 0.65). We identified similar SNP heritability and high genetic correlation between the sexes. Also, as predicted, we identified a significant negative genetic correlation between autism and the EQ (rg = −0.27 ± 0.07, P = 1.63 × 10−4). We also identified a significant positive genetic correlation between the EQ and risk for schizophrenia (rg = 0.19 ± 0.04; P = 1.36 × 10−5), risk for anorexia nervosa (rg = 0.32 ± 0.09; P = 6 × 10−4), and extraversion (rg = 0.45 ± 0.08; 5.7 × 10−8). This is the first GWAS of self-reported empathy. The results suggest that the genetic variations associated with empathy also play a role in psychiatric conditions and psychological traits.

## Introduction

Empathy is the ability to identify other people’s thoughts, intentions, desires, and feelings, and to respond to others’ mental states with an appropriate emotion1. It plays an important role in social interaction by facilitating both making sense of other people’s behaviour and in responding appropriately to their behaviour. For these reasons, it is considered a key component of prosocial behaviour, social cooperation, and social cognition2. Aspects of empathy are observed in humans and other animals and is thought to have evolved to support a range of prosocial behaviour and cooperative behaviour2.

Differences in various fractions of empathy have been observed in several psychiatric conditions including autism1, bipolar disorder3, schizophrenia4,5,6, and major depressive disorder3,7,8. Two major fractions of empathy include affective empathy (the drive to respond to another’s mental state with an appropriate emotion) and cognitive empathy (the ability to recognize another’s mental state). These differences vary between psychiatric conditions: for example, individuals with schizophrenia are more likely to report higher personal distress and emotional contagion6, whereas individuals with autism are likely to show difficulties with cognitive empathy but not affective empathy1,9. These may reflect causal risk mechanisms where alterations in empathy contribute to higher risk for developing a psychiatric condition. Equally, differences in empathy may also be due to the presence of a psychiatric condition, which may not allow individuals to understand and respond to another person’s mental state effectively.

Whilst empathy is clearly shaped by early experience, parenting, and other social factors, different lines of evidence suggest that empathy is partly biological. Empathy is modestly heritable (approximately a third of the variance is heritable)10,11,12, and a few candidate gene association studies have investigated the role of various genes in empathy13,14,15. In addition, several studies have identified a role for the oxytocinergic and the foetal testosterone systems in modulating empathy2,16,17,18. Neuroimaging studies have identified distinct brain regions implicated in different aspects of empathy including the amygdala and the ventromedial prefrontal cortex19,20. Empathy also shows a marked sex difference: females, on average, score higher on different measures of empathy1,21. A longitudinal study suggests that this female advantage grows larger with age22. Sex differences in the mind arise from a combination of innate biological differences, cultural, and environmental differences. Studies in infant humans have identified sex differences in the developmental precursors to empathy, such as neonatal preferences to faces over objects23 when environmental and cultural influences are minimal, lending support to the idea that sex differences in empathy are at least partly biological24.

Because empathy difficulties are found in a range of psychiatric conditions, it is an important phenotype for investigation. Understanding the biological networks that partly determine empathy may help us understand how it contributes to psychiatric phenotypes, an approach that has been used for other traits such as neuroticism25, creativity26, and cognitive ability27. We investigate the genetic architecture of self-reported empathy using the Empathy Quotient (EQ)1. The EQ is listed in the Research Domain Criteria (RDoC)28 as a self-report measure under the domain of ‘Understanding Mental States’ (https://www.nimh.nih.gov/research-priorities/rdoc/units/self-reports/151133.shtml). The EQ is widely used, has excellent test–retest reliability (r = 0.83, P = 0.0001)29, high internal consistency (Cronbach’s α ~0.9)30,31 and is significantly correlated with factors in the Interpersonal Reactivity Index and the Toronto Empathy Questionnaire, two other measures of empathy, suggesting good concurrent validity29,32. Psychometric analysis of the EQ in 3334 individuals from the general population suggests that the EQ is a good measure of empathy which can be measured across a single dimension33. By focusing on a self-report measure of empathy, we were able to obtain phenotypic and genetic data from a large number of participants, increasing the statistical power of the study. Previous work from our lab investigated the genetic correlates of cognitive empathy using a specific performance measure—the ‘Reading the Mind in the Eyes’ Test (the Eyes Test)34. The Eyes Test has a low correlation with the EQ (r ~ 0.10)30,35 and measures only one facet of empathy, namely cognitive empathy. Cognitive empathy is also referred to as employing a ‘theory of mind’, or ‘mentalizing’. The EQ includes items that measure cognitive empathy, but others that measure affective empathy. Yet, other items on the EQ involve both cognitive and affective empathy.

In this study, we aim to answer three questions: 1. What is the polygenic architecture of empathy? 2. Is empathy genetically correlated to various psychiatric conditions, psychological traits, and education? 3. Is there a genetic contribution to sex differences in empathy? We performed sex-stratified and non-stratified genome-wide association analyses of empathy in research participants from 23andMe, a personalized genetics company. We calculated the narrow-sense heritability explained by all the single-nucleotide polymorphisms (SNPs) tested, and investigated sex differences. Finally, we conducted genetic correlation analyses with six psychiatric conditions (anorexia, attention deficit hyperactivity disorder (ADHD), autism, bipolar disorder, major depressive disorder, and schizophrenia), psychological traits, and educational attainment.

## Methods

### Participants

Research participants were drawn from the customer base of 23andMe, Inc., a personal genetics company, and are described in detail elsewhere36,37. There were 46,861 participants (24,543 females and 22,318 males). All participants included in the analyses provided informed consent and answered surveys online according to a human subjects research protocol, which was reviewed and approved by Ethical & Independent Review Services, an AAHRPP (The Association for the Accreditation of Human Research Protection Programs, Inc.)-accredited private institutional review board (http://www.eandireview.com). All participants completed the online version of the questionnaire accessible via the research tab of their password-protected 23andMe personal online account. Only participants who were primarily of European ancestry (97% European Ancestry) were selected for the analysis using existing methods38. Unrelated individuals were selected using a segmental identity-by-descent algorithm39.

### Measures

The Empathy Quotient (EQ)1 is a self-report measure of empathy, and includes items relevant to both cognitive and affective empathy. It comprises 60 questions and has a good test–retest reliability29. Of these, 20 questions are filler questions, and of the remaining 40 questions, participants can score a maximum of 2 points and a minimum of 0 point per question. Therefore, in this study, participants scored a maximum of 80 and a minimum of 0.

### Genotyping, imputation, and quality control

DNA extraction and genotyping were performed on saliva samples by the National Genetic Institute, USA. Participants were genotyped on one of four different platforms: V1, V2, V3, and V4. The V1 and V2 platforms have a total of 560,000 SNPs largely based on the Illumina HumanHap550+BeadChip. The V3 platform has 950,000 SNPs based on the Illumina OmniExpress+Beadchip and has custom content to improve the overlap with the V2 platform. The V4 platform is a fully customized array and has about 570,000 SNPs. All samples had a call rate greater than 98.5%. A total of 1,030,430 SNPs (including Insertion/Deletion or InDels) were genotyped across all platforms. Imputation was performed using the March 2012 (v3) release of the 1000 Genomes Phase 1 reference haplotypes. First, we used Beagle (version 3.3.1)40 to phase batches of 8000–9000 individuals across chromosomal segments of no more than 10,000 genotyped SNPs, with overlaps of 200 SNPs. SNPs were excluded if they were not in Hardy–Weinberg equilibrium (P < 10−20), had a genotype call rate less than 95%, or had discrepancies in allele frequency compared to the reference European 1000 Genomes data (χ2 P < 10−15). We then imputed each phased segment against all-ethnicity 1000 Genomes haplotypes (excluding monomorphic and singleton sites) using Minimac241, using 5 rounds and 200 states for parameter estimation. We restricted the analyses to only SNPs that had a minor allele frequency of at least 1%. For genotyped SNPs, those present only on platform V1 or in chromosome Y and mitochondrial chromosomes were excluded due to small sample sizes and unreliable genotype calling respectively. Next, using trio data from all research participants in the 23andMe dataset, where available, SNPs that failed a parent offspring transmission test were excluded. For imputed SNPs, we excluded SNPs with average r2 < 0.5 or minimum r2 < 0.3 in any imputation batch, as well as SNPs that had strong evidence of an imputation batch effect. The batch effect test is an F-test from an analysis of variance of the SNP dosages against a factor representing imputation batch; we excluded results with P < 10−50. After quality control, 9,955,952 SNPs were analysed. Genotyping, imputation, and preliminary quality control were performed by 23andMe.

### Genetic association

We performed a linear regression assuming an additive model of genetic effects. Age and sex along with the first five ancestry principal components were included as covariates. Additionally, we performed a male-only and a female-only linear regression analysis to identify sex-specific loci. Since we were performing two independent tests for each trait (male-only and female-only, and males and females combined with sex as a covariate which is equivalent to a meta-analysis of the two sex-stratified genome-wide association studies (GWAS)), we used a threshold of P < 2.5 × 10−8 (5 × 10−8/2) to identify significant SNPs. Leading SNPs in each locus were identified after pruning for linkage disequilibrium (LD; r2 > 0.8) using Plink version 1.9. We calculated the variance explained by the top SNPs using a previously used formula42:

$$\frac{{R_{g|c}^2}}{{1 - R_c^2}} = \frac{{t^2}}{{n - k - 1}} \times 100$$
(1)

$$\frac{{R_{g|c}^2}}{{1 - R_c^2}}$$ is the proportion of variance explained by the SNP after accounting for the effects of the covariates (four ancestry principal components, age, and, additionally, sex for the non-stratified analyses), t is the t-statistic of the regression coefficient, k is the number of covariates, and n is the sample size. Winner’s curse correction was conducted using false discovery rate (FDR) Inverse Quantile Transformation43.

### Genomic inflation factor, heritability, and functional enrichment

We used Linkage Disequilibrium Score Regression (LDSR) coefficient to calculate genomic inflation due to population stratification44 (https://github.com/bulik/ldsc). Heritability and genetic correlation was performed using extended methods in LDSR45. Difference in heritability between males and females was quantified using46:

$$Z = \frac{{h_{males}^2 - h_{females}^2}}{{\sqrt {SE_{males}^2 + SE_{females}^2} }}$$
(2)

where Z is the Z-score for the difference in heritability for a trait, (h2malesh2females) is the difference in SNP heritability estimate in males and females, and SE is the standard errors for heritability. Two-tailed P-values were calculated, and reported as significant if P < 0.05. We identified enrichment in genomic functional elements for the traits by partitioning heritability performed in LDSR47. In addition to the baseline partitions we conducted four additional enrichment analyses. Enrichment for central nervous system (CNS)-specific histone marks was conducted using cell type-specific partitioned heritability analysis. For genes that are intolerant to loss-of-function mutations, we identified gene boundaries of genes with probability of loss-of-function intolerance scores >0.9 from the Exome Aggregation Consortium48, and conducted partitioned heritability analysis for all common SNPs within the gene boundaries identified. Similarly, for sex differentially enriched genes, we identified gene boundaries of genes with sex differential expression in cerebral cortex and associates structures (Brain Other)49 (Supplementary Table 8). We divided this into two separate lists—genes with higher expression in males and genes with higher expression in females with an FDR corrected P-value < 0.05. Partitioned heritability analyses were conducted to identify enrichment using LDSR.

### Genetic correlations

LDSR was also used to calculate genetic correlations. We restricted our primary analyses to only the non-stratified GWAS dataset due to the unavailability of sex-stratified GWAS data in the phenotypes investigated. We calculated initial genetic correlations using LD Hub50 for schizophrenia51, bipolar disorder, major depressive disorder, depressive symptoms, educational attainment (years of schooling), NEO-openness to experience, NEO-conscientiousness, subjective well-being, and neuroticism. For anorexia nervosa52 and ADHD53, we used the data available from the Psychiatric Genomics Consortium (PGC) webpage (https://www.med.unc.edu/pgc/results-and-downloads) to conduct genetic correlation analyses as these are in larger samples and, consequently, have greater statistical power than the datasets available on LD Hub. For autism, we used summary statistics from the PGC-iPSYCH meta-analysis54, details of which are provided in the Supplementary Note. We also conducted genetic correlation for extraversion55 separately as the data were unavailable on LD Hub at the time of analysis. For the anorexia nervosa, autism, and extraversion analyses, the North West European LD scores were used and the intercepts were not constrained as the extent of participant overlap was unknown. We report significant lists if they Bonferroni corrected P < 0.05, which we acknowledge is conservative. For anorexia nervosa and autism, we also conducted genetic correlation analyses using the sex-stratified EQ dataset due to the significant sex differences observed in these conditions. We correct for these using Bonferroni correction, and report significant correlations at P < 0.05.

### Gene-based analysis

Gene-based analyses for the non-stratified GWAS were performed using MAGMA56, which integrates LD information between SNPs to prioritize genes. Genes were significant if they had Bonferroni corrected P < 0.05. In addition, we also investigated enrichment in Gene Ontology (GO) terms using MAGMA.

### Genome-wide colocalization

Pairwise genome-wide colocalization analyses were conducted using GWAS-PW57 by dividing the genomes into segments containing approximately 5000 SNPs each. We considered the posterior probability of model 3, i.e., the model wherein SNPs in the same locus influence both the traits. We used a rigorous threshold of posterior probability >0.95 to identify significant loci that influenced both the traits. We conducted pairwise colocalization for empathy (non-stratified), and schizophrenia51, anorexia nervosa52, and autism.

### Data availability

Summary statistics for the EQ GWAS can be requested directly from 23andMe, and will be made available to qualified researchers subject to the terms of a data transfer agreement with 23andMe that protects the privacy of the 23andMe research participants. Please contact David Hinds (dhinds@23andMe.com) for more information. Top SNPs can be visualized here: https://ghfc.pasteur.fr/eq/.

## Results

### Phenotype description

To understand the genetic architecture of empathy, we collaborated with 23andMe to conduct a GWAS of empathy (n = 46,861) using the EQ, which was normally distributed. A flowchart of the study protocol is shown in Fig. 1. The mean score for all participants was 46.4 (sd = 13.7) on a total of 80 on the EQ, which is similar to the mean score reported in 90 typical participants in the first study describing the EQ (42.1, sd = 10.6)1. The mean age of the participants was 48.9 (sd = 15.7). Females scored higher than males on the EQ (41.9 ± 13.5 in males, 50.4 ± 12.6 in females) (Fig. 2), as previously observed1. There was significant age effect, with scores increasing with age (β = 0.08 ± 0.003; P = 3.3 × 10−104) and a significant sex effect with females scoring higher than males (β = 8.4 ± 0.11; P ~ 0).

### Genome-wide association analyses

We conducted three GWAS analyses: a male-only analysis, a female-only analysis, and a non-stratified analysis, using a linear regression model with age and the first four ancestry principal components as covariates (Methods). We corrected for the three different tests using a conservative threshold of P = 2.5 × 10−8. LDSR coefficient suggested non-significant genomic inflation due to population stratification (Supplementary Figure 13). We did not identify any genome-wide significant SNPs (Supplementary Figures 1–3 and Supplementary Table 1). We identified 11 suggestive loci (P < 1 × 10−6) in the three GWAS. The most significant SNP was identified in the non-stratified analysis (rs4882760; P = 4.29 × 10−8), and is an intronic SNP in TMEM132C. Regional association plots for all suggestive loci are provided in Supplementary Figure 4.

To investigate if the top SNPs from the EQ also contribute to cognitive empathy as measured by the Eyes Test, we conducted SNP lookup of all the 11 suggestive loci in the Eyes Test GWAS. None of the 11 loci were significant in the Eyes Test GWAS, and only 7 out of the 11 SNPs had concordant effect directions in the two traits (P = 0.54; two-sided binomial sign test).

The most significant SNP in each GWAS analysis explained 0.06–0.13% of the total variance (Supplementary Table 2). However, this reduced to 0.0006–0.016% after correcting for winner’s curse (Supplementary Table 2).

### Gene-based association, heritability, and enrichment in functional categories

Gene-based analysis identified two significant genes for the EQ: SEMA6D (P = 9.14 × 10−7) and FBN2 (P = 1.68 × 10−6) (Supplementary Table 3). Analysis for enrichment in GO terms did not identify any significant enrichment (Supplementary Table 4). The most significant GO process was negative regulation of neurotransmitter secretion.

We used LDSR44 to calculate the heritability explained by all the SNPs tested (Methods) and identified a heritability of 0.11 ± 0.014 for the EQ (P = 1.7 × 10−14) (Fig. 2, Supplementary Table 5). Partitioning heritability by functional categories did not identify any significant enrichment after correcting for multiple testing (Supplementary Table 6). We also investigated if there was an enrichment in heritability for histone marks in cells in the CNS47, but did not find a significant enrichment (enrichment = 3.67 ± 1.45; P = 0.077).

Recent studies have identified an enrichment of associations in or near genes that are extremely intolerant to loss-of-function variation in schizophrenia58, autism59,60, and developmental disorders61, conditions that are often accompanied by difficulties in social behaviour and empathy. We investigated if there was a significant enrichment in GWAS signal for the EQ in these genes that are extremely intolerant to loss-of-function variation. We did not identify a significant enrichment after correction for multiple testing (proportion h2SNP = 0.19, proportion SNP = 0.09, enrichment = 1.83 ± 0.42; P = 0.044).

### Sex differences

Sex differences in empathy62 may reflect genetic as well as non-genetic factors (such as prenatal steroid hormones, and postnatal learning)63. In our dataset, there was a significant female advantage on the EQ (P < 2 × 10−16 Cohen’s d = 0.65) (Fig. 2). To investigate the biological basis for the sex difference observed in the traits, we first tested the heritability of the sex-stratified GWAS analyses for the EQ. Our analyses revealed no significance difference between the heritability in the males-only and the females-only datasets (P = 0.48 for male–female difference in the EQ) (Fig. 2 and Supplementary Table 5). Additionally, there was a high genetic correlation between the males-only and females-only GWAS (rg = 0.82 ± 0.16, P = 2.34 × 10−7), indicating a high degree of similarity in the genetic architecture of the traits in males and females. This was not significantly different from 1 (P = 0.13, one-sided Wald test). We investigated the heterogeneity in the 11 SNPs of suggestive significance in both the sexes using Cochran’s Q-test, and did not identify significant heterogeneity (Supplementary Table 7).

Sex differences may also arise by differential expression of specific genes in different neural tissues at different developmental stages49,64. This could be due to multiple factors, including sex-specific transcription factors and sex-specific DNA methylation. We investigated this by performing enrichment analysis of the non-stratified GWAS in genes with higher expression in either males or females in cortical tissue samples (Supplementary Table 8). We did not identify a significant enrichment for either genes with higher expression in males (enrichment = 1.95 ± 0.70, P = 0.17) or females (enrichment = 0.28 ± 0.84, P = 0.39).

### Genetic correlations

To investigate how the EQ correlates with psychiatric conditions, psychological traits and educational attainment, we performed genetic correlation (Methods) with six psychiatric conditions (autism, ADHD, anorexia nervosa, bipolar disorder, depression (major depressive disorder and the larger depressive symptoms dataset) and schizophrenia), six psychological traits (NEO-extraversion, NEO-openness to experience, NEO-conscientiousness, neuroticism, and subjective well-being), and educational attainment (a proxy measure of IQ, measured using years of schooling) (Supplementary Table 9). With psychiatric conditions, three genetic correlations were significant following Bonferroni correction: EQ-autism (rg = −0.27 ± 0.07, P = 1.63 × 10−4), EQ-schizophrenia (rg = 0.19 ± 0.04; P = 1.36 × 10−5) and EQ-anorexia nervosa (rg = 0.32 ± 0.09; P = 6 × 10−4) (Fig. 3).

As anorexia nervosa is primarily diagnosed in women, and autism is primarily diagnosed in men, we further tested sex-specific correlations. After Bonferroni correction, we identified significant genetic correlations between the EQ in females (EQ-F) and anorexia (rg = 0.48 ± 0.12; P = 8.46 × 10−5) and the EQ in males (EQ-M) and autism (rg = −0.3 ± 0.08, P = 3 × 10−4).

With psychological traits, we identified one significant correlation after Bonferroni correction: EQ with extraversion (rg = 0.45 ± 0.08; P = 5.76 × 10−8). Additionally, we identified two nominally significant correlations: the EQ with subjective well-being (rg = 0.19 ± 0.07; P = 7.8 × 10−3) and NEO-conscientiousness (rg = 0.39 ± 0.14; P = 8.8 × 10−03) (Fig. 3). All three correlations were in the predicted direction as studies have identified a positive phenotypic correlation between all three traits and the EQ31,65. We previously reported a small positive correlation between the EQ and the Eyes Test (rg = 0.18 ± 0.06; P = 0.007)34, mirroring previous reported estimates of phenotypic correlation in the general population35 and estimates in our database from 916 neurotypical adults (r = 0.11 ± 0.032; P = 0.003, Pearson's correlation).

### Bayesian genomic colocalization

As there were significant genetic correlations between the EQ, anorexia nervosa schizophrenia, and autism, we investigated if there are genomic regions that influence both empathy and one of the psychiatric conditions (colocalization) by estimating the Bayesian posterior probability. We did not identify any regions associated with empathy and the three conditions. The most significant region identified in this analysis was in Chr11p12, posterior probability = 0.78 (Supplementary Figure 5) in the empathy–anorexia analysis. The most significant SNPs in this region for both anorexia and empathy are intronic SNPs in the gene LRRC4C, which is implicated in excitatory synapse development66,67. Further, this gene is highly intolerant to loss-of-function mutations (probability of loss-of-function intolerance = 0.95). We did not identify any expression quantitative trait loci in this region in neural tissues. These results are preliminary and a cautious interpretation is warranted as the probability is influenced by the modest power of both the GWAS57.

## Discussion

This is the first GWAS to investigate the genetic architecture of self-reported empathy. We identified four significant genetic correlations with the EQ and psychiatric conditions and psychological traits (autism, anorexia nervosa, schizophrenia, and extraversion), providing insights into the shared genetic architecture. Although we did not identify any significant SNPs after correcting for multiple testing at P < 2.5 × 10−8, we identified 11 SNPs of suggestive significance (P < 1 × 10−6). Males and females perform differently on the tests, but there was limited evidence of sex-specific genetic architecture.

We identified a significant negative genetic correlation between the EQ and autism. Several studies have identified lower self-reported empathy in individuals with autism, and our results mirror these studies1,68. This is likely to be driven by difficulties in understanding the mental states of others rather than responding to them. We also identified significant genetic correlations for the EQ with schizophrenia and anorexia nervosa. The empirical literature in general report deficits in cognitive empathy4,69, but preserved or stronger affective empathy6,69 and emotional contagion/personal distress6 in individuals with schizophrenia compared to controls. Studies with anorexia nervosa, on the other hand, have yielded mixed results. Some studies suggest preserved empathy70, some identify reduced cognitive empathy72,72,73, and others identify greater emotional contagion/personal distress74 in individuals with anorexia nervosa compared to controls. These studies are typically conducted in small samples, which may explain the different results in these heterogeneous conditions. Our results suggest that genetic variants associated with self-reported empathy slightly increase the risk for schizophrenia and anorexia nervosa; the latter remained significant even after using the females-only EQ dataset. A previous study34 identified a significant genetic correlation between cognitive empathy (measured using the Eyes Test75) and anorexia nervosa, underscoring the importance of empathy as a genetic risk factor in anorexia nervosa. However, both cognitive empathy and anorexia nervosa are positively correlated with educational attainment34,76, and it is possible that the correlation between cognitive empathy and anorexia nervosa may be mediated by educational attainment.

Here, self-reported empathy is not genetically correlated with educational attainment. Further, while cognitive empathy was not correlated with schizophrenia, self-reported empathy was positively and significantly correlated with schizophrenia, suggesting distinct roles for the two phenotypes in neuropsychiatry. Schizophrenia and anorexia share significant positive genetic correlation (rg = 0.23 ± 0.06)76, and it is possible that the pleiotropy between these two conditions may, in part, be mediated by genetic variants that contribute to empathy. This needs to be tested. Together with the GWAS on cognitive empathy34, this study provides evidence for the distinct roles of different social processes in various psychiatric conditions.

Investigating genetic correlations with psychological traits and measures of cognition further helped elucidate the genetic architecture of self-reported empathy. The EQ was significantly correlated with extraversion and nominally correlated with subjective well-being and conscientiousness. Both extraversion and conscientiousness correlate with empathy65 which, in turn, contributes to subjective well-being31. Of the five personality factors, extraversion, conscientiousness, and agreeableness have modest correlations with self-reported empathy65. We did not test for genetic correlation with agreeableness due to the low heritability of the trait. The direction of our genetic correlation results mirror observed phenotypic correlations and provide additional evidence for the positive role of self-reported empathy in subjective well-being.

This is also the first study to provide estimates of additive heritability explained by all the SNPs tested for self-reported empathy, and approximately 11% of variance was explained by SNPs. One study, investigating the heritability of the reduced EQ (18 items) in 250 twin pairs, identified a heritability of 0.3212. The literature on the heritability of empathy and prosociality is inconsistent, with heritability estimates ranging from 0.2010 to 0.6977, although a meta-analysis of different studies identified a heritability estimate of 0.35 (95% confidence interval — 0.21 — 0.41)78. Our analysis therefore suggests that a third of the heritability can be attributed to common genetic variants. Like IQ79, the heritability of empathy and prosociality behaviour changes with age10. We did not investigate the effect of age on heritability in our study.

We did not find any significant differences in heritability between males and females. Further, the male–female genetic correlation was high and not statistically different from 1. Despite the high genetic correlation, sex-specific correlations with anorexia nervosa were significant only for the females-only GWAS dataset. This suggests that the sex-specific genetic component of empathy can contribute differentially to psychiatric conditions. Several other factors may explain the observed phenotypic sex difference. For example, genetic variants for empathy may be enriched in sex-specific gene expression pathways. We conducted preliminary analysis by investigating if there is an enrichment in sex differentially expressed genes in cortical tissue samples, but did not find significant enrichment. However, sex-specific gene expression is a dynamic process with both spatial and developmental differences49,64. Investigating across different tissues and developmental time points in well-powered gene expression datasets will help better understand sex differences in empathy.

There are a few limitations that need to be taken into consideration in interpreting our results. The EQ is a self-report measure, and while it has excellent psychometric properties and construct validity, it is unclear how much of the intrinsic biological variation in this trait is captured by it. Further, while this is the largest GWAS to date of self-reported empathy, it still is only modestly powered, reflected in our inability to identify genome-wide significant loci. This modest statistical power influences subsequent analysis, and we highlight several nominally significant results for further investigation in larger datasets.

In conclusion, the current study provides the first narrow-sense heritability for empathy. While there is a highly significant difference on the EQ between males and females, heritability is similar, with a high genetic correlation between the sexes. We also identified significant genetic correlations between empathy and some psychiatric conditions and psychological traits, including autism. This global view of the genomic architecture of empathy will allow us to better understand psychiatric conditions, and improve our knowledge of the biological bases of neurodiversity in humans.

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## References

1. 1.

Baron-Cohen, S. & Wheelwright, S. J. The Empathy Quotient: an investigation of adults with Asperger syndrome or high functioning autism, and normal sex differences. J. Autism Dev. Disord. 34, 163–175 (2004).

2. 2.

Decety, J., Bartal, I. B.-A., Uzefovsky, F. & Knafo-Noam, A. Empathy as a driver of prosocial behaviour: highly conserved neurobehavioural mechanisms across species. Philos. Trans. R. Soc. Lond. B Biol. Sci. 371, 20150077 (2015).

3. 3.

Derntl, B., Seidel, E.-M., Schneider, F. & Habel, U. How specific are emotional deficits? A comparison of empathic abilities in schizophrenia, bipolar and depressed patients. Schizophr. Res. 142, 58–64 (2012).

4. 4.

Bora, E., Gökçen, S. & Veznedaroglu, B. Empathic abilities in people with schizophrenia. Psychiatry Res. 160, 23–29 (2008).

5. 5.

Michaels, T. M. et al. Cognitive empathy contributes to poor social functioning in schizophrenia: evidence from a new self-report measure of cognitive and affective empathy. Psychiatry Res. 220, 803–810 (2014).

6. 6.

Lehmann, A. et al. Subjective experience of emotions and emotional empathy in paranoid schizophrenia. Psychiatry Res. 220, 825–833 (2014).

7. 7.

Weightman, M. J., Air, T. M. & Baune, B. T. A review of the role of social cognition in major depressive disorder. Front. Psychiatry 5, 179 (2014).

8. 8.

Thoma, P., Schmidt, T., Juckel, G., Norra, C. & Suchan, B. Nice or effective? Social problem solving strategies in patients with major depressive disorder. Psychiatry Res. 228, 835–842 (2015).

9. 9.

Baron-Cohen, S. Autism: the empathizing-systemizing (E-S) theory. Ann. N. Y. Acad. Sci. 1156, 68–80 (2009).

10. 10.

Davis, M. H., Luce, C. & Kraus, S. J. The heritability of characteristics associated with dispositional empathy. J. Pers. 62, 369–391 (1994).

11. 11.

Emde, R. N. et al. Temperament, emotion, and cognition at fourteen months: the MacArthur Longitudinal Twin Study. Child. Dev. 63, 1437–1455 (1992).

12. 12.

Hatemi, P. K., Smith, K., Alford, J. R., Martin, N. G. & Hibbing, J. R. The genetic and environmental foundations of political, psychological, social, and economic behaviors: a panel study of twins and families. Twin Res. Hum. Genet. 18, 243–255 (2015).

13. 13.

Warrier, V., Baron-Cohen, S. & Chakrabarti, B. Genetic variation in GABRB3 is associated with Asperger syndrome and multiple endophenotypes relevant to autism. Mol. Autism 4, 48 (2013).

14. 14.

Uzefovsky, F. et al. The dopamine D4 receptor gene shows a gender-sensitive association with cognitive empathy: evidence from two independent samples. Emotion 14, 712–721 (2014).

15. 15.

Chakrabarti, B. et al. Genes related to sex steroids, neural growth, and social-emotional behavior are associated with autistic traits, empathy, and Asperger syndrome. Autism Res. 2, 157–177 (2009).

16. 16.

Decety, J. The neurodevelopment of empathy in humans. Dev. Neurosci. 32, 257–267 (2010).

17. 17.

Auyeung, B. et al. Foetal testosterone and the child systemizing quotient. Eur. J. Endocrinol. 155, S123–S130 (2006).

18. 18.

Chapman, E. et al. Fetal testosterone and empathy: evidence from the Empathy Quotient (EQ) and the ‘Reading the Mind in the Eyes’ test. Soc. Neurosci. 1, 135–148 (2006).

19. 19.

Siegal, M. & Varley, R. Neural systems involved in ‘theory of mind’. Nat. Rev. Neurosci. 3, 463–471 (2002).

20. 20.

Morelli, S. A., Rameson, L. T. & Lieberman, M. D. The neural components of empathy: predicting daily prosocial behavior. Soc. Cogn. Affect. Neurosci. 9, 39–47 (2014).

21. 21.

Holgado Tello, F. P., Delgado Egido, B., Carrasco Ortiz, M. A. & Del Barrio Gandara, M. V. Interpersonal reactivity index: analysis of invariance and gender differences in Spanish youths. Child. Psychiatry Hum. Dev. 44, 320–333 (2013).

22. 22.

Mestre, M. V., Samper, P., Frías, M. D. & Tur, A. M. Are women more empathetic than men? A longitudinal study in adolescence. Span. J. Psychol. 12, 76–83 (2009).

23. 23.

Connellan, J., Baron-Cohen, S., Wheelwright, S. J., Batki, A. & Ahluwalia, J. Sex differences in human neonatal social perception. Infant. Behav. Dev. 23, 113–118 (2000).

24. 24.

Christov-Moore, L. et al. Empathy: gender effects in brain and behavior. Neurosci. Biobehav. Rev. 46, 604–627 (2014).

25. 25.

de Moor, M. H. M. et al. Meta-analysis of genome-wide association studies for neuroticism, and the polygenic association with major depressive disorder. JAMA Psychiatry 72, 642 (2015).

26. 26.

Power, R. A. et al. Polygenic risk scores for schizophrenia and bipolar disorder predict creativity. Nat. Neurosci. 18, 953–955 (2015).

27. 27.

Clarke, T.-K. et al. Common polygenic risk for autism spectrum disorder (ASD) is associated with cognitive ability in the general population. Mol. Psychiatry 21, 419–425 (2015).

28. 28.

Insel, T. et al. Research domain criteria (RDoC): toward a new classification framework for research on mental disorders. Am. J. Psychiatry 167, 748–751 (2010).

29. 29.

Lawrence, E. J., Shaw, P., Baker, D., Baron-Cohen, S. & David, A. S. Measuring empathy: reliability and validity of the Empathy Quotient. Psychol. Med. 34, 911–919 (2004).

30. 30.

Melchers, M. C., Montag, C., Markett, S. & Reuter, M. Assessment of empathy via self-report and behavioural paradigms: data on convergent and discriminant validity. Cogn. Neuropsychiatry 20, 157–171 (2015).

31. 31.

Bos, E. H. et al. Preserving subjective wellbeing in the face of psychopathology: buffering effects of personal strengths and resources. PLoS. ONE 11, e0150867 (2016).

32. 32.

Spreng, R. N., McKinnon, M. C., Mar, R. A. & Levine, B. The Toronto Empathy Questionnaire: scale development and initial validation of a factor-analytic solution to multiple empathy measures. J. Pers. Assess. 91, 62–71 (2009).

33. 33.

Allison, C., Baron-Cohen, S., Wheelwright, S. J., Stone, M. H. & Muncer, S. J. Psychometric analysis of the Empathy Quotient (EQ). Pers. Individ. Dif. 51, 829–835 (2011).

34. 34.

Warrier, V. et al. Genome-wide meta-analysis of cognitive empathy: heritability, and correlates with sex, neuropsychiatric conditions and cognition. Mol Psychiatry (2017) [Epub, ahead of print].

35. 35.

Baron-Cohen, S. et al. The ‘Reading the Mind in the Eyes’ test: complete absence of typical sex difference in ~400 men and women with autism. PLoS. ONE 10, e0136521 (2015).

36. 36.

Tung, J. Y. et al. Efficient replication of over 180 genetic associations with self-reported medical data. PLoS. ONE 6, e23473 (2011).

37. 37.

Do, C. B. et al. Web-based genome-wide association study identifies two novel loci and a substantial genetic component for Parkinson’s disease. PLoS. Genet. 7, e1002141 (2011).

38. 38.

Eriksson, N. et al. Novel associations for hypothyroidism include known autoimmune risk loci. PLoS. ONE 7, e34442 (2012).

39. 39.

Henn, B. M. et al. Cryptic distant relatives are common in both isolated and cosmopolitan genetic samples. PLoS. ONE 7, e34267 (2012).

40. 40.

Browning, S. R. & Browning, B. L. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies By use of localized haplotype clustering. Am. J. Hum. Genet. 81, 1084–1097 (2007).

41. 41.

Fuchsberger, C., Abecasis, G. R. & Hinds, D. A. minimac2: faster genotype imputation. Bioinformatics 31, 782–784 (2015).

42. 42.

Hibar, D. P. et al. Common genetic variants influence human subcortical brain structures. Nature 520, 224–229 (2015).

43. 43.

Bigdeli, T. B. et al. A simple yet accurate correction for winner’s curse can predict signals discovered in much larger genome scans. Bioinformatics 32, 2598–2603 (2016).

44. 44.

Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

45. 45.

Bulik-Sullivan, B. K. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

46. 46.

Ge, T. et al. Phenome-wide heritability analysis of the UK Biobank. PLoS. Genet. 13, e1006711 (2017).

47. 47.

Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).

48. 48.

Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).

49. 49.

Chen, C.-Y. et al. Sexual dimorphism in gene expression and regulatory networks across human tissues. bioRxiv (2016). https://doi.org/10.1101/082289.

50. 50.

Zheng, J. et al. LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis. Bioinformatics 33, 272–279 (2016).

51. 51.

Ripke, S. et al. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014).

52. 52.

Duncan, L. E. et al. Significant locus and metabolic genetic correlations revealed in genome-wide association study of anorexia nervosa. Am. J. Psychiatry 174, 850–858 (2017).

53. 53.

Demontis, D. et al. Discovery of the first genome-wide significant risk loci for ADHD. bioRxiv (2017). https://doi.org/10.1101/145581.

54. 54.

Pedersen, C. B. et al. The iPSYCH2012 case–cohort sample: new directions for unravelling genetic and environmental architectures of severe mental disorders. Mol. Psychiatry 2017. https://doi.org/10.1038/mp.2017.196.

55. 55.

van den Berg, S. M. et al. Meta-analysis of genome-wide association studies for extraversion: findings from the genetics of personality consortium. Behav. Genet. 46, 170–182 (2016).

56. 56.

de Leeuw, C. A., Mooij, J. M., Heskes, T. & Posthuma, D. MAGMA: generalized gene-set analysis of GWAS data. PLoS. Comput. Biol. 11, 1–19 (2015).

57. 57.

Pickrell, J. K. et al. Detection and interpretation of shared genetic influences on 42 human traits. Nat. Genet. 48, 709–717 (2016).

58. 58.

Singh, T. et al. Rare schizophrenia risk variants are enriched in genes shared with neurodevelopmental disorders. bioRxiv (2016). https://doi.org/10.1101/069344.

59. 59.

Samocha, K. E. et al. A framework for the interpretation of de novo mutation in human disease. Nat. Genet. 46, 944–950 (2014).

60. 60.

Kosmicki, J. A. et al. Refining the role of de novo protein-truncating variants in neurodevelopmental disorders by using population reference samples. Nat. Genet. 49, 504–510 (2017).

61. 61.

McRae, J. F. et al. Prevalence and architecture of de novo mutations in developmental disorders. Nature 542, 433–438 (2017).

62. 62.

Baron-Cohen, S. Empathizing, systemizing, and the extreme male brain theory of autism. Prog. Brain Res. 186, 167–175 (2010).

63. 63.

Auyeung, B., Lombardo, M. V. & Baron-Cohen, S. Prenatal and postnatal hormone effects on the human brain and cognition. Pflug. Arch. 465, 557–571 (2013).

64. 64.

Shi, L., Zhang, Z., Su, B., Thompson, P. M. & Thiel, G. Sex biased gene expression profiling of human brains at major developmental stages. Sci. Rep. 6, 21181 (2016).

65. 65.

Melchers, M. C. et al. Similar personality patterns are associated with empathy in four different countries. Front. Psychol. 7, 290 (2016).

66. 66.

Song, Y. S., Lee, H.-J., Prosselkov, P., Itohara, S. & Kim, E. Trans-induced cis interaction in the tripartite NGL-1, netrin-G1 and LAR adhesion complex promotes development of excitatory synapses. J. Cell. Sci. 126, 4926–4938 (2013).

67. 67.

Seiradake, E. et al. Structural basis for cell surface patterning through NetrinG-NGL interactions. EMBO J. 30, 4479–4488 (2011).

68. 68.

Baron-Cohen, S. et al. Attenuation of typical sex differences in 800 adults with autism vs. 3,900 controls. PLoS. ONE 9, e102251 (2014).

69. 69.

Horan, W. P. et al. Structure and correlates of self-reported empathy in schizophrenia. J. Psychiatr. Res. 66–67, 60–66 (2015).

70. 70.

Hambrook, D., Tchanturia, K., Schmidt, U., Russell, T. & Treasure, J. Empathy, systemizing, and autistic traits in anorexia nervosa: a pilot study. Br. J. Clin. Psychol. 47, 335–339 (2008).

71. 71.

Morris, R., Bramham, J., Smith, E. & Tchanturia, K. Empathy and social functioning in anorexia nervosa before and after recovery. Cogn. Neuropsychiatry 19, 47–57 (2014).

72. 72.

Harrison, A., Sullivan, S., Tchanturia, K. & Treasure, J. Emotion recognition and regulation in anorexia nervosa. Clin. Psychol. Psychother. 16, 348–356 (2009).

73. 73.

Brewer, R., Cook, R., Cardi, V., Treasure, J. & Bird, G. Emotion recognition deficits in eating disorders are explained by co-occurring alexithymia. R. Soc. Open. Sci. 2, 140382 (2015).

74. 74.

Beadle, J. N., Paradiso, S., Salerno, A. & McCormick, L. M. Alexithymia, emotional empathy, and self-regulation in anorexia nervosa. Ann. Clin. Psychiatry 25, 107–120 (2013).

75. 75.

Baron-Cohen, S., Wheelwright, S. J., Hill, J., Raste, Y. & Plumb, I. The ‘Reading the Mind in the Eyes’ Test revised version: a study with normal adults, and adults with Asperger syndrome or high-functioning autism. J. Child. Psychol. Psychiatry 42, 241–251 (2001).

76. 76.

Duncan, L. et al. Significant locus and metabolic genetic correlations revealed in genome-wide association study of anorexia nervosa. Am. J. Psychiatry. 174, 850–858 (2017).

77. 77.

Knafo-Noam, A., Uzefovsky, F., Israel, S., Davidov, M. & Zahn-Waxler, C. The prosocial personality and its facets: genetic and environmental architecture of mother-reported behavior of 7-year-old twins. Front. Psychol. 6, 112 (2015).

78. 78.

Knafo-Noam, A. & Uzefovsky, F. inThe Infant Mind: Origins of the Social Brain (eds Legerstee, M. et al.) 97–121 (The Guilford Press, New York, 2013).

79. 79.

Bouchard, T. J. The Wilson Effect: the increase in heritability of IQ with age. Twin Res. Hum. Genet. 16, 923–930 (2013).

## Acknowledgements

We thank Richard Bethlehem, Florina Uzefovsky, and Paula Smith for discussions of the results. We are grateful to Brendan Bulik-Sullivan, Hillary Finucane, and Donna Werling for their help with the analytical methods. This study was funded by grants from the Medical Research Council, the Wellcome Trust, the Autism Research Trust, the Templeton World Charity Foundation, Inc., the Institut Pasteur, the CNRS, and the University Paris Diderot. VW is funded by St. John’s College, Cambridge, and Cambridge Commonwealth Trust. The research was funded and supported by the National Institute for Health Research (NIHR) Collaboration for Leadership in Applied Health Research and Care East of England at Cambridgeshire and Peterborough NHS Foundation Trust. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR, or the Department of Health. We acknowledge with gratitude the generous support of Drs Dennis and Mireille Gillings in strengthening the collaboration between SBC and TB, and between Cambridge University and the Institut Pasteur. We thank the research participants and employees of 23andMe for making this work possible. We specifically thank the following members of the 23andMe Research Team: Michelle Agee, Babak Alipanahi, Adam Auton, Robert K. Bell, Katarzyna Bryc, Sarah L. Elson, Pierre Fontanillas, Nicholas A. Furlotte, Bethann S. Hromatka, Karen E. Huber, Aaron Kleinman, Nadia K. Litterman, Matthew H. McIntyre, Joanna L. Mountain, Carrie A.M. Northover, Steven J. Pitts, J. Fah Sathirapongsasuti, Olga V. Sazonova, Janie F. Shelton, Suyash Shringarpure, Chao Tian, Joyce Y. Tung, Vladimir Vacic, and Catherine H. Wilson. This work was supported by the National Human Genome Research Institute of the National Institutes of Health (grant number R44HG006981). The iPSYCH (The Lundbeck Foundation Initiative for Integrative Psychiatric Research) team acknowledges funding from The Lundbeck Foundation (grant no. R102-A9118 and R155-2014-1724), the Stanley Medical Research Institute, the European Research Council (project no: 294838), the Novo Nordisk Foundation for supporting the Danish National Biobank resource, and grants from Aarhus and Copenhagen Universities and University Hospitals, including support to the iSEQ Center, the GenomeDK HPC facility, and the CIRRAU Center.

A full list of the authors and affiliations in the iPSYCH-Broad autism group is provided in the Supplementary Information.

## Author information

### Author notes

1. David A. Hinds, Thomas Bourgeron and Simon Baron-Cohen are Joint senior authors

### Affiliations

1. #### Department of Psychiatry, Autism Research Centre, University of Cambridge, Cambridgeshire, UK

• Varun Warrier
•  & Simon Baron-Cohen
2. #### Human Genetics and Cognitive Functions Unit, Institut Pasteur, Paris, France

• Roberto Toro
•  & Thomas Bourgeron
3. #### CNRS UMR 3571: Genes, Synapses and Cognition, Institut Pasteur, Paris, France

• Roberto Toro
•  & Thomas Bourgeron
4. #### Human Genetics and Cognitive Functions, Université Paris Didero, Sorbonne Paris Cité, Paris, France

• Roberto Toro
•  & Thomas Bourgeron

6. #### The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark

• Anders D Børglum
•  & Jakob Grove
7. #### Centre for Integrative Sequencing, iSEQ, Aarhus University, Aarhus, Denmark

• Anders D Børglum
•  & Jakob Grove
8. #### Department of Biomedicine—Human Genetics, Aarhus University, Aarhus, Denmark

• Anders D Børglum
• , Jakob Grove
•  & Simon Baron-Cohen
9. #### Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark

• Jakob Grove
10. #### 23andMe, Inc., Mountain View, CA, 94041, USA

• David A. Hinds
11. #### CLASS Clinic, Cambridgeshire and Peterborough NHS Foundation Trust (CPFT), Cambridgeshire, UK

• Simon Baron-Cohen

### Conflict of interest

DH and the 23andMe Research Team are employees of 23andMe, Inc. The remaining authors declare that they have no conflict of interest.

### Corresponding authors

Correspondence to Varun Warrier or Simon Baron-Cohen.