Shared genetic aetiology between cognitive performance and brain activations in language and math tasks

Cognitive performance is highly heritable. However, little is known about common genetic influences on cognitive ability and brain activation when engaged in a cognitive task. The Human Connectome Project (HCP) offers a unique opportunity to study this shared genetic etiology with an extended pedigree of 785 individuals. To investigate this common genetic origin, we took advantage of the HCP dataset, which includes both language and mathematics activation tasks. Using the HCP multimodal parcellation, we identified areals in which inter-individual functional MRI (fMRI) activation variance was significantly explained by genetics. Then, we performed bivariate genetic analyses between the neural activations and behavioral scores, corresponding to the fMRI task accuracies, fluid intelligence, working memory and language performance. We observed that several parts of the language network along the superior temporal sulcus, as well as the angular gyrus belonging to the math processing network, are significantly genetically correlated with these indicators of cognitive performance. This shared genetic etiology provides insights into the brain areas where the human-specific genetic repertoire is expressed. Studying the association of polygenic risk scores, using variants associated with human cognitive ability and brain activation, would provide an opportunity to better understand where these variants are influential.

The genetic contribution to intellectual ability in human has been studied for almost a century 1 . These first studies on twins and families constitute the starting point of the behavioral genetic field. One of the first goal was to estimate the heritability of behavioral traits, meaning the proportion explained by genetics in the variance of an observed trait. These heritability studies were historically conducted in dizygotic and monozygotic twins for which a sensible model known as Falconer formula 2 links the variance of a trait to the a priori genetic and environment shares, conveniently determined in twins. Among the behavioral traits, language and math functions in humans have been extensively studied in fundamental neuroscience as distinctive abilities of human lineage. They are frequently assessed through standard behavioral tests as regular phenotypes but also with neuroimaging to provide brain characteristics, named endophenotypes 3,4 . Those two kinds of phenotype are used jointly as a way to classify the broad behavioral symptoms of language impairments into stable characteristics that in turn are candidates to search for potential associations with either medical treatment responses or genetic profiles 5,6 . Structural properties observed using magnetic resonance imaging (MRI) or activations observed with functional MRI (fMRI) have been used to produce such endophenotypes 7 . These can reveal differences between control and disease groups in language-specific regions 8 or distinguish disorder subtypes such as grammatical-SLI (specific language impairment) 9 . In this study, the endophenotypes correspond to fMRI activations, which are hypothesized to reflect brain spatial display of underlying molecular mechanisms when a subject is tested. This raises the question of their potential common genetic roots with cognitive ability assessed by tests. Imaging-genetics resources, such as the Human Connectome Project (HCP) provide an unprecedented opportunity to study the variability of behavioral endophenotypes along with behavioral tests in control subjects, as well as to determine their potential heritability or association with genetics. This dataset provides a unique opportunity to link in one study issues addressed by the genetics, neuroimaging and psychology communities because it contains: pedigree information, fMRI images and behavioral tests. There are existing bridges between these fields, but to the best of our knowledge data is rarely available at the same time. The rationale of this study is thus to look into the heritability of fMRI activations and behavioral scores separately and then to investigate the shared genetics roots between the two. Following up on these ideas, we first proposed to study the genetic influence involved in fMRI activation differences among typically developed individuals. We used the pedigree data from the HCP language comprehension and verbal math fMRI tasks. These tasks recruit regions directly implicated in brain disorders, such as Broca's area in SLI 10 , the angular gyrus in developmental dyslexia 3 and the intraparietal sulcus in dyscalculia 4 . A few studies have already attempted to estimate the narrow sense heritability of brain activations for various tasks. They notably include digit and n-back working memory 11,12 , visual math subtraction 13 , and stimuli such as written words, faces and spoken language 14 . However, these studies had relatively small sample sizes for reliably detecting heritability estimates ranging between 25 and 50%. In addition to including a larger sample size, the HCP data were processed using state of the art methods, providing 2-mm isotropic resolution and finer inter-individual registration. In particular, the so-called grayordinate activations are computed on the surface of the cortex 15 for each individual, and inter-subject fMRI alignment is performed using areal-feature-based registration 16 . The grayordinate approach refers to fMRI analyses performed on the cortical surface, as opposed to a volume-based approach. We decomposed our analysis onto the HCP's multimodal parcellation of the human cerebral cortex 17 , which enabled us to map the genetic influence on fMRI activations on a very fine scale.
Furthermore, it is known that neural activation endophenotypes from MRI may reflect not only impairments in language but also differences in cognitive scores. For example, weaker left-lateralizations have been reported for some developmental language disorders 18 , and in normal populations, fMRI activations during simple tasks correlate with various cognitive scores. Notably, single digit calculation fMRI activations are predictive of high school math scores 19 , and the fronto-parietal functional connectivity in children performing a task that required them to match Arabic numbers to an array of dots correlated with their score on a standardized math test 20 . Using HCP data and in line with these approaches, we show in this paper how variations in language related fMRI activations correlate with cognitive abilities assessed by the median reaction time (RT), average accuracy and difficulty level during the HCP language and math tasks. Remarkably, recent studies have shown that, beyond the age-related heritability of general cognitive ability 21,22 and of various indicators of academic performance 23 , these scores are highly pleiotropic 21,24 [pleiotropy occurs when one gene regulates one or more phenotypic traits].
This raises the question of the potential pleiotropy between neural activations and cognitive abilities. Thus, as a second contribution, we studied the shared genetic variance of fMRI activations and cognitive performance scores measured during the MRI session or behavioral scores acquired independently from the task. We studied behavioral variables measured by the HCP using standardized tests from the National Institute of Health (NIH): fluid intelligence, working memory, and language assessments such as vocabulary comprehension and oral reading decoding. Details of these variables and their heritability estimates can be found in Table S1, and how well they correlate phenotypically and genetically with the behavioral scores measured during the task is reported in Table S2.
Recent genome wide association studies have unveiled new loci and genes influencing human cognitive performance (e.g. human intelligence 25 , general cognitive function 26,27 and educational attainment 27 ) and possibly intelligence as a construct in differential psychology 21 . However, for these human-specific characteristics, little is known about the underlying integration mechanism of molecular functions or the brain areas where they are most influential. The shared genetic etiology investigated in this work provides new perspectives to decipher the basis of cognitive abilities such as language in humans. This study had two major aims: (1) to estimate the heritability of fMRI activations during story comprehension and math tasks; and (2) to determine the shared genetic etiology between these activations and cognitive performance.

Results
Task fMRI Activations in MATH and STORY tasks. Figure 1 shows the activations for MATH (vs the intercept of the general linear model (GLM) being considered as baseline), STORY and the contrast STORY -MATH. The intercept reflects the mean of the residual BOLD time series after removing variance explained by all other regressors. Both tasks show clear activations in the planum temporale and Heschl's gyrus area, reflecting the fact that the stimuli were presented in the auditory modality. The MATH task, in which participants were requested to perform addition and subtraction, activates areas traditionally implicated in mathematical calculations, that is, the intraparietal sulcus, the middle frontal and the inferior temporal regions 28,29 . The story listening task activates the language understanding network, encompassing bilateral temporal regions and left frontal regions 30,31 . As expected, the group activations for the STORY task are more left lateralized, notably in the left posterior superior temporal and inferior frontal regions, which correspond to Wernicke's and Broca's areas, respectively. Moreover, regions implicated in inhibition networks are also activated by these tasks 32,33 , notably the middle frontal gyrus in the math task and the medial prefrontal cortex, implicated in motivation and execution, and above the anterior cingulate cortex, controlling selective attention 34,35 . In addition, both tasks activate complementary networks; in particular, the math task deactivates the semantic and episodic memory processes, known as the default mode network, which is also active in resting or passive states 36 . This last remark makes the STORY-MATH contrast particularly relevant for studying the genetic influence on activation specifically elicited by math and story tasks.

Univariate Genetic Analyses.
We performed a cortex-wise heritability analysis on the median activation (β-value) in the 360 areals of the HCP multi-modal parcellation. After stringent Bonferroni correction (p < 0.05/360 ≈ 1.4·10 −4 ), we found 54 regions whose activations during the MATH task are heritable and 46 regions for the STORY task. These results are summarized in Fig. 2  Tables S3-S6. The details and names of the areals can be found in the Supplementary Information of the paper describing the multimodal parcellation of the human cerebral cortex 17 . For the MATH (resp. STORY) task, the heritability estimates range from 0.23 to 0.45, with the maximum in the left "Area PGp" corresponding to the angular gyrus (resp. 0.22 to 0.55, with the maximum in the left "PeriSylvian Language Area"). In addition, we performed heritability analysis using the median z-stat value in each areal instead of the median parameter estimate (β-value) and obtained similar results (Fig. S1).  The univariate genetic analysis of the activations associated with verbal math emphasizes mainly areals spanning the math network, including the intraparietal sulcus, middle frontal, inferior temporal and angular gyri. The analysis of activations associated with story comprehension distinctively underlines regions of the language network as bilaterally heritable. Among these regions are the superior temporal sulcus dorsal and ventral parts, Brodmann area (BA) 47 in Broca's area, and the middle frontal gyrus at the junction with the precentral sulcus. Interestingly, the heritability networks of the MATH and STORY tasks overlap very little except in the auditory cortex, around the planum temporale, in the frontal cortex (BA 8), and in the inferior temporal region. Table 1 presents the heritability estimates of the behavioral scores gathered during the MRI scans. The global accuracy on the HCP language tasks, averaging the scores in the MATH and STORY tasks, was significantly heritable, with h² = 0.34, close to the traditionally high heritability estimate of cognitive performance 37,38 . The heritability estimates for the median reaction time (RT) were approximately 0.2. Furthermore, RT Story and RT Math were significantly correlated (phenotypic correlation: ρ p = 0.34, genetic correlation: ρ g = 0.45, Table S2). Regarding the accuracy and average difficulty level of the HCP language task, we observed that the MATH task variables have higher heritability estimates than those of the STORY task. This result might indicate a higher genetic influence on performance during simple arithmetic tasks than during language comprehension. However, this result needs to be considered in light of the different distribution patterns of MATH and STORY accuracies. The STORY accuracy reported by the HCP displays discrete values and might not be sufficiently informative (Fig. S2). Table S2 underlines a significant correlation between math and story accuracies (ρ p = 0.15, ρ g = 0.27). The discrete distribution of STORY accuracy likely occurs because each story lasted approximately 20-30 s, few story questions were presented to the subjects, and most subjects tended to choose the correct answer in the two-alternative forced-choice question.
Bivariate Genetic Analyses. We performed bivariate genetic analyses to quantify the shared genetic influence between intellectual performance, represented by the behavioral measures, and the neural activation in each areal. The genetic correlation estimates are usually subject to substantial sampling errors and therefore inaccurate. The large sample size of the HCP offers the opportunity to reduce the standard errors. The distribution of STORY accuracy values is concentrated on a small number of values ( Fig. S2), thus, we chose to use the average of the STORY and MATH accuracies as the behavioral score to characterize the individual performance. Thus, the results presented here concern the relationship between this average score and the activations or deactivations revealed by the contrast STORY-MATH (activation map shown in Fig. 1c). As a first step of our analysis and to filter out the areals for which the neural activation was not significantly correlated with the behavioral score, we computed the phenotypic correlation between these two variables in each areal of the HCP multi-modal parcellation. Figure 3(a,b) summarizes the phenotypic correlation and associated p-values, for areals significant after Bonferroni correction (p < 0.05/360). The language network is clearly encompassed along the left superior temporal sulcus (STS) and Broca's area, as well as in the anterior part of the right STS. The activations in the angular gyrus (area PGp), supporting the manipulation of numbers in verbal form 39 , were also significantly correlated with the behavioral scores.
Among the 360 areals of the HCP multimodal parcellation, 39 (resp. 38) were significantly phenotypically correlated and were kept for the bivariate analysis in the right (resp. left) hemisphere. The shared genetic variance estimates for these areals are presented Fig. 3(c,d) (p < 0.05 without correction), and detailed values can be found in Tables S7 and S8. With stringent Bonferroni correction the p-value threshold for ρ g is p < 0.05/(39 + 38) ≈ 6.5·10 −4 . Among the areals with significant shared genetic variance, we found the left anterior ventral insular area (AVI, ρ g = 0.61)and the right angular gyrus (PGp, ρ g = −0.40). Noticeably, in the left hemisphere areals, parts of the language network in the posterior STS had activations that shared significant genetic variance with language accuracy. These include the posterior ventral (STSv posterior, ρ g = 0.47) and dorsal (STSd posterior, ρ g = 0.45) parts of the STS, adjacent to the auditory 5 complex area (A5, ρ g = 0.54), the perisylvian language area (PSL, ρ g = 0.47) and the temporo-intraparietal junction (PGi, ρ g = 0.54). On the left hemisphere internal face, we also found the superior frontal language area (SFL, ρ g = 0.61) and, adjacent to this areal the Brodmann 8 decomposed into medial (8Bm, ρ g = 0.75) and lateral (8Bl, ρ g = 0.73) parts. Additionally, we noted two right hemisphere regions implicated in language processing and significantly genetically correlated with the fMRI task average score: the temporal pole (area TG dorsal, TGd, ρ g = 0.65) and the lateral part of Brodmann area 47 (47 l, ρ g = 0.51). The latter is adjacent to Brodmann areas 44 and 45 in the inferior frontal, which are connected through the arcuate fasciculus with the language temporal regions. Additionally, we extended our analysis to behavioral variables measured by the HCP following a standardized NIH protocol. Among these, we selected the variables that are most likely to reflect cognitive performance. Then, we estimated their heritability (Table S1) and correlations with the behavioral scores measured during the fMRI task (Table S2). In this set of variables, fluid intelligence (heritability: h 2 = 0.43, correlations with language accuracy: ρ p = 0.36, ρ g = 0.61), working memory (h 2 = 0.52, ρ p = 0.34, ρ g = 0.50), vocabulary comprehension (h 2 = 0.64, ρ p = 0.40, ρ g = 0.57) and oral reading decoding (h 2 = 0.67, ρ p = 0.46, ρ g = 0.67) were the ones with the highest heritability estimates and correlations with the average accuracy of the two fMRI tasks. Thus, we performed a bivariate genetic analysis between the STORY-MATH activations (difference between β STORY and β MATH ) and these four variables. Regardless of whether one considers the NIH scores or the ones directly related to the fMRI tasks, the study of shared genetic influence with the median activation yields approximately the same set of regions (Figs 4 and 5). This observation reinforces our claim that these regions have common genetic roots with the parts of general cognitive performance accounted for by the four cognitive variables under scrutiny, namely fluid intelligence, working memory, vocabulary comprehension and reading decoding.

Discussion
In this paper, we have shown that brain activation pattern in the language and math networks are heritable. Additionally, we highlighted a particular set of regions along the superior temporal sulcus and in the inferior frontal whose activations share a common genetic basis with some aspects of general cognitive ability, assessed through fMRI task accuracy and behavioral scores.
Previous studies that have estimated the heritability of fMRI task activation had low sample sizes for significantly estimating heritably values ranging from 25% to 50%. The previous samples included 30 subjects (10 triplets of male monozygotic (MZ) twins with one additional brother) 11 , 64 subjects (19 MZ and 13 dizygotic (DZ) pairs) 13,14 or 319 subjects (75 MZ and 66 DZ pairs, 37 unpaired) 12 . We must emphasize that these results correspond to fMRI activations associated with verbal math and semantic comprehension tasks. Thus, regions not recruited by the tasks cannot be found to be significantly correlated with cognitive ability in our case, because activations in these regions are incoherent across individuals. Notably, the visual word form area, related to literacy, is not activated in our oral tasks because they did not require word reading.
Furthermore, combining data from various cohorts is unfeasible, because neural activations from different tasks are not comparable when estimating inter-individual variance. This highlights the necessity of utilizing large cohorts with standardized fMRI protocols to perform such genetic analyses. To our knowledge, this is the first study to address the heritability of fMRI activation cortex-wise on a multimodal parcellation of the human cerebral cortex. Our results confirm the genetic influence on the formation of neural circuits implicated in language 40 and math 13 . Using the HCP    Fig. S4. Genetic correlation was investigated only for areals that were significantly phenotypically correlated (Fig. 4). fine scale parcellation 17 allowed us, for instance, to distinguish the genetic effects on the temporo-parietal junction implicated in language 30 (area PGi) and on the adjacent angular gyrus (area PGp), which is particularly involved in the manipulation of numbers in verbal form 39 . Indeed, these two areas, part of BA 39, present different cytoarchitectonic properties, such as a slightly broader layer II for PGi 41 , which might explain their involvement in different tasks. In a previous work, Pinel and Dehaene also found the left angular gyrus and the posterior superior parietal lobule bilaterally to be heritable 13 . Adding to these observations, our results underline a left hemisphere intraparietal sulcus specificity, with more heritable areals and slightly higher heritability compared to the right for the MATH task. This finding is consistent with results reported by Vogel and colleagues demonstrating a correlation of activations in the left intraparietal sulcus modulated by age, which was not observed in the right intraparietal sulcus 42 . Heritability represents the proportion of observed inter-individual phenotypic variance that is explained by genetics. Thus, it might be that inter-individual variance is not sufficiently pronounced in the right hemisphere, whereas activations have evolved over one's lifetime in the left hemisphere. Overall, the heritability maps for the STORY and MATH tasks pinpoint regions known to be disrupted in neurodevelopmental disorders. For instance, the inferior frontal area and the temporo-parietal junction activations are impaired in developmental dyslexia 3,43 , and the intraparietal region activations are less modulated by the numerical distance between two numbers being compared in developmental dyscalculia 4,44,45 . Highlighted areas might provide new insights into brain regions where normal gene expression might be disrupted, leading to brain dysfunction and neurodevelopmental disorders. Frequently replicated genes associated with neurobehavioral disorders, such as developmental dyslexia or SLI, likely play such a role in structural brain maturation by interfering with neuronal migration and neurite growth 6 . Several studies have already described some phenotypic correlations between cognitive abilities and neural activations in language [46][47][48] and math 19,20,49,50 . Our study replicates these observations, notably the correlation with language processing regions, including Broca's area and the posterior superior temporal gyrus 46 . Moreover, we estimated the genetic proportion in these phenotypic correlations. Hence, we demonstrated a shared genetic etiology between brain activations and cognitive performance, assessed in our study by the following tests: fluid intelligence, working memory, vocabulary comprehension and reading decoding. Interestingly, in the right hemisphere, mainly the anterior STS was found to be genetically correlated with the language task accuracy. This result seems consistent with the hypothesized role of the right anterior STS in the processing of figurative language, likely involved in the Aesop's fable metaphors presented to the subjects 51 .
The observed genetic correlations shed light on the genetic links between cognitive performance and activation level in cognitive task-related fMRI. These links might be related to the development and maturation of myelin, enhancing brain connectivity. In children with difficulties processing syntactically complex sentences, arcuate fasciculus maturation was incomplete compared to adults 52,53 . Thus, we could look for additive genetic effects implicated in the various levels of fiber tract maturation, which improves brain connectivity and efficiency. Indeed, Skeide and colleagues reported an example of such a genetic risk variant for dyslexia. They showed that this variant is related to the functional connectivity of left fronto-temporal phonological processing areas during the resting state 54 . Similarly, children with higher arithmetic scores present a more mature response modulation in their left intraparietal lobe 49 . Our study suggests that a proportion of the observed inter-individual variance in cognitive performance partly results from the same additive genetic effects as those contributing to brain activation variance. The moderate shared genetic basis suggests that a crucial interaction occurs between the environment and gene networks to enable the brain to develop to its full potential.
In this study, we could not accurately model the share environment between siblings because HCP did not provide household information to model the shared environment. However, we pinpointed in the method section that we used the education level to model environmental differences in family socioeconomic status to mitigate the absence of common environment information. This choice is likely conservative and our heritability estimates and shared genetic are likely underestimated. Indeed, this model is conservative because the number of year of education shares genetic roots with the general cognitive ability 55 . In addition, the genotyping material from HCP was not available at the time of our study, thus we did not study gene×environment interaction.
Recently, with the emergence of large cohorts, such as UK Biobank, new loci and genes influencing human cognitive ability have been discovered [25][26][27] . However, little is known about how these genes contribute to this human-specific trait. Our study pinpoints brain regions where activations genetically correlate with global cognition scores. These regions might help elucidate the mechanism in which these genes are implicated. When the HCP genotyping data are released, a polygenic score of these newly discovered variants could be used to determine the explained proportion of the neural activation variance in these regions.

Material and Methods
Subjects. This study utilized the dataset of the Human Connectome Project (HCP). The HCP scans and data were released in April 2017 (humanconnectome.org). The details of the release are available in the HCP reference manual. In this project, 1046 subjects aged between 22 and 37 years old (µ ± σ = 28.8 ± 3.7 years) completed the fMRI language task in the HCP S1200 release. To avoid population stratification, we only included the 785 Caucasian individuals (372/413 M/F) that were classified as race = "White" by the HCP; among these, the ethnicity of 69 was "Hispanic/Latino". This subgroup of the HCP contains 178 twin pairs (117 monozygotic twins (MZ) with 103 siblings and 61 dizygotic twins (DZ) with 61 siblings and 1 half sibling), 203 siblings, 1 half sibling and 60 unpaired individuals. The unpaired individuals did not contribute to the genetic parameter estimation but allowed a more accurate estimation of mean and variance effects. Subjects were chosen by the HCP consortium to represent healthy adults beyond the age of major neurodevelopmental changes and before the onset of neurodegenerative changes 56 . They underwent a battery of tests to determine if they met the inclusion/exclusion criteria of the HCP 56 . All subjects provided written informed consent on forms approved by the Institutional Review Board of Washington University. All of the following methods were carried out in accordance with relevant guidelines and regulations. All experimental MRI protocols were approved by the Institutional Review Board of Washington University. Image acquisition and processing. MR images were acquired using a 3 T Connectome Scanner, adapted from Siemens Skyra, housed at Washington University in St Louis, using a 32-channel head coil. T1-weighted images with 256 slices per slab were acquired with the three-dimensional magnetization-prepared rapid gradient echo (3D-MPRAGE) sequence: TR = 2400 ms, TE = 2.14 ms, TI = 1000 ms, flip angle = 8°, FOV = 224 × 224 mm, and resolution 0.7 mm isotropic. T2-weighted images, 256 slices per slab, were acquired with a 3D-T2SPACE sequence: TR = 3200 ms, TE = 565 ms, variable flip angle, FOV = 224 × 224 mm, and resolution 0.7 mm isotropic. fMRI data acquisition parameters were as follows: TR = 720 ms, TE = 33.1 ms, flip angle = 52 deg, BW = 2290 Hz/ Px, in-plane FOV = 208 × 180 mm, 72 slices, and 2.0 mm isotropic voxels, with a multi-band acceleration factor of 8. Two runs of each task were acquired, one with right-to-left and the other with left-to-right phase encoding 2; each run interleaved 4 blocks of a story task with 4 blocks of a math task. The lengths of the blocks varied (average of approximately 30 seconds), but the task was designed so that the math task blocks matched the length of the story task blocks, with some additional math trials at the end of the task to complete the 3:57 (min:sec) run. The details of the HCP data analysis pipelines are described elsewhere 15,57 . Briefly, they are primarily built using tools from FSL 58 and Freesurfer 59 . The HCP fMRIVolume pipeline generates "minimally preprocessed" 4D time series that include gradient unwarping, motion correction, fieldmap-based EPI distortion correction, brain-boundary-based registration of EPI to structural T1-weighted scans, non-linear (FNIRT) registration into MNI152 space, and grand-mean intensity normalization 57 . For the S500 release, two smoothing approaches were chosen by the HCP: volume-based smoothing or smoothing constrained to the cortical surface and subcortical gray-matter parcels. For the former, standard FSL tools can be applied for analysis, while for the latter, the HCP adapted these tools to this the 'grayordinate' approach 15,57 . The grayordinate approach refers to fMRI analyses performed on the cortical surfacen, as opposed to a volume-based approach. This is more accurate spatially because activation occurs in gray, not white, matter. Unconstrained volume-based smoothing causes blurring effects by mixing signals from cortex regions adjacent in volume but not on the surface. For these reasons, our study analyses were carried out on the surface of the cortex.
The HCP fMRISurface pipeline brings the time series from the volume into the CIFTI grayordinate standard space. This is accomplished by mapping the voxels within the cortical gray matter ribbon onto the native cortical surface, transforming them according to the surface registration onto the 32k Conte69 mesh, and mapping the set of subcortical gray matter voxels from each subcortical parcel in each individual to a standard set of voxels in each atlas parcel. The result is a standard set of grayordinates in every subject (i.e., the same number in each subject, with spatial correspondence) with 2 mm average surface vertex and subcortical volume voxel spacing. These data are smoothed with surface and parcel constrained smoothing of 2 mm FWHM (full width half maximum) to regularize the mapping process 57 .
The language task. The HCP language task was developed by Binder and colleagues 36 . In the story blocks, participants were presented with brief auditory stories adapted from Aesop's fables, followed by a 2-alternative forced-choice question to check the participants' understanding of the story topic. The example provided in the original paper is "For example, after a story about an eagle that saves a man who had done him a favor, participants were asked, "Was that about revenge or reciprocity?"". In the math blocks, participants were also presented auditory series of addition and subtraction (e.g., "fourteen plus twelve"), followed by "equals" and then two choices (e.g., "twenty-nine or twenty-six"). To ensure similar level of difficulty across participants, math trials automatically adapted to the participants responses. As shown by Binder and colleagues 36 , the story and math trials were well matched in terms of duration, auditory and phonological input, and attention demand. Furthermore, they were likely to elicit distinct brain activation -on the one hand, anterior temporal lobes classically involved in semantic processing, and parietal cortex on the other hand, classically involved in numerical processing, thus spanning a broad set of regions involved in conceptual semantic processing.
HCP task fMRI analysis. The analysis of fMRI data was carried out by the HCP consortium and we describe briefly their pipeline 15 . The Story predictor covered the variable duration of a short story, question, and response period (~30 s). The Math predictor covered the duration of a set of math questions designed to roughly match the duration of the story blocks. The grayordinate data for individual task runs were processed in a level 1 analysis. Activity estimates were computed for the preprocessed functional time series from each run using a general linear model (GLM) implemented in FSL's FILM (FMRIB's Improved Linear Model with autocorrelation correction) 60 . Predictors were convolved with a double gamma "canonical" hemodynamic response function 61 to generate the main model regressors. The two runs for each task and subject were then combined in a level 2 fixed-effects analysis 15 , which we used as our phenotype. Fixed-effects analyses were conducted using FEAT (fMRI Expert Analysis Tool) to estimate the average effects across runs within-subjects, and then mixed-effects analyses treating subjects as random effects were conducted using FLAME (FMRIB's Local Analysis of Mixed Effects) to estimate the average effects of interest for the group third-level analysis.
Phenotype definitions. To define our phenotypes, we consider separately the regression analyses on STORY and MATH tasks, and the contrast STORY-MATH. We used the beta values (pe1.dtseries.nii files) of the results of the level 2 analysis, which essentially average the level 1, i.e., the individual, runs. The contrasts were defined by the HCP in level 1 and averaged for level 2: thus, the grayordinate values of the beta and contrast values (cope1.dtseries.nii) are identical in this case, as they did not define any "new" contrasts specifically at level 2. Therefore, we could have used the cope1.dtseries.nii.files with no difference in results. We used the MSMAll registered the functional analysis results from HCP and the HCP multimodal parcellation 17 . We analyzed each of the 180 areals separately. We computed the median beta values in each areal for both hemispheres. These phenotypes constitute our proxy to estimate the activation in each part of the brain. Moreover, we also included in our phenotypes the accuracy, reaction time and average difficulty level for the MATH and STORY tasks. We called these "behavioral scores", as opposed to the grayordinate activation phenotypes previously defined.
The STORY-MATH contrast allows to cancel out: (i) the mean of the residual BOLD signal, which reflects the mean of the random error, (ii) signal in areas activated by both tasks, for example the primary auditory area. In addition, as emphasized by Binder and colleagues in the paper introducing this contrast 36 , the story and math tasks activate complementary networks. In particular, the math task deactivates the semantic and episodic memory processes, known as the default mode network, which remains active in resting or passive states. The STORY task activates regions traditionally involved in the semantic comprehension of phrases along the superior temporal sulcus, including Wernicke's area, and in the inferior frontal region, including Broca's area 62 . This set of activation is often left lateralized as observed here (Fig. 1). The MATH task involves brain regions in the intraparietal, middle and inferior frontal regions, as well as the angular gyrus, when the math task is performed orally 39 .
In the STORY-MATH contrast positive (respectively negative) beta weights reflect brain regions more activated by the STORY task than the MATH task, or deactivated by the MATH task (respectively more activated by the MATH task than the STORY task, or deactivated by the STORY task). Brain regions that are jointly activated by the STORY and MATH tasks have approximately null beta weights.
Univariate analysis of additive genetic variance. The variance components method, as implemented in the Sequential Oligogenic Linkage Analysis Routines (SOLAR) software package 63 , was used for the heritability estimations of the phenotypes under analysis, such as the median activation in each areal 17 . The SOLAR algorithms use maximum variance decomposition methods derived from the strategy developed by Amos 64 . The covariance matrix Ω for a pedigree of individuals is given by: Ω = 2·Φ·σ g 2 + I · σ e 2 , where σ g 2 is the genetic variance due to the additive genetic factors, Φ is the kinship matrix representing the pair-wise kinship coefficients among all individuals, σ e ² is the variance due to individual-specific environmental effects, and I is the identity matrix.
Narrow sense heritability is defined as the fraction of the phenotype variance σ p 2 attributable to additive genetic factors: h² = σ g 2 /σ p 2 .
The significance of the heritability is tested by comparing the likelihood of the model in which σ g ² is constrained to zero with that of the model in which σ g ² is estimated. Before testing for the significance of heritability, phenotype values for each individual within the HCP cohort were adjusted for the following covariates: sex, age, age², age·sex interaction, age²·sex interaction, ethnicity (Hispanic or not) and education level. We used the number of years of education as a proxy for the education level to account for environmental differences in family socioeconomic status. This is a conservative approach because the number of years of education was shown to be associated not only with the family socioeconomic status (7%) but also with the general cognitive ability (3.5%) 55 . Thus, it likely has shared environmental ground with the former and shared genetic origin with the latter. HCP data do not contain the information that would disentangle this issue. Following this last remark, one should note that the heritability estimates and shared genetic variances, described in the next section, were underestimated.

Bivariate genetic analyses.
To assess the relationship between math dexterity/language comprehension and activation in brain areas, we computed the Pearson correlation between the median activation in each of the 180 areals of both hemispheres, and the behavioral scores.
Furthermore, we assessed the degree of shared genetic variance in the areals for which activation was significantly correlated with the behavioral scores; we performed a genetic correlation analysis using SOLAR, relying on the following model: , where Pearson's phenotypic correlation ρ p is decomposed into ρ g and ρ e . ρ g is the proportion of variability due to shared genetic effects and ρ e that due to the environment, while h a 2 and h b 2 correspond to the previously defined narrow sense heritability for phenotypes a and b, respectively. In our case, one corresponds to the heritability of fMRI activation in one areal, while the second is the heritability of one of our behavioral scores.

Data Availability statement
HCP (https://db.humanconnectome.org) is a publicly available dataset. Investigators need to apply to be granted access to restricted data.