The phenotypic correlation between human intelligence and brain volume (BV) is considerable (r ≈ 0.40), and has been shown to be due to shared genetic factors. To further examine specific genetic factors driving this correlation, we present genomic analyses of the genetic overlap between intelligence and BV using genome-wide association study (GWAS) results. First, we conduct a large BV GWAS meta-analysis (N = 47,316 individuals), followed by functional annotation and gene-mapping. We identify 18 genomic loci (14 not previously associated), implicating 343 genes (270 not previously associated) and 18 biological pathways for BV. Second, we use an existing GWAS for intelligence (N = 269,867 individuals), and estimate the genetic correlation (rg) between BV and intelligence to be 0.24. We show that the rg is partly attributable to physical overlap of GWAS hits in 5 genomic loci. We identify 92 shared genes between BV and intelligence, which are mainly involved in signaling pathways regulating cell growth. Out of these 92, we prioritize 32 that are most likely to have functional impact. These results provide information on the genetics of BV and provide biological insight into BV’s shared genetic etiology with intelligence.
The relation between brain volume (BV) and cognitive ability has long been a fundamental question to cognitive sciences1,2,3,4. Before the advent of magnetic resonance imaging (MRI), BV was typically determined from autopsy records5, or inferred by external head size measures. A literature meta-analysis conducted by Vernon et al.6 estimated the correlation between external head measures and intelligence to be around 0.19. Five years later, a meta-analysis of 24 studies using MRI-based measures of BV (with the subtle title; “Big brains are better”) found a substantially higher correlation of 0.33 with intelligence7. Recently, an exhaustive meta-analysis8, including 88 studies comprising 148 samples, was conducted, aiming to readdress the much-debated relation between BV and intelligence. This study found that BV and intelligence were, indeed, robustly correlated, albeit lower (r = 0.24) than what was reported in the literature. The difference may in part be explained by publication bias, with strong, positive correlations more often being reported than small, non-significant ones. Recently, these findings were supported by another large-scale study reporting a correlation of r = 0.19 between BV and fluid intelligence9 (referring to problem-solving and logical skills that do not depend on previously acquired knowledge). Interestingly, previous studies did not find evidence for sex differences in the relation between BV and intelligence8.
The conservative correlation of ~0.20, implies an explained variance of just 4%. This is low and therefore BV is not a good predictor of intelligence, and vice versa, intelligence is not a good predictor of BV. However, knowing that this correlation is at least partly due to shared genetic factors10 suggests that knowledge of which specific genetic factors are involved in both traits may provide insight into the observed covariance between BV and intelligence. With the availability of larger genotype data and genome-wide association studies (GWAS) that identify increasing numbers of genes related to BV11,12,13 and intelligence14,15,16, analysis of the shared genes and pathways has become feasible.
In the current study, we utilize large-scale genetic data and advanced statistical genetics tools, to examine the genetic factors that underlie the observed correlation between BV and intelligence. We conduct a GWAS of BV in the UK Biobank (UKB) sample, and meta-analyze the results with two additional cohorts for which data on intracranial volume (ICV) and head circumference (HC), a proxy measure for BV, is available in the public domain. Using extensive follow-up analyses and GWAS summary statistics of intelligence we aim to zoom in on the genetic factors overlapping between BV and intelligence.
We identify 92 genes affecting both BV and intelligence, which are mainly involved in signaling pathways regulating cell growth. A series of follow-up analyses prioritizes 32 out of these 92 genes that are most likely to have functional impact based on the properties of these genes.
GWAS meta-analysis of BV
To identify genetic variants associated with BV, we first performed a GWAS (Supplementary Fig. 1) using data of 17,062 participants from the UKB17, with BV estimated from structural (T1-weighted) MRI by summing total gray and white matter volume, and ventricular cerebrospinal fluid volume (Supplementary Fig. 2). GWAS analyses in UKB were corrected for the Townsend deprivation index (TDI)18, a measure that correlates with socioeconomic and health-related factors19,20, as well as for age, sex, genotype array, assessment center, standing height, and the first 10 genetic principal components (PCs). Within the UKB data, we identified 3,610 genome-wide significant (GWS; P < 5 × 10−8) variants, tagging 9 independent genomic loci. The single nucleotide polymorphism (SNP)-based heritability (h2SNP) of BV, estimated through linkage disequilibrium score regression (LDSC21; Methods), was 35.3% (SE = 4.1%). The LDSC intercept of 1.02 is close to 1, suggesting that the observed inflation (λGC = 1.118) in genetic signal is mostly due to polygenicity rather than population stratification22 (see Supplementary Note 1). The UKB GWAS results were then meta-analyzed with GWAS results from two previously published studies: one on ICV from the ENIGMA consortium13 (N = 11,373), the other on HC, a proven proxy of BV23,24,25 (also see Supplementary Note 2), from a recent GWAS meta-analysis11 of adults and children (N = 18,881). This led to a total sample size of 47,316 unrelated Europeans (Supplementary Fig. 3; Supplementary Table 1). LDSC showed high concordance of SNP associations between the three samples (UKB and ENIGMA: rg = 1.25, SE = 0.20; UKB and HC-GWAS: rg = 0.75, SE = 0.09; ENIGMA & HC-GWAS: rg = 0.94, SE = 0.20; Supplementary Data 1), justifying subsequent meta-analysis (we note that the rg outside [−1, 1] is likely due to the unbiased rg estimation by LDSC, the large standard error of the estimate and the high genetic correlations between these meta-analyzed cohorts). Sample-size weighted fixed-effects meta-analysis was carried out using METAL26 (Methods) resulting in 24 linkage disequilibrium (LD) independent lead variants (r2 < 0.1), residing in 18 genomic loci (Fig. 1a–b; Supplementary Data 2; Supplementary Figure 4; Supplementary Note 3), representing 4,155 GWS variants (of which 371 were indels, Supplementary Data 3) associated with BV. Of these 18 loci, 14 were not identified in a recent GWAS study of ICV12 (2 loci reported in this earlier study were not GWS in the current meta-analysis). To see if our results were driven by only one of the three samples or were supported by all three, we examined the direction of effect of all GWS variants across the three individual cohorts. As expected, we found high concordance: all variants that were GWS in the meta-analysis, and that were present in all three cohorts, had the same direction of effect. Of the 4,155 variants that were GWS in the BV meta-analysis, just 2.4% was GWS in at least two of the individual cohorts, however, 75.0% had a P-value < 0.05 in at least two cohorts (and 21.2% in all three cohorts). The h2SNP of BV estimated by LDSC from the meta-analytic results was 21.4% (SE = 1.8%), and the LDSC intercept approximated 1 (1.025, SE = 0.009), suggesting that the inflation in test statistics (λGC = 1.18) in the meta-analysis was also largely due to polygenicity22. Functional annotation of 4,683 ‘candidate’ variants (i.e., variants in the loci with a GWAS P-value of P < 10−5 and LD r2 > 0.6 with one of the independent significant variants; Methods; Supplementary Note 4) carried out in FUMA27 showed that these variants were most abundant in intronic (n = 2,444, 52,1%) or intergenic regions (n = 751, 16.0%), and 27 (0.6%) variants were exonic nonsynonymous SNPs (ExNS) altering protein structures of 13 genes (Supplementary Figure 5; Supplementary Data 4, 5). One gene, SPPL2C, contained 8 ExNS (all in exon 1, and in the same inversion region). SPPL2C codes for the signal peptide peptidase-like 2C, which plays a role in the degradation of signaling peptides in the brain28.
To test whether specific functional categories of variants contribute disproportionally to the heritability of BV, we used LDSC to partition the genetic signal over different variant annotations29 (Methods). We observed significant heritability enrichment in eight variant categories (Fig. 1c; Supplementary Table 2), with the strongest enrichment of variants in (evolutionary) conserved regions, (enrichment = 15.1, SE = 2.7, P = 1.19 × 10−6, suggesting a 15-fold enrichment in h2 conveyed by variants in these regions compared to the proportion of variants in these regions), transcription start sites (TSS; enrichment=11.1, SE = 3.1, P = 1.2 × 10−3), and H3K9ac peaks (i.e., specific histone modification that is correlated with active promoters; enrichment = 10.3, SE = 3.0, P = 1.24 × 10−3).
BV gene analyses
To gain insight into which genes may be involved in BV, we mapped the 4683 candidate variants implicated in the BV meta-analysis to genes, using positional mapping, eQTL mapping, and chromatin interaction mapping as implemented in FUMA27, and through gene-based association tests as implemented in MAGMA30 (Methods, Supplementary Fig. 6; Supplementary Note 5). In total, 343 unique genes were implicated by at least one of these methods (of these, 321 genes were mapped based on the 18 identified risk loci). Specifically, the 18 risk loci were mapped to 119 genes based on position, 207 genes by eQTL association, 192 genes through chromatin-chromatin interactions (Fig. 2a; Supplementary Data 6), and gene-based association testing identified 69 genes (Fig. 2b; Supplementary Data 7). Of the 343 genes, 73 genes were also found in previous studies (38 were not), while 270 were not previously implicated (Supplementary Data 8). Overall, 16 genes were implicated by all four methods (FRZB, FOXO3, WBP1L, PRR13, MAP3K12, RAB5B, SUOX, RPS26, ERBB3, PITPNM2, C12orf65, INA, HMGA2, PLEKHM1, MAPT, and KANSL1) of which the first 11 were not identified in previous studies of BV at the time of analysis11,12,31,32,33.
Variation in BV can be seen as a continuum, with abnormally small or large brains as the extremes. Since monogenic Mendelian disorders are often characterized by abnormal brain development, which frequently results in deviations in BV, we performed look-up of the 343 genes implicated in the BV meta-analysis in the Online Mendelian Inheritance in Man database34 (OMIM). We identified 143 monogenic disorders caused by high-penetrance mutations in 89 of the 343 BV-implicated genes (25.9% vs. 16.4% (59 genes) that is expected based on the number of genes reported in OMIM; Fisher exact test P = 2.33 × 10−5) many of which are commonly associated with abnormal brain development (including microcephalia (CDK6), macrocephalia (PTEN) and megalencephaly (AKT3)), and abnormal growth (tetra-amelia syndrome (WNT3)) (Supplementary Data 9).
To identify functional pathways related to BV, we performed gene-set and gene-expression analysis in MAGMA30 (Methods). Of the 12,191 tested gene sets (including canonical pathways, gene ontology (GO) gene sets, and gene-expression in tissue/cell types), 18 were significantly associated after Bonferroni correction (Gene-set PBONF = 3.90 × 10−6 (0.05/(12,191 gene sets + 53 tissue types + 565 cell types)); Fig. 2c; Supplementary Data 10–11; Supplementary Note 6). Among these significant gene sets were several cell-signaling pathways, including the ERBB3 pathway, the PI3K/AKT signaling in cancer pathway and a pathway related to GRB7 events in ERBB2 signaling. Other pathways were related to (regulation of) developmental processes such as mesenchyme and muscle organ development, or to growth. That is, key cell-signaling pathways were implicated that are involved in normal brain development35 and involved in several brain developmental abnormalities36,37. Pairwise conditional gene set analysis allows one to disentangle overlapping associations of gene sets (see Methods). Conditional gene-set analyses can be used to examine whether a gene set is independently associated with a trait, rather than being associated because it is nested within a larger associated gene set38. These analyses indicated that nine gene sets constitute largely independent associations, defined as remaining significant in the majority of the conditional analyses (Supplementary Figs. 7–9, Supplementary Data 12). Conversely, conditional analyses suggest that a number of signaling pathways (BIOCARTA: ERBB3 pathway, BIOCARTA: PTEN pathway) represent a shared underlying signal, since conditional P-values were substantially higher than the unconditional P-values.
Next, we aimed to identify tissue categories and neuronal cell types that are enriched for gene signal of BV, by linking gene P-values to gene-expression in 53 tissue types39 and 565 brain cell types40 (Methods). None of the associations of tissue or cell types passed our stringent multiple testing correction (i.e. correcting for all tested gene sets, tissues, and cell types, thus 0.05/12,809). The strongest evidence of association was observed for three cell types: TNR-BMP4 polydendrocytes in the subthalamic nucleus, (P = 4.67 × 10−3), layer 5a cortical (P = 8.55 × 10−3), and hippocampal endothelial stalk FLT1-LCN2 neurons (P = 0.01) (Supplementary Data 10). Polydendrocytes are thought to be precursor cells of oligodendrocytes41, a cell-type involved in supporting neuronal health and myelinization of the brain42. The endothelial neurons in the hippocampus support formation of the blood–brain barrier, and are involved in the regulation of neurogenesis43.
Genetic correlations between BV and other traits
Earlier studies have shown substantial overlap between BV and behavioral traits, reporting phenotypic correlations between BV and, e.g., autism44, and genetic correlations between BV and intelligence45,46,47,48 and schizophrenia49. We used LDSC to estimate the overlap in genetic signal between BV and 25 brain-related and neuropsychiatric traits for which published genome-wide summary statistics based on large samples were available (Methods). Significant genetic correlations (P < 0.002; 0.05/25) were observed between BV and 7 traits, amongst which were positive genetic correlations with educational attainment (rg = 0.22, SE = 0.03, P = 3.58 × 10−17; Fig. 1d; Supplementary Table 3), intelligence (rg = 0.21, SE = 0.03, P = 9.87 × 10−12), BMI (rg = 0.17, SE = 0.03, P = 1.22 × 10−9) and birth weight (rg = 0.26, SE = 0.07, P = 1.13 × 10−4), and negative genetic correlations with ADHD (rg = −0.17, SE = 0.05, P = 3.56 × 10−4), neuroticism (rg = −0.10, SE = 0.03, P = 4.65 × 10−4), and insomnia (rg = −0.12, SE = 0.03, P = 5.76 × 10−6), confirming previously reported overlap12,50,51
Genetic overlap with intelligence
The development of human intelligence has coincided with a strong increase in total size of the brain52. Indeed, epidemiological studies have shown overlap in genetic factors between BV and intelligence46,53. To further explore the nature of the genetic overlap between intelligence and BV, we used GWAS summary statistics of a large recent meta-analysis of intelligence from Savage et al.53, based on data from 14 cohorts (N = 269,867) using different types of intelligence measures. Except for one, all of these cohorts used intelligence measures that adhere to the idea of a single latent trait g (Spearman’s g) that underlies multiple domains of cognitive ability. Extensive previous research has shown that a) scores on different, validated tests of cognitive ability correlate highly, and b) all tests index the same underlying (statistical) construct, i.e., g10,54.
Our analyses confirmed the previously established genetic overlap between BV and intelligence: the estimated rg between BV and intelligence was 0.24 (Supplementary Data 2; Supplementary Table 3), in line with prior evidence of shared genetic factors between intelligence and gray and white matter volume obtained from twin studies46. To gain insight into the specific genetic factors driving this genetic correlation, we explored the overlap in variants, loci, and genes associated with both traits. We first determined whether strongly associated variants show the same direction of effect in both traits, by performing a look-up of the lead variant of BV in the intelligence GWAS, and vice versa. We observed a complete sign concordance of the 24 BV lead variants in intelligence (sign concordance = 100%, P = 1.19 × 10−7, Fig. 3a), and weaker but still considerable sign concordance of the 243 lead variants of intelligence in BV (sign concordance = 62.1%, P = 1.39 × 10−4, Fig. 3b). Similarly, we found that the majority of genes that reached gene-wide significance in the MAGMA gene-based test of BV (69) also showed low P-values in intelligence (507), and vice versa (Supplementary Fig. 10).
To explore possible causal relations between BV and intelligence, we carried out GWAS summary statistics-based Mendelian Randomization (MR) analyses, using the Generalized Summary-data-based MR package55 (GSMR; Methods), essentially using independent lead variants as instrumental variables (LD: r2 < 0.1). GSMR analyses demonstrated a directional effect of BV on intelligence (bxy = 0.154, SE = 0.015, P = 1.88 × 10−23; Fig. 3c) and a less strong yet still highly significant directional effect of intelligence on BV (bxy = 0.139, SE = 0.026, P = 7.34 × 10−8; Fig. 3d), suggesting a bidirectional association between these phenotypes, in line with the previous reports53. We do note, however, that GSMR relies on several strong assumptions, such as the absence of a third, mediating, factor, and no horizontal or vertical pleiotropy56, which may not always hold. The current results should be interpreted conditional on these assumptions.
To investigate whether specific variants or genes could be identified that drive the genetic overlap between intelligence and BV, we performed several cross-trait analyses of variants and genes significantly implicated in both traits. We observed physical overlap in 5 out of the 18 genomic loci for BV, and overlap in genes implicated by FUMA (ngenes = 81) and MAGMA (ngenes = 24), resulting in 92 unique overlapping genes (Fig. 3e–f; Supplementary Data 13). Conversely, of the 343 genes that were significant for BV, 251 genes were not associated to intelligence (Fig. 3e). Lookup of gene functions of the 92 overlapping genes in the online GeneCards57 repository showed strong involvement of these genes in a wide variety of cellular processes and key factors in cell division. When comparing the probability of being loss-of-function (LoF) intolerant (pLI score >0.9) between overlapping genes (n = 92) and genes observed for only one of these traits (n = 1926), we observe a slightly higher fraction of genes being LoF intolerant in the genes that play a role in both traits (24.4% (shared) vs. 21.2% (intelligence-only), 18.0% (BV-only), and 17.7% (all genes), Fig. 3g). The mutation intolerance of these overlapping genes is further demonstrated by a slightly higher fraction of genes in this category that are associated with monogenic disease (18.3%) in the OMIM database compared to BV-only (16.8%) and intelligence-only (17.4%) genes (all genes: 16.4%). Furthermore, we observed a higher proportion of genes that cause monogenic developmental disorders (overlap: 16.0%, BV-only: 10.0%, intelligence-only: 13.7%, all genes: 12.3%) in the DECIPHER database of developmental disorders58 (Fig. 3g).
Previous studies have sought to narrow down the link between intelligence and BV to specific subcortical regions, such as the caudate nuclei59 or the thalamus60. If specific brain regions are related to intelligence, one might expect genes associated with intelligence (or BV) to be differentially expressed across regions. Using data obtained from the Allen Human Brain Atlas (AHBA)61, we examined whether cortical gene expression profiles of the 92 genes that overlap between BV and intelligence differed from the gene expression profiles of all the 1900 genes associated to BV and/or intelligence (Fig. 3h). Specifically, using permutation analyses we compared the expression of these 92 overlapping genes across 57 brain regions to that of 10,000 randomly selected, equally large, sets of genes drawn from the 1900 genes related to either or both of the traits. Although we observed overexpression of the 92 overlapping genes in the anterior part of the fusiform (two-sided permutation test N = 10,000 permutations P = 0.015) and the parahippocampal gyrus (P = 0.005), none of the associations survived a conservative Bonferroni correction (P = 8.77 × 10−4; 0.05/57); Supplementary Data 14). To examine whether clusters of more homogeneously expressed genes exist within the 92 overlapping genes, we additionally performed clustering analysis on the correlations between the expression profiles of genes, aiming to maximize intra-cluster cohesion, while at the same time maximizing differences between clusters (see Methods). Two clusters were identified and only for cluster 2 (representing 42 of the 92 overlapping genes) one of the 57 expression profiles was significantly different from that of random sets, suggesting overexpression in the anterior fusiform gyrus (P = 6.0 × 10−4; Supplementary Data 14). Interaction analyses in MAGMA (see Methods), testing whether gene expression in specific brain regions contributed significantly to the genetic relationship between BV and intelligence, revealed no evidence for interaction (Supplementary Data 15). Based on these results, we are unable to conclude that the set of overlapping genes is predominantly expressed in any particular cortical region compared to other brain regions.
To prioritize from the 92 overlapping genes those genes that are more likely to have causal effects on both BV and intelligence, we filtered the set of overlapping genes based on either one of three conditions: (1) the gene contains an ExNS SNP, (2) the gene is part of one of the significant gene sets for BV or intelligence, and (3) the GWAS signal of either trait colocalizes with eQTL signals, using COLOC62 (colocalization of GWAS and eQTL signals is compatible with the hypothesis of a common causal variant; Supplementary Data 13). Filtering the 92 overlapping genes on the basis of these 3 conditions yielded a selection of 32 potentially functionally interesting genes. For example, we prioritized two genes on chromosome 3 (out of 4 associated genes in the same region) that may prove particularly interesting candidates for functional follow-up aimed at characterizing the genetic relation between BV and intelligence: both USP19 and ARIH2 are part of a significantly associated gene set for BV, and in addition USP19 contains an ExNS variant that was significantly associated to intelligence.
To examine whether the gene-set associations are unique to BV and intelligence, respectively, we conducted conditional gene-set analyses (i.e., we conditioned the gene-based Z-scores in BV on the gene-based Z-scores for intelligence, and vice versa). Interestingly, these analyses indicated that the gene sets identified for BV (18) and intelligence (9) are trait specific, since the association remains virtually unchanged when conditioning on the other trait (Supplementary Table 4). Fisher’s exact tests showed that the 92 overlapping genes were not significantly enriched in the 27 gene sets associated to BV or intelligence (Supplementary Table 5). However, several genes related to intelligence were included in gene sets observed for BV, including ERBB3 and FOXO3 (see Fig. 4), suggesting that, although no single gene set was shared between BV and intelligence, several genes associated with intelligence are located along important signaling pathways implicated in BV.
We studied the genetics of BV and report 14 loci and 270 genes that had not been previously associated to BV. The gene-findings converged into 18 significantly associated gene-sets related to cellular signaling, division, and growth, mostly appearing independent in conditional analyses.
We found a significant positive genetic correlation with intelligence using previously published GWAS result53, confirming the genetic overlap estimated from twin studies46. We then explored whether specific genes or gene-sets drive this genetic correlation, and identified 92 genes that are associated with both traits. Of these 92, 32 genes were indicated as the most plausible ones explaining the shared genetic etiology. Several of these genes are involved in cell growth and differentiation pathways (e.g., FOXO3, ERBB3, DDR, and HBEGF, Fig. 4). We also tested whether the set of 92 shared genes, had specific biological characteristics that would set them apart from the genes that were associated with either BV or intelligence. Except for an increased intolerance to LoF mutations, we did not find any other of the tested biological functions to be different. We note that that the current set of genes associated with both traits, likely underestimates the true number of shared genes, as current sample sizes of especially the BV GWAS may limit the number of significantly associated genes.
The nature of the genetic overlap between intelligence BV remains largely unresolved. The large numbers of genes (>1000) that well-powered GWAS are now able to identify53,63,64 provide a starting point for understanding how genetic variation contributes to intelligence, yet it does not directly provide insight into the genetic relation with BV. Investigating genes that are related to both brain function (intelligence) and structure (BV), may provide further insight into the nature of the relation between intelligence and BV. We emphasize that in addition to shared pathways and biological pleiotropy, some genes linked to intelligence may also be associated to BV through gene-environment correlation with factors that indirectly influence BV development65.
In conclusion, we explored and refined the genetic architecture of BV, and identified and characterized a set of genes that likely drives this association.
Samples and phenotypes
UK Biobank - Brain volume: The UKB (UKB, www.ukbiobank.ac.uk) constitutes a large data set, combining a wide range of phenotypes with genetic and imaging information. The UKB study received ethical approval from the National Research Ethics Service Committee North West-Haydock (reference 11/NW/0382), and all study procedures were in accordance with the World Medical Association for medical research. The current study was conducted under UKB application number 16406. Here, we used processed data of a subset of N = 21,407 individuals who underwent a MRI procedure: data were released in the third quarter of 2018. After filtering on quality of the imaging results, relatedness, European ancestry and the availability of relevant covariates (discussed in more detail below), we arrived at a final sample size of N = 17,062. The BV phenotype was approximated as follows: BV = white matter volume + gray matter volume + cerebrospinal fluid volume (UKB field codes 25008, 25006, and 25004, respectively). The UKB obtained ethical approval from the National Research Ethics Service Committee North West–Haydock (reference 11/NW/0382), and all study procedures were performed in accordance with the World Medical Association for medical research. The current study was conducted under UKB application number 16406.
ENIGMA - ICV: We used GWAS summary statistics on ICV obtained in a collaboration of cohorts included in the meta-analysis of ICV of the Enhancing NeuroImaging Genetics through Meta-Analysis (ENIGMA)13 consortium. ICV was calculated from brain MRI data collected in 50 individual cohorts (N = 11,373). Given the moderate phenotypic and relatively low, but still significant, genetic correlation between BV and height (phenotypic r = 0.55 in UKB; rg = 0.26 between height in UKB and uncorrected ICV summary statistics from ENIGMA13) we considered height an important covariate to include in GWAS on volumetric brain measures. Therefore, we used mtCOJO to correct for height, by conditioning on height using UKB-derived sumstats (this procedure is described in more detail below). Details on the individual cohorts included in the ENIGMA meta-analysis and genotyping, imputation and analysis procedures are described elsewhere13. All participants were of European descent and provided informed consent. All studies included in the ENIGMA meta-analysis were approved by their local institutional review board or local ethics committee.
HC GWAS: In contrast to the two samples discussed above, the GWAS summary statistics obtained from the HC GWAS are based on genetic analysis of an indirect measure of BV, i.e., HC in children and adults (see Supplementary Note 2 for more information on the relation between HC and BV). These summary statistics were generated by meta-analyzing 11 population-based cohorts, resulting in a total sample size of N = 18,881. All studies included in the meta-analysis were approved by the local ethics committees, and informed (or parental) consent was obtained from all participants. Data collection, genotyping, imputation and analysis procedures are described in detail elsewhere (see Haworth et al.11). Since not all individual studies that were included in the HC meta-analysis corrected for height, we first conditioned the HC GWAS summary statistics on height using a procedure similar to that described for the ENIGMA data (rg = 0.39 between height in UKB and uncorrected HC summary statistics from Haworth et al.11.
Genotyping, imputation and quality control
The genotype data that we used was released by the UKB in March 2018, and concerns an updated version of data released earlier (July 2017). Details on the collection and processing of the genotype data are described elsewhere17. To summarize, the UKB genotyped in total 489,212 individuals on two custom made SNP arrays (UK BiLEVE AxiomTM array covering 807,411 markers (n = 49,950) and UKBTM Axiom array covering 825,927 markers (n = 438,427), both by Affymetrix). The genotype arrays shared 95% of marker content. Quality control executed by the UKB team resulted in a total of 488,377 individuals and 805,426 unique markers in the released data. In the version of the data used for the current study (genotype data release of March 2018), genotypes were phased and imputed by the UKB team to a combined reference panel of the Haplotype Reference Consortium and the UK10K. Finally, imputed and quality-controlled genotype data was available for 487,422 individuals and 92,693,895 genetic variants.
Prior to analysis, imputed variants were converted to hard call using a certainty threshold of 0.9. In our own quality control procedure, we excluded variants with a low imputation score (INFO score <0.9), low minor allele frequency (MAF < 0.005) and high missingness (>0.05). Furthermore, indels that had the same chromosomal location were excluded. This resulted in a total of 9,203,453 variants used for downstream analysis.
To reduce bias of the results due to population stratification, we only included individuals from European descent in our analyses. To this end, we projected principal components (PC’s) from the 1000 Genomes reference populations onto the called genotypes available in the UKB data. Participants for whom the projected principal component score was closest (using the Mahalanobis distance) to the average score of the European 1000Genomes sample66 were considered to be of European descent. Participants having a Mahalanobis distance >6 S.D. were excluded from further analysis. In an additional quality control step, we excluded participants that (1) had withdrawn their consent, (2) were related to other participants according to the UKB team (i.e., subjects with most inferred relatives, third degree or closer, were removed until no related subjects were present), (3) reported a gender that did not match their genetic gender, or (4) showed sex-chromosome aneuploidy. After filtering availability of imaging data, and MRI scan quality, 17,062 individuals remained for analysis.
GWAS of BV in UKB
The genome-wide association analysis (GWAS) of BV in the UKB data was conducted in PLINK67,68, using a linear regression model with additive allelic effects. In order to correct for potential subtle population stratification effects, we included 10 genetic PC’s as covariates. Genetic PC’s were computed using FlashPCA269 in the QC’ed subset of unrelated European subjects, retaining only independent (r2 < 0.1), relatively common (MAF > 0.01) and genotyped or very high imputation quality (INFO = 1) variants (n = 145,432 markers). We refer to Supplementary Note 7 for a brief comparison of the UKB BV GWAS results including either the first 10 or the first 100 genetic PC’s as covariates. Additional covariates included in the analysis were: age, sex, genotype array, assessment center, and standing height (since the genetic correlation between BV and height in the UKB is rg = 0.26). We also included the TDI as covariate to correct for potential confounding factors. TDI is strongly associated with individual levels of deprivation70 and health-related factors including self-reported health19, life style factors (smoking71, vegetable intake72), chronic illness73, and all cause mortality20. We believe that inclusion of TDI as a covariate is the most appropriate correction for SES and health-related factors, since this measure has a correlational pattern with a wide variety of potential confounding factors, and because of its availability in all UKB participants.
Conditional GWAS (mtCOJO)
Multi-trait-based conditional and joint analysis using GWAS summary data (mtCOJO55) can be used to conduct conditional GWAS analyses. The mtCOJO method is integrated in the GCTA74 software, and requires summary-level GWAS data to obtain GWAS results for phenotype A that are conditioned on the genetic signal of phenotype B. In the current study, we used mtCOJO to correct the GWAS summary statistics of ICV (ENIGMA) and head circumference (HC-GWAS) for their overlap with the genetic signal of height. The height summary statistics used in mtCOJO were obtained from a GWAS that we ran on standing height in the UKB sample (N = 385,748). The conditioned GWAS results were subsequently included in the GWAS meta-analysis of BV described in the following section.
GWAS meta-analysis of BV
Before carrying out the meta-analysis of BV, we performed additional filtering and prepared the summary statistics for each of the individual cohorts. First, within the ENIGMA data and the HC-GWAS data, variants that were available for N < 5000 were excluded. Second, in the UKB data we identified some instances where there were multiple variants on the same position, whilst having different alleles. These variants were excluded from further analysis. Lastly, we aligned the allele coding of indels in the ENIGMA data to match the coding in the UKB data. A number of indels that were not present in the UKB data were excluded from analysis, since we used the UKB data as reference data for downstream analyses. The summary statistics from the UKB, ENIGMA, and the HC-GWAS consortium showed strong genetic correlations (rg ranging between 0.75 and 1; see Supplementary Data 1), supporting our choice for a meta-analytic approach.
Using a sample-size weighted z-score method in METAL26 we combined the GWAS on BV in the UKB data (N = 17,062), the GWAS on ICV (conditioned on height) in the ENIGMA results (N = 11,373), and the GWAS on HC (conditioned on height) in the HC-GWAS results (N = 18,881), resulting in a total sample size of N = 47,316 (see Supplementary Fig. 1).
Intelligence - GWAS summary statistics
We used recently published GWAS meta-analysis summary statistics of intelligence53 to study the genetic overlap with BV. Data collection procedures and methods are described in detail elsewhere53. In comparing the results of BV and intelligence, we updated some of the downstream analyses for intelligence (e.g., we used FUMA version 1.3.5 instead of 1.3.0, we explored a slightly different collection of gene sets, and used GTEx data v7 instead of v6.1). Therefore, gene-based findings for intelligence presented in the current study may deviate slightly from those presented in the original study53 (Supplementary Data 6). We applied Bonferroni correction for multiple testing to the meta-analytic variants P-values to identify to intelligence associated genetic variants.
Genomic risk loci and functional annotation
We used FUMA27 (v1.3.5) for the functional annotation of the BV meta-analysis results. FUMA is an online platform that takes GWAS summary statistics as input, and subsequently annotates, prioritizes, and visualizes the results.
Prior to defining genomic risk loci, FUMA identifies variants that are GWS (5 × 10−8), and independent (r2 < 0.6) as independent significant variants. In the next step, independent significant variants that are independent from each other at r2 < 0.1 are denoted lead variants. Genotypes from the UKB were used as reference data to infer LD. Finally, FUMA characterizes genomic risk loci by merging LD blocks that are located close to each other (<250 kb apart). Thus, it is possible that one genomic risk locus contains multiple independent significant variants or lead variants.
In order to obtain information on the functional consequences of variants on genes, FUMA performs ANNOVAR75 gene-based annotation using Ensembl genes (build 85) for all variants in LD (r2 > 0.6) with one of the independent significant variants and having an association P-value lower than 1 × 10−5. In addition, Combined Annotation Dependent Depletion (CADD) scores76, Regulome DB scores77, and 15-core chromatin state78,79 are annotated to variants by matching chromosome position, reference, and alternative alleles. CADD scores can be used to prioritize genetic variants that are likely to be pathogenic and/or deleterious (CADD scores >12.37 suggest a variant is deleterious). The score is a single measure combining various annotations, and has been shown to correlate with pathogenicity, disease severity, and experimentally measured regulatory effects and complex trait associations76. RegulomeDB scores77 characterize variants by their likelihood to have a regulatory function (with lower scores indicating higher probability of regulatory function). Scores range from 7, meaning that there is no evidence of the variant having a regulatory function, to 1a, meaning that a variant is likely to affect binding and is linked to the expression of a gene target77. Chromatin state was predicted by ChromHMM79 for 127 cell types, using 15 states to classify and describe variants.
Using FUMA27, all variants in genomic risk loci that were GWS (P < 5 × 10−8) or were in LD (r2 > 0.6) with one of the independent significant variants, were mapped to genes. Variants could be annotated to a gene by either of three strategies. First, positional mapping maps variants to protein-coding genes based on physical proximity (i.e., within 10 kb window). Second, eQTL mapping maps variants to genes whose expression is associated with allelic variation at the variants level. Information on eQTLs was derived from three publicly available data repositories; GTEx80 (v7), the Blood eQTL browser81, and the BIOS QTL browser82. This strategy maps variants to genes up to 1 Mb apart (cis-eQTLs). We applied a false discovery rate (FDR) of ≤0.05 to limit the results to significant variant gene pairs. Third, variants were mapped to genes based on significant chromatin interactions between promoter regions of genes (250 bp up- and 500 bp downstream of the TSS) and a genomic region in a risk locus. In contrast to eQTL mapping, and in the absence of a distance boundary, chromatin interaction mapping may involve long-range interactions. The resolution of chromatin interactions was defined as 40 kb, and hence, interaction regions may comprise multiple genes. In order to prioritize genes implicated by chromatin interaction mapping, information on predicted enhancers and promoters in 111 tissue/cell types from the Roadmap Epigenomics Project78 was integrated. We used FUMA to filter on chromatin interactions for which one interaction region overlapped with predicted enhancers, and the other with predicted promoters 250 bp up- and 500 bp downstream of the TSS site of a gene. At the time of writing, FUMA contained Hi-C data of 14 tissue types from the study of Schmitt et al.83. An FDR of 1 × 10−5 was used to define significant interactions.
Genome-wide gene-based analysis (GWGAS) has the potential to identify genes associated to a trait of interest despite the genetic signal of individual variants in or nearby the gene not reaching genome-wide significance in variant-based analyses. Using the P-values from the variant-based analysis as input, GWGAS tests the joint signal of all variants in a gene with the phenotype, while accounting for LD between those variants. In addition to gene-mapping in FUMA, we therefore also conducted gene-based analysis in MAGMA30 to assess the joint effect of all variants within all 19,427 protein-coding genes included in the NCBI 37.3 database. MAGMA requires as input the P-values derived from variant-based analyses, in this case, the BV meta-analysis results. All variants in our BV meta-analysis were annotated to genes, resulting in 18,168 genes that contained at least one variant in the BV meta-analysis. Besides variants located within a gene, we also included variants lying within 2 kb before and 1 kb after the TSS of the gene. We used Entrez ID as the primary gene ID. MAGMA’s gene-based analysis uses a multiple linear PC regression, where an F-test is used to compute the gene P-value. The model takes linkage disequilibrium between variants into account. Genes were considered to be genome-wide significantly associated if the P-value survived a Bonferroni correction for multiple testing (0.05/number of genes tested: P < 2.75 × 10−6).
Functional gene set analysis
Gene set analysis was performed using MAGMA30, testing 12,191 predefined gene sets in an exploratory fashion. Selected gene sets included canonical pathways (n = 2199) and GO gene sets (n = 9996). All gene sets were obtained from the Molecular Signatures Database (MSigDB, version 7.0). For all gene set analyses, competitive, rather than self-contained, P-values are reported. Competitive gene set analysis tests whether the joint association of genes in a gene set with the phenotype of interest is stronger than that of a randomly selected set of genes of similar size. This approach provides stronger evidence for association of the gene set compared to a self-contained test, where the joint association of genes in a gene set with the phenotype is tested against the null hypothesis of no effect.
To assess whether genes associated to our traits of interest are disproportionately expressed in certain tissue- and cell types, we applied MAGMA’s gene-expression analysis to investigate associations with several gene expression profiles. First, we tested tissue gene-expression in 53 different tissue types obtained from the GTEx portal (v.7), which include gene-expression data from 13 brain tissue types39. Secondly, we tested gene-expression in 565 distinct adult mouse brain cell types from Dropviz40. These data were collected through the use of the Drop-seq technique84 by assessing RNA expression in 690,000 individual cells from nine brain regions of the adult mice brain, which were subsequently grouped to 565 transcriptionally distinct groups of cell types.
Gene set and gene-expression analyses were Bonferroni corrected for the total number of gene sets, tissue types and single-cell types tested in MAGMA (P < 3.90 × 10−6 (=0.05/(12,191 + 53 + 565)).
Conditional gene set analyses
In order to gain more insight in the genetic pathways associated to BV and intelligence, we performed conditional gene set analyses using MAGMA30. Conditional analyses are conducted to identify MsigDB gene sets that represent independent associations (i.e., in a regression-based framework, we assessed the association between our trait of interest (e.g., BV) and a gene set, conditional on another trait (e.g., intelligence)). Specifically, we determined which gene set associations remain for BV when we condition on intelligence, and vice versa. This approach provides information on whether gene sets are uniquely associated to BV or intelligence, or rather, shared between both traits. For example, if the P-value of association between a gene set and BV increases when conditioning on intelligence (i.e., less significant), then this suggests that the gene set association is likely shared between both traits, while if the P-value between a gene set and BV is unaffected by conditioning on intelligence, then this implies that the association is specific to BV, i.e., not shared with intelligence. Note however that even when the association between a gene set and a phenotype is largely explained by a second, overlapping, gene set, its conditional P-value will still be reduced compared to its marginal association P-value. Therefore, de Leeuw et al.38 advise to assess the relative degree of association remaining for each gene set. In addition, we conducted pairwise conditional gene set analysis for all 18 gene sets that were significantly associated to BV. Gene-sets that showed a significant conditional P-value in more than half of the conditional analyses were denoted as ‘largely independent’ gene-set associations.
SNP-based heritability and genetic correlations
We used LDSC21 to estimate the proportion of phenotypic variance that can be explained by common variants, a statistic known as SNP-based heritability, h2SNP. We used precomputed LD scores that were calculated using 1000 Genomes European data.
Genetic correlations (rg) between the signal from our BV meta-analysis, intelligence53, and 24 psychiatric, behavioral, and lifestyle-related traits for which summary-level data were available, were also calculated using LDSC85. The rg estimated by LDSC is an unbiased estimate, and may exceed [−1, 1] when standard errors are large, and the genetic correlation between studies is high.
Genetic correlations for which the P-value survived the correction for multiple testing (Bonferroni-corrected P < 0.002 (=0.05/25)) were considered significant.
In order to determine whether some functional categories of the genome contribute more than others to the SNP-heritability (h2SNP) of BV, we performed stratified LDSC29. Using this method, we calculated whether any of the 28 specific genomic categories included in the analysis was enriched for variants that contribute to h2SNP. Enrichment here is defined as the proportion of h2SNP in a given category divided by the proportion of variants in that category (e.g., if enrichment in intronic regions is 4.75, this indicates that this functional category has a 4.75-fold higher contribution to h2SNP relative to the proportion of variants annotated to that category).
MR analysis was performed using generalized summary-data-based Mendelian randomization (GSMR55). The main goal was to examine whether the genetic correlation between BV and intelligence (rg = 0.24) might be explained by directional effects. Analyses were conducted using forward and reverse GSMR, testing for uni- and bidirectional effects between BV and intelligence. Variants that showed clear pleiotropic effects on the exposure phenotype and the outcome (i.e., instruments that are associated to both BV as well as intelligence) were excluded from analysis using the HEIDI outlier method. In addition, we excluded variants that are in LD (r2 > 0.1) with one another across both tests (i.e., variants that are associated to either BV or intelligence, but are in LD with a second variant that is used as an instrument for the other trait).
Linking expression of overlapping genes to brain regions
We investigated whether genes overlapping between BV and intelligence show different brain expression patterns than non-overlapping genes, which may provide evidence of whether certain brain areas are more involved in the genetic association between BV and intelligence. To this end, we first performed gene-mapping with MAGMA and FUMA based on the summary statistics of the GWAS of each trait separately, after which we extracted the set of genes implicated by any of the gene-mapping strategies in both traits. Subsequently, we extracted the cortical gene-expression profile for each of the overlapping genes from the AHBA61, which describes gene-expression data per gene across distinct cortical areas. An expression profile of the set was obtained by taking the average expression across the 92 genes. We also performed clustering on the individual correlation patterns for each of the expression profiles of each of the 92 genes separately, by computing a 92 × 92 gene-to-gene correlation matrix, with the level of correlation between two genes taken as the Pearson correlation coefficient between their cortical expression profiles. This matrix was subdivided into clusters (also called modules) using Newman’s modularity algorithm86, maximizing for each cluster the intra-cluster cohesion and minimizing cohesion between genes of different clusters. Per cluster, a cluster-level expression profile was computed by calculating the average gene expression profile across all genes in the given cluster, resulting in clusters of genes showing distinct brain maps.
Interaction gene set analyses
Aiming to determine whether genes associated to both BV and intelligence are over- (or under-) expressed in specific brain regions, we performed interaction analysis as implemented in MAGMA (v1.07b). This analysis tested whether the combined involvement of a gene set (here: genes related to BV) and continuous gene properties (here: gene expression) is different from their individual effects on intelligence. A positive interaction effect would suggest that, within the set of BV-related genes, the relation between expression and gene-based Z-scores for intelligence is stronger compared to other genes (i.e., all genes for which both expression data from AHBA, as well as gene-based Z-scores for intelligence were available). This would indicate that gene expression in that specific region plays a particular role in the genetic relation between BV and intelligence.
An interaction analysis was conducted separately for gene expression in each of 57 brain regions defined in the AHBA61. To control for potential variation in general expression levels across genes, we conditioned on the marginal effects of the BV-related gene set and average gene expression across all 57 regions, as well as for their interaction. The set of BV-related genes consisted of all genes that were significant in MAGMA’s gene-based analysis and/or mapped from variant-level results in FUMA. Interaction effects were deemed significant if their P-value exceeded the Bonferroni-corrected threshold of P < 8.77 × 10−4 (0.05/57).
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Our policy is to make genome-wide summary statistics publicly available. Summary statistics from our BV meta-analysis and the height-corrected summary statistics in UKB are available for download at the website of the Department of Complex Trait Genetics, CNCR: https://ctg.cncr.nl/software/summary_statistics. Data restrictions do not allow redistribution of the summary statistics for ICV and head circumference. To replicate our results, the uncorrected summary statistics can be requested for download from the ENIGMA and Max Planck Institute websites. The variant effect sizes can be subsequently corrected for height (summary statistics available from GWAS ATLAS; ID 3187) using the mtCOJO software. The GWAS summary statistics from the intelligence meta-analysis conducted by Savage et al.53 are also available from the CTG website.
Hogan, M. J. et al. Cerebellar brain volume accounts for variance in cognitive performance in older adults. Cortex 47, 441–450 (2011).
MacLullich, A. M. J. et al. Intracranial capacity and brain volumes are associated with cognition in healthy elderly men. Neurology 59, 169–174 (2002).
Ritchie, S. J. et al. Beyond a bigger brain: multivariable structural brain imaging and intelligence. Intelligence 51, 47–56 (2015).
Lee, J. J., McGue, M., Iacono, W. G., Michael, A. M. & Chabris, C. F. The causal influence of brain size on human intelligence: evidence from within-family phenotypic associations and GWAS modeling. Intelligence 75, 48–58 (2019).
Epstein, H. T. & Epstein, E. B. The relationship between brain weight and head circumference from birth to age 18 years. Am. J. Phys. Anthr. 48, 471–474 (1978).
Vernon, P. A., Wickett, J. C., Bazana, P. G. & Stelmack, R. M. in Handbook of Intelligence 245–264 (Cambridge University Press, 2000).
McDaniel, M. A. Big-brained people are smarter: a meta-analysis of the relationship between in vivo brain volume and intelligence. Intelligence 33, 337–346 (2005).
Pietschnig, J., Penke, L., Wicherts, J. M., Zeiler, M. & Voracek, M. Meta-analysis of associations between human brain volume and intelligence differences: how strong are they and what do they mean? Neurosci. Biobehav. Rev. 57, 411–432 (2015).
Nave, G., Jung, W. H., Karlsson Linnér, R., Kable, J. W. & Koellinger, P. D. Are bigger brains smarter? Evidence from a large-scale preregistered study. Psychol. Sci. 30, 43–54 (2019).
Deary, I. J., Penke, L. & Johnson, W. The neuroscience of human intelligence differences. Nat. Rev. Neurosci. 11, 201–211 (2010).
Haworth, S. et al. Low-frequency variation in TP53 has large effects on head circumference and intracranial volume. Nat. Commun. 10, 357 (2019).
Adams, H. H. H. et al. Novel genetic loci underlying human intracranial volume identified through genome-wide association. Nat. Neurosci. 19, 1569–1582 (2016).
Hibar, D. P. et al. Common genetic variants influence human subcortical brain structures. Nature 520, 224–229 (2015).
Savage, J. E. et al. Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence. Nat. Genet. https://doi.org/10.1038/s41588-018-0152-6 (2018).
Sniekers, S. et al. Genome-wide association meta- Analysis of 78,308 individuals identifies new loci and genes influencing human intelligence. Nat. Genet. https://doi.org/10.1038/ng.3869 (2017).
Davies, G. et al. Genetic contributions to variation in general cognitive function: a meta-analysis of genome-wide association studies in the CHARGE consortium (N=53 949). Mol. Psychiatry https://doi.org/10.1038/mp.2014.188 (2015).
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
Townsend, P. Deprivation. J. Soc. Policy 16, 125–146 (1987).
Adams, J., Ryan, V. & White, M. How accurate are Townsend Deprivation Scores as predictors of self-reported health? A comparison with individual level data. J. Public Health27, 101–106 (2005).
Smith, G. D., Whitley, E., Dorling, D. & Gunnell, D. Area based measures of social and economic circumstances: cause specific mortality patterns depend on the choice of index. J. Epidemiol. Community Heal 55, 149–150 (2001).
Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Yang, J. et al. Genomic inflation factors under polygenic inheritance. Eur. J. Hum. Genet. 19, 807–812 (2011).
Hshieh, T. T. et al. Head circumference as a useful surrogate for intracranial volume in older adults. Int Psychogeriatr. 28, 157–162 (2017).
Martini, M., Klausing, A., Lüchters, G., Heim, N. & Messing-Jünger, M. Head circumference - a useful single parameter for skull volume development in cranial growth analysis? Head. Face Med. 14, 1–8 (2018).
Wolf, H., Kruggel, F., Hensel, A., Wahlund, L. & Arendt, T. The relationship between head size and intracranial volume in elderly subjects. Brain Res. 973, 74–80 (2003).
Willer, C. J., Li, Y., Abecasis, G. R. & Overall, P. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
Watanabe, K., Taskesen, E., Bochoven, A. van & Posthuma, D. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017).
Müller, S. A., Scilabra, S. D. & Lichtenthaler, S. F. Proteomic substrate identification for membrane proteases in the brain. Front. Mol. Neurosci. 9, 96 (2016).
Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Publ. Gr. 47, 1228–1235 (2015).
de Leeuw, C. A., Mooij, J. M., Heskes, T. & Posthuma, D. MAGMA: Generalized gene-set analysis of GWAS data. PLoS Comput. Biol. 11, 1–19 (2015).
Ikram, M. A. et al. Common variants at 6q22 and 17q21 are associated with intracranial volume. Nat. Genet. 44, 539–544 (2012).
Hibar, D. P. et al. Common genetic variants influence human subcortical brain structures. Nature 520, 224–229 (2015).
Taal, H. R. et al. Common variants at 12q15 and 12q24 are associated with infant head circumference. Nat. Genet. 44, 532 (2012).
McKusick, V. A. Mendelian Inheritance in Man and its online version, OMIM. Am. J. Hum. Genet. 80, 588–604 (2007).
Tschopp, O. et al. Essential role of protein kinase Bγ (PKBγ/Akt3) in postnatal brain development but not in glucose homeostasis. Development 132, 2943–2954 (2005).
Lee, J. H. et al. De novo somatic mutations in components of the PI3K-AKT3-mTOR pathway cause hemimegalencephaly. Nat. Genet. 44, 941 (2012).
Poduri, A. et al. Somatic activation of AKT3 causes hemispheric developmental brain malformations. Neuron 74, 41–48 (2012).
de Leeuw, C. A., Stringer, S., Dekkers, I. A., Heskes, T. & Posthuma, D. Conditional and interaction gene-set analysis reveals novel functional pathways for blood pressure. Nat. Commun. 9, 3768 (2018).
Lonsdale, J. et al. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
Saunders, A. et al. Molecular diversity and specializations among the cells of the adult mouse brain. Cell 174, 1015–1030 (2018).
Nishiyama, A., Komitova, M., Suzuki, R. & Zhu, X. Polydendrocytes (NG2 cells): multifunctional cells with lineage plasticity. Nat. Rev. Neurosci. 10, 9–22 (2009).
Pfeiffer, S. E., Warrington, A. E. & Bansal, R. The oligodendrocyte and its many cellular processes. Trends Cell Biol. 3, 191–197 (1993).
Wilhelm, I., Nyúl-Tóth, Á., Suciu, M., Hermenean, A. & Krizbai, I. A. Heterogeneity of the blood-brain barrier. Tissue Barriers 4, e1143544 (2016).
Sacco, R., Gabriele, S. & Persico, A. M. Head circumference and brain size in autism spectrum disorder: a systematic review and meta-analysis. Psychiatry Res. - Neuroimaging 234, 239–251 (2015).
van der Schot, A. C. et al. Influence of genes and environment on brain volumes in twin pairs concordant and discordant for bipolar disorder. Arch. Gen. Psychiatry 66, 142–151 (2009).
Posthuma, D. et al. The association between brain volume and intelligence is of genetic origin. Nat. Neurosci. 5, 83 (2002).
Posthuma, D. et al. Genetic correlations between brain volumes and the WAIS-III dimensions of verbal comprehension, working memory, perceptual organization, and processing speed. Twin Res. Hum. Genet. 6, 131–139 (2003).
Posthuma, D. et al. Multivariate genetic analysis of brain structure in an extended twin design. Behav. Genet. 30, 311–319 (2000).
Smeland, O. B. et al. Genetic OVerlap between Schizophrenia and Volumes of Hippocampus, Putamen, and Intracranial Volume Indicates Shared Molecular Genetic Mechanisms. Schizophr. Bull. 44, 854–864 (2018).
Okbay, A. et al. Genome-wide association study identifies 74 loci associated with educational attainment. Nature 533, 539 (2016).
Zhao, B. et al. Genome-wide association analysis of 19,629 individuals identifies variants influencing regional brain volumes and refines their genetic co-architecture with cognitive and mental health traits. Nat. Genet. 51, 1637–1644 (2019).
Roth, G. & Dicke, U. Evolution of the brain and intelligence. Trends Cogn. Sci. 9, 250–257 (2005).
Savage, J. E. et al. Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence. Nat. Genet. 50, 912–919 (2018).
Johnson, W., Bouchard, T. J., Krueger, R. F., McGue, M. & Gottesman, I. I. Just one g: consistent results from three test batteries. Intelligence 32, 95–107 (2004).
Zhu, Z. et al. Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat. Commun. 9, 1–12 (2018).
Davies, N. M., Holmes, M. V. & Davey Smith, G. Reading Mendelian randomisation studies: a guide, glossary, and checklist for clinicians. BMJ 362, k601 (2018).
Rebhan, M., Chalifa-Caspi, V., Prilusky, J. & Lancet, D. GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support. Bioinformatics 14, 656–664 (1998).
Firth, H. V. et al. DECIPHER: database of chromosomal imbalance and phenotype in humans using ensembl resources. Am. J. Hum. Genet. 84, 524–533 (2009).
Grazioplene, R. G. et al. Subcortical intelligence: caudate volume predicts IQ in healthy adults. Hum. Brain Mapp. 36, 1407–1416 (2015).
Bohlken, M. M. et al. Genes contributing to subcortical volumes and intellectual ability implicate the thalamus. Hum. Brain Mapp. 35, 2632–2642 (2014).
Sunkin, S. M. et al. Allen Brain Atlas: an integrated spatio-temporal portal for exploring the central nervous system. Nucleic Acids Res. 41, D996–D1008 (2012).
Wallace, C. et al. Statistical colocalization of monocyte gene expression and genetic risk variants for type 1 diabetes. Hum. Mol. Genet. 21, 2815–2824 (2012).
Davies, G. et al. Study of 300,486 individuals identifies 148 independent genetic loci influencing general cognitive function. Nat. Commun. 9, 1–16 (2018).
Sniekers, S. et al. Genome-wide association meta-analysis of 78,308 individuals identifies new loci and genes influencing human intelligence. Nat. Genet. 49, 1107 (2017).
Solovieff, N., Cotsapas, C., Lee, P. H., Purcell, S. M. & Smoller, J. W. Pleiotropy in complex traits: challenges and strategies. Nat. Publ. Gr. 14, 483–495 (2013).
Webb, B. T. et al. Molecular genetic influences on normative and problematic alcohol use in a population-based sample of college students. Front. Genet. 8, 1–11 (2017).
Purcell, S. et al. REPORT PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, s13742–015 (2015).
Abraham, G. & Inouye, M. Fast principal component analysis of large-scale genome-wide data. PLoS ONE 9, 1–5 (2014).
Stafford, M. & Marmot, M. Neighbourhood deprivation and health: does it affect us all equally? Int. J. Epidemiol. 32, 357–366 (2003).
Shohaimi, S. et al. Residential area deprivation predicts smoking habit independently of individual educational level and occupational social class. a cross sectional study in the Norfolk cohort of the European Investigation into Cancer (EPIC-Norfolk). J. Epidemiol. Community Heal. 57, 270–276 (2003).
Shohaimi, S. et al. Residential area deprivation predicts fruit and vegetable consumption independently of individual educational level and occupational social class: a cross sectional population study in the Norfolk cohort of the European Prospective Investigation into Canc. J. Epidemiol. Community Heal 58, 686–691 (2004).
Jordan, H., Roderick, P. & Martin, D. The Index of Multiple Deprivation 2000 and accessibility effects on health. J. Epidemiol. Community Heal. 58, 250–257 (2004).
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164–e164 (2010).
Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014).
Boyle, A. P. et al. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 22, 1790–1797 (2012).
Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–320 (2015).
Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).
The GTEx Consortium. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348, 648–660 (2015).
Westra, H.-J. et al. Systematic identification of trans-eQTLs as putative drivers of known disease associations. Nat. Genet. 45, 1238–1243 (2013).
Zhernakova, D. V. et al. Identification of context-dependent expression quantitative trait loci in whole blood. Nat. Genet 49, 139–145 (2017).
Schmitt, A. D. et al. A compendium of chromatin contact maps reveals spatially active regions in the human genome. Cell Rep. 17, 2042–2059 (2016).
Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).
Bulik-Sullivan, B. K. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1–9 (2015).
Newman, M. E. J. Modularity and community structure in networks. PNAS 103, 8577–8582 (2006).
The full GWAS summary statistics from the ENIGMA GWAS meta-analysis were downloaded from http://enigma.ini.usc.edu/research/download-enigma-gwas-results/. Data on infant head circumference has been contributed by HC-GWAS and was downloaded from https://hdl.handle.net/1839/ff12326d-9688-4a46-bc2a-b50cbbd2b20c. The Genotype-Tissue Expression (GTEx) Project was supported by the Common Fund of the Office of the Director of the National Institutes of Health, and by NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS. The data used for the analyses described in this manuscript were obtained from the GTEx Portal on 06/12/2018. This study makes use of data generated by the DECIPHER community. A full list of centers that contributed to the generation of the data is available from http://decipher.sanger.ac.uk and via email from email@example.com. Funding for the project was provided by the Wellcome Trust. This work was funded by The Netherlands Organization for Scientific Research (NWO Brain & Cognition 433-09-228, NWO MagW VIDI 452-12-014, NWO VICI 435-14-005 and 453-07-001, 645-000-003). Analyses were carried out on the Genetic Cluster Computer, which is financed by the Netherlands Scientific Organization (NWO: 480-05-003), by the VU University, Amsterdam, the Netherlands, and by the Dutch Brain Foundation, and is hosted by the Dutch National Computing and Networking Services SurfSARA. This research has been conducted using the UKB Resource (application number 16406). We would like to thank the participants and researchers who collected and contributed to the data.
The authors declare no competing interests.
Peer review information Nature Communications thanks Carrie Bearden, David Howard and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Jansen, P.R., Nagel, M., Watanabe, K. et al. Genome-wide meta-analysis of brain volume identifies genomic loci and genes shared with intelligence. Nat Commun 11, 5606 (2020). https://doi.org/10.1038/s41467-020-19378-5
Nature Genetics (2021)