Despite great interest in using magnetic resonance imaging (MRI) for studying the effects of genes on brain structure in humans, current approaches have focused almost entirely on predefined regions of interest and had limited success. Here, we used multivariate methods to define a single neuroanatomical score of how William’s Syndrome (WS) brains deviate structurally from controls. The score is trained and validated on measures of T1 structural brain imaging in two WS cohorts (training, n = 38; validating, n = 60). We then associated this score with single nucleotide polymorphisms (SNPs) in the WS hemi-deleted region in five cohorts of neurologically and psychiatrically typical individuals (healthy European descendants, n = 1863). Among 110 SNPs within the 7q11.23 WS chromosomal region, we found one associated locus (p = 5e–5) located at GTF2IRD1, which has been implicated in animal models of WS. Furthermore, the genetic signals of neuroanatomical scores are highly enriched locally in the 7q11.23 compared with summary statistics based on regions of interest, such as hippocampal volumes (n = 12,596), and also globally (SNP-heritability = 0.82, se = 0.25, p = 5e−4). The role of genetic variability in GTF2IRD1 during neurodevelopment extends to healthy subjects. Our approach of learning MRI-derived phenotypes from clinical populations with well-established brain abnormalities characterized by known genetic lesions may be a powerful alternative to traditional region of interest-based studies for identifying genetic variants regulating typical brain development.
The morphology of an adult brain represents a holistic snapshot of a unique neurodevelopmental history; its variations are an accumulation of dynamic processes working in concert with few constraints1. Different brain regions share the same original sets of proto-structures emerging from interactive molecular signaling programs during early embryonic stage. Post-natal brain growth, myelination, and subsequent regressive processes leading to mature functional circuits provide further overlap in the processes giving rise to adult brain morphology. These developmental processes, furthermore, are guided by distributed patterns of gene expression, interactions with the environment, and operate under spatial constraints imposed by the cranium that may link the morphology of various parts of the adult brain1,2. Consequently, the perturbation of a developmentally critical gene often results in diverse morphological abnormalities not limited to a single brain region3,4,5. Given this, it is reasonable to expect that variability interjected into neurodevelopment via a genetic variant may not only contribute to variability in the MRI-derived morphology of a single delineated brain region, but also to covariance among multiple regions2.
However, genetic studies of neuroanatomy using magnetic resonance imaging (MRI) continue to prioritize morphological measures on specific landmark-defined brain regions, such as the volumes of subcortical nuclei6 or average thickness of cortical parcellations7. Although this approach captures some genetic effects of structural variations, it bypasses the fact that the morphological state of an adult brain is the sum of previous developmental processes across brain regions. These landmark-defined regions of interest (ROIs) therefore may have lost genetically relevant information by ignoring co-varied components, while concurrently introducing irrelevant variance by combining measures from genetically unrelated neighbors8.
The limitations of this ROI approach are most evident in the context of studying effects on neurodevelopment, as the age-dependent processes have been shown to consist of a gradient spreading across the cortical surface without a discernable relationship to traditional anatomical landmarks9. Past efforts to redefine the imaging phenotypes beyond landmark-based ROIs include learning a sparse representation from patients with Alzheimer’s disease8 or redrawing ROIs based on the genetic correlations from twin studies7,10. These methods can be conceptualized as projecting the multidimensional measures of MRI onto a lower dimensional axis while filtering out components irrelevant to the genetic signals. Such methods have seldom focused, however, on neurodevelopmental disorders, such as Williams Syndrome (WS), that have larger neuroanatomical impacts and more finite candidate genetic regions attributable to the neuroanatomical differences. Since statistical power is the most critical factor for identifying genes through associations11, a redefined MRI measure that contains more relevant genetic signals and reduces the burden of multiple comparisons can greatly facilitate the discovery of neurodevelopmental genes.
WS is a multi-systemic disorder caused by hemi-deletion of roughly 27 genes on chromosome 7, resulting in cardiovascular morbidities, intellectual impairment, and hypersociability12,13. Besides a decrease of about 11% in brain size, patients with WS have aberrant regionalization of cortical surfaces as assessed with brain MRI, particularly in superior parietal regions and the orbitofrontal cortex14,15,16,17,18. Animal models have suggested GTF2IRD1, a gene-encoded general transcription factor, as one of the most promising candidate genes for neuroanatomical differences in WS4,19,20,21. Genetic perturbations on GTF2IRD1 have recently been associated with dog friendliness toward humans22. Despite such findings in animal models, associations of this gene with brain or behavioral phenotypes in the healthy human population are lacking6. Without association studies on brain phenotypes in healthy human populations, it remains unclear whether common genetic variants on those genes have an impact on typical brain development.
Here, we describe a novel two-pronged approach to capturing genetic effects on neurodevelopment. First, using one single score to represent the global neuroanatomical variations, and a candidate genes approach by examining only the WS region, we limit the effect-size requirements imposed by Bonferroni correction. Second, and more important, we increase the sensitivity of the anatomical phenotype by using a single derived score calculated from multidimensional MRI measures. In our previous work, we derived a single global measure that characterizes how WS brains are structurally different from controls, across multiple parameters in multiple locations23. In this study, we demonstrate that the WS neuroanatomical score can be regarded as an MRI endophenotype, enriched in genetic information pertaining to neurodevelopment. By applying the neuroanatomical scores to five imaging genetic cohorts with brain MRI and single nucleotide polymorphisms (SNP) data (n = 1863 healthy European descent), we demonstrate, for the first time, that a common variant in GTF2IRD1 is associated with variation in brain structure (Bonferroni corrected p = 0.023). The genetic signals are more enriched than traditionally defined ROI and have significantly high SNP-heritability (h2 = 0.82, se = 0.25, p = 5e−4). Our results provide a proof of concept for the strategy of using multivariate structural measures as a derived intermediate phenotype for genetic association studies.
Materials and methods
Healthy imaging genetics cohorts
We selected 1863 healthy imaging genetics subjects from five independent cohorts: 184 from the Alzheimer’s Disease Neuroimaging Initiative (ADNI)24, 653 from the Nord-Trøndelag Health Study (HUNT)25, 325 from the Norwegian Cognitive NeuroGenetics (NCNG)26, 250 from the Thematically Organized Psychosis study (TOP)27, and 451 from the Pediatric Imaging Neurocognition and Genetics Study (PING)28. From each study, only healthy, unrelated, European-ancestry subjects were retained for analysis. Because the WS neuroanatomical scores were nevertheless trained on an adult WS cohort23, the residual confounding of age effect might have an impact on the association. Given that the PING study contains the youngest individuals across all cohorts, we further stratified the PING sample into two subcohorts, one for those ages 16 years and older, and the other for those younger than 16. The cut-point 16 years old is decided based on previous studies that found most of the developmental changes of structural measures asymptote by the age of 1629,30. Each study collected 3D T1 MRI images according to comparable acquisition protocols and was processed with the same FreeSurfer reconstruction protocols. The processing protocols include bias correction, registration, segmentation, and 3D surface reconstruction, as implemented in FreeSurfer29,31. Studies using the same five imaging genetic cohorts show genetic factors can be consistently estimated, demonstrating the success in protocol homogenization despite differences in scanners and recruiting sites7. Whole-genome genotypes were imputed according to the same Mach/Minimac procedure using the 1000 Genomes Project as a reference. Estimated dosages of 110 SNPs falling within the WS hemi-deletion region (chromosome 7q11.23, 72Mb–74Mb, hg19) were imputed with good quality in all cohorts and selected for analysis. Demographics and detailed summaries of data acquisition and processing for each cohort are presented in the Supplementary materials.
WS neuroanatomical scores
We used a penalized regression model to calculate WS neuroanatomical scores given individuals’ MRI measures. Full details of the training and validation of the model have been published elsewhere24. Briefly, 3D T1 MRI images were obtained for 22 WS patients and 16 healthy controls. A multivariate regularized logistic regression was trained to discriminate WS patients from healthy controls on the basis of 30,760 predictors, including estimated cortical surface area10, cortical surface geometry30, and sulcal depths16 for each of 5124 reconstructed vertices and the volumes of 16 subcortical structures31. When the model was trained in WS cohort, intra-cranial volume (ICV) was used as a covariate to ensure overall brain size was not driving the classification. Therefore, the WS neuroanatomical score is capturing the subtle morphological reorganization of the WS brain. For each subject in our healthy imaging genetics cohort, we applied the resulting discriminative weights to the same neuroimaging feature space, summarizing this high-dimensional data with a single, composite neuroanatomical score reflecting morphological variations on the axis between healthy individuals and patients with WS. Figure 1 illustrates the flowchart of the analytic strategy and visualization of the weights for contributing neuroimaging measures to the final composite scores. Weights of each imaging measure included in the analyses can be found in Supplementary Figure 1.
Candidate Region Association analysis
Each imputed SNP dosage was regressed against the WS neuroanatomical score while controlling age, age squared, gender, and the first seven principal components of genetic ancestry as potentially confounding covariates. Although previous analyses from our group had shown consistent estimation of genetic effects across five cohorts7, we used meta-analysis to account for potential bias resulting from scanner differences. We estimated the SNP effects in each cohort separately and combined them post-hoc according to an inverse variance weighted meta-analysis implemented in PLINK. To account for multiple comparisons, we used a Bonferroni adjustment for the 110 linked SNPs. Our significance threshold was set to p < 0.05/110 = 4.5e−4, conservatively controlling for 110 correlated tests. We then used CAVIAR to determine which SNP is the potential causal variant32.
Local enrichment and global SNP-heritability
To demonstrate the enrichment of local genetic signals with newly defined WS neuroanatomical scores, we used the quantile–quantile plots comparing –log10(p) between our SNP associations in the WS chromosomal regions and summary statistics obtained from the traditional ROI approach as reported by the ENIGMA consortium (n = 12,596)6. Despite the scale of our cohorts, the sample size is considered modest in the context of genome-wide association studies. Therefore, to avoid under-powered genome-wide analyses while quantifying the global genetic signals of WS neuroanatomical score, we used Genome-wide Complex Trait Analysis (GCTA)33 to estimate the variance explained by all of the SNPs on the entire genome (i.e., the SNP-heritability). The genetic relationship matrix is calculated for all cohorts, using GCTA, and then the SNP-heritability is derived while controlling for age, age squared, gender, cohort membership, and the first seven principal components of genetic ancestry.
The training and validating of WS neuroanatomical scores have been published elsewhere24. In short, the derived neuroanatomical scores robustly distinguished WS from other groups in both the training set (leave-one-out cross-validation area under curve as 100%) and the validating set (area under curve as 100%). The composite WS score significantly mediates the cognitive differences between cases and controls, especially tests quantifying social behaviors24. Having derived this multivariate measure which characterizes WS, we then applied the score to healthy imaging genomic cohorts. Each healthy individual’s MRI measures were combined into one single score given the derived weights of WS neuroanatomical score (Supplementary Figure 1). The score of cohort members is normally distributed (mean: 0.6, SD: 0.09) and not correlated with genetic ancestry (absolute Pearson correlations <0.2, p > 0.05). None of the cohort members were determined as patients with WS, and none met the anatomical criterion for WS we derived in our earlier work24.
The associations between SNPs and neuroanatomical score in imaging genomic cohorts are shown in Fig. 2 and Fig. 3. One locus containing three SNPs located at GTF2IRD1 showed statistical significance after Bonferroni correction (Fig. 2, top SNP, rs2267824, p = 2.0e−4). Effect sizes of the associated SNP were consistent across cohorts (Fig. 3) except for the cohort with individuals younger than 16 years old. After excluding individuals younger than 16 years old, the association of rs2267824 became stronger (reference allele: C, coefficient: 0.018, p = 5e−5). CAVIAR confirmed that the region contains one single locus and rs2267824 was the potential causal variant. In addition, one SNP within 250 kb of FZF9 showed nominal significance (rs2237280, p = 0.00627).
The quantile–quantile plots compared with associations from the ENIGMA study demonstrated significantly enriched genetic signals in the WS chromosomal regions when using the WS neuroanatomical score (Fig. 4). In terms of global genetic signals, the WS neuroanatomical score has high heritability (h2 = 0.82, se = 0.25, p = 5e−4) despite the fact that less than 1% of phenotypic variation can be explained by the potential causal SNP, rs2267824.
Here we demonstrate that the WS neuroanatomical score can be regarded as an MRI endophenotype, enriched in genetic information pertaining to neurodevelopment. By applying the neuroanatomical score to five imaging genetic cohorts, we show that a common variant in GTF2IRD1 is associated with variation in brain structure. The genetic signals were more enriched than traditionally defined ROI and have significantly high SNP-heritability. Our results provide a proof of concept for the strategy of using multivariate structural measures as a derived intermediate phenotype for genetic association studies. An optimized multivariate MRI procedure defines the intermediate phenotype that can accurately capture the continuous nature of the underlying brain variations, thus providing greater power for detecting genetic associations.
The associations between GTF2IRD1 and the WS neuroanatomical score support a critical role of this general transcription factor for normal brain development, and specifically for one of the characteristic personality traits of WS. WS has a unique neuroarchitecture compared to other developmental disorders with intellectual impairment, but few studies have tied anatomical changes to strikingly heightened social behavior12,13,14,15,16,17,18,23. Previous case studies of partial hemi-deletions in WS indicate that the region telomeric to 7q11.23, which includes GTF2IRD1, is crucial for the changes in social behaviors characteristic of WS4,20,34. Animal models also support the role of GTF2IRD1 in brain development4,19,21. In particular, a recent study on dog friendliness found the genetic variations on GTF2IRD1 and GTF2I were positively selected for the tendency to socially engage with humans22. Together with these results, our findings provide converging evidence for the role of GTF2IRD1 in human brain development and social cognition.
The associations with GTF2IRD1 are not consistent across age: the PING sample with age under 16 years old did not show significant associations between WS neuroanatomical score and GTF2IRD1. While many factors can lead to this null-association among younger cohort, one possibility is that the developmental genetic effects need to accumulate over time to be detectable. Although the WS neuroanatomical score was validated with a WS child cohort23, the score variations are limited among children with typical development, which is the case for the PING sample. Although our current sample size is too small to systematically examine possible age-dependent genetic effects on structural neuroimaging measures, they may become feasible with the large-scale imaging genetic studies now becoming available.
It is likely that other genes also affect the neuroanatomical profiles we defined here, and they may act synergistically in producing the observed phenotype. For example, a study of neuron-like cells derived from stem cells in WS demonstrated reduced neuron proliferation and enhanced dendritic elaboration resulting from perturbation on FZD935. As our associations found a suggestive signal located at FZD9, although much weaker than the main GTF2IRD1 effects, it nevertheless jointly contributed to the variations in neuroanatomical profiles. This interpretation is supported by the effects of partial hemi-deletions, which spare the FZD9 gene23,35. We found that although WS neuroanatomical scores increased among these subjects, it is much weaker than in those with a typical hemi-deletion23. Further evidence for synergistic effects was found in studies implicating both GTF2IRD1 and FZD9 in the Wnt pathway, a well-researched signaling pathway that has been implicated in stem cell control and neuroplasticity3,34,36.
In addition, we found significantly high heritability of the observed variations in our defined neuroanatomical score, indicating polygenic contributions. Although the neuroanatomical scores were highly specific to WS status among patient groups23, the variations in scores among healthy adults can represent the accumulation of multiple developmental processes with diverse genetic perturbations, each with small effects. This phenomenon is compatible with the theory of the modularized genetic networks in which canalized phenotypes, e.g., typically developed brains, can tolerate many small genetic perturbations unless genetic hubs are drastically disturbed5,37. In this framework, the WS deletions would represent a large perturbation of a neurodevelopmental process which in typical developed individuals only shows small variations attributable to regulatory genes across the genome. Although our WS neuroanatomical scores were enriched for WS relevant genetic effects, it nevertheless characterized an underlying canalized developmental process. Using our analytic strategy with diverse genetic developmental disorders may provide further insight into this enduring question about phenotype–genotype mapping.
In sum, our results provide further support for the role of GTF2IRD1 in the WS phenotype and a proof of concept for deriving multivariate MRI phenotypes for genotype–phenotype studies. This strategy may prove useful in other neurodevelopmental disorders that typically have restricted genetic deletions or alterations. In addition, more accurate measurement of the neuroanatomical phenotype should also provide greater power for genetic studies of diseases such as schizophrenia and autism spectrum disorders where the genetic basis is distributed across the genome, and should ultimately facilitate the discovery of other mediating paths from genes to disorders.
Stiles, J. & Jernigan, T. L. The basics of brain development. Neuropsychol. Rev. 20, 327–348 (2010).
Alexander-Bloch, A., Giedd, J. N. & Bullmore, E. Imaging structural co-variance between human brain regions. Nat. Rev. Neurosci. 14, 322–336 (2013).
Corley, S. M. et al. RNA-Seq analysis of Gtf2ird1 knockout epidermal tissue provides potential insights into molecular mechanisms underpinning Williams-Beuren syndrome. BMC Genomics 17, 450 (2016).
Tassabehji, M. et al. GTF2IRD1 in craniofacial development of humans and mice. Science 310, 1184–1187 (2005).
Wagner, G. P. & Zhang, J. The pleiotropic structure of the genotype-phenotype map: the evolvability of complex organisms. Nat. Rev. Genet. 12, 204–213 (2011).
Hibar, D. P. et al. Common genetic variants influence human subcortical brain structures. Nature 520, 224–229 (2015).
Chen, C. H. et al. Large-scale genomics unveil polygenic architecture of human cortical surface area. Nat. Commun. 6, 7549 (2015).
Vounou, M. et al. Sparse reduced-rank regression detects genetic associations with voxel-wise longitudinal phenotypes in Alzheimer’s disease. Neuroimage 60, 700–716 (2012).
Walhovd, K. B., Fjell, A. M., Giedd, J., Dale, A. M. & Brown, T. T. Through thick and thin: a need to reconcile contradictory results on trajectories in human cortical development. Cereb. Cortex 27, 1472–1481 (2017).
Chen, C. H. et al. Hierarchical genetic organization of human cortical surface area. Science 335, 1634–1636 (2012).
Visscher, P. M. Sizing up human height variation. Nat. Genet. 40, 489–490 (2008).
Martens, M. A., Wilson, S. J. & Reutens, D. C. Research review: Williams syndrome: a critical review of the cognitive, behavioral, and neuroanatomical phenotype. J. Child Psychol. Psychiatry 49, 576–608 (2008).
Pober, B. R. Williams-Beuren syndrome. N. Engl. J. Med. 362, 239–252 (2010).
Gaser, C. et al. Increased local gyrification mapped in Williams syndrome. Neuroimage 33, 46–54 (2006).
Jernigan, T. L., Bellugi, U., Sowell, E., Doherty, S. & Hesselink, J. R. Cerebral morphologic distinctions between Williams and Down syndromes. Arch. Neurol. 50, 186–191 (1993).
Kippenhan, J. S. et al. Genetic contributions to human gyrification: sulcal morphometry in Williams syndrome. J. Neurosci. 25, 7840–7846 (2005).
Meda, S. A., Pryweller, J. R. & Thornton-Wells, T. A. Regional brain differences in cortical thickness, surface area and subcortical volume in individuals with Williams Syndrome. PLoS ONE 7, e31913 (2012).
Meyer-Lindenberg, A., Mervis, C. B. & Berman, K. F. Neural mechanisms in Williams syndrome: a unique window to genetic influences on cognition and behaviour. Nat. Rev. Neurosci. 7, 380–393 (2006).
Enkhmandakh, B. et al. Essential functions of the Williams-Beuren syndrome-associated TFII-I genes in embryonic development. Proc. Natl Acad. Sci. USA 106, 181–186 (2009).
Hoeft, F. et al. Mapping genetically controlled neural circuits of social behavior and visuo-motor integration by a preliminary examination of atypical deletions with Williams Syndrome. PLoS ONE 9, e104088 (2014).
Young, E. et al. Reduced fear and aggression and altered serotonin metabolism in Gtf2ird1‐targeted mice. Genes Brain Behav. 7, 224–234 (2008).
vonHoldt, B. M. et al. Structural variants in genes associated with human Williams-Beuren syndrome underlie stereotypical hypersociability in domestic dogs. Sci. Adv. 3, e1700398 (2017).
Fan, C. C. et al. Williams syndrome-specific neuroanatomical profile and its associations with behavioral features. NeuroImage Clin. 15, 343–347 (2017).
Hua, X. et al. 3D characterization of brain atrophy in Alzheimer’s disease and mild cognitive impairment using tensor-based morphometry. Neuroimage 41, 19–34 (2008).
Honningsvåg, L.-M., Linde, M., Håberg, A., Stovner, L. J. & Hagen, K. Does health differ between participants and non-participants in the MRI-HUNT study, a population based neuroimaging study? The Nord-Trøndelag health studies 1984-2009. BMC Med. Imaging 12, 23 (2012).
Espeseth, T. et al. Imaging and cognitive genetics: the Norwegian Cognitive NeuroGenetics sample. Twin. Res. Hum. Genet. 15, 442–452 (2012).
Rimol, L. M. et al. Cortical volume, surface area, and thickness in schizophrenia and bipolar disorder. Biol. Psychiatry 71, 552–560 (2012).
Jernigan, T. L. et al. The pediatric imaging, neurocognition, and genetics (PING) data repository. Neuroimage 124(Part B), 1149–1154 (2016).
Fischl, B., Sereno, M. I. & Dale, A. M. Cortical surface-based analysis. II: Inflation, flattening, and a surface-based coordinate system. Neuroimage 9, 195–207 (1999).
Fan, C. C. et al. Modeling the 3D geometry of the cortical surface with genetic ancestry. Curr. Biol. 25, 1988–1992 (2015).
Dale, A. M., Fischl, B. & Sereno, M. I. Cortical surface-based analysis. I. Segmentation and surface reconstruction. Neuroimage 9, 179–194 (1999).
Hormozdiari, F., Kostem, E., Kang, E. Y., Pasaniuc, B. & Eskin, E. Identifying causal variants at loci with multiple signals of association. Genetics 198, 497–508 (2014).
Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).
Antonell, A. et al. Partial 7q11.23 deletions further implicate GTF2I and GTF2IRD1 as the main genes responsible for the Williams–Beuren syndrome neurocognitive profile. J. Med. Genet. 47, 312–320 (2010).
Chailangkarn, T. et al. A human neurodevelopmental model for Williams syndrome. Nature 536, 338–343 (2016).
Carmona-Mora, P. et al. The nuclear localization pattern and interaction partners of GTF2IRD1 demonstrate a role in chromatin regulation. Hum. Genet. 134, 1099–1115 (2015).
Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).
We thank Carolina Makowski for her excellent comments and suggestions on the manuscript. T.T.B. and N.A. are funded by R01 DA038958—Examination of Neurobehavioral Development Using the PING Data Resource. PING data are disseminated by the PING Coordinating Center at the Center for Human Development, University of California, San Diego. The investigators within PING contributed to the design and implementation of PING but did not necessarily participate in this study. The NCNG study has been funded through the Research Council of Norway (including the FUGE program), the National Institutes of Health, the University of Oslo, the University of Bergen, the Bergen Research Foundation (BFS), Helse Vest, and the Western Norway Regional Health Authority, the KG Jebsen Centre for Psychosis Research, and Dr. Einar Martens Fund. We also thank the Centre for Advanced Study (CAS) at the Norwegian Academy of Science and Letters in Oslo for hosting collaborative projects and workshops between Norway, Sweden, and Scotland in 2011–2012.
Conflict of interest
The authors declare that they have no conflict of interest.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.