INTRODUCTION

Diffusion tensor imaging (DTI) is widely acknowledged as a useful tool for studying the white matter microstructure of the living brain. By mapping the diffusion of water through the brain's fibers, DTI can recover major fiber pathways in the brain, and patterns of anatomical connectivity, with broad applications in psychiatry, neurology, and brain mapping (Thomason and Thompson, 2011). DTI-based white matter abnormalities are widely reported in developmental and degenerative brain diseases including Alzheimer's disease and mild cognitive impairment (Fellgiebel et al, 2004; Naggara et al, 2006; Oishi et al, 2011), schizophrenia (White et al, 2008; Ellison-Wright and Bullmore, 2009; Patel et al, 2011), bipolar disorder (Sussmann et al, 2009; Heng et al, 2010), attention-deficit/hyperactivity disorder (Konrad and Eickhoff, 2010), and autism (Alexander et al, 2007; Ke et al, 2009). These studies show the utility of DTI in neuropsychiatric research. In several studies, treatment of neuropsychiatric patients has also been associated with changes in DTI measures (Versace et al, 2008; Yoo et al, 2007). This also shows the promise of DTI for understanding therapeutic effects.

Measures of white matter integrity derived from DTI, such as fractional anisotropy (FA), are highly heritable (Lee et al, 2008; Chiang et al, 2009; Kochunov et al, 2010; Lee et al, 2010; Patel et al, 2010; Chiang et al, 2011b). As such, they may be useful as intermediate measures or ‘endophenotypes’ (Meyer-Lindenberg and Weinberger, 2006; de Geus et al, 2008; Hall and Smoller, 2010; Marenco and Radulescu, 2010) for assessing genetic influences on the brain. Several commonly carried genetic variants have already been identified that exert small effects on the brain's white matter as detected by DTI. These include highly prevalent polymorphisms in genes coding for brain-derived neurotrophic factor (BDNF; Chiang et al, 2011a), clusterin (CLU; Braskie et al, 2011), the neuregulin 1 receptor (ErbB4; Konrad et al, 2009), neurotrophic tyrosine kinase receptor-type 1 (NTRK1; Braskie et al, 2012), catechol-O-methyl transferase (COMT; Thomason et al, 2010), and the hemochromatosis (HFE) gene (Jahanshad et al, 2012a). We therefore considered these genes as candidates in this study.

The molecular and cellular effects of these genes and their protein products have been extensively investigated. COMT is a well-studied gene and codes for one of the group of enzymes that degrade catecholamines. Catecholamine levels are altered in many neuropsychiatric disorders, thereby making this molecule an ideal target for medications. Several of the genes above are also well known for their role in brain development. BDNF's protein product is a neural growth factor or neurotrophin, vital for the healthy development and maintenance of the nervous system (Binder and Scharfman, 2004). Similarly, NTRK1 codes for TrkA, which belongs to a tyrosine kinase receptor family, to which neurotrophin growth factors bind. Neurotrophins and their receptors, not surprisingly, are also important in neuropsychiatric disease and may offer new therapeutic targets in the form of small-molecule antagonists or mimickers (Allen and Dawbarn, 2006). ErbB4 encodes another tyrosine kinase receptor, which by binding to its ligand, neuregulin-1 (coded by NRG1), participates in neural modulation and development and is thought to contribute to the pathophysiology of schizophrenia (Hahn et al, 2006). Lastly, HFE and CLU contain polymorphisms that increase the risk for neurodegenerative disease. Their protein products regulate iron metabolism—important in brain aging (Bartzokis et al, 2011)—and beta-amyloid metabolism (DeMattos et al, 2002), respectively.

White matter structure is certainly influenced by non-genetic factors such as age (Chiang et al, 2011b), and sex differences (which are partly genetic and nongenetic), but we expect a moderate and significant proportion of an individual's white matter integrity to be predictable from their genetic profiles. This is corroborated by DTI findings of high heritability for white matter microstructure. As mentioned above, individual effects of single genetic variants on white matter structure have been explored, but a multilocus approach has not yet been taken. The utility of a multilocus candidate gene model in predicting an imaging-derived outcome was recently explored in the context of structural MRI (Biffi et al, 2010) and functional MRI (Nikolova et al, 2011), but its applications in DTI and detailed three-dimensional maps of brain structure appear novel. In this paper, we incorporate a subject's genetic signature, at key loci, into a multilocus model. We hypothesize that this would help predict brain integrity, as measured by DTI-derived FA, more powerfully than a single-locus genetic test. We focus on the corpus callosum, as it is the largest white matter structure in the brain, easy to examine at the brain's midline, highly heritable (Chiang et al, 2009; Brouwer et al, 2010; Kochunov et al, 2010), and well studied in neurology and psychiatry as the primary commissure connecting the two brain hemispheres (Foong et al, 2000; Alexander et al, 2007). We chose FA as the DTI measure of white matter structure, as it has been shown to have higher heritability than other DTI parameters, such as radial and axial diffusivity (Kochunov et al, 2010).

MATERIALS AND METHODS

Participants

A total of 395 subjects (23.7±2.2 years of age; 143 men and 252 women; 47 siblings, 141 monozygotic twins (49 pairs and 43 singletons), and 207 dizygotic twins (1 triplet, 70 pairs, and 64 singletons) from the Brisbane young adult twins and siblings study (de Zubicaray et al, 2008) were included in our study, for whom both 105-gradient DTI scans and genome-wide genotype information were available. All twins in this study are Australians of European descent. Previously, principal component analysis was conducted in this cohort for population stratification analysis and correction (Medland et al, 2009). Subjects who were >6 SD from either of the top two average reference principal component scores—derived from non-Australian European populations—were identified as ancestry outliers and excluded from analysis. The first two principal components refer to differences between Africans and non-Africans and to differences between East Asians and others, respectively. Owing to migration patterns and the fact that this sample was originally recruited to study mole patterns, exclusions are usually because of Asian or Polynesian ancestry. All subjects were screened to exclude cases of pathology known to affect brain structure. Additionally, no subjects had a first-degree relative with a psychiatric disorder or reported a history of significant head injury, a neurological or psychiatric illness, substance abuse or dependence.

Diffusion Tensor Imaging

Whole-brain diffusion tensor MRI scans were collected with a 4-tesla Bruker Medspec MRI scanner. Images were acquired using single-shot echo planar imaging with a twice-refocused spin echo sequence to reduce eddy current-induced distortions. Acquisition parameters were optimized to yield the best signal-to-noise ratio for estimation of diffusion tensors (Jones et al, 1999). Imaging parameters were: 23 cm field-of-view, TR/TE 6090/91.7 ms, with a 128 × 128 acquisition matrix. 105 images were acquired for each subject: 11 with no diffusion sensitization and 94 diffusion-weighted images with gradient directions evenly distributed on the hemisphere. Standard protocols for skull-stripping and eddy current distortion correction were performed using FSL (http://www.fmrib.ox.ac.uk/fsl) and we adjusted for echo planar imaging distortions as detailed in prior studies (Leow et al, 2005; Jahanshad et al, 2010). FSL was also used to calculate tensors and scalar maps of FA from the corrected images. The LONI pipeline (http://pipeline.loni.ucla.edu) was used to parallelize the preprocessing steps.

A mean deformation template (MDT) was created for the DTI scans, to which subjects’ FA maps (obtained from DWI elastically aligned to their high resolution T1-weighted anatomical scan) were registered as in Jahanshad et al (2010), using a 3D elastic warping technique with a mutual information cost function (Leow et al, 2005). The MDT and the registered FA maps were then thresholded at 0.25, as FA measures below this threshold may reflect contributions from non-white matter in healthy-appearing white matter. After registering the FA maps across subjects, all FA images were smoothed with a Gaussian filter with a 7-mm isotropic full-width at half-max (FWHM). The structure of the corpus callosum was identified automatically by using the Johns Hopkins University (JHU) white matter atlas (ICBM DTI 81; Mori et al, 2008), which tracks its 3D extent, extending laterally from the midline (Figure 1). The atlas FA image was linearly and then elastically registered to our study-specific FA-MDT; the transformation matrix and deformation map were then applied to the JHU set of labels using nearest-neighbor interpolation to avoid intermixing of labels. The full label (composed of three regions: splenium, body, and genu) of the corpus callosum was then accurately extracted. This avoided subjectivity and rater dependency in defining the limits of the corpus callosum.

Figure 1
figure 1

The three-dimensional structure of the corpus callosum, as defined by the JHU white matter atlas, is displayed in axial, coronal, and sagittal views in blue, overlaid on the study template.

PowerPoint slide

Genotyping and Selection of Candidate Single-Nucleotide Polymorphisms (SNPs)

We considered six candidate SNPs listed in Table 1 owing to the recent imaging genetics discoveries outlined in the introduction. These particular genetic variants are located in six different genes. All have been linked to structural differences detectable with DTI. Several of these polymorphisms (ie, rs6265 in BDNF, rs6336 in NTRK1, rs4680 in COMT, and rs1799945 in HFE) are exonic variants and lead to amino-acid changes in the protein products of these genes (val → met, his → tyr, val → met, and his → asp, respectively). These have been well studied in the neuropsychiatric literature (Egan et al, 2003; Zecca et al, 2004; Tunbridge et al, 2006; van Schijndel et al, 2011). The remaining candidate SNPs do not cause missense mutations, but have been discovered in genome-wide association and genetic risk studies of neuropsychiatric disease (Lambert et al, 2009; Silberberg et al, 2006; Konrad et al, 2009). To obtain genotype information, genomic DNA samples were analyzed on the Human610-Quad BeadChip (Illumina, San Diego, California, USA) according to the manufacturer's protocols (Infinium HD Assay; Super Protocol Guide; Rev. A, May 2008). Additionally, imputation was performed by mapping the genotyped information to HapMap (Release 22 Build 36) using the Mach software (http://www.sph.umich.edu/csg/abecasis/MACH/index.html). All candidate SNPs passed a platform-specific quality control score (>0.7) and genotype call rate (>0.95).

Table 1 Associations of Single SNPs with Mean Callosal FA

Statistical Analysis

Linear mixed-effects models were used to study the joint and individual associations of genotypes with imaging measures, to take into account the relatedness between the subjects. For n subjects and p independent predictors (SNPs or other covariates), regression coefficients (β) were obtained, using the efficient mixed-model association (EMMA) software with restricted maximum likelihood estimation (Kang et al, 2008), according to the formula:

Here, y represents an n-component vector of voxelwise or mean FA measures, X is a matrix of SNP genotypes (coded additively as 0, 1, or 2 for the number of minor alleles) and/or covariates (eg, sex and age), Z is the identity matrix, and b is a vector of random effects with a variance of σg2K, where Kis the n by n kinship matrix for the twins and siblings (here, monozygotic twins are coded as 1, dizygotic twins and siblings as 0.5, and unrelated subjects as 0, corresponding to the expected proportion of their shared genetic polymorphisms, respectively). ɛ is a matrix of residual effects with a variance of σe2I, and I is an identity matrix. P-values for the significance of individual and joint SNP associations with FA were assessed using an F-test, according to the formula,

where RSS represents the residual sum-of-squares, a reduced model includes only covariates, and a full model contains both SNPs and covariates. For all statistical analyses, the LONI pipeline (http://pipeline.loni.ucla.edu/) was used for parallelization on a multi-CPU grid computer. The standard false discovery rate (FDR) method (Benjamini and Hochberg, 1995) was used for multiple comparison correction across voxels in the corpus callosum.

RESULTS

We assessed six candidate SNPs that have recently been implicated, to varying degrees, in affecting the brain, at the gross anatomical or microstructural level. We first used linear mixed-effects models to regress each subject's genotype at each candidate SNP against average FA measures across the corpus callosum (all callosal voxels with an FA above 0.25), to study their individual effects on white matter integrity. The regression β- and p-values for each SNP on its own, treated as an independent predictor, are shown in Table 1.

We then assessed the joint effect of our set of candidate SNPs on the corpus callosum, using a partial F-test and linear mixed-effects model to compute p-values. When regressed against average FA across the corpus callosum, a 5-SNP model with NTRK1, CLU, COMT, ErbB4, and HFE containing top SNPs explained 5.6% of the variance in FA (p=0.0001; model included sex and age; a model including only sex and age explained 0.42% of the variance). Prediction of mean callosal FA was improved by adding candidate SNPs in a stepwise fashion (Table 2). Addition of the BDNF SNP, however, did not improve the model. To ensure multicollinearity was not present among the genotypes, we assessed the correlation structure between the candidate SNPs, and none was correlated with any of the others (Supplementary Table S1).

Table 2 Multilocus Effects on Average Callosal FA and 3-Dimensional Maps of the Corpus Callosum

We also investigated the combined influence of the candidate SNPs on more detailed, spatial maps of the corpus callosum. In a stepwise fashion, in order of the SNPs’ individual effects (strongest to weakest, as shown in Table 1), we studied multilocus effects on voxel-by-voxel maps of callosal white matter structure that are shown in Table 2. The 5-SNP model showed the most widespread, statistically significant influence on the corpus callosum, where 82% of voxels (encompassing the callosal body, genu, and splenium) survived the FDR correction for multiple comparisons across all callosal voxels, at a critical p-value threshold of 0.041 (Figure 2); we note that in FDR, a higher critical p-value denotes a stronger effect, as it is the highest threshold that controls the FDR; as such this effect is widespread and strong. Both the number of statistically significant voxels and the critical p-value threshold were strongest for the 5-SNP model. Figure 2 also shows the voxelwise distribution of the fraction of variance explained by the 5-SNP model in the corpus callosum.

Figure 2
figure 2

Voxelwise R2 and p-values are shown in three representative sagittal slices for the joint effect of five SNPs in the NTRK1, CLU, COMT, ErbB4, and HFE genes on the corpus callosum microstructure, measured by DTI fractional anisotropy (FA). (a) The coefficient of determination (R2) or predictability of the 5-SNP model at each voxel is shown in the selected slices. Warmer colors represent higher fractions of variance in FA explained by the multi-SNP model. (b) P-values are shown for the 5-SNP model at each voxel; maps are corrected for multiple comparisons across voxels by applying a critical p-value threshold to control the FDR. Warmer colors represent more significant associations (greater effect sizes). For associations at each voxel, we adjusted for any effects of sex and age, and accounted for kinship structure via mixed-effects models.

PowerPoint slide

In addition to the additive, linear model, we included two-way SNP–SNP interactions in the mean callosal FA mixed-effect model. No significant interactions were found (Supplementary Table S2). We also explored prediction of voxel-by-voxel FA from the five SNPs using two popular machine-learning models (support vector regression and artificial neural networks; see Supplementary Methods), within a cross-validation framework, which similar to the mixed-effect model, led to statistically significant predictions across the corpus callosum (Supplementary Figure S1). At each voxel, mean-squared errors of predictions of FA were obtained from the candidate genotypes. The artificial neural network and support vector regression models’ predictive errors were then compared with those of null predictors (ie, where FA is randomly assigned to subjects) through permutations. The artificial neural network and support vector regression learning models were found to be statistically significant across 75% and 40% of the corpus callosum voxels, after correcting for multiple comparisons, with critical p-value thresholds of 0.037 and 0.019, respectively.

DISCUSSION

In this work, we aimed to predict neuroanatomical white matter micro-structure based on multiple genetic risk factors, while covarying for sex and age. Five of the six candidate polymorphisms that we considered in the study—CLU (Braskie et al, 2011), ErbB4 (Konrad et al, 2009), NTRK1 (Braskie et al, 2012), COMT (Thomason et al, 2010), and HFE (Jahanshad et al, 2012a)—explained close to 6% of the variability in mean callosal FA, using a linear mixed-effect model. This is a considerable fraction of the variance explained by only a few SNPs, taking into account the complexity of the structure and the non-genetic factors that influence it. It is also comparable to previous findings in the literature for multilocus models of a brain-imaging phenotype. Biffi et al (2010) found that 3% of the variance in MRI-derived volumes of several brain regions could be explained from a number of candidate genes for Alzheimer's disease. Nikolova et al (2011) showed 11% of the variance in ventral striatal reactivity could be explained from their panel of five polymorphisms. We also found that our candidate polymorphisms displayed extensive, significant effects on 82% of the volume of the corpus callosum, when cumulatively modeled at a voxelwise FA basis, which captures more spatial detail than an average measure of FA across the corpus callosum. We also confirmed significant predictions across the corpus callosum from the five SNPs using multilocus machine-learning models. These yielded similar predictions, but were less spatially widespread, as only a subset of subjects were considered who were unrelated to each other.

We focused on DTI-derived FA of the corpus callosum as our imaging measure in this study. The corpus callosum is the largest white matter structure in the brain, containing over 300 million axons (Hofer and Frahm, 2006). This fiber bundle transfers motor, sensory, and cognitive information between the two cerebral hemispheres (Huang et al, 2005). With the advent of DTI, it has become increasingly clear that the structure of the corpus callosum is impaired in several brain disorders. In a recent meta-analysis, for instance, Patel et al (2011) concluded that the splenium of the corpus callosum has significantly lower FA in patients with schizophrenia vs controls. Recent DTI studies have also identified callosal abnormalities in patients with other brain disorders such as bipolar disorder (Benedetti et al, 2011), post-traumatic stress disorder (Jackowski et al, 2008), and autism (Alexander et al, 2007). It would therefore be beneficial, clinically, to know an individual's personalized genetic risk for a corpus callosum structural abnormality. In addition, the microstructure of the corpus callosum has been shown to be highly heritable in studies including those with the same Australian twins as in this paper. Chiang et al (2009) mapped out genetic contributions to white matter structure in the Australian twins and discovered significant voxelwise effects in the callosal genu and splenium. In that paper, a classical twin design was used to estimate the overall genetic contribution to the observed variance, but effects of specific SNPs were not assessed or modeled. Similarly, Kochunov et al (2010) found mean FA values from the corpus, body, and genu of the corpus callosum were highly heritable (all with h2>0.5) in members of the San Antonio Family Study. Similar results have also been reported in studies of young children (Brouwer et al, 2010) and in older individuals (eg, Pfefferbaum et al, 2001). Recently, it was also shown in the same Australian population as ours that the heritability of callosal FA, particularly in the genu, is high regardless of imaging protocol differences (Jahanshad et al, 2012b). Here, we show that predictions of microstructural measures may be made based on a few common polymorphisms. We focused on the corpus callosum here, but our results may also have implications for other white matter tracts in the brain. The millions of axons in the corpus callosum connect numerous regions of the brain with each other. Genetic variants that affect this brain structure may also have roles in other white matter regions.

We selected six candidate SNPs for our study based on their reported individual effects on white matter structure on DTI and their importance in neuropsychiatric disease. The val158met missense mutation resulting from the candidate SNP in COMT causes reduced degradation and thus increased availability of dopamine, thereby leading to alterations in reward experience, executive function, and working memory, with implications on risk for neuropsychiatric disease and differential response to therapy (Tunbridge et al, 2006; Wichers et al, 2008). BDNF's val66met polymorphism, which affects the neurotrophin's secretion and its function in long-term potentiation, has been investigated in many studies and shown to alter memory performance at a young age, among other associations with neuropsychiatric disease (Egan et al, 2003; Hariri et al, 2003). Similarly, although not as fully characterized, the candidate SNP in the neurotrophin receptor gene, NTRK1, leads to a his598tyr amino-acid change in the kinase domain of TrkA, and has been significantly associated with risk for schizophrenia (van Schijndel et al, 2009; van Schijndel et al, 2011). In our study, this SNP had the strongest effect of all candidates on white matter micro-structure. We also found subjects with greater numbers of minor alleles of the NTRK1 polymorphism had lower FA, which is consistent with data suggesting that the minor allele is over-represented in schizophrenia patients (van Schijndel et al, 2011). Another receptor gene we considered was the neuregulin receptor, ErbB4, with an intronic variant associated with schizophrenia risk in several studies (Konrad et al, 2009; Nicodemus et al, 2006, Silberberg et al, 2006). Another intronic variant, rs11136000, in CLU has been discovered and replicated in genome-wide association studies of Alzheimer's disease (Lambert et al, 2009). Similarly, the his63asp mutation in iron-related HFE gene has been linked to Alzheimer's disease, along with other neurodegenerative disorders (Connor and Lee, 2006). We found all variants except for the one in BDNF contributed additively to prediction of average callosal FA as well as three-dimensional maps of voxelwise FA across the corpus callosum.

Personalized prediction of individuals’ disease-related measures is being advocated by some as a vital component of future diagnosis and treatment of brain disorders (Koslow et al, 2010). Some of the genetic variation may account for some of the broad heterogeneity in patients’ disease status (Cummings, 2000; Folstein and Rosen-Sheidley, 2001) and the extent to which they respond to therapy (Gordon, 2007). Multilocus models are particularly appealing for personalized prediction of disease. Several groups have explored multilocus models of candidate risk variants in the context of brain disorders. Carayol et al (2010), for instance, reported on the cumulative effect of four candidate SNPs on the risk for autism, using a case–control approach. These models are beginning to be applied to brain imaging in the context of neuropsychiatric disorders (Biffi et al, 2010; Hibar et al, 2011a; Nikolova et al, 2011), and may provide more biologically meaningful predictions with implications for personalized diagnosis and therapy.

Future studies are needed to replicate our findings in independent cohorts of subjects, even though we found significant predictions using cross-validation in the support vector regression and artificial neural network analyses. In addition, as new candidate gene studies and genome-wide searches using DTI measures (eg, Kochunov et al, 2011) identify effects of new variants, candidate genes may be added or removed from this panel, to better predict white matter integrity. We did not find evidence for two-way interactions between the SNPs in our study, which is probably reasonable, as the SNPs are likely contributing independently and additively to white matter integrity, and interactions are second-order effects (modulations of the main effect of a gene) that may require large samples to identify, if present at all. Such interactions, however, may be identified in follow-up studies particularly with SNPs that directly share the same pathway, like NTRK2 and BDNF (Perroud et al, 2009), NRG1 and ErbB4 (Nicodemus et al, 2010), or COMT and 5-HTTLPR (Borroni et al, 2006). In this paper, we took a voxelwise approach to study genetic associations with FA. In addition to voxelwise maps of FA, tract- and fiber-based measures from diffusion imaging may also be considered as predictive outputs. Such measures, along with multivariate methods that simultaneously consider not only multiple genes, but also multiple voxels (Vounou et al, 2010; Hibar et al, 2011b; Wan et al, 2011) may help provide more statistical power. For instance, our voxelwise, multilocus model improved only slightly beyond the 2-SNP model with polymorphisms in NTRK1 and CLU. This may be because of the strong effects of NTRK1 and CLU SNPs on their own, but it may also be because multiple variants do not necessarily affect the same exact voxels. This may make it difficult to obtain substantially more expansive voxelwise effects by adding more variants to the model. Here, we considered genetic polymorphisms as predictors, and these explained a small but significant proportion of the heritable variation in white matter structure across young, healthy individuals. Although it remains to be determined, it is plausible that a measure of white matter integrity, such as DTI-derived FA, relates to a person's lifetime risk for developing mental and neurodegenerative disorders, especially for disorders in which FA is abnormally low.