Multimodal population brain imaging in the UK Biobank prospective epidemiological study

Journal name:
Nature Neuroscience
Volume:
19,
Pages:
1523–1536
Year published:
DOI:
doi:10.1038/nn.4393
Received
Accepted
Published online

Abstract

Medical imaging has enormous potential for early disease prediction, but is impeded by the difficulty and expense of acquiring data sets before symptom onset. UK Biobank aims to address this problem directly by acquiring high-quality, consistently acquired imaging data from 100,000 predominantly healthy participants, with health outcomes being tracked over the coming decades. The brain imaging includes structural, diffusion and functional modalities. Along with body and cardiac imaging, genetics, lifestyle measures, biological phenotyping and health records, this imaging is expected to enable discovery of imaging markers of a broad range of diseases at their earliest stages, as well as provide unique insight into disease mechanisms. We describe UK Biobank brain imaging and present results derived from the first 5,000 participants' data release. Although this covers just 5% of the ultimate cohort, it has already yielded a rich range of associations between brain imaging and other measures collected by UK Biobank.

At a glance

Figures

  1. Data from the three structural imaging modalities in UK Biobank brain imaging.
    Figure 1: Data from the three structural imaging modalities in UK Biobank brain imaging.

    (a) Single-subject T1-weighted structural image with minimal pre-processing: removal of intensity inhomogeneity, lower neck areas cropped and the face blanked to protect anonymity. Color overlays show automated modeling of several subcortical structures (above) and segmentation of gray matter (below). (b) Single-subject T2-weighted FLAIR image with the same minimal pre-processing showing hyperintense lesions in the white matter (arrows). (c) Group-average (n 4,500) T1 atlas; all subjects' data were aligned together (see Online Methods for processing details) and averaged, achieving high-quality alignment, with clear delineation of deep gray structures and good agreement of major sulcal folding patterns despite wide variation in these features across subjects. (d) Group-average T2 FLAIR atlas. (e) Group-average atlas derived from SWI processing of swMRI phase and magnitude images. (f) Group-average T2* atlas, also derived from the swMRI data. (g) Manhattan plot (a layout common in genetic studies) relating all 25 IDPs from the T1 data to 1,100 non-brain-imaging variables extracted from the UK Biobank database, with the latter arranged into major variable groups along the x axis (with these groups separated by vertical dotted lines). For each of these 1,100 variables, the significance of the cross-subject univariate correlation with each of the IDPs is plotted vertically, in units of –log10(Puncorrected). The dotted horizontal lines indicate thresholds corresponding to multiple comparison correction using FDR (lower line, corresponding to Puncorrected = 3.8 × 10−5) and Bonferroni correction (upper line, Puncorrected = 1.8 × 10−8) across the 2.8 million tests involving correlations of all modalities' IDPs against all 1,100 non-imaging measures. Effects such as age, sex and head size are regressed out of all data before computing the correlations. As an indication of the corresponding range of effect sizes, the maximum r2 (fractional variance of either variable explained by the other) is calculated, as well as the minimum r2 across all tests passing the Bonferroni correction. Here, the maximum r2 = 0.045 and the minimum r2 = 0.0058. (h) Plot relating all 14 T2* IDPs to 1,100 non-imaging variables. Maximum r2 = 0.034, minimum r2 = 0.0063. Marked Bonferroni and FDR multiple comparison threshold levels are presented as in g.

  2. The diffusion MRI data in UK Biobank.
    Figure 2: The diffusion MRI data in UK Biobank.

    (a) Group-average (n 4,500) atlases from six distinct dMRI modeling outputs, each sensitive to different aspects of the white matter microarchitecture. The atlases shown are: FA, MD (mean diffusivity) and MO (tensor mode); and ICVF (intra-cellular volume fraction), ISOVF (isotropic or free water volume fraction) and OD (orientation dispersion index) from the NODDI microstructural modeling. Also shown are several group-average white matter masks used to generate IDPs (for example, pink (r) are retrolenticular tracts in the internal capsules; upper green (s) are the superior longitudinal fasciculi). (b) Tensor ellipsoids depicting the group-averaged tensor fit at each voxel for the region shown in the inset in c. The shapes of the ellipsoids indicate the strength of water diffusion along three principal directions; long thin tensors indicate single dominant fiber bundles, whereas more spherical tensors (within white matter) generally imply regions of crossing fibers (seen more explicitly modeled in corresponding parts of c). (c) Group-averaged multiple fiber orientation atlases, showing up to three fiber bundles per voxel. Red shows the strongest fiber direction, green the second and blue the third. Each fiber bundle is only shown where the modeling estimates that population to have greater than 5% voxel occupancy. Inset shows the thresholded mean FA image (copper) overlaid on the T1, with the region shown in detail in b and c. (d) Four example group-average white matter tract atlases estimated by probabilistic tractography fed from the within-voxel fiber modeling: corpus callosum (genu), superior longitudinal fasciculus, corticospinal tract and inferior fronto-occipital fasciculus. (e) Plot relating all 675 dMRI IDPs (nine distinct dMRI modeling outputs from tensor and NODDI models × 75 tract masks) to 1,100 non-imaging variables (see Fig. 1g for details). Maximum r2 = 0.057, minimum r2 (passing Bonferroni) = 0.0065. Dotted horizontal lines (multiple comparison thresholds) are described in Figure 1g.

  3. The task fMRI data in UK Biobank.
    Figure 3: The task fMRI data in UK Biobank.

    (a) The task paradigm temporal model (time running vertically) depicting the periods of the two task types (shapes and faces); for more information on this paradigm view, see http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FEAT/UserGuide. (b) Example fitted activation regression model versus time-series data (time running horizontally) for the voxel most strongly responding to the 'faces > shapes' contrast in a single subject (Z = 12.3). (c) Percentage of subjects passing simple voxel-wise activation thresholding (Z > 1.96) for the same contrast. Note the reliable focal activation in left and right amygdala. The underlying image is the group-averaged raw fMRI image. (d) Group-averaged activation for the three contrasts of most interest, overlaid on the group-average T1 atlas (fixed-effects group average, Z > 100, voxelwise Pcorrected < 10−30). (e) Plot relating the 16 tfMRI IDPs to 1,100 non-imaging variables (see Fig. 1g for details). Maximum r2 = 0.018, minimum r2 (passing Bonferroni) = 0.0062. Dotted horizontal lines (multiple comparison thresholds) are described in Figure 1g.

  4. The resting-state fMRI data in UK Biobank.
    Figure 4: The resting-state fMRI data in UK Biobank.

    (a) Example group-average resting-state network (RSN) atlases from the low-dimensional group-average decomposition showing 4 of 21 estimated functional brain networks, including the default mode network (red-yellow), dorsal attention network (green), primary visual (copper) and higher level visual (dorsal and ventral streams, blue). The three slices shown are (top to bottom) sagittal, coronal and axial. (b) The 55 non-artifact components from a higher dimensional parcellation of the brain (axial views). These are shown as displayed by the connectome browser (www.fmrib.ox.ac.uk/ukbiobank/netjs_d100), which allows interactive investigation of individual connections in the group-averaged functional network modeling. The 55 brain regions (network nodes) are clustered into groups according to their average population connectivity, and the strongest individual connections are shown (positive in red, anticorrelations in blue). (c) Plot relating the 76 rfMRI 'node amplitude' IDPs to 1,100 non-imaging variables (see Fig. 1g for details). Maximum r2 = 0.065, minimum r2 (passing Bonferroni) = 0.0059. (d) Plot relating the 1,695 rfMRI 'functional connectivity' IDPs to 1,100 non-imaging variables. Maximum r2 = 0.032, minimum r2 = 0.0059. Dotted horizontal lines (multiple comparison thresholds) in c and d are described in Figure 1g.

  5. Voxel-wise correlations of participants' age against several white matter measures from the dMRI and T2 FLAIR data.
    Figure 5: Voxel-wise correlations of participants' age against several white matter measures from the dMRI and T2 FLAIR data.

    (a) Voxel-wise (cross-subject) correlation of FA versus age. Group-average FA in white matter is shown in green, overlaid onto the group-average T1. (b) Correlation of MO versus age, using the same color scheme. Nearby areas of MO increase are shown in greater detail in f, which also shows the distinct primary fiber directions. (c) Correlation of OD versus age, including a reduction in dispersion in posterior corpus callosum. (d) Correlation of ISOVF versus age, showing increases in freely diffusing water with age in a broad range of tracts. (e) Voxel-wise correlation of T2 FLAIR intensity showing increased intensity with aging in white matter. For ae, blue and red-yellow show negative and positive Pearson correlation with age, respectively (Pcorrected < 0.05, with Bonferroni correction across voxels resulting in significance at r = 0.1 (dMRI n = 3,722; T2 FLAIR n = 3,781). (g) Histograms (across voxels) of the voxel-wise age correlation of the correlation maps shown above, with correlation value on the x axis. FA and MO largely decreased with age, whereas OD and ISOVF largely increased.

  6. Visualization of 2.8 million univariate cross-subject association tests between 2,501 IDPs and 1,100 other variables in the UK Biobank database.
    Figure 6: Visualization of 2.8 million univariate cross-subject association tests between 2,501 IDPs and 1,100 other variables in the UK Biobank database.

    (a) Manhattan plot showing, for each of the 1,100 non-brain-imaging variables, the statistically strongest association of that variable with each distinct imaging sub-modality's IDPs (that is, six results plotted for each x axis position, each with a color indicating a brain imaging modality; this plot differs from the other Manhattan plots, which show correlations with all IDPs). Whereas the Manhattan plots in Figures 1,2,3,4 showed associations for each brain imaging modality separately, all associations are depicted in a single plot. (b) List of all IDP-cognitive score associations passing Bonferroni correction for multiple comparisons (Pcorrected < 0.05; Puncorrected < 1.8 × 10−8). The first column lists the age-adjusted correlation coefficient, and the second shows the unadjusted correlation, both being correlations between a specific brain IDP (fifth column) and a cognitive test score (sixth column). The UK Biobank cognitive tests carried out included fluid intelligence, prospective memory, reaction time (shape pairs matching), memorized pairs matching, trail making (symbol ordering), symbol digit substitution, and numeric memory. (c) IDP associations with the cognitive phenotype variables (the full set of 174 cognitive variables, repeated for each brain imaging modality). Shown behind, in gray, are the same associations without adjustment for age, with a large number of stronger associations. Dotted horizontal lines (multiple comparison thresholds) in a and c are described in Figure 1g. (d) Scatterplot showing the relationship between adjusted correlations and those obtained without first regressing out the confound variables (each point is a pairing of one IDP with one non-brain-imaging variable, 2.8 million points). The grid lines indicate Bonferroni-corrected significance level (as described in Fig. 1). (e) Example association between unadjusted white matter volume and fat-free body mass is high (r = 0.56) when pooling across the sexes. After adjusting for several variables (including sex), the correlation falls almost to zero.

  7. Details of three modes from the doubly-multivariate CCA-ICA analyses across all IDPs and non-brain-imaging variables.
    Figure 7: Details of three modes from the doubly-multivariate CCA-ICA analyses across all IDPs and non-brain-imaging variables.

    IDPs are listed in orange and non-brain-imaging variables in black. The lists show the variables most strongly associated with each mode; where multiple very similar (and highly correlated) non-imaging variables are found, only the most significant is listed here for brevity. The first column shows the weight (strength and sign) of a given variable in the ICA mode, the second shows the (cross-subject) percentage variance of the data explained by this mode, and the third column shows the percentage variance explained in the data without the confounds first regressed out. (a) Mode 7 links measures of bone density, brain structure/tissue volumes and cognitive tests. (b) Mode 8 links measures of blood pressure and alcohol intake to IDPs from the diffusion and functional connectivity data; two functional network connections strongly involved are displayed, with the population mean connection indicated by the bar connecting the two nodes forming the connection (red indicates positive mean correlation, blue negative, and the width of the bar indicates the connection strength). The group-ICA maps are thresholded at Z > 5, and the colored text is the ICA weight shown in the table list. (c) Mode 9 includes a wide range of imaging and non-imaging variables; as well as showing three strong functional network connections, we also show two functional nodes whose resting fluctuation amplitude is associated with this mode.

  8. Hypothesis-driven study of age, BMI and smoking associations with subcortical T2[ast].
    Figure 8: Hypothesis-driven study of age, BMI and smoking associations with subcortical T2*.

    (a) UK Biobank population-average map of T2*, overlaid with the main subcortical structures being investigated. The T2* IDPs reflect individuals' median T2* values in these regions. The relatively low T2* in putamen and pallidum likely reflects greater iron content. (b) BMI regression betas from multiple regressions of R2* (from the ASPS study) and T2* (from UK Biobank) against relevant covariates (see c). All variables are standardized so that beta values can be interpreted as (partial) correlation coefficients. R2* significance is reported as FDR-corrected P. T2* significance is reported as –log10Puncorrected with the more conservative Bonferroni correction (for Pcorrected = 0.05) resulting in a threshold here of 3.6. (c) Full set of univariate and multiple regression betas and significance values for all brain regions tested and all model covariates. The regression results are much sparser, reflecting the higher associational specificity obtained by reporting unique variance explained.

  9. Asymptotic behavior of group average images and correlations with age, with increasing numbers of subjects.
    Supplementary Fig. 1: Asymptotic behavior of group average images and correlations with age, with increasing numbers of subjects.

    (a) Resting-state network group-averages formed from 5 different group sizes. Individual subjects' preprocessed resting-state data were used to generate subject-specific effect-size maps (arbitrary but consistent units of resting activity strength) of one of the low-dimensional resting-state networks (the default mode network, here shown with both positive=red and negative=blue involvement in this network, with the same color-coding and thresholding applied in all cases). These were then averaged across subjects for a range of subject numbers. Increasing n suppresses background noise as expected, and the non-noise network structure asymptotes towards a constant map as n rises over 100. Although any imperfections in functional spatial alignment across subjects limits the sharpness of the asymptotic group-average, as n is raised even further to 5000, there is no sign of a degradation (e.g., blurring) of the group-average map (compared with lower subject numbers), as expected. (b-d) Voxelwise correlation of the same resting-state network with increasing subject age, again to illustrate the statistical effect of increasing the number of subjects used. (b) The age correlation map when using 5000 subjects; with increasing age the correlations are dominantly negative, indicating a weakening of this cognitive network (r>0.1, Pcorrected<10-10). (c) The 10 voxels having the strongest (positive or negative) correlation with age, that are also at least 10mm distant from each other, are used to form 10 plots of age-correlation against number of subjects used in the correlation, from 10 to 5000 subjects. As expected, the plots asymptote, with increasing subject numbers, towards the "true" final value, with noticeable instability up to as many as 2000 subjects. (d) From the same 10 sets of correlations, the statistical significance (-log10(P)) is shown. Whereas r asymptotes towards its true final constant value with increasing n, statistical significance (for a non-null correlation) has an ever-increasing trend with increasing n. (e) Theoretical relationship between increasing statistical power and subject numbers, assuming a true correlation between any two variables of r=0.1. As n increases (here up to 7000), the number of distinct tests that will pass Bonferroni multiple-comparison correction rises to very large numbers - here up to around 1015.

  10. Visualisation of 2.8 million univariate cross-subject association tests between 2501 IDPs and 1100 other variables in the UK Biobank database, showing variance explained on the y axis.
    Supplementary Fig. 2: Visualisation of 2.8 million univariate cross-subject association tests between 2501 IDPs and 1100 other variables in the UK Biobank database, showing variance explained on the y axis.

    Whereas the version of this plot in Figure 6 reported statistical significance (-log10(P)), here we show variance explained (r2). The relationship between P and r is not here exactly a fixed one-to-one mapping, due to the different numbers of valid (non-missing) data in different pairs of variables being tested. The Manhattan plot shows, for each of the 1100 non-brain-imaging variables, the strongest r2 association of that variable with each distinct imaging sub-modality’s IDPs. (i.e., 6 results plotted for each x axis position, each with a color indicating a brain imaging modality; this plot differs from the other Manhattan plots, which show correlations with all IDPs).

  11. Visualisation of modes 1, 2 and 3 from the doubly-multivariate CCA-ICA analyses across all IDPs and non-brain-imaging variables.
    Supplementary Fig. 3: Visualisation of modes 1, 2 and 3 from the doubly-multivariate CCA-ICA analyses across all IDPs and non-brain-imaging variables.

    (a) Mode 1 links physical measures of body size, metabolic rate and hand grip strength to brain structure sizes and a range of dMRI-derived measures. (b) Mode 2 primarily links bone density measures to brain structure sizes, T2* levels and a range of dMRI-derived measures. (c) Mode 3 primarily links measures of body fat to T2* levels and resting-state activity fluctuation amplitudes. As seen in Supp Fig 8a, modes 1 and 2 are associated with aging (and sex), while mode 3 is not strongly associated; from Supp Fig 8b, we see that modes 1 and (more strongly) 3 are associated with hypertension.

  12. Visualisation of modes 4, 5 and 6 from the doubly-multivariate CCA-ICA analyses across all IDPs and non-brain-imaging variables.
    Supplementary Fig. 4: Visualisation of modes 4, 5 and 6 from the doubly-multivariate CCA-ICA analyses across all IDPs and non-brain-imaging variables.

    (a) Mode 4 links cardiac measures (including heart rate) to resting-state amplitudes and connectivities (rfMRI summary images are shown larger in Supp Fig 5). Observing these 3 types of measures linked together suggests that the apparent change in functional connectivity in this mode likely reflects changes in vascular processes rather than underlying neural connectivity. (b) Mode 5 links a range of lifestyle and biophysical measures (most strongly alcohol intake and smoking, red blood cell and cardiac measures) to T2* subcortical intensity (e.g., iron deposition) and the resting-state amplitudes of many brain areas (rfMRI summary images shown larger in Supp Fig 6). (c) Mode 6 links early-life measures (birth weight and breast feeding) along with several other physical and lifestyle measures to many imaging measures of both diffusivity and functional connectivity (rfMRI summary images are shown larger in Supp Fig 7).

  13. More detailed visualization of the rfMRI summary measures (resting-state fluctuation amplitudes and functional connectivity) from CCA-ICA mode 4.
    Supplementary Fig. 5: More detailed visualization of the rfMRI summary measures (resting-state fluctuation amplitudes and functional connectivity) from CCA-ICA mode 4.
  14. More detailed visualization of the rfMRI summary measures (resting-state fluctuation amplitudes and functional connectivity) from CCA-ICA mode 5.
    Supplementary Fig. 6: More detailed visualization of the rfMRI summary measures (resting-state fluctuation amplitudes and functional connectivity) from CCA-ICA mode 5.
  15. More detailed visualization of the rfMRI summary measures (resting-state fluctuation amplitudes and functional connectivity) from CCA-ICA mode 6.
    Supplementary Fig. 7: More detailed visualization of the rfMRI summary measures (resting-state fluctuation amplitudes and functional connectivity) from CCA-ICA mode 6.
  16. Associations of the nine CCA-ICA modes with confounds and other variables of interest.
    Supplementary Fig. 8: Associations of the nine CCA-ICA modes with confounds and other variables of interest.

    (a) As with the univariate analyses, data fed into the multivariate analyses were first adjusted for parameters that might otherwise induce apparent relationships based on potentially non-interesting factors (age, sex, head size, head motion). Here we show how, by projecting the CCA-ICA modes back onto the original data, we can estimate how strongly the discovered modes relate to these parameters. For example, modes 1,2,4,5,7,8 are associated with aging, whereas 3,6,9 are not strongly associated with age. Considering this in the light of the fact that all data input to the CCA had been age-adjusted suggests that, while these modes reflect meaningful biological processes related to aging, their identification here was not driven by trivial corruption of IDPs by aging (e.g., reduced fMRI signal due to cortical thinning). (b) Correlation of the CCA-ICA modes against several variables of interest, including some clinical outcomes (for which, at this stage, numbers are naturally quite limited). (c) Scatterplot of CCA-ICA modes 1 vs. 8 (one point per subject), showing the associations between these modes, age and sex. Colors running from green to red indicate increasing age in females; colors running from blue to pink indicate increasing age in males.

References

  1. Filippini, N. et al. Distinct patterns of brain activity in young carriers of the APOE-epsilon4 allele. Proc. Natl. Acad. Sci. USA 106, 72097214 (2009).
  2. Douaud, G. et al. Brain microstructure reveals early abnormalities more than two years prior to clinical progression from mild cognitive impairment to Alzheimer's disease. J. Neurosci. 33, 21472155 (2013).
  3. Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).
  4. Allen, N. et al. UK Biobank: current status and what it means for epidemiology. Health Policy and Technology 1, 123126 (2012).
  5. Wick, J.Y. Understanding frailty in the geriatric population. Consult Pharm. 26, 634645 (2011).
  6. Stanek, K.M. et al. Obesity is associated with reduced white matter integrity in otherwise healthy adults. Obesity (Silver Spring) 19, 500504 (2011).
  7. Petersen, S.E. et al. Imaging in population science: cardiovascular magnetic resonance in 100,000 participants of UK Biobank - rationale, challenges and approaches. J. Cardiovasc. Magn. Reson. 15, 46 (2013).
  8. Van Essen, D.C. et al. The WU-Minn Human Connectome Project: an overview. Neuroimage 80, 6279 (2013).
  9. Good, C.D. et al. A voxel-based morphometric study of ageing in 465 normal adult human brains. Neuroimage 14, 2136 (2001).
  10. Karas, G.B. et al. A comprehensive study of gray matter loss in patients with Alzheimer's disease using optimized voxel-based morphometry. Neuroimage 18, 895907 (2003).
  11. Debette, S. & Markus, H.S. The clinical importance of white matter hyperintensities on brain magnetic resonance imaging: systematic review and meta-analysis. Br. Med. J. 341, c3666 (2010).
  12. Haacke, E.M., Xu, Y., Cheng, Y.-C.N. & Reichenbach, J.R. Susceptibility weighted imaging (SWI). Magn. Reson. Med. 52, 612618 (2004).
  13. Duyn, J. MR susceptibility imaging. J. Magn. Reson. 229, 198207 (2013).
  14. Beaulieu, C. The basis of anisotropic water diffusion in the nervous system - a technical review. NMR Biomed. 15, 435455 (2002).
  15. Basser, P.J., Mattiello, J. & LeBihan, D. Estimation of the effective self-diffusion tensor from the NMR spin echo. J. Magn. Reson. B. 103, 247254 (1994).
  16. Zhang, H., Schneider, T., Wheeler-Kingshott, C.A. & Alexander, D.C. NODDI: practical in vivo neurite orientation dispersion and density imaging of the human brain. Neuroimage 61, 10001016 (2012).
  17. Jbabdi, S., Sotiropoulos, S.N., Savio, A.M., Graña, M. & Behrens, T.E. Model-based analysis of multishell diffusion MR data for tractography: how to get over fitting problems. Magn. Reson. Med. 68, 18461855 (2012).
  18. de Groot, M. et al. Improving alignment in Tract-based spatial statistics: evaluation and optimization of image registration. Neuroimage 76, 400411 (2013).
  19. Wakana, S. et al. Reproducibility of quantitative tractography methods applied to cerebral white matter. Neuroimage 36, 630644 (2007).
  20. Logothetis, N.K. What we can do and what we cannot do with fMRI. Nature 453, 869878 (2008).
  21. Hariri, A.R., Tessitore, A., Mattay, V.S., Fera, F. & Weinberger, D.R. The amygdala response to emotional stimuli: a comparison of faces and scenes. Neuroimage 17, 317323 (2002).
  22. Smith, S.M. et al. Functional connectomics from resting-state fMRI. Trends Cogn. Sci. 17, 666682 (2013).
  23. Douaud, G. et al. DTI measures in crossing-fibre areas: increased diffusion anisotropy reveals early white matter alteration in MCI and mild Alzheimer's disease. Neuroimage 55, 880890 (2011).
  24. Ennis, D.B. & Kindlmann, G. Orthogonal tensor invariants and the analysis of diffusion tensor magnetic resonance images. Magn. Reson. Med. 55, 136146 (2006).
  25. Raichle, M.E. et al. A default mode of brain function. Proc. Natl. Acad. Sci. USA 98, 676682 (2001).
  26. Satterthwaite, T.D. et al. An improved framework for confound regression and filtering for control of motion artifact in the preprocessing of resting-state functional connectivity data. Neuroimage 64, 240256 (2013).
  27. Kuznetsova, K. et al. Brain white matter structure and information processing speed in healthy older age. Brain Struct. Funct. 221, 32233235 (2016).
  28. Madden, D.J. et al. Diffusion tensor imaging of cerebral white matter integrity in cognitive aging. Biochim. Biophys. Acta 1822, 386400 (2012).
  29. Van Der Werf, Y.D. et al. Thalamic volume predicts performance on tests of cognitive speed and decreases in healthy aging. A magnetic resonance imaging-based volumetric analysis. Brain Res. Cogn. Brain Res. 11, 377385 (2001).
  30. Fjell, A.M. & Walhovd, K.B. Structural brain changes in aging: courses, causes and cognitive consequences. Rev. Neurosci. 21, 187221 (2010).
  31. Reichle, E.D., Carpenter, P.A. & Just, M.A. The neural bases of strategy and skill in sentence-picture verification. Cognit. Psychol. 40, 261295 (2000).
  32. Hotelling, H. Relations between two sets of variates. Biometrika 28, 321377 (1936).
  33. Hyvärinen, A. Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Netw. 10, 626634 (1999).
  34. Smith, S.M. et al. A positive-negative mode of population covariation links brain connectivity, demographics and behavior. Nat. Neurosci. 18, 15651567 (2015).
  35. Smith, S.M. et al. Correspondence of the brain's functional architecture during activation and rest. Proc. Natl. Acad. Sci. USA 106, 1304013045 (2009).
  36. Yaffe, K., Browner, W., Cauley, J., Launer, L. & Harris, T. Association between bone mineral density and cognitive decline in older women. J. Am. Geriatr. Soc. 47, 11761182 (1999).
  37. Tan, Z.S. et al. Bone mineral density and the risk of Alzheimer disease. Arch. Neurol. 62, 107111 (2005).
  38. Pirpamer, L. et al. Determinants of iron accumulation in the normal aging brain. Neurobiol. Aging 43, 149155 (2016).
  39. Hamidi, M., Drevets, W.C. & Price, J.L. Glial reduction in amygdala in major depressive disorder is due to oligodendrocytes. Biol. Psychiatry 55, 563569 (2004).
  40. D'Esposito, M., Deouell, L.Y. & Gazzaley, A. Alterations in the BOLD fMRI signal with ageing and disease: a challenge for neuroimaging. Nat. Rev. Neurosci. 4, 863872 (2003).
  41. Kirk, R. Practical significance: a concept whose time has come. Educ. Psychol. Meas. 56, 746759 (1996).
  42. Gage, S.H., Davey Smith, G., Ware, J.J., Flint, J. & Munafò, M.R. G = E: what GWAS can tell us about the environment. PLoS Genet. 12, e1005765 (2016).
  43. Simpson, E. The interpretation of interaction in contingency tables. J. R. Stat. Soc. B 13, 238241 (1951).
  44. Swanson, J.M. The UK Biobank and selection bias. Lancet 380, 110 (2012).
  45. Berkson, J. Limitations of the application of fourfold table analysis to hospital data. Biometrics 2, 4753 (1946).
  46. Pearl, J. Causality: Models, Reasoning and Inference (Cambridge University Press, 2009).
  47. Duff, E.P. et al. Learning to identify CNS drug action and efficacy using multistudy fMRI data. Sci. Transl. Med. 7, 274ra16 (2015).
  48. Insel, T. et al. Research domain criteria (RDoC): toward a new classification framework for research on mental disorders. Am. J. Psychiatry 167, 748751 (2010).
  49. Schram, M.T. et al. The Maastricht Study: an extensive phenotyping study on determinants of type 2 diabetes, its complications and its comorbidities. Eur. J. Epidemiol. 29, 439451 (2014).
  50. German National Cohort (GNC) Consortium. The German National Cohort: aims, study design and organization. Eur. J. Epidemiol. 29, 371382 (2014).
  51. Jenkinson, M., Beckmann, C.F., Behrens, T.E., Woolrich, M.W. & Smith, S.M. FSL. Neuroimage 62, 782790 (2012).
  52. Daducci, A. et al. Accelerated Microstructure Imaging via Convex Optimization (AMICO) from diffusion MRI data. Neuroimage 105, 3244 (2015).
  53. Glasser, M.F. et al. The minimal preprocessing pipelines for the Human Connectome Project. Neuroimage 80, 105124 (2013).
  54. Mennes, M. et al. Optimizing full-brain coverage in human brain MRI through population distributions of brain size. Neuroimage 98, 513520 (2014).
  55. Andersson, J.L., Skare, S. & Ashburner, J. How to correct susceptibility distortions in spin-echo echo-planar images: application to diffusion tensor imaging. Neuroimage 20, 870888 (2003).
  56. Ug caronurbil, K. et al. Pushing spatial and temporal resolution for functional and diffusion MRI in the Human Connectome Project. Neuroimage 80, 80104 (2013).
  57. Larkman, D.J. et al. Use of multicoil arrays for separation of signal from multiple slices simultaneously excited. J. Magn. Reson. Imaging 13, 313317 (2001).
  58. Moeller, S. et al. Multiband multislice GE-EPI at 7 tesla, with 16-fold acceleration using partial parallel imaging with application to high spatial and temporal whole-brain fMRI. Magn. Reson. Med. 63, 11441153 (2010).
  59. Setsompop, K. et al. Blipped-controlled aliasing in parallel imaging for simultaneous multislice echo planar imaging with reduced g-factor penalty. Magn. Reson. Med. 67, 12101224 (2012).
  60. Feinberg, D. et al. Multiplexed echo planar imaging for sub-second whole brain FMRI and fast diffusion imaging. PLoS One 5, e15710 (2010).
  61. Jenkinson, M., Bannister, P., Brady, M. & Smith, S. Improved optimization for the robust and accurate linear registration and motion correction of brain images. Neuroimage 17, 825841 (2002).
  62. Andersson, J., Jenkinson, M. & Smith, S. Non-linear registration aka spatial normalization. in FMRIB Technical Report (Oxford University, 2007).
  63. Zhang, Y., Brady, M. & Smith, S. Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm. IEEE Trans. Med. Imaging 20, 4557 (2001).
  64. Smith, S.M. et al. Longitudinal and cross-sectional analysis of atrophy in Alzheimer's disease: cross-validation of BSI, SIENA and SIENAX. Neuroimage 36, 12001206 (2007).
  65. Patenaude, B., Smith, S.M., Kennedy, D.N. & Jenkinson, M. A Bayesian model of shape and appearance for subcortical brain segmentation. Neuroimage 56, 907922 (2011).
  66. Mugler, J.P. III. Optimized three-dimensional fast-spin-echo MRI. J. Magn. Reson. Imaging 39, 745767 (2014).
  67. Andersson, J.L. & Sotiropoulos, S.N. Non-parametric representation and prediction of single- and multi-shell diffusion-weighted MRI data using Gaussian processes. Neuroimage 122, 166176 (2015).
  68. Andersson, J.L. & Sotiropoulos, S.N. An integrated approach to correction for off-resonance effects and subject movement in diffusion MR imaging. Neuroimage 125, 10631078 (2016).
  69. Hernández, M. et al. Accelerating fibre orientation estimation from diffusion weighted magnetic resonance imaging using GPUs. PLoS One 8, e61892 (2013).
  70. Smith, S.M. et al. Tract-based spatial statistics: voxelwise analysis of multi-subject diffusion data. Neuroimage 31, 14871505 (2006).
  71. Wakana, S., Jiang, H., Nagae-Poetscher, L.M., van Zijl, P.C.M. & Mori, S. Fiber tract-based atlas of human white matter anatomy. Radiology 230, 7787 (2004).
  72. Bannister, P.R., Brady, J.M. & Jenkinson, M. Integrating temporal information with a non-rigid method of motion correction for functional magnetic resonance images. Image Vis. Comput. 25, 311320 (2007).
  73. Barch, D.M. et al. Function in the human connectome: task-fMRI and individual differences in behavior. Neuroimage 80, 169189 (2013).
  74. Woolrich, M.W., Ripley, B.D., Brady, M. & Smith, S.M. Temporal autocorrelation in univariate linear modeling of FMRI data. Neuroimage 14, 13701386 (2001).
  75. Beckmann, C.F. & Smith, S.M. Probabilistic independent component analysis for functional magnetic resonance imaging. IEEE Trans. Med. Imaging 23, 137152 (2004).
  76. Salimi-Khorshidi, G. et al. Automatic denoising of functional MRI data: combining independent component analysis and hierarchical fusion of classifiers. Neuroimage 90, 449468 (2014).
  77. Smith, S.M., Hyvärinen, A., Varoquaux, G., Miller, K.L. & Beckmann, C.F. Group-PCA for very large fMRI datasets. Neuroimage 101, 738749 (2014).
  78. Kiviniemi, V., Kantola, J.-H., Jauhiainen, J., Hyvärinen, A. & Tervonen, O. Independent component analysis of nondeterministic fMRI signal sources. Neuroimage 19, 253260 (2003).
  79. Kiviniemi, V. et al. Functional segmentation of the brain cortex using high model order group PICA. Hum. Brain Mapp. 30, 38653886 (2009).
  80. Smith, S.M. The future of FMRI connectivity. Neuroimage 62, 12571266 (2012).
  81. Smith, S.M. et al. Network modelling methods for FMRI. Neuroimage 54, 875891 (2011).
  82. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57, 289300 (1995).
  83. Genovese, C.R., Lazar, N.A. & Nichols, T. Thresholding of statistical maps in functional neuroimaging using the false discovery rate. Neuroimage 15, 870878 (2002).
  84. Sui, J. et al. A CCA+ICA based model for multi-task brain imaging data fusion and its application to schizophrenia. Neuroimage 51, 123134 (2010).

Download references

Author information

Affiliations

  1. Oxford Centre for Functional MRI of the Brain (FMRIB), University of Oxford, Oxford, UK.

    • Karla L Miller,
    • Fidel Alfaro-Almagro,
    • Saad Jbabdi,
    • Stamatios N Sotiropoulos,
    • Jesper L R Andersson,
    • Ludovica Griffanti,
    • Gwenaëlle Douaud,
    • Thomas W Okell,
    • Mark Jenkinson &
    • Stephen M Smith
  2. Department of Electrical Engineering, Brigham Young University, Provo, Utah, USA.

    • Neal K Bangerter
  3. Institute of Neurology, University College London, London, UK.

    • David L Thomas
  4. Center for Magnetic Resonance Research, University of Minnesota, Minneapolis, Minnesota, USA.

    • Essa Yacoub
  5. Icahn School of Medicine at Mount Sinai, New York, New York, USA.

    • Junqian Xu
  6. Department of Neuroradiology, University of Heidelberg, Heidelberg, Germany.

    • Andreas J Bartsch
  7. Siemens Healthcare UK, Frimley, UK.

    • Peter Weale &
    • Iulius Dragonu
  8. UK Biobank, Stockport, UK.

    • Steve Garratt,
    • Sarah Hudson &
    • Rory Collins
  9. Nuffield Department of Population Health, University of Oxford, Oxford, UK.

    • Rory Collins
  10. Division of Brain Sciences, Department of Medicine, Imperial College London, London, UK.

    • Paul M Matthews

Contributions

K.L.M., R.C., P.M.M. and S.M.S. provided the overall scientific strategy for UK Biobank brain imaging. K.L.M., N.K.B., D.L.T., E.Y., J.X., A.J.B., S.J., S.N.S., J.L.R.A., M.J., P.M.M. and S.M.S. developed acquisition protocols. N.K.B., K.L.M., T.W.O., P.W., I.D., S.G. and S.H. implemented the imaging protocol at the dedicated imaging center. F.A.-A., K.L.M., S.J., S.N.S., J.L.R.A., L.G., G.D., M.J. and S.M.S. developed post-processing pipelines and IDP calculation. K.L.M. and S.M.S. carried out the univariate and multivariate analyses and prepared figures. K.L.M. and S.M.S. wrote the manuscript, which was edited by all of the authors.

Competing financial interests

P.W. and I.D. are employees of Siemens Healthcare UK, the vendor of MRI scanners for UK Biobank, selected under a competitive bidding process.

Corresponding author

Correspondence to:

Author details

Supplementary information

Supplementary Figures

  1. Supplementary Figure 1: Asymptotic behavior of group average images and correlations with age, with increasing numbers of subjects. (960 KB)

    (a) Resting-state network group-averages formed from 5 different group sizes. Individual subjects' preprocessed resting-state data were used to generate subject-specific effect-size maps (arbitrary but consistent units of resting activity strength) of one of the low-dimensional resting-state networks (the default mode network, here shown with both positive=red and negative=blue involvement in this network, with the same color-coding and thresholding applied in all cases). These were then averaged across subjects for a range of subject numbers. Increasing n suppresses background noise as expected, and the non-noise network structure asymptotes towards a constant map as n rises over 100. Although any imperfections in functional spatial alignment across subjects limits the sharpness of the asymptotic group-average, as n is raised even further to 5000, there is no sign of a degradation (e.g., blurring) of the group-average map (compared with lower subject numbers), as expected. (b-d) Voxelwise correlation of the same resting-state network with increasing subject age, again to illustrate the statistical effect of increasing the number of subjects used. (b) The age correlation map when using 5000 subjects; with increasing age the correlations are dominantly negative, indicating a weakening of this cognitive network (r>0.1, Pcorrected<10-10). (c) The 10 voxels having the strongest (positive or negative) correlation with age, that are also at least 10mm distant from each other, are used to form 10 plots of age-correlation against number of subjects used in the correlation, from 10 to 5000 subjects. As expected, the plots asymptote, with increasing subject numbers, towards the "true" final value, with noticeable instability up to as many as 2000 subjects. (d) From the same 10 sets of correlations, the statistical significance (-log10(P)) is shown. Whereas r asymptotes towards its true final constant value with increasing n, statistical significance (for a non-null correlation) has an ever-increasing trend with increasing n. (e) Theoretical relationship between increasing statistical power and subject numbers, assuming a true correlation between any two variables of r=0.1. As n increases (here up to 7000), the number of distinct tests that will pass Bonferroni multiple-comparison correction rises to very large numbers - here up to around 1015.

  2. Supplementary Figure 2: Visualisation of 2.8 million univariate cross-subject association tests between 2501 IDPs and 1100 other variables in the UK Biobank database, showing variance explained on the y axis. (439 KB)

    Whereas the version of this plot in Figure 6 reported statistical significance (-log10(P)), here we show variance explained (r2). The relationship between P and r is not here exactly a fixed one-to-one mapping, due to the different numbers of valid (non-missing) data in different pairs of variables being tested. The Manhattan plot shows, for each of the 1100 non-brain-imaging variables, the strongest r2 association of that variable with each distinct imaging sub-modality’s IDPs. (i.e., 6 results plotted for each x axis position, each with a color indicating a brain imaging modality; this plot differs from the other Manhattan plots, which show correlations with all IDPs).

  3. Supplementary Figure 3: Visualisation of modes 1, 2 and 3 from the doubly-multivariate CCA-ICA analyses across all IDPs and non-brain-imaging variables. (1,289 KB)

    (a) Mode 1 links physical measures of body size, metabolic rate and hand grip strength to brain structure sizes and a range of dMRI-derived measures. (b) Mode 2 primarily links bone density measures to brain structure sizes, T2* levels and a range of dMRI-derived measures. (c) Mode 3 primarily links measures of body fat to T2* levels and resting-state activity fluctuation amplitudes. As seen in Supp Fig 8a, modes 1 and 2 are associated with aging (and sex), while mode 3 is not strongly associated; from Supp Fig 8b, we see that modes 1 and (more strongly) 3 are associated with hypertension.

  4. Supplementary Figure 4: Visualisation of modes 4, 5 and 6 from the doubly-multivariate CCA-ICA analyses across all IDPs and non-brain-imaging variables. (1,432 KB)

    (a) Mode 4 links cardiac measures (including heart rate) to resting-state amplitudes and connectivities (rfMRI summary images are shown larger in Supp Fig 5). Observing these 3 types of measures linked together suggests that the apparent change in functional connectivity in this mode likely reflects changes in vascular processes rather than underlying neural connectivity. (b) Mode 5 links a range of lifestyle and biophysical measures (most strongly alcohol intake and smoking, red blood cell and cardiac measures) to T2* subcortical intensity (e.g., iron deposition) and the resting-state amplitudes of many brain areas (rfMRI summary images shown larger in Supp Fig 6). (c) Mode 6 links early-life measures (birth weight and breast feeding) along with several other physical and lifestyle measures to many imaging measures of both diffusivity and functional connectivity (rfMRI summary images are shown larger in Supp Fig 7).

  5. Supplementary Figure 5: More detailed visualization of the rfMRI summary measures (resting-state fluctuation amplitudes and functional connectivity) from CCA-ICA mode 4. (581 KB)
  6. Supplementary Figure 6: More detailed visualization of the rfMRI summary measures (resting-state fluctuation amplitudes and functional connectivity) from CCA-ICA mode 5. (363 KB)
  7. Supplementary Figure 7: More detailed visualization of the rfMRI summary measures (resting-state fluctuation amplitudes and functional connectivity) from CCA-ICA mode 6. (349 KB)
  8. Supplementary Figure 8: Associations of the nine CCA-ICA modes with confounds and other variables of interest. (319 KB)

    (a) As with the univariate analyses, data fed into the multivariate analyses were first adjusted for parameters that might otherwise induce apparent relationships based on potentially non-interesting factors (age, sex, head size, head motion). Here we show how, by projecting the CCA-ICA modes back onto the original data, we can estimate how strongly the discovered modes relate to these parameters. For example, modes 1,2,4,5,7,8 are associated with aging, whereas 3,6,9 are not strongly associated with age. Considering this in the light of the fact that all data input to the CCA had been age-adjusted suggests that, while these modes reflect meaningful biological processes related to aging, their identification here was not driven by trivial corruption of IDPs by aging (e.g., reduced fMRI signal due to cortical thinning). (b) Correlation of the CCA-ICA modes against several variables of interest, including some clinical outcomes (for which, at this stage, numbers are naturally quite limited). (c) Scatterplot of CCA-ICA modes 1 vs. 8 (one point per subject), showing the associations between these modes, age and sex. Colors running from green to red indicate increasing age in females; colors running from blue to pink indicate increasing age in males.

PDF files

  1. Supplementary Text and Figures (18,514 KB)

    Supplementary Figures 1–8 and Supplementary Table 1

  2. Supplementary Methods Checklist (410 KB)

Additional data