Introduction

Brain development is a dynamic process that follows a well-defined trajectory during childhood and adolescence. Variations in neurodevelopmental trajectories over this critical period have direct consequence for adult functioning and mental health. During this formative period the brain undergoes profound change: increases in brain size, most rapid after birth, continue into late adolescence1; myelination processes that begin in utero continue to progress through to the second decade of life2, and synaptic pruning leads to significant reductions in synaptic density during early adolescence3. Magnetic resonance imaging (MRI) provides the opportunity to study brain development and track these developmental processes in vivo.

Analyses of structural MRI have found that grey and white matter volumes follow different trajectories during adolescence. Cortical grey matter volume peaks in childhood, then gradually decreases during adolescence4,5, whereas white matter volume increases throughout childhood and adolescence6,7. These observations were recently confirmed across four independent samples, where Tamnes et al. observed consistent developmental trajectories characterised by a decreasing cortical thickness with increasing age and childhood increases in surface area followed by subtle decreases during adolescence8. Evidence suggests that sexual dimorphism may also play a role in cerebral development. Sex differences have been observed in developmental studies of cortical thickness9 and sex-by-age interactions in area and folding have been reported in frontal and temporal cortex in adolescence10,11. These differences may result from delays in development along similar trajectories, the timing of which can coincide with pubertal onset12,13, although the accurate definition and timing of developmental peaks is difficult and may be at risk to potential model or sample biases14,15. Any differences between the sexes are likely subtle; male brains are larger throughout development16, and regional, volumetric estimates of sex differences can be diminished once corrected for global differences in scale4.

Taken together, these studies present a consensus view of typical cerebral development. However, longitudinal studies of healthy populations have shown that individual development can deviate significantly from these canonical trajectories15, highlighting the need to create models of typical growth and development of the brain that can be applied on an individual level. Establishing developmental trajectories for typical cerebral development is vital to our understanding of maturational brain change and may provide a more accurate understanding of relationships between brain maturation and behavioural phenotypes during development.

MRI data can generally be considered high-dimensional: a single image, or cortical surface, may comprise over 250,000 voxels, or vertices, many of which are highly correlated and interrelated. Manifold learning refers to a suite of dimension reduction techniques based on the intuition that high-dimensional datasets (such as MRI) reside on an embedded low-dimensional manifold or subspace. The aim is to learn a mapping between the high- and low-dimensional data representations while preserving certain statistical properties (e.g.: variance, community structure) of the original data. Such methods are well established in computer vision problems17,18,19,20 and bioinformatics21,22,23 to deal with high-throughput data and can allow for an easier and more intuitive interpretation of important model features while retaining the underlying nonlinear relationships present between individual datapoints in the original dataset.

One such example, Neighbourhood Preserving Embedding (NPE) aims to preserve the local neighbourhood structure of the data once projected onto the manifold17. That is, communities, patterns, or between-group differences that exist among datasets in the high-dimensional setting are maximally preserved within the low-dimensional subspace. Local neighbourhoods are typically defined based on the Euclidean distance between samples in the high-dimensional image space, however it is possible to introduce other constraints in order to conduct NPE in a supervised setting (Fig. 1)17,24. For example, restricting local neighbourhoods to only include subjects from the same class, or diagnostic group, to enhance separation in the low-dimensional space.

Figure 1
figure 1

Neighbourhood preserving embedding. (A) For a given datapoint, \({X}_{i}\), nearest neighbours are selected and weights assigned that can be used to approximately reconstruct \({X}_{i}\). A linear transformation, P, is then sought to project the data into a low-dimensional space while preserving the neighbourhood structure. (B) Possible supervision strategies for neighbourhood construction. In an unsupervised setting, neighbours are selected based on image similarity alone; alternatively, neighbours can be selected from within- or between-classes in order to maximise/minimise group differences in the manifold structure. Similarly, neighbours can be selected based on the weighted combination of image similarity and that of another subject-specific attribute (e.g.: age). (C) Analysis pipeline for NPE analysis. For each image metric, NPE is used for subspace projection, before the embedded data are combined and passed on for statistical modelling.

Manifold learning methods have great potential to aid in the analysis and understanding of high-dimensional MRI datasets25,26,27 with recent examples demonstrating improved discrimination between neurological disease states28, mapping of developmental trajectories in the newborn brain29, and detection of white matter lesions30. Dimension reduction leverages the redundancy often present in high-dimensional data, acting as a form of statistical regularisation and reducing the risk of overfitting models when the number of samples is much less than the number of model features. Importantly, these methods can serve to highlight latent patterns within the dataset that align with variables of interest such as age, sex or clinical diagnosis.

Recently, studies have shown that it is possible to accurately predict an individual’s age31,32,33,34,35 or sex36,37 based on MRI images alone. These methods involve the use of modern multivariate techniques to extract informative morphological features to act as neuroanatomical markers of age or sex in a predictive model. In this study, we aim to build on these observations, introducing supervised NPE as a method to isolate structured patterns of covariance within populations, and simultaneously modelling typical neuroanatomical variation due to age and sex during development. Using a multi-metric approach, we combine measures of tissue volume, cortical area and cortical thickness to build a model that predicts age and sex with high accuracy and generalises well to other developmental populations.

Results

For every participant (n = 768, age 3.0–21.0 y), we constructed maps of brain tissue volume, cortical thickness and cortical area from T1-weighted MRI (see Methods). For each image metric (volume, thickness and area), we projected the full image data matrix to a 3 dimensional manifold using NPE based on 10 nearest-neighbours (Fig. 1). In order to maximally preserve age- and sex-related variation in the embedded data, we incorporated participant attributes into the construction of the weighted neighbourhood matrices (Fig. 1B; see Methods).

Figure 2 shows manifold structure visualised for tissue volume (Fig. 2A), cortical area (Fig. 2B) and cortical thickness (Fig. 2C) calculated in the PING cohort. For each metric, an orthogonal rotation was applied to each manifold to maximise correlation with age and sex along the first and second axes respectively. The images show the (standardised) weight of the embedding vectors (model coefficients) used to project new data to the rotated manifold, thus important features are represented by a larger weight, and increases in volume, thickness or area in regions with high positive weight will result in a positive increase along the respective embedding axis and vice versa.

Figure 2
figure 2

Manifold structure for tissue volume, cortical area and cortical thickness. Manifold structure is visualised for tissue volume (A), cortical area (B) and cortical thickness (C). For each image metric, the first two embedding coordinates are plotted against each other. Each point represents a subject; the colourbar indicates age and markers denote sex (square: male; circle: female). Images show the embedding vectors for the first and second coordinates, i.e.: the model coefficients in each voxel required to transform data into the embedded subspace. Maps are Z-scored for comparison (colourbar).

For all three modalities, age increases along the first embedding coordinate, indicated by the gradation of colour along the first axis of the scatter plots Increasing age (a positive embedding coordinate) is associated with a neuroanatomical pattern represented by relatively increased tissue volume in the cerebellum, brain stem and thalamus (black arrows; Fig. 2A), and ascending white matter tracts subjacent to the primary motor cortex (positive image weights), with relative decreases in medial frontal and parietal cortices (negative image weights, Fig. 2A). In the cortex, age is associated with increasing surface area in the insula and anterior cingulate cortex (black arrows, Fig. 2B).

Separation between sexes is shown along the second dimension (squares and circles, Fig. 2). This is associated with a distributed, discriminant pattern of tissue volume alterations with increases in medial posterior cingulate (black arrow; Fig. 2A) and primary visual cortex, and brainstem, and relative decreases in the basal ganglia, frontal pole, and cerebellum (negative weights, black arrow) associated with male sex. Separation along the second dimension was also associated with regions of increased surface area in medial temporal and orbitofrontal cortices (black arrow, Fig. 2B), and decreased cortical thickness in the anterior insula (arrow, Fig. 2C).

Figure 3A shows the joint manifold structure visualised after concatenating all three image metrics and performing a final dimension reduction on the concatenated coordinate data, \({Y}_{c}=({Y}_{v},{Y}_{t},{Y}_{a})\) using PCA29. This demonstrates how individual variation associated with age and sex during development can be captured along two orthogonal dimensions in this population.

Figure 3
figure 3

Age and sex prediction with manifold embedding. (A) The first two coordinates of the joint manifold are shown, each point represents a subject; the colourbar indicates age and markers denote sex (square: male; circle: female). (B) Using 10-fold cross-validation, age and sex were predicted based on the concatenated manifold coordinates. Gaussian Process regression was used to predict age, shown plotted against chronological age (colourbar shown as in A). (C) Predicted class probabilities are shown for males (blue histogram) and females (yellow). (D) Predicted age error is shown for each site in PING separately. (E) Sex classification accuracies for each site in PING.

Model accuracy

Using 10-fold cross-validation, we predicted age in the PING cohort with a mean absolute error (MAE) of 1.54 years (correlation between chronological and predicted age = 0.926; Fig. 3B). Using a linear discriminant classifier, we predicted sex with an accuracy of 80.9% (Fig. 3C).

Our model was robust to site variation in the PING cohort: performing NPE without the additional site constraint did not affect the prediction accuracies (MAE = 1.49 y, accuracy = 80.0%) and there was no significant association between acquisition site and absolute age estimation error (ANOVA: F6,761 = 1.95, p = 0.07; Fig. 3D). Classification accuracy was similar across sites, ranging from 75.0–85.9% (Fig. 3E).

We repeated this analysis, accounting for differences in global scale by correcting voxel-(vertex-)wise measures of tissue volume, cortical surface area and thickness metrics for intracranial volume, total surface area and mean cortical thickness, respectively. After correcting for global scale, we achieved a cross-validated MAE of 1.79 years, and an accuracy of 71.4% in the PING cohort (Fig. 4).

Figure 4
figure 4

Age and sex prediction after correction for global scaling. (Top) Mean absolute error for the full manifold and for each imaging metric are shown before and after correction for individual differences in global scale. (Bottom) Corresponding sex classification accuracies for the full manifold and for each imaging metric.

Contribution of each metric

To assess the individual contribution of each set of image metrics to the final model, we also performed the analysis using only tissue volume, cortical thickness or cortical area data. We found that the joint model combining all three metrics outperformed single metric models for both age and sex prediction (tissue volume only: MAE = 1.69, accuracy = 71.9%; area: MAE = 2.77, accuracy = 72.7%; thickness: MAE = 2.07, accuracy = 71.5%). Figure 4 shows MAE and classification accuracies for each tissue metric with and without correction for variations in global scale.

After correcting for variation due to global scale, the ability to discriminate between sexes was reduced for tissue volume (classification accuracy = 58.3%) and cortical area (52.7%), but not cortical thickness (70.1%). In contrast, global scaling significantly increased error in age prediction estimates based on cortical thickness (MAE = 4.62 y), but had little impact on models using tissue volume (1.66 y) or cortical area (2.96 y).

Associations between model accuracy and age

To determine if model accuracy varied with age, we partitioned our data into 10 approximately equal-sized bins and calculated MAE and classification accuracy in each (Table 1). Age prediction error ranged from a minimum of 1.23 y at around 9 years of age, to a maximum of 2.55 y in the oldest participants (mean age = 20.3 y). In contrast, classification accuracy ranged from 69% to 87%, with discrimination lowest in the youngest participants (mean age = 4.5 y) and highest at around 16 years. After correction for global scale, the minimum MAE was 1.50 (mean age 11.7 y; maximum MAE = 2.9, mean age 22.3 y), and the lowest and highest classification accuracies were 65.3% in the youngest group, and 80.3% at 16 years.

Table 1 Age and sex prediction accuracy at different ages.

Associations between model accuracy and cognition

In order to determine if deviations from the average developmental trajectory of the brain coincided with differences in cognitive performance we compared predicted age errors (the difference between age estimated from MRI using the above model and true, chronological age) with cognitive scores in PING.

In the PING cohort, no significant associations were found between NTCB scores (corrected for age, sex, socioeconomic status and genetic ancestry) and predicted age error after correcting for multiple comparisons (Table 2).

Table 2 Associations between predicted age error and cognitive score in PING.

Alternative parameter settings

Prediction accuracies in the PING cohort were robust to altering the number of neighbours, k, used in manifold construction (k = 5: classification accuracy = 80.9%, MAE = 1.51 y; \(k\) = 20: accuracy = 79.7%; MAE = 1.53 y), and the number of manifold dimensions, d (d = 5: accuracy = 80.1%, MAE = 1.59 y; \(d\) = 10: accuracy = 81.1%, MAE = 1.59 y).

Finally, we found that using NPE for subspace projection outperformed dimension reduction with PCA (MAE = 1.61 y, accuracy = 71.2%; correcting for global scaling, MAE = 1.91 y, accuracy = 54.4%).

External model validation

To test the external validity of this modelling approach, we projected imaging data acquired from typically developing controls in two independent populations (ABIDE and ABIDE-II) onto a low-dimensional manifold constructed from the PING dataset. Using this approach, we predicted age in ABIDE with an MAE of 1.65 years (correlation: 0.825; Fig. S1A), and sex with an accuracy of 80.0% (Fig. S1B; 1.87y and 71.2% after correction for global scaling). We achieved similar results in ABIDE-II (Fig. S1; MAE = 1.54, correlation: 0.817; accuracy = 80.2; 1.80 y and 70.6% after correction for global scaling). In contrast to the PING data, a main effect of site was evident in both ABIDE and ABIDE-II age prediction error (F16,407 = 1.86, p = 0.02; F14,424 = 16.56, p < 0.001; Supplemental Information; Fig. S1C,D). When considering MAE and accuracy over sites in ABIDE and ABIDE-II, we find a positive correlation between mean cohort age and MAE (i.e: age prediction is worse in sites with older participants; ABIDE r = 0.23, ns; ABIDE II r = 0.78, p < 0.001), however, as in the PING cohort, we see an improvement in sex prediction accuracy with increasing age (ABIDE r = 0.72, p < 0.001; ABIDE-II r = 0.22 ns).

Discussion

In this study, we use manifold learning to generate a parsimonious description of typical brain development during childhood and adolescence. By combining measures of tissue volume, cortical thickness and cortical area, we show how patterns of anatomical variation can be used to accurately predict age and sex between the ages of 3 and 21 years. We show that this model is not strongly affected by site-to-site variation in image acquisition and yields accurate predictions across different study populations.

We demonstrate that supervised NPE can be used to predict biological age from MRI with a mean error of around 1.5 years. This is in line with previous reports in this population. Using a set of 231 pre-selected, image-based features from T1, T2 and diffusion-weighted MRI, Brown et al. developed a nonlinear model of cerebral maturation to predict age in the PING cohort, achieving an MAE of 1.03 years31. Using just T1-weighted image features, Brown et al. reported an MAE of 1.71 y, comparable to the MAE reported in this study. Using a similar approach, Franke et al. reported an MAE of 1.1 years in a cohort aged 5 to 1838. Using nonlinear mapping functions (i.e.: kernels) in machine learning allows for the use of linear methods to discover highly nonlinear boundaries or patterns in the original data by creating an implicit feature space39. While flexible, a limitation of these methods is the inability to identify important features in the space of the original dataset. An alternative approach may be to use regularised linear regression across all cortical regions, although this method depends upon an initial cortical parcellation35. By calculating a linear mapping between the original, high-dimensional data and the low-dimensional embedded manifold, NPE produces a set of basis vectors that – through linear combination – can approximately reconstruct the original dataset, capture nonlinear relationships and provide interpretable voxel-(vertex-)wise maps of feature importance17.

We show examples of these vectors in Fig. 2 for each image metric. The images represent distributed patterns of neuroanatomical variation that highlight regional importance within the model. As reported previously, increasing age is reflected a lower-to-higher-order trajectory characterised by reduced tissue volume in frontal and parietal regions, coupled with increased white matter tissue and brain stem volume6,40,41. Increasing age was also associated with increased cortical surface area most prominent in the insula and cingulate cortex. This agrees with previous reports of high rates of cortical surface area expansion during childhood in regions associated with higher-order intellectual function42,43.

We also find that male sex is predicted by a pattern of neuroanatomical variation including increased brain tissue volume in the posterior cingulate and occipital lobe, volumetric decreases in the basal ganglia and insula, alongside a pattern of reduced cortical surface area in medial frontal regions and decreased cortical thickness in the insula. We highlight that the manifold embedding coordinates defined by NPE are orthogonal by construction; as such, the patterns shown in Fig. 2 reflect anatomical variation independently associated with age and sex during development.

Sexual dimorphism during development is a contentious issue. Developmental trajectories for cortical grey and white matter appear similar between sexes4,6 with perceived sex differences often assigned to variation in physical size1,44. In a longitudinal study of 387 subjects aged 3 to 27, Lenroot et al. reported increased frontal grey matter volume in females and increased occipital white matter in males, after accounting for brain size13. Conversely, Sowell et al. reported thicker parietal and posterior temporal cortex in females, independent of age9. These discrepant findings may reflect the different timing of puberty or the differential effects of testosterone on brain development males and females45. In addition, male brains are larger than females throughout development16, and correcting for differences in global scale can diminish perceived sex differences in regional, cerebral volumes4. After correcting for global differences in intracranial volume, total surface area and mean cortical thickness, we were still able to achieve relatively accurate predictions of age and sex across development, although both were maximised with the inclusion of global scaling information.

Concerning differences in global scale that vary with age or sex, we report accuracies of both corrected and uncorrected metrics, using ICV, total surface area and mean cortical thickness as covariates. Previous developmental studies have used ICV, or alternatively brain tissue volume (specifically: parenchymal volume, excluding ventricular CSF) to correct regional measures of volume, both of which have advantages and disadvantages15. As a consensus on the use of ICV or brain tissue volume to correct volumetric measures is lacking4, we report uncorrected and ICV-corrected accuracies, though we note that brain tissue volume and ICV correlated strongly in the PING cohort (r = 0.87) and our model performance was not affected by the use of brain tissue as a covariate instead of ICV.

Importantly, our study confirms recent reports that multivariate analyses that consider whole-brain patterns of variation in brain morphometry can reliably and accurately discriminate sex, even if large within-class, or between region, variability exists36,37. We also show that this dimorphic pattern is evident even at very young ages, achieving a classification accuracy of around 69% (65% after global correction) in the youngest participants (aged 3–5 y). It is important to consider that this finding does not imply that all females have e.g.: a smaller posterior cingulate than all males, or even that the cingulate is on average smaller in females across populations. We suggest that this pattern of variation is one of a number (including the pattern of age-related variation described above) that exist concurrently within a population. An individual’s anatomical phenotype can then be viewed as arising from the weighted expression of these patterns, with the weight dependent on e.g.: age, sex, environmental or genetic factors. As such, tissue volume or cortical thickness, measured at a single point reflects the combination of multiple, distributed patterns of variation and, as such, may vary greatly over a population and not necessarily in line with a given covariate of interest (e.g.: sex).

In contrast to previous reports, we do not find strong evidence that deviation from a predicted developmental trajectory corresponds to adverse functional or behavioural outcome in healthy individuals. Previously, the difference between chronological age and age predicted from neuroimaging has been framed as an index of accelerated or delayed development or aging, depending on the direction of the discrepancy46. Recently, Erus et al. reported that a significant increase in predicted brain age compared to chronological age in childhood (predicted by a multi-modal MRI assessment) was indicative of precocious cognitive development (and vice versa), suggesting that complex cognitive phenotypes could be captured as variation along a single dimension of brain development33. In this study, we did not find any statistically significant associations between measures of cognitive function and age prediction error in typically-developing individuals. In the PING cohort, Akshoomoff et al., found that age, sex, socioeconomic status and genetic ancestry explained between 57% and 73% of variance in each of the NCTB scores47. After correcting for these factors, we found that brain age estimation error explained at most 0.7% of the remaining variance (in memory and pattern comparison tests).

This suggests that model error in age prediction in this context does not reflect the impact of an underlying latent variable associated with cognition. This discrepancy is likely due to differences in model construction between methods. Here, we use NPE to extract a single dimension of anatomical variation that aims to maximally preserve specific age-related structure in the data. As such, the reported pattern may lie orthogonal to neuroanatomical correlates of (non age-related) cognitive performance and as mentioned above, such patterns, while coexistent within the population, will not vary as a function of each other. Indeed, once corrected for age-related variation, measures of cognition will, by definition, lie orthogonal to age, and thus age-specific neuroanatomical variation. Other methods that predict age based on the appearance of the brain as a whole, might better reflect the conflation of cognitive and age-related ‘components’ during development, such that model error captures variation in anatomy aligned with cognition33.

We found that age prediction accuracy decreased slightly with increasing age. In the PING cohort, MAE was lowest at around 9 years, and highest at 20. Other studies have found similar discrepancies35,48 and this may indicate the increasing difficulty of discrimination between individuals in early adulthood compared to childhood and adolescence, suggesting that image-based prediction is most accurate during the periods when the rate of anatomical change is greatest1,31.

In an effort to assess the external validity of this approach, we tested the accuracy of the NPE model constructed in the PING dataset to predict age and sex in a comparative population. Overall, we found this model was able to age and sex to the same level of accuracy in the ABIDE cohorts (Fig. S1A,B). We also observed a trend towards a model-based underestimation of age in the older participants. However, although no effect of site was found in the PING cohort, there was a significant effect of site on accuracy in both ABIDE and ABIDE-II (Fig. S1). This effect appeared to be compounded by the uneven distribution of age across sites, with the mean age within site ranging from 8.15 years to 20.75 years (Table S2), and the older sites showing significantly poorer model performance (e.g.: site p in ABIDE; sites a and k in ABIDE-II; Fig. S1).

Our findings suggest that the anatomical maturation of the brain during childhood and adolescence can be accurately modelled within a low-dimensional subspace. That is, variation along two axes is sufficient to capture individual variations due to age and sex within a population with relatively high accuracy. In contrast, functional or cognitive development is not well represented by variation along these axes. This suggests that additional, orthogonal dimensions of development are required to more accurately model individual trajectories. Alternatively, this approach may benefit from incorporating information from additional imaging modalities (e.g.: functional MRI) in order to more fully capture phenotypic variation associated with cognitive development33,49.

Subspace projection methods, such as NPE, bring focus to reliable and robust patterns that can predict phenotypic characteristics based on the brain’s shape and appearance. Global, linear methods including PCA and ICA have proven highly effective in identifying patterns in MRI data analysis27,50,51. However, as PCA seeks to preserve global properties of the data it may be dominated by sources of confounding variance, e.g.: site-to-site variation52. Alternatively, nonlinear embedding methods can often achieve highly-accurate embeddings based on local geometry, allowing visualisation of complex geometry in low-dimensional space but can not be easily applied to new, unseen datapoints19,53. NPE somewhat combines these approaches, calculating a linear embedding that allows the projection of unseen data into a previously defined subspace, learned from local neighbourhood geometry. In a clinical setting, this framework could be extended to explore anatomical patterns underlying developmental or neuropsychiatric disorder, or stratifying clinical populations by locating individuals within clusters based on the expression of different neuroanatomical imaging components. Indeed, combining functional, diffusion and possibly genetic information into a larger manifold framework and considering similarities over multiple modalities to model local neighbourhoods and communities within large datasets could provide a more complete model of individual variation during this time period. In addition, the projection of longitudinal data onto the manifold could enable individuals to be tracked over time, an important consideration for developmental studies15. With increasing sample sizes from large-scale imaging studies, there is also great potential for the application of deep learning to such problems54,55.

In summary, we present a framework for modelling anatomical development during childhood. This model accurately predicts age and sex based on image-derived markers of cerebral morphology and generalises well to independent populations.

Methods

Imaging data

To model typical neurodevelopment, 3 Tesla, T1-weighted MRI data were obtained from the PING Study56. The PING cohort comprises a large, typically-developing paediatric population with participants from several US sites included across a wide age and socioeconomic range. The human research protections programs and institutional review boards at all institutions (Weil Cornell Medical College, University of California at Davis, University of Hawaii, Kennedy Krieger Institute, Massachusetts General Hospital, University of California at Los Angeles, University of California at San Diego, University of Massachusetts Medical School, and Yale University) participating in the PING study approved all experimental and consenting procedures, and all methods were performed in accordance with the relevant guidelines, regulations and PING data use agreement31. Written parental informed consent was obtained for all PING subjects below the age of 18 and directly from all participants aged 18 years or older. Exclusion criteria included: a) neurological disorders; b) history of head trauma; c) preterm birth (less than 36 weeks); d) diagnosis of an autism spectrum disorder, bipolar disorder, schizophrenia, or mental retardation; e) pregnancy; and f) daily illicit drug use by the mother for more than one trimester56. Similar proportions of males and females participated across the entire age range.

The PING cohort included 1493 participants aged 3 to 21 years, of whom 1249 also had neuroimaging data. Of these, n = 773 were available to download from NITRC (https://www.nitrc.org). T1 images were acquired using standardized high-resolution 3D RF-spoiled gradient echo sequence with prospective motion correction (PROMO) at each site, with pulse sequences optimized for equivalence in contrast properties across scanner manufacturers (GE, Siemens, and Phillips) and models56.

In addition to imaging data, participants undertook comprehensive behavioural and cognitive assessments (NIH Toolbox Cognition Battery, NTCB47; and provided a saliva sample for genome-wide genotyping. The NTCB comprises seven tests (Flanker, Picture Sequence, List Sorting, Picture Vocabulary, Reading, Dimensional Change Card Sorting, Pattern Comparison) that measure abilities across six major cognitive domains, including cognitive flexibility, inhibitory control, and working memory47.

External validation data

For model validation, a comparative neurodevelopmental population was obtained from the ABIDE and ABIDEII datasets57,58. These datasets represent a consortium effort to aggregate MRI datasets from individuals with autism spectrum disorder and age-matched typically-developing controls. Contributions per site ranged from 13 to 105 typically-developing participants per site chosen as matched controls for the ASD population at each site. For both studies, 3 Tesla, T1-weighted MRI were acquired from 17 sites; images and acquisition details are available at http://fcon_1000.projects.nitrc.org/indi/abide. All participating sites received local Institutional Review Board approval for acquisition of the contributed data.

For further details, see Supplemental Information.

Image processing

Quality control assessment for the PING data is detailed in Jernigan et al.56. In brief, images were inspected for excessive distortion, operator compliance, or scanner malfunction. Specifically, T1-weighted images were examined slice-by-slice for evidence of motion artefacts or ghosting and rated as acceptable, or recommended for re-scanning. After additional, on-site, visual quality control assessment, we removed a further 5 participants, resulting in a final sample of n = 768 (mean age = 12.3 y; range: 3.2–21.0 y; 404 male). Site-specific demographic data are shown in Table S1.

For all subjects, vertex-wise maps of cortical thickness and cortical area (estimated on the white matter surface) were constructed from T1 MRI with FreeSurfer 5.3 (http://surfer.nmr.mgh.harvard.edu). Briefly, this process includes removal of non-brain tissue, transformation to Talairach space, intensity normalisation, tissue segmentation and tessellation of the grey matter/white matter boundary followed by automated topology correction. Cortical geometry was matched across individual surfaces using spherical registration59,60,61,62. Any images that failed initial surface reconstruction, or returned surfaces with topological errors, were manually fixed using white matter mask editing and re-submitted to Freesurfer until all datasets passed inspection.

In addition, whole-brain tissue volume maps were estimated using deformation-based morphometry63,64. Each participant’s T1 image was intensity normalised, corrected for bias field inhomogeneities and aligned to MNI 152 space using diffeomorphic nonlinear registration (ANTs)65,66. Transformed images were visually inspected to ensure alignment to the template images and voxel-wise maps of volume change induced by the nonlinear deformation were characterised by the determinant of the Jacobian operator, referred to here as the Jacobian map. Each map was log-transformed so that values greater than 0 represent local areal expansion in the subject relative to the target and values less than 0 represent areal contraction.

Prior to analysis, both tissue volume maps and cortical thickness and area maps were smoothed with a Guassian kernel of 10 FWHM.

Manifold learning

The aim of Neighbourhood Preserving Embedding is to calculate a linear transformation, P, to map a high-dimensional \(n\times D\) dataset \(X=\{{X}_{1},{X}_{2},\cdots ,\,{X}_{n}\}\) into a low-dimensional n × d subspace \(Y=\{{Y}_{1},\,{Y}_{2},\,\cdots ,{Y}_{n}\}\) where \(d\ll D\) and \(Y={P}^{T}X\) while preserving the local neighbourhood structure of the data17. The process is illustrated in Fig. 1. For a given data point X i , an adjacency matrix is first constructed, placing an edge between X i and X j only if X j belongs to the set of \(k\) nearest neighbours of X i . Following this, a set of weights, W, is calculated that approximately reconstruct X i from its neighbours and a linear projection, P, sought to optimally preserve this structure in the low-dimensional space, Y 17. NPE is closely related to Locally Linear Embedding19, but one of the major benefits that NPE confers is that the solution generalises to new datapoints, allowing unseen data to be projected onto the manifold without re-calculating the embedding.

The analysis pipeline used in this study is shown in Fig. 1C. For each image metric (tissue volume, cortical thickness, cortical area), data were first mean-centred and projected to an orthogonal subspace via singular value decomposition (SVD), while retaining 95% of variance, to reduce computational complexity and avoid overfitting. NPE was then performed using k = 10 neighbours, projecting data to \(d=3\) dimensions. In order to maximally preserve age- and sex-related variation in the embedded data, we incorporated participant attributes into the construction of the adjacency matrices. Nearest neighbours were selected based on the product of two adjacency matrices, A and \(a\), where the \({(i,j)}^{th}\) element of each matrix represents the (normalised) similarity between images, \(A\), and age, \(a\), of subjects \(i\) and \(j\), respectively and \({A}_{i,j}=0\), if \({S}_{i}\ne {S}_{j}\), where S indicates the sex of the participant. In order to reduce potential bias in image similarities due to site effects, we also introduced an additional constraint: \({A}_{i,j}=0\), if \({s}_{i}={s}_{j}\), where \(s\) indicates the site/scanner of image acquisition, although this had little effect on the final embedding.

This resulted in three sets of coordinates, \({Y}_{v}\), \({Y}_{t}\), and \({Y}_{a}\), representing the low-dimensional embedding of tissue volume, \(v\), cortical thickness, \(t\), and area, \(a\), data. We concatenate the embedding coordinates to produce a final low-dimensional representation of the combined, multi-metric image data for statistical analysis.

Statistical analysis

Internal validity of the model was assessed using 10-fold cross-validation. We used 90% of the PING participants as a training set, calculating the manifold embedding coordinates for each imaging modality and combining them into a single representation. To predict age, we used a Gaussian Process Regression (GPR) model with the concatenated manifold coordinates as features and age as a dependent variable67. To predict sex, the combined coordinate set was sent to a linear discriminant classifier. Image data from the remaining 10%, the test set, were then projected onto the joint manifold and the fitted models used to predict age and sex. Mean absolute error in age estimation (MAE) and correlation between true age and predicted age are reported, alongside classification accuracies for sex. This process was repeated for each fold, reconstructing the manifold each time, such that all PING participants were part of the test set exactly once.

External validity was assessed by projecting the combined ABIDE and ABIDE-II datasets57,58 onto a manifold constructed from the full PING dataset and predicting age and sex using models trained on the PING data (see Supplemental Information).

To determine if errors in image-derived age estimation correlated with cognitive performance, we performed a further set of analyses using available cognitive data. Of the PING dataset, n = 617 had complete records for NTCB score, family socioeconomic status (household income and parental education), and genetic ancestry56,68. NTCB scores were corrected for age, sex, socioeconomic status and GAF47 and linear regression used to determine associations between age estimation error (based on cross-validated age predictions) and corrected cognitive scores.

All statistical analysis was performed in Matlab R2105b (Natick, MA).

Data and code availability

PING data are available from http://pingstudy.ucsd.edu subject to a data usage agreement.

ABIDE data are available from http://fcon_1000.projects.nitrc.org/indi/abide.

The original Matlab code to perform NPE and linear graph embedding is available from http://www.cad.zju.edu.cn/home/dengcai/Data/code/.

In addition, we have made available example code to perform supervised NPE at: http://developmentalimagingmcri.github.io.