Parcellation of the neonatal cortex using Surface-based Melbourne Children’s Regional Infant Brain atlases (M-CRIB-S)

Longitudinal studies measuring changes in cortical morphology over time are best facilitated by parcellation schemes compatible across all life stages. The Melbourne Children’s Regional Infant Brain (M-CRIB) and M-CRIB 2.0 atlases provide voxel-based parcellations of the cerebral cortex compatible with the Desikan-Killiany (DK) and the Desikan-Killiany-Tourville (DKT) cortical labelling schemes. This study introduces surface-based versions of the M-CRIB and M-CRIB 2.0 atlases, termed M-CRIB-S(DK) and M-CRIB-S(DKT), with a pipeline for automated parcellation utilizing FreeSurfer and developing Human Connectome Project (dHCP) tools. Using T2-weighted magnetic resonance images of healthy neonates (n = 58), we created average spherical templates of cortical curvature and sulcal depth. Manually labelled regions in a subset (n = 10) were encoded into the spherical template space to construct M-CRIB-S(DK) and M-CRIB-S(DKT) atlases. Labelling accuracy was assessed using Dice overlap and boundary discrepancy measures with leave-one-out cross-validation. Cross-validated labelling accuracy was high for both atlases (average regional Dice = 0.79–0.83). Worst-case boundary discrepancy instances ranged from 9.96–10.22 mm, which appeared to be driven by variability in anatomy for some cases. The M-CRIB-S atlas data and automatic pipeline allow extraction of neonatal cortical surfaces labelled according to the DK or DKT parcellation schemes.

brains 15 . Recently, we introduced the Melbourne Children's Regional Infant Brain (M-CRIB) atlases 16,17 , which are neonatal-specific, voxel-wise brain atlases. The cortical parcellations were constructed to be compliant with the DK 6 and DKT 7 adult cortical parcellation schemes. Each of the M-CRIB atlases are comprised of 10 neonatal brains that have been extensively manually parcellated to accurately reflect brain structures at this time-point. Parcellation of new data has been achieved using multi-atlas label fusion algorithms that probabilistically assign labels to each voxel after warping the set of parcellated atlases to a novel dataset 18,19 .
While accurate labelling can be achieved using voxel-based parcellation schemes 16,18,20 , surface-based registration methods lead to significantly improved alignment of cortical landmarks, including cortical folds, and therefore more accurate placement of areal boundaries 21,22 .
The Developing Human Connectome Project (dHCP) has recently provided a pipeline for segmentation and cortical extraction for T 2 -weighted images of neonatal brains 20 , along with an infant cortical surface-based atlas comprising the major lobes 23 . This process segments brain tissue into cerebral and cerebellar grey and white matter, and various subcortical grey matter structures, before extracting an inner and outer cortical surface which is automatically partitioned into lobes. These existing cortical surface extraction tools can be expanded upon by incorporating our M-CRIB brain atlases to provide accurate fine-grained cortical surface-based parcellations, avoiding projection of a non-surface parcellation or a static template. Thus, measures such as cortical thickness, surface area, and curvature for each cortical sub-division of the M-CRIB atlases can subsequently be derived for infant MRI scans.
In this study we aimed to provide neonatal average surface templates and surface-based cortical atlases based on the M-CRIB and M-CRIB 2.0 parcellation schemes, that are compatible with FreeSurfer and the dHCP pipelines 2,5,8 . Additionally, we aimed to provide companion scripts to perform cortical surface extraction, surface registration and atlas-based cortical parcellation using novel neonatal T 2 -weighted brain images. Given the compatibility of the neonatal M-CRIB-S parcellation schemes with the adult DK and DKT schemes, the proposed tools can generate parcellated neonatal cortical regions that are comparable with those obtained using existing tools such as FreeSurfer at older time points, enabling longitudinal studies, beginning from the neonatal time-point. This will be valuable for investigating neurological development and disease progression from infancy to adulthood.

Results
Surface-space versions of the M-CRIB 2.0 and M-CRIB parcellations were derived and named M-CRIB-S(DKT) and M-CRIB-S(DK), respectively. An automated pipeline to segment novel T 2 -weighted neonatal images, extract cortical surfaces, and perform cortical parcellation with the M-CRIB-S(DK) and M-CRIB-S(DKT) atlases, was delivered. Labelling accuracy of this pipeline was assessed utilizing a Leave-One-Out atlas generation and automated parcellation strategy for each of the labelled atlas training set (n = 10). Results of the Leave-One-Out analysis of labelling accuracy, comparing automated labelling to manually-defined labels in the labelled dataset (n = 10) are shown in Figs. 1, 2 and 3. Figure 1 shows the vertex-wise parcellation mismatch rates for both atlases in midline (i), lateral (ii), superior (iii), inferior (iv), frontal (v), and occipital (vi) aspects. Hemisphere-wide vertex-wise agreement rates were similar across parcellation schemes: the average for M-CRIB-S(DKT) was 0.84 [range 0.78-0.87], and the average for M-CRIB-S(DK) was 0.83 [range 0.77-0.87]. The rates of high mismatch are confined to region boundaries, indicating that the bulk of the central portions always agreed with ground truth. Exceptionally high rates of mismatch can be seen for the frontal pole and temporal pole labels in the M-CRIB-S(DK). Figure 2 displays regional Dice measures for M-CRIB-S(DKT) and M-CRIB-S(DK). Dice scores for both atlases were generally high (0.79-0.83). For the M-CRIB-S(DKT) parcellation scheme, per-region mean Dice measures were similar across hemispheres (left: mean = 0.82, SD = 0.05; right: mean = 0.83, SD = 0.05). In both hemispheres, the highest Dice scores were observed in the insula (left: 0.95; right: 0.94). The lowest Dice score observed in the left hemisphere was for the pars triangularis (0.75), and the lowest Dice score in the right hemisphere was for the posterior cingulate (0.75). For the M-CRIB-S(DK) parcellation scheme, per-region mean Dice measures were similar to those listed above for the M-CRIB-S(DKT) parcellation, and were again similar between hemispheres (left: mean = 0.79, SD = 0.10; right: mean = 0.80, SD = 0.07). The highest Dices scores in each hemisphere were again seen in the insula (left: 0.95; right: 0.94). The lowest Dice scores were outliers observed in the frontal pole in the left hemisphere (0.50), and in the banks of the superior temporal sulcus region in the right hemisphere (0.55). Figure 3 shows per-region Hausdorff distances, which measure worst-case boundary discrepancy between automatic and manual labels for M-CRIB-S parcellation schemes. For the M-CRIB-S(DKT) parcellation scheme, per-region mean Hausdorff distances were similar between hemispheres (left: mean: 10.22 mm, SD = 2.99 mm; right: mean: 10.13 mm, SD = 2.77 mm). The smallest Hausdorff distances for each hemisphere were both seen in the insula (left: 3.27 mm, right: 4.11 mm). The largest Hausdorff distance for the left hemisphere was observed in the inferior parietal region (16.18 mm), and the largest in the right hemisphere was for the superior parietal region (15.91 mm).
For the M-CRIB-S(DK) parcellation scheme, Hausdorff distances were similar to those for M-CRIB-S(DKT) and were again similar between hemispheres (left: mean = 10.3 mm, SD = 3.08 mm; right: mean = 9.96 mm, SD = 2.67 mm). The smallest Hausdorff distances observed in each hemisphere were again both in the insula (left: 3.23 mm, right: 4.11 mm). The largest Hausdorff distance seen in the left hemisphere was for the inferior parietal region (16.39 mm), and in the right hemisphere was for the superior parietal region (15.     Worst-case (i) and best-case (ii) instances of boundary discrepancies, measured by Hausdorff distances, between estimated (green) and manual (red) label boundaries. The star markers and white paths depict the traversal between nearest neighbours. Other surface vertices are shaded according to curvature, with light and dark grey denoting gyri and sulci, respectively. (2020) 10:4359 | https://doi.org/10.1038/s41598-020-61326-2 www.nature.com/scientificreports www.nature.com/scientificreports/ average surfaces. The UNC and dHCP parcellation labels are those included in the released data packages available for download. The 42-week versions of the UNC and dHCP atlases were chosen as the closest age to the mean of the M-CRIB-S cohort (41.7 weeks postmenstrual age (PMA)).

Discussion
The primary contribution of this work is the provision of atlases and tools that facilitate cortical surface extraction and parcellation of the neonatal cortex into 31 or 34 regions. Our pipeline is based on T 2 -weighted images of neonates around term equivalent age and uses a common adult-compatible parcellation scheme, with neonatal-specific training data. This work extends our previous M-CRIB and M-CRIB 2.0 voxel-based atlases to enable surface-based parcellation of the neonatal cortex.
We have applied this method to a cohort of healthy, term-born infants (mean age at scan = 42.4 weeks). The full age range of subjects suitable for processing under the proposed pipeline will depend on whether tissue intensity contrast is adequate to reliably segment brain structures and extract cortical surfaces, and whether cortical folding complexity is enough to identify all macrostructural morphological features for surface-based template registration and region identification. Tissue segmentation and cortical surface extraction using DrawEM and Deformable are designed to be compatible with T 2 -weighted white/grey matter contrasts visible between 1-5 months 24 and have been demonstrated on data acquired between 28-45 weeks PMA 20 . From 5 to 8 months' PMA the T 2 -weighted grey/white matter contrast is transitioning to become like that of an adult by about 9 months' PMA via an isocontrast phase (6-8 months) 24 . As such, tissue segmentation using these protocols would only be expected to work optimally up to approximately 5 months of age, before the T 2 -weighted image becomes isointense.
Small-scale cortical folding occurs mostly late in gestation, with many sulci and gyri that define areal boundaries within the M-CRIB-S(DK) and M-CRIB-S(DKT) atlases not being reliably detectable until approximately 40 weeks PMA 13 . It is possible that a sulcus, for example, may not have formed a sufficiently deep groove to be separated from neighbouring gyri. Thus, careful inspection of parcellation results for subjects scanned at ages below 40 weeks' PMA is required to ensure that the morphological features that define areal boundaries are present. While full-term subjects were chosen for our atlases to avoid confounds introduced by the heterogeneous effects of preterm birth on brain morphology 25 , the lack of this heterogeneity in the training set may limit applicability to preterm or other populations at high risk for brain maldevelopment or malformation. Future work extending our atlases to represent broader ages or clinical groups would be valuable. Nonetheless, the parcellation component may be appropriate for subjects below 40 weeks and beyond 5 months of age, provided cortical surface data were obtained via other preprocessing methods.
Parcellation accuracy of the automated surface-based labelling methods using the M-CRIB-S atlases was quantified using Dice overlap measures and Hausdorff distances using Leave-One-Out cross-validation. Dice overlap scores were high on average and appeared similar in the left and right hemisphere. The per-vertex mismatch rates (Fig. 1) were largely zero for the bulk of the internal portions of most regions. High rates of mismatch for the frontal pole, temporal pole and BSTS labels in the M-CRIB-S(DK) atlas were unsurprising since low labelling accuracy was reported for these regions in the original adult DK atlas 26 , and they were subsequently removed in the adult DKT atlas 27 . The inaccuracy is mainly due to the small size and somewhat arbitrary boundaries of www.nature.com/scientificreports www.nature.com/scientificreports/ these regions. As such, this is an inherent limitation of the M-CRIB-S(DK) atlas. When compared to accuracy measures presented for the adult DKT atlas, Klein and Tourville 7 presented Leave-One-Out Dice measures of overlap between FreeSurfer parcellations and manual parcellations, which ranged from 0.72-0.98. These values are similar to our Dice overlap results, suggesting that the presented pipeline provides a labelling accuracy consistent with adult parcellation tools. However, it should be noted that the Dice metrics provided in the DKT paper 7 were not calculated via surface vertices, rather, the parcellations are rasterized into the grey matter ribbon voxels, from which per-label Dice coefficients are computed. While the surface and volumetric Dice measures come from different sources, by construction the labels of the rasterized version correspond to the labels on the surfaces and thus the two are closely related.
Boundary discrepancies between manual and automated labels were measured using Hausdorff distance. The Hausdorff distance is the maximal distance travelled between any two nearest neighbours of manual and automatic label boundary contours. Most regions had Hausdorff distances between 5-8 mm for both M-CRIB-S(DKT) and M-CRIB-S(DK) atlases. Figure 4(i) shows individual instances of worst-case boundary discrepancies in the middle temporal gyrus and inferior parietal labels. The subjects shown, subjects 7 and 8, appeared to exhibit sulcation that varied more than for other subjects in the temporal and parietal cortices and, as a result, the boundaries of these regions were shifted with respect to other training set subjects. An additional confound in manual labelling was that in some instances, label boundaries in the protocols relied on landmarks that were abstract or subject to large individual variability in presence or in morphology. In contrast, the best-case boundary discrepancies (Fig. 4(ii)) feature the insula and pericalcarine regions. The high accuracy of the estimated insula boundary is likely due its particularly well-defined boundaries in the original parcellation protocols, relatively easily identifiable in images and consistent across subjects. Other literature has reported cross-validated boundary discrepancy measures between manual and automated segmentations of the adult DK atlas dataset 6 . Rather than using Hausdorff distances, discrepancies were calculated as average per-vertex distances between manual and automated label boundaries across subjects. Graphical depictions of these average distances appeared to show a maximum discrepancy of 1 mm. Average boundary mismatch is incompatible with the worst-case discrepancy used in this paper and, thus, cannot be directly compared.
The dHCP set of tools are currently available for cortical surface extraction and lobar parcellation. Makropoulos et al. 20 recently highlighted the potential value of incorporating M-CRIB parcellation in these tools 20 , as its compatibility with the DK 6 adult parcellation may facilitate comparisons of cortical measures between the perinatal and adult time points. Here, we provide a publicly available, surfaced based cortical parcellation that can accomplish this objective.
The value of this surface-based atlas and the associated processing scripts is automated parcellation of the neonatal cortex that is straightforward to employ in longitudinal studies. The processing scripts and the M-CRIB-S(DK) and M-CRIB-S(DKT) atlases were constructed to be used with FreeSurfer, to produce compatible output and give a direct correspondence between region-based statistics such as cortical thickness, surface area, and curvature measures at neonatal, childhood and adult timepoints.  28,29 . Criteria for a subject being healthy were no admissions to a neonatal intensive care or special care unit, resuscitation at birth not required, birthweight more than 2.5 kg and no evidence of congenital conditions known to affect development and growth. Ethical approval for the studies was obtained from the Human Research Ethics Committees of the Royal Women's Hospital and the Royal Children's Hospital, Melbourne and the research studies complied with the standards of the Declaration of Helsinki. Written informed consent was obtained from parents. Data that exhibited excessive movement or other corrupting artefacts were excluded. This cohort was subdivided into the following two subsets: labelled and unlabelled subsets. The labelled set comprised the ten subjects (40. MRi acquisition. All neonate subjects were scanned at the Royal Children's Hospital, Melbourne, Australia, on a 3 T Siemens Magnetom Trio scanner during unsedated sleep. T 2 -weighted images were acquired with a turbo spin echo sequence with the following parameters: 1 mm axial slices, flip angle = 120°, repetition time = 8910 ms, echo time = 152 ms, field of view = 192 × 192 mm, in-plane resolution = 1 mm 2 (zero-filled interpolated to 0.5 × 0.5 × 1 mm in image reconstruction), matrix size = 384 × 384. All T 2 -weighted images were resliced to voxel-volume-preserving size of 0.63 × 0.63 × 0.63 mm 16,30 . processing pipeline. The proposed processing pipeline and M-CRIB-S training data is graphically described in Fig. 6. www.nature.com/scientificreports www.nature.com/scientificreports/ image segmentation. Each image in the unlabelled dataset ( Fig. 6(i)) was segmented into cerebral white and grey matter (including lobar sub-divisions), cerebellum and various subcortical grey matter structures automatically using the DrawEM software package 11,31 . Briefly, this technique non-linearly registered the non-labelled T 2 -weighted images to multiple pre-labelled images. The non-labelled image was then segmented using label fusion. The proposed pipeline utilized the wrapper script neonatal-pipeline-v1.1.sh included in DrawEM for execution. Figure 6(ii) shows an example voxel-based DrawEM segmentation output.
The labelled M-CRIB atlas images were already segmented appropriately for DrawEM compatibility. Each M-CRIB segmented image comprised manually traced cerebral white and grey matter, cerebellum, basal ganglia and thalamus, cortical, ventricular and other subcortical labels. Tracing protocols for the cortical 16,17 and subcortical 30 segmentations have been previously described. Figure 6(b) shows an example M-CRIB-S segmented image. Surface extraction. DrawEM compatible segmentations containing hemispheric white matter and grey matter, cerebellar, ventricular, brainstem and subcortical grey matter labels were used as input for the Deformable module 20,32 of MIRTK (https://github.com/BioMedIA/MIRTK). Deformable used to extract the inner and outer boundaries of each hemisphere of the cerebral cortices for both labelled and unlabelled datasets. Figure 6(iii) shows inner and outer surfaces overlaid onto the original T 2 -weighted image (top), and lateral aspects of the inner (middle) and outer (bottom) surfaces in 3D, respectively. Surface inflation and spherical mapping. The proposed pipeline used the FreeSurfer tools mris_ inflate and mris_sphere 8 to construct inflated and spherical versions of the white matter surfaces, respectively. Figure 6(iv) shows exemplary inflated and spherical surface outputs. Default FreeSurfer 6.0.0 options were used for both tools with the following exception: the negative triangle removal option "-remove_negative 1" was added to mris_sphere. The inflated surfaces exhibited the same gross shape features as those seen when FreeSurfer is executed on adult brain images. Specifically, an overall elliptical appearance, a dimple in the vicinity of the insula, and the smooth protrusion of the temporal and occipital poles. curvature template generation. Surface templates, comprised of all labelled and unlabelled subjects, were constructed using the curvature-based spherical mapping, alignment and averaging method as previously described 2,8 . Briefly, spherical registration involves linear (rotation) and non-linear displacement of vertices in spherical space. The registration algorithm aims to optimise agreement of white and inflated sulcal depth maps of a subject's surfaces to a template. The use of local curvatures and sulcal depth to drive registration means that corresponding sulci and gyri are aligned. An iterative procedure of aligning spherical surfaces from both the labelled and unlabelled datasets to the current template, followed by creation of a new template, was performed. The final template curvature and sulcal depth maps were created by averaging all aligned maps (see Fig. 6(a)). www.nature.com/scientificreports www.nature.com/scientificreports/ The spherical mapping of each white matter surface onto a common spherical space (Fig. 6(v)) meant that any given point in template space could be mapped to a point on each subject's white matter surface, and those points were in correspondence across subjects. This enabled average white, pial and inflated surfaces to be constructed using the FreeSurfer tool mris_make_average_surface, by resampling surfaces onto the 6 th order common icosahedron. The 6 th order icosahedron was chosen due to having minimal density while still upsampling the original surfaces.
Surface labelling. For the 10 cases in the labelled dataset, the volumetric M-CRIB and M-CRIB 2.0 labels were projected to the corresponding white matter surface vertices using nearest labelled neighbour projection (See Fig. 6(vi) for parcellation using the M-CRIB-S(DKT) scheme). Label data were individually checked for anatomical accuracy of label placement by one author (B.A.). For both atlases, label placement was considered highly accurate. In a few instances, very minor mislabelling was identified and manually corrected on the relevant surface and corrected volumetrically in some cases for M-CRIB 2.0 data. Figure 7 depicts the projection of the M-CRIB and M-CRIB 2.0 labels projected onto the white matter surface generated by Deformable for a single labelled subject. These surface-space versions of the M-CRIB 2.0 and M-CRIB parcellations are called M-CRIB-S(DKT) and M-CRIB-S(DK), respectively. While similar, the highlighted regions demonstrate some differences including label boundary changes (for example, in lateral orbitofrontal and pars orbitalis) and region removal (banks of the superior temporal sulcus). A comprehensive description of the differences in regions and region boundaries between the M-CRIB and M-CRIB 2.0 parcellations is available in previous publications 16,17 . parcellation training set construction. Parcellation training sets were constructed using the labelled set for each M-CRIB-S(DKT) and M-CRIB(DK) cortical label, using the method of Fischl et al. 5 . Briefly, for each template, spatial prior distributions for each cortical label were constructed on the surface using the tool mris_ca_ train. The M-CRIB-S(DKT) parcellation of the average white surface is shown in Fig. 6(c,d).
template surface construction. Using both labelled and unlabelled datasets, we derived group-averaged white ( Fig. 8(i)), pial (ii), and inflated surfaces (iii) along with curvature (iv) and sulcal depth maps in a common spherical space. For interoperability with the dHCP and UNC atlases 14,23 , we also provide versions of the M-CRIB-S spherical template surfaces registered to the dHCP 42-week and UNC 42-week spherical template surfaces. M-CRIB-S(DKT) and M-CRIB-S(DK) parcellation maps in each labelled subject were transferred to the spherical template and used as the training set for the FreeSurfer tool mris_ca_label. We applied this labelling to the average white matter surface using the M-CRIB-S(DKT) to illustrate our cortical labelling approach