Deep learning-based, fully automated, pediatric brain segmentation

The purpose of this study was to demonstrate the performance of a fully automated, deep learning-based brain segmentation (DLS) method in healthy controls and in patients with neurodevelopmental disorders, SCN1A mutation, under eleven. The whole, cortical, and subcortical volumes of previously enrolled 21 participants, under 11 years of age, with a SCN1A mutation, and 42 healthy controls, were obtained using a DLS method, and compared to volumes measured by Freesurfer with manual correction. Additionally, the volumes which were calculated with the DLS method between the patients and the control group. The volumes of total brain gray and white matter using DLS method were consistent with that volume which were measured by Freesurfer with manual correction in healthy controls. Among 68 cortical parcellated volume analysis, the volumes of only 7 areas measured by DLS methods were significantly different from that measured by Freesurfer with manual correction, and the differences decreased with increasing age in the subgroup analysis. The subcortical volume measured by the DLS method was relatively smaller than that of the Freesurfer volume analysis. Further, the DLS method could perfectly detect the reduced volume identified by the Freesurfer software and manual correction in patients with SCN1A mutations, compared with healthy controls. In a pediatric population, this new, fully automated DLS method is compatible with the classic, volumetric analysis with Freesurfer software and manual correction, and it can also well detect brain morphological changes in children with a neurodevelopmental disorder.

Owing to these challenges, innovative methods are being incessantly developed to address the constraints inherent to existing template-based approaches.Recently, deep learning has emerged as a promising methodology for precise brain segmentation 18 .Deep learning encompasses neural networks exceeding five layers, which facilitate the extraction of hierarchical features directly from raw images.Given their capability for autonomous learning, these networks demonstrate remarkable outcomes and broad generalizability when trained on extensive datasets 11,[18][19][20] .In the field of neonate and infant brain segmentation, including whole brains or lesions, convolutional neuronal networks (CNNs) are the most commonly used 11 .In this context, a state-of-the-art deep-learning based segmentation (DLS) methodology, named VUNO Med-Deep Brain, has been introduced in adult cohorts with Alzheimer's disease 21 .
In a previous study conducted by our team, we utilized the Freesurfer software coupled with manual adjustments to execute brain morphometric analyses.This yielded the first comprehensive findings on developmental brain alterations and volumetric disparities in regional structures among epilepsy patients harboring an SCN1A gene mutation 22 .The SCN1A gene (MIM#182389), responsible for encoding the alpha 1 subunit of the voltagegated sodium channel, has mutations that can lead to an array of neurodevelopmental disorders, including epilepsy 23 .
To ascertain the efficacy of the newly devised deep-learning based morphometric analysis for pediatric cohorts, we compared the structural parameters derived from the DLS method for healthy children aged under 11 with those obtained from the Freesurfer software supplemented by manual corrections.For a more granular assessment of the age-specific accuracy of the DLS approach, we performed subgroup analyses focusing on three distinct age groups: under two years, between two to six years, and older than six years.Furthermore, to gauge the robustness of the DLS technique, we juxtaposed regional volumes between pediatric patients (below 11 years) with an SCN1A mutation and their healthy counterparts.

Baseline demographics of patients and control group
Twenty-one patients with a SCN1A mutation and 42 healthy controls, previously selected and analyzed by our center, were re-analyzed in this study (age range; 2.0-10.5).To delineate the age-specific performance of the DLS method, we classified the control group into three different age subgroups: age ≤ 2 years (n = 12, 28.6%), 2 < Age ≤ 6 years (n = 18, 42.9%), and 6 < Age ≤ 10 years (n = 12, 28.6%).The baseline demographics are shown in Table 1 of a previous study 22 .

Whole-brain volume analysis
The volumes of total brain, total gray matter, cortical gray matter, and total white matter measured by DLS method were not different from those measured by Freesurfer with manual correction.The volume of subcortical gray matter measured by DLS was significantly smaller, and the total cerebellum measured by DLS was larger than that measured by Freesurfer with manual correction methods.(Table 1 and Fig. 1).
After subgroup analysis according to age, the volume differences of the subcortical and cerebellar gray matter between the two methods were revealed to be consistent in all three subgroups.

Cortical parcellated volume analysis
Among the 68-parcellated areas measured by the two methods, volume differences were found in only 7 areas (Table 2).The volumes of the right caudal middle frontal, right frontal pole, left inferior parietal, left fusiform,   right lateral occipital, and right insular cortex were significantly smaller.However, the left parahippocampus volume was significantly larger when measured by DLS, than when measured by Freesurfer with manual correction methods.Subgroup analysis between the two methods exhibited a significant volume difference in the right insular cortex among the ≤ 2 years of age group.

Subcortical volume analysis
The measured volume of both thalami, both putamen, and left caudate were significantly different between those measured by the DLS method and those by the Freesurfer with the manual correction method (Table 3 and Fig. 2).In the ≤ 2 years age group, both thalami were significantly larger and the right putamen was significantly smaller in the volume measured by the DLS method, compared to that by Freesurfer with the manual correction method.These differences were also observed in volumes of the right thalamus in the 2 < Age ≤ 6 years group and the right putamen in the 6 < Age ≤ 10 years group.

Whole-brain volume analysis
After adjusting for the sex, age and ICV for each group, the volumes of the total brain, total gray matter, cortical gray matter, subcortical gray matter, and total white matter were found to be significantly smaller in the patient group than in the controls (Table 4 and Fig. 3A) as is consistent with a previous study 22 .

Cortical parcellated volume analysis
The 34 cortical parcellated regions per hemisphere between the two groups; healthy control and patients with SCN1A mutation were compared (Table 4 and Supplementary Table 1).In comparison to heathy controls, patients showed significantly decreased volumes in both the lateral orbitofrontal, precentral, and inferior parietal, right isthmus cingulate, right middle temporal, left Banks of the STS, left parahippocampal, and right insular cortex compared to healthy control.

Subcortical volume analysis
A subcortical volume analysis was performed to compare the volumes of the subcortical structures (thalamus, caudate, putamen, pallidum, accumbens area) between the patients and the controls (Table 4, Supplementary Table 1 and Fig. 3B).The patients showed significantly smaller volumes of both the thalami, putamen, and caudate, and right pallidum than those of healthy controls.

Group comparison of each volume between patients with a SCN1A mutation and healthy control measured by DLS
In the assessment of our segmentation algorithm's efficacy, we computed the Dice Similarity coefficient (DSC), along with precision and recall metrics, across whole, cortical, and subcortical brain regions as delineated in Table 5.Our findings revealed cortical gray matter showed relatively low performance compared to other whole brain areas.Notably, specific areas such as the frontal pole, entorhinal cortex, temporal pole, and nucleus accumbens exhibited suboptimal performance, with performance values falling below 0.7, in both cortical and subcortical analyses.

Discussion
In pediatric populations, accurate brain segmentations of MR imaging can help find a diagnostic biomarker for a specific neurological disease, define a clinical course, or identify the underlying developmental patho-mechanism of various neurodevelopmental disorders 24 .However, accurate segmentation of pediatric brain MR imaging is challenging due to the reduced tissue contrast, increased noise, several partial volume effects, and ongoing white matter myelination 25,26 .This study aimed to demonstrate the performance of the DLS method for brain segmentation of MRI in a pediatric population aged under 11 years.
To investigate the accuracy of the DLS method in measuring regional brain volume, the volume of whole brain (Table 1 and Fig. 1), parcellated cortical volumes (Table 2), and segmented subcortical volumes (Table 3  www.nature.com/scientificreports/and Fig. 2) were measured by a DLS method in healthy controls and compared to previously measured volumes using Freesurfer with a manual editing method.Importantly, using the DLS method, the volumes of the total brain, total cerebral gray matter, cortical gray matter, and total white matter, were consistent with those measured by Freesurfer with manual editing (Table 1 and Fig. 1).In particular, the DLS methods can successfully delineate the gray and white matter and parcellate the total cortical gray matter (Table 1 and Fig. 1) as the freesurfer did (Table 2) representing the good performance of DLS method for cortical volume analysis.
The volume of only seven area including right caudal, middle frontal, right frontal pole, left inferior parietal, left fusiform, left lateral occipital, right insular, and the left parahippocampal cortex, measured by the DLS method, were discordant with that measured by Freesurfer with manual correction.After a subgroup analysis, the differences were resolved in the Age > 2 years group.
Since CNN was first introduced in 1989 27 , great interest in CNN's ability of the neonate and infant brain segmentation were gained through two large-scale competitions using standardized open data sets: the Neonatal Brains Segmentation Challenge and the 2017 iSeg 6-month Infant Brain Magnetic Resonance Imaging Segmentation Challenge.They concluded that the CNN approach, using deep-learning methods, could solve the neonate and 6-month-old brain tissue segmentation with a respectable dice similarity coefficient of 72.5-73.5% 6,28,29 .
Recently, the UNC/UMN Baby Connectome Project (https:// iseg2 017.web.unc.edu/ baby-conne ctome-proje ct/) is in the process of identifying brain and behavioral development in typically developing infants across the first 5 years of life by analyzing structural segmentation and functional connectivity 30 .These data suggest that CNN can be applied to the segmentation of young child brains and that it is particularly effective in whole brain and cortical gray matter volume analysis.In addition, currently released deep learning-based, infant-dedicated cortical surface reconstruction pipeline, iBEAT V2.0 were successfully processed various imaging protocols/ scanners 31 .
However, there were volume differences between Freesurfer with manual editing, and the new DLS method in the cerebellum and subcortical gray matter, including both thalami and the right putamen.Regard to subgroup analysis, the volume differences were evident mainly in patients of age under two and also dissolved in older age groups upper two as cortical parcellated volume analysis.The Supplementary Fig. 1 provide several illustrative examples, highlighting enhancements in achieving consistent and accurate segmentation in specific brain areas using deep learning segmentation (DLS) methods.Notably, the examples include incorrect segmentation from subcortical gray matter (GM) to white matter (WM) (Supplementary Fig. 1A), a noisy boundary in cerebellar segmentation (Supplementary Fig. 1B), and smaller segmented volumes in the putamen and pallidum areas (Supplementary Fig. 1C and D), when measured by Freesurfer with manual editing.These findings imply that the DLS approach could be instrumental in addressing the challenges of precise segmentation of complex structures, which is a limitation observed with Freesurfer.
Traditionally, atlas-based methods [32][33][34] , which match intensity information between an atlas and target images, and pattern recognition methods 35,36 , which classify tissues based on a set of local intensity features, are the classical approaches to automated segmentation 11 .Unfortunately, these methods have been shown to provide inaccurate segmentation for pediatric brain 37,38 due to an inappropriate template, customized to adult brains with low intensity contrasts and high shape variability of each regional structure, including the thalamus, hippocampus, parahippocampal areas, and insular cortex.For these reasons, volume differences measured between these two methods can be observed in these areas.Additionally, small sample sizes with large dispersions also contribute to the volume differences measured between the two methods.
In investigating the ability to identify brain morphometric abnormalities in patients with a SCN1A mutation, we compared the volumes between patients and healthy controls using the DLS method (Table 4).In whole brain analysis, the volumes of total brain, total gray matter, cortical gray matter, subcortical gray matter, and total white matter were significantly decreased in patients with a SCN1A mutation and related epilepsy, compared to that of healthy control and these results were consistent with our previous study 22 .In addition to the cortical and subcortical areas which showed reduced volume in patients using Freesurfer software with manual correction 22 , the both thalamus, right pallidum, right lateral orbitofrontal, right paracentral, right inferior parietal, left Banks of the STS, left parahippocampal, and right insular cortex were significantly smaller in patients with SCN1A mutation related epilepsy compared to those of healthy controls in this study.
The banks of the STS, and parahippocampal cortex are subnetworks of default mode network (DMN) and the thalamus and insular cortex are correlated areas with DMN in studies with resting state functional MRIs 39,40 .www.nature.com/scientificreports/These structural alterations of the DMN and DMN-associated areas, in patients with a SCN1A mutation, is consistent with the results of other studies 22,41 .
For cortical parcellation, FreeSurfer generates a white matter (WM) surface and pial surface for each hemisphere.The WM surface is generated from a segmentation mask and a copy of the WM surface is deformed towards the cerebrospinal fluid (CSF)/Gray matter (GM) boundary, to eventually form a pial surface.The cortical surface is mapped to a spherical atlas and the probabilities for each cortical region are calculated with Bayesian estimations.As the parcellation pipeline includes intensive computation such as registration and surface reconstruction, the whole processing time is around 7 h while it depends on the computation environment.
Since DeepBrain utilizes the fully trained deep-learning model for the parcellation pipeline that replaces manually designed algorithms such as cortical surface generation, the computation time can be significantly reduced.The total computation time for whole brain parcellation was less than 1 min, which is incomparably faster than traditional methods, such as FreeSurfer.
Despite several limitations, namely lack of validation using standardized open datasets or other datasets from multiple resources with dice coefficient, and a small number of patients, the deep-learning-based method with CNN could provide a key solution for accurate pediatric brain segmentation.Importantly, accurate and quick segmentation by CNN may identify the normal trajectories of developing brains, leading ultimately to the early detection and treatment of various neurodevelopmental diseases.
In a pediatric population, this new, fully automated DLS method is compatible with the classic, volumetric analysis with Freesurfer software and manual correction, and it can also well detect segmental brain volume change in children with a SCN1A mutation.Further validation using larger population data may confirm that this fully automated DLS method is a good and easy tool for accurate brain segmentations in pediatric populations.

Subjects
The 21 patients of epilepsy with a SCN1A mutation, who were under 11 years of age, and the 42 healthy controls, who had participated in our previous study, were reinvestigated by the DLS method 22 .Patients of epilepsy with a SCN1A mutation were recruited from the pediatric neurology clinics of three medical centers in Korea: Asan Medical Center, Samsung Medical Center, and Seoul National University Children's Hospital.We included patients who satisfied the following inclusion criteria: (i) epilepsy diagnosed by a pediatric neurologist; (ii) a genetically confirmed SCN1A mutation and (iii) a normal brain MRI.For each patient, two healthy control subjects matched in age and sex and without alleged neurologic deficits, were recruited.The model was originally trained with adult MR images, but we fine-tuned the model with additional pediatric brain images obtained from OpenfMRI dataset 44,45 .Fine-tuning dataset consists of 249 images (female: 156, male: 143, age: 5.01-19.22,σ = 4.33).The input image comprised a conformed T1-weighted brain MRI and the outputs are segmented brain regions mask and associated volumes of 104 regions in total.The CNNs was trained using ADAM optimizer and generalized dice loss function.The DLS system also includes a 3D CNN model that provides the intracranial volume segmentation, which was used to normalize the volume measurements for statistical analysis.The total processing time including image preprocessing and normalized volume retrieval was about 1 min on the minimum computational requirement setting (CPU: 16 GB, GPU: RTX2080Ti 11 GB).Although the model was trained with both 3 T and 1.5 T images, exploiting 3 T MR images was recommended to acquire more accurate segmentation.

Statistical analysis
We compared the absolute volume differences of healthy controls measured by two methods using the paired t-test.For multiple comparisons, a Bonferroni correction was applied, and p < 0.001 was considered significant for the whole brain, the 10 subcortical, and 34 cortical volumes.
The adjusted volumes ( Volume adj ) were used for comparisons between the control and patient groups by the DLS method, in order to adjust the total (ICV).Volume adj signifies the linearly adjusted volume calculated as Volume − β(ICV − ICV mean ) where ICV mean is the mean ICV of the each group and the parameter was fixed to minimize the covariance of the Volume adj and ICVs in each group 46 .
We fitted linear models with the interceptor term and the group indicator (healthy control vs. SCN1A patients) as covariates and used generalized least squares to estimate regression coefficients.The errors were allowed to be correlated with unconstrained parameters.After the Bonferroni correction for multiple comparison, a p < 0.001 was considered significant for the whole brain, the 10 subcortical, and 34 cortical volumes.
To determine the spatial overlap of the structures, we conducted Dice score coefficient (DSC), precision, and recall analysis between manual and automated segmentation methods.Dice Score can be defined as twice the total overlapping area of the predicted mask and the ground truth divided by the sum of the total number of tumor pixels (i.e.pixel value 1 [foreground]) in both the predicted mask and the ground truth).The value of DSC ranges from 0, indicating no spatial overlap between structures, to 1, indicating complete overlap 47 .
Other metrics used require pixel-wise evaluation of the ground truth and the predicted masks.True positive (TP) can be defined as the total number of positive pixels (belonging to the tumor) in the ground truth which are correctly predicted, True negative (TN) is the total number of negative pixels (belonging to the background) in the ground truth which are correctly predicted negative, False positive (FP) is the total number of negative pixels which are falsely predicted as positive pixels.Precision can be defined as the ratio of the total number of pixels predicted as positive to the total number of actual foregrounds.Recall is the ratio of the total number of correctly predicted foregrounds to the total number of actual foregrounds.

Figure 1 .
Figure 1.Whole brain volume analysis in a 42 healthy control by the DLS method and Freesurfer with manual correction.(A) Whole brain volume of overall healthy controls (B) Whole brain volume of healthy controls with age ≤ 2 years (C) Whole brain volume of healthy controls with 2 < age ≤ 6 (D) Whole brain volume of healthy controls with age > 6 The blue dots showed the volume measured by Freesurfer with manual correction; The red dots showed the volume measured by the DLS method.**p-values < 0.001.

Figure 2 .
Figure 2. Subcortical volume analysis in a 42 healthy control by the DLS method and Freesurfer with manual correction.(A) Whole brain volume of overall healthy controls (B) Whole brain volume of healthy controls with age ≤ 2 years (C) Whole brain volume of healthy controls with 2 < age ≤ 6 (D) Whole brain volume of healthy controls with age > 6 The blue dots showed the volume measured by Freesurfer with manual correction; The red dots showed the volume measured by the DLS method.**p-values < 0.001.

Figure 3 .
Figure 3. Group comparison of whole brain (A) and subcortical (B) parcellated volumes by deep-learning based segmentation (DLS) method.The blue dots showed the volumes of healthy controls; The red dots showed the volume of SCN1A patients **p-values < 0.001.

Table 2 .
Cortical volume analysis between the two methods, DLS and Freesurfer with manual correction methods in control group.Data are presented as differences between means.Paired t test and Bonferroni correction was applied.Bold font indicates statistical significance (p < 0.001).

Table 3 .
Subcortical volume analysis between the two methods, DLS and Freesurfer with manual correction methods in control group.Volume were normalized by ICV and expressed in mL.Paired t test and Bonferroni correction was applied.Bold font indicates statistical significance (p < 0.001).

Table 4 .
Group comparison of whole brain, subcortical and cortical parcellated volumes by deep-learning based segmentation (DLS) method.General linear models were used to account for correlation by matching of age and gender.Bold font indicates statistical significance (p < 0.001) after Bonferroni correction.Volume were normalized by ICV and expressed in mL.GM gray matter, WM white matter, SD standard deviation.**p-values < 0.001, and *p-values < 0.05: the areas with significant differences between two group had been shown in previous study using FreeSurfer with manual correction method.

Total Healthy control SCN1A patients Total Healthy control SCN1A patients Total Healthy control SCN1A patients
instead of max-pooling to minimize feature loss of small brain regions during spatial dimension reduction.