Searching for optimal machine learning model to classify mild cognitive impairment (MCI) subtypes using multimodal MRI data

The intervention at the stage of mild cognitive impairment (MCI) is promising for preventing Alzheimer’s disease (AD). This study aims to search for the optimal machine learning (ML) model to classify early and late MCI (EMCI and LMCI) subtypes using multimodal MRI data. First, the tract-based spatial statistics (TBSS) analyses showed LMCI-related white matter changes in the Corpus Callosum. The ROI-based tractography addressed the connected cortical areas by affected callosal fibers. We then prepared two feature subsets for ML by measuring resting-state functional connectivity (TBSS-RSFC method) and graph theory metrics (TBSS-Graph method) in these cortical areas, respectively. We also prepared feature subsets of diffusion parameters in the regions of LMCI-related white matter alterations detected by TBSS analyses. Using these feature subsets, we trained and tested multiple ML models for EMCI/LMCI classification with cross-validation. Our results showed the ensemble ML model (AdaBoost) with feature subset of diffusion parameters achieved better performance of mean accuracy 70%. The useful brain regions for classification were those, including frontal, parietal lobe, Corpus Callosum, cingulate regions, insula, and thalamus regions. Our findings indicated the optimal ML model using diffusion parameters might be effective to distinguish LMCI from EMCI subjects at the prodromal stage of AD.

ADNI participants. Data used in the present article was obtained from the ADNI database (adni.loni.usc. edu). The ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. For up-to-date information, see www. adni-info. org. ADNI-3 began in 2016 and includes scientists at 59 research centers in the United States and Canada. To ensure sufficient statistical power to assess differences in data collected with different protocols in each scanning site, we used the available data only in ADNI-3 at the time of download. This study reflects the data available on December 2020. In ADNI, MCI subject is diagnosed on the criteria; (1) subjective memory concern reported by the participant, study partner, or clinician; (2) abnormal memory function documented by scoring within the education adjusted ranges on the Logical Memory II subscale (Delayed Paragraph Recall, Paragraph A only) from the Wechsler Memory Scale-Revised (the maximum score of 25); (3) Mini-Mental State Examination (MMSE) score between 24 and 30; (4) global Clinical Dementia Rating (CDR) score of 0.5, with a Memory Box score of at least 0.5; and (5) general cognition and functional performance sufficiently preserved such that a diagnosis of AD could not be made. Participants used in the present study were 34 and 32 individuals diagnosed with early MCI (EMCI) and late MCI (LMCI) respectively, based on the WMS-R Logical Memory II Story A score. The EMCI subjects were recruited with memory function approximately 1.0 SD below, while those of LMCI were approximately 1.5 SD below expected education adjusted norms 2,4,5,7 . The specific cutoff scores were as follows (a maximum score of 25): EMCI was diagnosed for a score of 9-11 for 16 or more years of education; a score of 5-9 for 8-15 years of education; or a score of 3-6 for 0-7 years of education. LMCI was diagnosed for a score of 8 for 16 or more years of education; a score of 4 for 8-15 years of education; or a score of 2 for 0-7 years of education. Demographic and neuropsychological information in this study were shown in Table 1 and Supplementary Figs. S1, S2.  imaging is done exclusively on 3 T scanners. The MRI acquisition of ADNI-3 consists of Participant Scan (3 Plane Localizer, Accelerated Sagittal MPRAGE, Sagittal 3D FLAIR, Axial T2 STAR, Axial 3D PASL, Axial DTI, Field Mapping, Axial rs-fMRI, HighResHippocampus) and Phantom Scan (3 Plane Localizer, QC Phantom MPRAGE). The scanning protocols of T1-weighted MRI (voxel size = 1 mm 3 ), diffusion-weighted image (DWI) (voxel size = 2 mm 3 ), functional MRI are described in detail on the ADNI website (http:// adni. loni. usc. edu/ metho ds/ mri-tool/ mri-analy sis/). ADNI-3 utilized diffusion MRI protocols for 3 T Siemens, Philips, and GE scanners, using 2.0 mm isotropic voxels with b = 0 and 1000 s/mm 2 weighted volumes. The DICOM images, acquired from ADNI-3 database, were converted to NIFTI format with the dcm2nii part of MRIcroGL (https:// www. nitrc. org/ proje cts/ dcm2n ii/). Diffusion MRI preprocessing. Diffusion MRI data were preprocessed using MRtrix3.0 31 , FSL 6.0 (www. fsl. fmrib. ox. ac. uk) 32 , and advanced normalization tools (ANTs). We conducted the preprocessing process based on the recommendations by Maximov et al. 33 . The following steps were conducted: (1) noise correction using Marchenko-Pastur principal component analysis (MPPCA) (' dwidenoise'; MRtrix3.0 command), (2) correction for Gibbs ringing artifacts ('mrdegibbs'; MRtrix3.0 command), (3) motion correction, eddy current, and susceptibility distortion correction (' dwifslpreproc'; MRtrix3.0 command), (4) bias field correction calculated by advanced normalization tools (ANTs), (5) DTIFIT in FSL fits a diffusion tensor model at each voxel on the preprocessed diffusion image.
Briefly, the reconstruction of tractography was performed by ROI (region of interest)-based approach with DSI Studio (http:// dsi-studio. labso lver. org). After fiber tracts were generated by whole-brain seeding, the tracts running through ROIs were selected for analysis. The parameters for fiber tracking included a step size of 0.2 mm, a minimum and maximum fiber length of 20 mm and 800 mm respectively, and a turning angle threshold of 60°. This progression was repeated until the quantitative anisotropy (QA) of the fiber orientation dropped below the default threshold, until fiber tract continuity no longer met the progression criteria, or until tracking reached to 10,000,000 seeds [34][35][36] . www.nature.com/scientificreports/ The HCP-MMP1.0 was used for the parcellation of cerebrum, which is a surface-based coordinate system ("greyordinates") created in the CIFTI format 37 . In this study, the built-in HCP MMP1.0 atlas of DSI Studio was used to convert all 180 areas from a surface-based coordinate system to volumetric coordinates.
The quantitative tractography analysis was conducted, in which the 'connectivity matrix' function in DSI Studio was used to generate matrices representing the number of fibers ending in regions of a per-subject aligned HCP MMP1.0 atlas. After the bilateral connectivity matrices were generated, the number of streamlines corresponding to each connection was divided by the total number of each tract 34,35 . HCP1065 template. The HCP 1065 template was constructed from a total of 1065 subjects' diffusion MRI data from the Human Connectome Project (2017 Q4, 1200-subject release). The HCP1065 data are shared under the WU-Minn HCP open access data use term. The HCP1065 registration is based on the nonlinear ICBM152 2009a space. A multishell diffusion scheme was used, and the b-values were 1000, 2000, 3000 s/mm 2 . The number of diffusion sampling directions was 90, 90, and 90, respectively. The in-plane resolution was 1.25 mm, with the slice thickness was 1.25 mm. The diffusion data were reconstructed in the MNI space using q-space diffeomorphic reconstruction to obtain the spin distribution function 36 . A diffusion sampling length ratio of 1.7 was used, and the output resolution was 1 mm. The analysis was conducted using DSI Studio (http:// dsi-studio. labso lver. org).
Tract-based spatial statistics (TBSS). The preprocessed diffusion MRI imaging data from ADNI-3 were further processed with the DSI Studio (http:// dsi-studio. labso lver. org). The diffusion data were reconstructed in the MNI space using q-space diffeomorphic reconstruction (QSDR), an extension of the generalized q-sampling imaging (GQI), to obtain the spin distribution function (SDF) 36 . GQI obtain the SDF from the shell sampling scheme used in q-ball imaging (QBI), which is more sensitive to intravoxel orientational heterogeneity than classical diffusion tensor imaging (DTI) algorithm. Generalized fractional anisotropy (gFA) is considered as the QBI analog of DTI-derived FA 38 , which is the most widely used QBI measure. Since Corbo et al. (2014) showed the advantage of gFA-based TBSS compared to FA-based TBSS, we conducted gFA-based TBSS using gFA instead of FA as described previously 39 . In this study, 'gFA' means generalized FA, while 'FA' means DTI-FA.
After obtaining the gFA or MD (mean diffusivity) image from reconstructed diffusion data by DSI studio, we conducted the voxel-wise statistical analysis of gFA or MD data using TBSS (Tract-Based Spatial Statistics) of FSL, respectively 14,32 . TBSS projects all subjects' gFA or MD data onto the mean gFA or MD tract skeleton respectively, before applying voxelwise cross-subject statistics. TBSS aims to improve the sensitivity, objectivity, and interpretability of analysis of multi-subject diffusion imaging studies (https:// fsl. fmrib. ox. ac. uk). For all TBSS analyses, p < 0.05 was considered significant. Since the null distribution is not known, nonparametric permutation tests were used for thresholding on statistic maps to detect differences in FA between EMCI and LMCI subjects. Threshold-free cluster enhancement (TFCE) was applied to find significant clusters of voxels (p < 0.05) and correct multiple comparisons for family-wise error (FWE).
The ROI-to-ROI fMRI analysis basically computes the temporal correlation of BOLD activity between distinct regions from a given area to all other areas using a General Linear Model (GLM) approach. For the segmentation of cortical areas, DSI studio-built in HCP MMP1.0 atlas (http:// dsi-studio. labso lver. org) was incorporated to CONN for FC analyses. All FC measures were available in CONN for each subject and each condition (firstlevel analyses). Subject-specific contrast images reflecting standardized correlation coefficients were obtained for further analyses. The correlation coefficient (r) was converted to the normally distributed variable (z) by Fisher's z-transformation.

Graph measures (ROI-level).
With the graph theory analysis in CONN-toolbox (https:// web. conn-toolb ox. org/ fmri-metho ds/ conne ctivi ty-measu res/ graphs-roi-level), we explored resting-state functional connectivity (RSFC) between brain areas by the ROI-to-ROI approach. All ROI-level graph measures are based on nondirectional graphs with nodes (ROIs) and edges (suprathreshold connections). For each subject, a graph adjacency matrix A is computed by thresholding the associated ROI-to-ROI Correlation (RRC) matrix r by an absolute or relative threshold. Then, based on the resulting graphs, a number of measures can be computed addressing topological properties of each ROI within the graph as well as of the entire network of ROIs 41,42 .

Machine learning (ML)-based classifications.
We utilized the scikit-learn (https:// scikit-learn. org/), a library for machine learning (ML) in Python 3, to conduct multiple ML classification algorithms 43 . Based on the selected feature subsets, ML models were adopted by using several classifiers, including support vector machine (SVM), K-nearest neighbor (KNN), logistic regression (LR), random forest (RF), gradient boosting classifier (GBC), and Adaptive boosting (AdaBoost) 27,44,45 . SVM, a supervised learning method, searches for an optimal separating hyperplane between classes, which maximizes the margin. LR is the statistical technique used to predict the relationship between the dependent and the independent variable, where the dependent variable is binary in nature. K-nearest neighbors (KNN), a type of supervised learning algorithm, tries to predict the correct class for the test data by calculating the distance between the test data and all the training points. RF, GBC, Scientific Reports | (2022) 12:4284 | https://doi.org/10.1038/s41598-022-08231-y www.nature.com/scientificreports/ and AdaBoost are ensemble ML algorithms, based on the idea of creating a highly accurate prediction rule by boosting or bagging many relatively weak and inaccurate rule to improve generalizability/robustness over a single estimator 44,45 . Ten-fold cross-validation (CV) was used for the evaluation of each ML model 27 . We used the stratified crossvalidator 'StratifiedKFold (n_splits = 10, random_state = 0)' of scikit-learn tool (https:// scikit-learn. org/) in all the evaluations, which enabled us to compare the classification performance based on the same conditions. The test portion was hold out exclusively for testing (evaluation). Briefly the data were split into 'training' and 'test' sets. Models were trained using only the 'training' set, and model performance was assessed using the only 'test' set. The classification performance of different classifiers was evaluated using accuracy (ACC), precision, recall, F1 score, which were calculated based on the confusion matrix of classification results. The area under the receiver operating characteristic curve (AUC(ROC)) was calculated using 'roc_ auc' in the scikit-learn tool (https:// scikit-learn. org/). We took the mean of each metric to evaluate classification performance. The definitions of ACC, precision, recall, and F1 score are given as follows: We first addressed the cortical areas that were connected by affected callosum fibers, and then we conducted rs-fMRI analyses to calculate the resting-state functional connectivity (RSFC) and graph-theoretical metrics in those cortical areas, respectively. Using ROI-to-ROI binary correlation matrix (360-by-360 matrix based on HCP-MMP1.0 atlas) in each subject, we obtained the functional correlation coefficient (r) between connected cortical areas, which was transformed to z value with Fisher r-to-z transformation (i.e. TBSS-RSFC method). The TBSS-RSFC method resulted in a feature vector with 12 elements/ subject (12 = correlation coefficient (z) × 12 connected areas). We also measured the 2 graph-theory metrics of representative functional segregation (i.e. clustering coefficient, local efficiency) in the cortical areas by applying graph theory on rs-fMRI analyses (i.e. TBSS-Graph method) with CONN-fMRI toolbox for SPM12. The TBSS-Graph method resulted in a feature vector with 36 elements/subject (36 = 2 graph-theory metrics × 18 HCP-MMP1.0 areas).

Results
The LMCI-related white matter (WM) alterations by gFA-based TBSS. To search for the optimal ML model for EMCI/LMCI classification, we first used diffusion MRI dataset in ADNI database (as flowchart in Fig. 1). The gFA-based TBSS indicated the LMCI-related white matter (WM) changes in the Corpus Callosum (CC), the largest bundles of commissural fibers ( Fig. 2A). The LMCI-related WM changes, shown in Fig. 2A, were sub-classed into the anterior ROIs (α, a), middle ROIs (β, b), and posterior ROIs (γ, c, and δ) in the CC (Fig. 2B). Then to address which cortical areas are possibly connected by the callosal fibers inter-hemispherically, we conducted fiber tracking using the areas of LMCI-related WM changes as ROIs in the template brain (HCP1065) (Fig. 2C). We then quantified the cortical areas in which each bundle of streamlines project by cortical endpoint analyses. The tables in Fig. 2D listed the top 3 of cortical areas connected by streamlines running through each pair of ROIs, which were represented overlaid on the template brain (Fig. 2E). The cortical endpoint analyses showed the streamlines, passing through the ROIs (α-a), connect the frontal superior and middle gyri (10d, p10p, 9p, 9a) in the frontal lobes (Fig. 2C,E). Those, passing through the ROIs (β, b), connected superior motor and precentral areas (SFL, SCEF, 6mp) (Fig. 2C,E). Those, passing through the ROIs (γ-c) and (δ), connect the cortical regions, including paracentral, postcentral cortices (3b, 5mv), precuneus (7Am), superior parietal lobes (5L, 7AL, 7PC), and occipital visual areas (V3, V3A), respectively (Fig. 2C,E). The table in Fig. 2F indicates the mean diffusion parameter (mean FA and MD) in each ROI of LMCI-related WM changes. Although individual variation exists between each region, reduced FA value was observed in ROI α in LMCI subjects, compared with EMCI subjects (p < 0.05, t-test). Statistical data of Fig. 2F is in Supplementary Fig.S3.

Feature extraction by ROI-based and Graph theory-based RSFC.
Then to extract features to classify EMCI/LMCI subjects by machine learning (ML), we conducted rs-fMRI analyses to calculate the restingstate functional connectivity (RSFC) and graph theory metrics in the above cortical areas (Fig. 2E), respectively. At first, using ROI-to-ROI binary correlation matrix in each subject, we obtained the functional correlation coefficient (z) between connected cortical areas (i.e. TBSS-RSFC method) (12 elements = 1 coefficient (z) × 12 connected areas in Fig. 2D). On the other hand, previous graph theory-based studies showed the functional segregation is impaired in AD patients 21 . We then measured the graph-theoretical metrics of representative functional segregation, including clustering coefficient and local efficiency, in the cortical areas above by applying graph theory on rs-fMRI analyses (36 elements = 2 metrics × 18 cortical areas in Fig. 2E), namely the TBSS-Graph method.
The LMCI-related white matter alterations by MD-based TBSS. We also conducted mean diffusivity (MD)-based TBSS to investigate LMCI-related WM alterations (Fig. 3A). The LMCI-related WM changes were shown in Fig. 3B, which were sub-classed into the frontal, parietal, temporal, occipital lobes, and Corpus Callosum (CC) and cingulum, and insula and thalamus regions. We measured the volume (mm 3  . We also investigated the mean diffusion parameters (mean FA and MD) in each region of LMCI-related changes by MD-based TBSS (Fig. 3D). Although individual variation exists between each region, we found higher MD value in the right parietal lobe in LMCI subjects, compared with EMCI subjects. Statistical data of Fig. 3D is in Supplementary Fig.S3. Then we prepared two additional feature subsets of diffusion parameters (FA, MD) in altered WM regions by gFA-based TBSS and MD-based TBSS (24 elements = 6 regions/hemisphere × 2 × (mean FA, MD)), respectively.
Machine learning approach and performance for EMCI/LMCI classification. In this study, the main purpose was to search for the optimal ML model for EMCI/LMCI classification. Using the four feature subsets above, we then adopted multiple ML classifiers to distinguish LMCI from EMCI subjects, including support vector machine (SVM), k-nearing neighbors (KNN), decision tree classifier (DTC), Logistic Regression (LR), Random Forest (RF), Gradient Boosting Classifier (GBC), and Adaptive Boosting Classifier (AdaBoost). We compared the classification performance of these multiple ML classifiers by calculating accuracy (ACC), Recall, Precision, F1 score, and area under the curve (AUC) of receiver operation curve (ROC), with tenfold cross-validation (CV). We took the mean of each metric to evaluate classification performance. The table showed AdaBoost classifier (in gray hatching of Fig. 4A), an ensemble ML algorithm, provides better performance of 70% accuracy and 79% AUC (of ROC), using features of diffusion parameters by MD-based TBSS.

Discussion
In the current study, we proposed several diffusion MRI-based ML approaches for EMCI/LMCI classification, based on the hypothesis that subtle brain changes could be detected earlier in white matter microstructures by diffusion MRI. Using four feature subsets extracted from single-or multi-modal MRI data including diffusion-MRI, we trained and tested multiple ML models and assessed performance with cross-validation. Our results indicated the single modal data of diffusion parameters (FA, MD) provide better performance than that of multimodal method (TBSS-RSFC, TBSS-Graph method). The diffusion parameters of frontal, parietal lobe, Corpus Callosum, cingulum, insula, and thalamus were useful classification factors. In addition, those extracted from left hemisphere were slightly more useful for classification than right hemisphere. In general, different neuroimaging modalities could provide more essential complementary information than single modality 27 . However, our results showed the single modal features of diffusion MRI provided higher classification performance.
Our finding of left hemisphere dominant features for classification, which might reflect the more changed volumes of white matter in the left hemisphere, is compatible with a previous study. Goryawala et al. showed that significant features of brain volumes for EMCI/LMCI classification are from the left hemisphere 28 . These results suggest asymmetrical white matter alterations could occur during MCI progression. Additionally, our results of useful features from frontal, parietal lobe, and cingulum for classification are partially in agreement with previous studies. Hojjati et al. identified significantly different networks in MCI subtypes, including those in the frontal, temporal, and parietal gyrus 48 . Goryawala et al. showed the significant classification factors are cortical volumes of temporal, parietal, and cingulum for EMCI/LMCI classification 28 . Sheng et al., using graph theory metrics, selected features in the temporal or cingulate cortex 26 . Further, our findings suggest the association of insula and thalamus for classification of MCI subtypes. Numerous studies have revealed the insular gray matter loss 49 , dysfunction of insular network at the early stage of AD 50 , pre-symptomatic changes in thalamus 51 . These findings could reflect possible white matter alterations in the insula and thalamus during MCI progression.
Over the past decade, several ML approaches have been proposed for classification of AD and MCI. Current diagnostic methods for AD mainly depend on neuropsychological tests, neuroimaging, and biofluids, including cerebrospinal fluid (CSF) and serum 52 . Gurevich et al. and Kang et al. applied neuropsychological scores for discrimination of AD and cognitive impairment by ML 53,54 . Some studies used CSF and serum data for classification by ML 55,56 . A number of neuroimaging approaches have been applied in classification of AD and MCI, including positron emission tomography (PET) of Aβ-amyloid and tau deposition, structural MRI to detect brain atrophy, diffusion MRI and functional MRI 29,57-61 .
Although our results showed single modal features of diffusion MRI provided higher performance, a number of studies have effectively classified MCI subtypes by multi-modal MRI analyses with optimal feature selection 8,[24][25][26]28,47,48 . In general, the feature matrix, which is extracted from MRI or PET analyses, contains a huge amount of irrelevant or redundant features. To remove irrelevant features and reduce feature dimensions, feature selection is typically performed before classification (  25 . Collectively, these results suggest the optimal feature selection from multi-modal MRI data might be critical to improve classification performance. Thus, previous studies have typically combined multi-modal features after extracting each single modal data, which is followed by optimal feature selection. In contrast, our method sequentially integrated the multimodality of diffusion MRI and rs-fMRI or graph theory. We presumed our sequential integration methods of multi-modalities (TBSS-RSFC and TBSS-Graph method) resulted in the over-reduction of features and lost the non-linear mutual relations. This might lead to the poor performance for classification. In general, the white matter (WM) damage is considered to precede GM atrophy and network dysfunctions 29 . Our TBSS analyses showed LMCI-related white matter (WM) changes in the Corpus Callosum (CC). A number of studies using structural and diffusion MRI have revealed WM changes in the CC in neurological diseases, including AD, bipolar disorder, schizophrenia, and Huntington's disease [62][63][64][65][66] . WM changes can develop as a consequence of a number of factors, including demyelination and decreased number of axons, and/or cortical grey matter (GM) atrophy 67,68 . It therefore remains unclear whether WM changes in the CC are specific to each disease. The two different mechanisms were proposed to cause CC atrophy in AD; the direct myelin damage of callosal fibers; and the cell death in the GM, particularly the large pyramidal cells in cortical layer III 69 . Assumingly, the WM changes in CC can affect inter-hemispherical communications. Vecchio et al. (2015) showed the FA reduction in CC by DTI analysis is associated with a loss of inter-hemispheric functional connectivity by restingstate EEG in MCI and AD patients 70 . Further, reduced FA and increased MD in cognition-related WM tracts (e.g. cingulum, superior longitudinal fasciculus) are correlated with MMSE score in AD patients 71 . These results suggest the WM alterations in MCI subjects could partly lead to the disrupted segregation of neural network in AD 21 . The pathophysiological process of AD reportedly starts 20 years or more before symptoms 2,3 . The deposition of Aβ-amyloid is one of early signs at preclinical AD stage. Several neuroimaging studies have shown the relationship between early white matter alterations and amyloid deposition with amyloid-β PET [72][73][74][75] . Although the results are not completely consistent, those studies have suggested white matter microstructural changes (reduced FA and increased MD values) can be correlated with Aβ-amyloid deposition. Taken together, these findings might support our hypothesis that subtle brain changes can be detected earlier by diffusion MRI data.
This study was subject to several limitations. Several issues need to be further addressed. First, the sample size of MCI subjects, especially that of LMCI, was limited in the ADNI-3 dataset. ADNI imaging was carried out at over than 50 imaging centers, using scanners from the three major MR vendors (GE, Siemens and Philips). Although ADNI MRI core has established a standard set of protocols and procedures (www. adni-info. org.), the different scanners could cause a potential inconsistency on the analyses of imaging data. Since ADNI-3 project is under progress, the clinical and neuropsychological information for each subject is limited. During preparing this paper, additional information of neuropsychological scores and biomarkers became available. We added the additional available information in Table 1 and Supplementary Figs. S1, S2. Based on recent studies, various neuropsychological scores and biomarkers could improve classification performance with neuroimaging studies, including MMSE, RAVLT, CSF protein levels, and Apolipoprotein-E (APOE) genotype 28,76 . The abnormal memory function in MCI was determined by a single memory score 2,4,5,7 , which could lead to misclassifications that cause low accuracy and specificity in the present study. Our results have to be verified with larger datasets and follow-up longitudinal studies to reduce individual variations and validate the proposed ML model.
In conclusion, the feature set of diffusion parameters in the regions of LMCI-related WM changes was useful to distinguish LMCI from EMCI subjects with application of ensemble ML algorithm.