Identification of patients with and without minimal hepatic encephalopathy based on gray matter volumetry using a support vector machine learning algorithm

Minimal hepatic encephalopathy (MHE) is characterized by diffuse abnormalities in cerebral structure, such as reduced cortical thickness and altered brain parenchymal volume. This study tested the potential of gray matter (GM) volumetry to differentiate between cirrhotic patients with and without MHE using a support vector machine (SVM) learning method. High-resolution, T1-weighted magnetic resonance images were acquired from 24 cirrhotic patients with MHE and 29 cirrhotic patients without MHE (NHE). Voxel-based morphometry was conducted to evaluate the GM volume (GMV) for each subject. An SVM classifier was employed to explore the ability of the GMV measurement to diagnose MHE, and the leave-one-out cross-validation method was used to assess classification accuracy. The SVM algorithm based on GM volumetry achieved a classification accuracy of 83.02%, with a sensitivity of 83.33% and a specificity of 82.76%. The majority of the most discriminative GMVs were located in the bilateral frontal lobe, bilateral lentiform nucleus, bilateral thalamus, bilateral sensorimotor areas, bilateral visual regions, bilateral temporal lobe, bilateral cerebellum, left inferior parietal lobe, and right precuneus/posterior cingulate gyrus. Our results suggest that SVM analysis based on GM volumetry has the potential to help diagnose MHE in cirrhotic patients.

mild to be identified by routine physical and neurological examinations. MHE patients are often misdiagnosed or left untreated because their subtle neurocognitive impairments require specific neuropsychological and neurophysiological tests to be detected 15 .
Notably, the structural abnormalities occurring in the gray matter (GM) are considered to contribute to the neuropsychological dysfunction in MHE and have been associated with the progression of HE 6,16,17 . Several studies even proposed that regional GM morphometry (such as regional volume and cortical thickness measurements) could help to predict the existence of MHE 18,19 . Given these findings, we used a support vector machine (SVM) learning method to test the extent to which GM volumetry can distinguish between cirrhotic patients with and without MHE. Additionally, this study aimed to identify the specific GM regions that contributed the most to differentiating between the two patient groups.

Subjects. This study was approved by the Research Ethics Committee of Fujian Medical University Union
Hospital and was conducted in accordance with the Declaration of Helsinki. Written informed consent was obtained from all the study subjects: cirrhotic patients with MHE (n = 24) and those without MHE (NHE, n = 29). Table 1 lists the demographic and clinical characteristics of the study participants. Exclusion criteria included a current diagnosis of overt HE or other neuropsychiatric disorders, the use of psychotropic medications, the presence of uncontrolled endocrine diseases and metabolic diseases such as thyroid dysfunction, or recent alcohol abuse (less than six months prior to the study). The diagnosis of OHE was based on the West Haven criteria 15 . MHE was diagnosed using the Psychometric Hepatic Encephalopathy Score (PHES), which is comprised of a battery of neuropsychological assessments including the digit symbol test, number connection test A, number connection test B, serial dotting test, and line tracing test. The patient with PHES score ≤ 5 was diagnosed as MHE. Details about the PHES examination and MHE diagnosis have been described previously 20,21 . MRI acquisition. A 3-T MR scanner (Siemens, Verio, Germany) was used to acquire high-resolution T1-weighted images with a magnetization-prepared rapid gradient echo (MPRAGE) sequence. Image acquisition parameters were as follows: time to repetition (TR) = 1900 ms, time to echo (TE) = 2.48 ms, flip angle = 9 °, field of view (FOV) = 256 mm × 256 mm, matrix = 256 × 256, number of sagittal slices = 176, and slice thickness = 1 mm. MRI processing. Image processing was performed using Statistical Parametric Mapping software (SPM8) (http://www.fil.ion.ucl.ac.uk/spm/software/spm8/). In brief, the standard unified segmentation model in SPM8 was used to separate the structural MRI images into gray matter, white matter, and cerebrospinal fluid. Then, the Diffeomorphic Anatomical Registration Through Exponentiated Lie algebra (DARTEL) approach was employed to generate a GM template from all the images 22 and the template was spatially registered to the tissue probability map in standard Montreal Neurological Institute space. Following this affine registration, each gray matter MR image was non-linearly warped to the above GM template with a 1.5-mm cubic resolution. The GM volume of a single voxel was calculated by multiplying the GM map by the non-linear determinants derived from the spatial normalization step. Finally, the resulting images were refined by smoothing with an 8 mm 3 full-width at half-maximum (FWHM) kernel. Support vector machine analysis. Compared to other classification algorithms, SVMs have good performance and generalization capability when processing small-sample data 23,24 . Through the kernel transformation, SVMs can map the input objects into a higher dimension space. In order to make the classification accuracy as high as possible, a hyperplane needs to be selected to maximize the margin of separation between distinct classes. The key problem for SVMs is how to construct the optimal hyperplane. www.nature.com/scientificreports www.nature.com/scientificreports/ Assuming a binary classification problem, the input training data has m samples in the form of < > x y , i i , where x i is an n-dimensional vector and y i is the class label. The optimal hyperplane that separates the given data is then defined as where w is the "normal vector" perpendicular to the hyperplane, b is the offset parameter, Φ is the function of nonlinear transformation, and T represents the matrix transpose. Through mathematical derivation, the SVM classifier with the maximum margin can be obtained by optimizing the following function: where ξ i is the "slack variable" representing the amount by which each data point deviates from the separation margin, and C is a predetermined constant that controls the balance between the training errors and the misclassification tolerance. Once the "normal vector" w and the offset parameter b in Eq. (1.2) are calculated, the classification (class label y i ) can be predicted for a new sample based on Eq. (1.1). Accordingly, as shownin Fig. 1, when the parameters w and b were calculated, the decision boundary could be described by the equations w T x + b = +1 and w T x + b = −1. These decision boundaries were chosen in order to achieve the maximum margin separating the two classes. The data points lying on the decision boundaries are called "support vectors". In this study, the SVM algorithm was carried out using the PRoNTo software (Pattern Recognition for Neuroimaging Toolbox, version 2.1, http://www.mlnl.cs.ucl.ac.uk/pronto/prtsoftware.html) 25 . Each T1-weighted structural image was considered one data point in a high-dimensional space defined by the GM volume (GMV) value. In this high dimensional space, the linear decision boundaries classified the brain scans based on their class label (i.e., the NHE and MHE groups). Specifically, the classifier was trained by providing the samples in the form of x y , i i to find the optimal hyperplane, where x i represented the input GMV feature and y i was the class label (NHE and MHE). The optimal hyperplane was computed based on the varying patterns of GMV values across each T1-weighted image.
We chose a linear kernel over a non-linear kernel for several reasons. Firstly, non-linear kernels do not improve prediction accuracy in the high-dimensional space 26,27 . More importantly, a linear kernel reduces the risk of over-fitting, can greatly increase computational efficiency, and permits whole-brain classification without dimensionality reduction 28 . The similarity matrix was pre-computed using the linear kernel in the PRoNTo software and was then provided to the SVM classifier. The elements in the similarity matrix were calculated as the "dot product" of the input GMV features in the high-dimensional space. Then, the SVM classifier can extract the weight vector (i.e. the "normal vector" w) as an SVM discrimination map. The weight metric (Wi in Table 2) indicates the strength of the contribution of the GMV feature to the SVM classifier. In our study, we set the parameter C = 1 according to previous neuroimaging studies 29,30 . It is noted that the several factors (i.e. individual age, sex, and education level), were included as covariates and regressed out using PRoNTo software, before building the SVM model.
The "leave-one-out" cross-validation strategy was adopted in accordance with previous studies 31,32 , which excludes a single subject for testing and uses the remaining subjects for training. Every subject was excluded once to evaluate classification performance. This procedure was applied to all subjects in order to assess the overall www.nature.com/scientificreports www.nature.com/scientificreports/ accuracy of the SVM 23 . A permutation test (permutations = 1000 times) was applied to determine the statistical significance of the classification accuracy 33,34 .
We analyzed the correlation between the test margin and the PHES results using Pearson correlation analysis. The test margin was computed by projecting the input GMV feature onto the "normal vector" of the hyperplane. Accordingly, a larger absolute value of the test margin meant that the subject lay further away from the hyperplane.

Results
MHE patients performed significantly worse in all five subtests of the PHES assessment (resulting in a lower final score), indicating significant cognitive deficits compared to the NHE subjects. Figure 2 shows the SVM classification performance based on GMV between the 29 NHE subjects and the 24 MHE patients. The overall accuracy rate was 83.02% (P = 0.001), with a sensitivity of 83.33% and a specificity of 82.76%. As shown in Fig. 3, the area under the receiver operating characteristic (ROC) curve was 0.94, indicating a high possibility of correctly discriminating between the NHE and MHE individuals. Pearson correlation analysis indicated a positive correlation between the test margin and the PHES results (r = 0.647, P = 1.6 × 10 −7 ). Taken together, these results suggested that when the PHES score is far from diagnostic criteria, the subject is unlikely to be misclassified.
We identified the GM regions that were more associated with MHE or more associated with NHE by setting the threshold to ≥30% of the maximum weight vector scores, as per previous studies 30,35 . Those GM regions with a high absolute value of Wi had a higher discriminant power between groups. Specifically, GM regions with positive weight values were stronger contributors to recognizing individuals in the NHE group and those with negative weight values were stronger contributors to recognizing individuals in the MHE group (Table 2).

Discussion
In this study, SVM classification analysis with regional GMV as the indicator yielded 83.02% accuracy (83.33% sensitivity and 82.76% specificity) in classifying the MHE and NHE groups, suggesting the usefulness of gray matter volumetry in identifying early-stage hepatic encephalopathy among cirrhotic patients. Given that GM structural abnormalities exacerbate in stages as HE progresses in cirrhotic patients, and the changes of GM volume and thickness are correlated with cognitive impairments in cirrhosis, it is not unanticipated that gray matter volumetry is successful in differentiating between MHE and NHE diagnoses 6,18,19 . The PHES was designated as the current "gold standard" for MHE diagnosis 15,36 , although its disadvantages are also noted 37 , such as the reliance on the considerable motor activity and the existence of learning effect across the multiple tests. The GM volumetry may be helpful to overcome these disadvantages and play the important role in the assisted diagnosis. In terms of GMV data, the most informative regions were the bilateral frontal lobe, bilateral lentiform nucleus, bilateral thalamus, bilateral sensorimotor areas, bilateral visual regions, bilateral temporal lobe, bilateral cerebellum, left inferior parietal lobe, right precuneus, and the right posterior cingulate gyrus. All of these areas have been frequently reported to be affected by liver dysfunction that often induces energy metabolism disorders and deposition of neurotoxic substances in the brain [38][39][40][41] .
The GM regions that contributed to the identification of NHE patients had significantly higher GMV values in control NHE subjects than in the MHE subjects. This reduction of GMV that we observed in the MHE group reflected the previously reported regional atrophy in MHE 6,17,18,42 . For example, cirrhotic patients with MHE have consistently shown a loss of gray matter in both cortical and subcortical structures, such as the frontal and parietal lobes, limbic areas, and striatum 6,42 , and all of these regions were identified in the discrimination map obtained by our SVM procedure. Additionally, this decreased GMV occurred in several brain networks such as the frontoparietal network, the default mode network, and the primary and secondary visual networks. Therefore, network-oriented, regional GM atrophy may also be able to predict the relevant neurological dysfunctions that are common in MHE, such as executive dysfunction, attention deficits, and impaired visuospatial ability 9,43-45 . Similarly, MHE-associated neuronal loss in the basal ganglia and frontal lobe may induce the disintegration of cortico-striatal circuits subserving motor and cognitive processes 46 , and the reduction in cerebellar volume can affect sensorimotor processing in cirrhotic patients with MHE. The brain regions that contributed to the identification of MHE subjects showed significantly higher GMV in MHE patients compared with NHE subjects. In agreement with this result, previous studies also revealed similar enlargements in these specific GM regions in cirrhotic patients. For example, cirrhotic patients with cognitive impairment have demonstrated a significant increase in cortical thickness in the bilateral lingual and parahippocampal gyrus, right posterior cingulate cortex, precuneus, peri-calcarine sulcus, and the fusiform gyrus 47 . In addition, cirrhosis is often accompanied by an increase in thalamic volume 19,42,48 , so much so that increased GMV in the thalamus has been regarded as an additional characteristic of MHE. Accordingly, it was not unexpected to find that GM regions, such as the bilateral thalamus, bilateral precentral and postcentral gyrus, bilateral inferior temporal gyrus, bilateral occipital lobe, bilateral cerebellum, left insula, and right precuneus, were identified in our study by SVM classification in the discrimination map. (2020) 10:2490 | https://doi.org/10.1038/s41598-020-59433-1 www.nature.com/scientificreports www.nature.com/scientificreports/ It is important to note that the mechanisms underlying increased GMV in MHE are not well understood. One possible reason may be the diffuse, low-grade, cerebral edema related to Alzheimer's type II astrocytes during chronic liver disease 47,49 . The existence of both decreases and increases in GMV in MHE may reflect brain structural reorganization due to chronic liver failure. This MHE-associated neural plasticity possibly represents a compensatory mechanism to balance the negative influences of neural metabolic abnormalities.
Despite the compelling results of our study, its limitations are three-fold. 1) The relatively small sample size restricts the statistical power of the results. Accordingly, we encourage future studies to validate the classification potential of GMV using a larger number of study subjects. 2) The MHE patients in our study exhibited mild heterogeneity in terms of the etiology of their cirrhosis and their history of overt HE. This may introduce bias in the classification results since these factors can induce varying degrees of structural and functional impairments in the brain 6,48,50 . 3) While we only examined the discriminative potential of gray matter changes in MHE, mapping abnormal white matter alterations may also be useful to diagnose MHE 51 , which should be investigated in future studies.  Table 2. Brain regions contributing to the identification of MHE vs. NHE. Note: The above brain regions were identified by setting the classification threshold to ≥30% of the maximum weight vector scores. The first column lists only clusters larger than 200 voxels. Wi (reported in the last column) is the weight of each cluster centroid, i.e., the value that indicates the relative contribution of the GMV feature to the SVM-based classification.