Introduction

Neurodegenerative disorders, particularly dementia, provide a serious issue in modern healthcare because they cause a steady and long-term loss in cognitive abilities. This deterioration has a broader impact on language proficiency, memory retention, and executive capacities, while also causing motivational deficiencies, motor impairments, and emotional distress. The growth of these symptoms not only reduces the autonomy of people affected, but also has a significant impact on their overall well-being and that of their caretakers.

Mild Cognitive Impairment (MCI) develops as a critical point on Alzheimer’s Disease (AD) continuum in this complicated terrain of cognitive decline1. Each year, nearly 10% to 15% of the elderly people with MCI progress to dementia and 60–70% of dementia cases are caused by Alzheimer’s Disease (AD)2 . MCI is crucial in the early diagnosis and intervention efforts aimed at minimizing the impact of cognitive impairment because it serves as a vital connection between the pre-clinical phases of AD and the full-fledged onset of AD-related dementia.

Evaluation of motor function, including gait and balance, in the elderly population may be a helpful clinical tool for forecasting many clinical outcomes, including mortality, neurological illness3, cognitive impairment4 and fall risk5. Postural control, or balance, is the result of the cooperative efforts of several body systems, including the vestibular, motor, cognitive, visual, and sensory. As a result, abnormalities in any of these systems, including neuropathology and cognitive decline, can cause deficits in balance3. The presence or the severity of gait and/or balance disturbances are associated with a higher risk of Alzheimer’s dementia6. According to research7, there were notable variations in the MCI groups’ static balance performance when contrasted with the typical aging group. As a result, standing balance may be a helpful biomarker for the development of neurodegenerative diseases and mild cognitive impairments.

Accelerometers and gyroscopes are examples of wearable, wireless technology that has recently come to light as a viable alternative for clinical and laboratory testing8,9. Inertial sensors are often integrated into wearable devices, enabling real-time capture of movement and acceleration data for analyzing posture and balance. Clinical tests can be conducted in conjunction with these tools, giving useful data that will help make better-informed decisions about extent of cognitive decline and the treatments that will follow4,10.

Research on the use of static balance biomarkers derived from wearable inertial sensors to distinguish MCI patients from cognitively normal individuals is lacking. In studies11,12, correlation between cognitive dysfunction and standing postural balance have been reported in MCI patients using force platforms for assessing static balance parameters. A recent study13, developed new balance stability indicator with area under the curve (AUC=0.806), using stabilometer, whereas researchers in14 assessed and compared the static balance ability of the older adults with MCI standing on soft and hard support surfaces.

Some research is being done on using dynamic balancing tests which involves moderate activity, such as walking, which are not very easy to perform in older patients8,15. Static standing balance measurement may be a simple method that requires less physical load in older adults16. Furthermore, selection of feature set is very important for improving the accuracy of classification when machine learning techniques are being employed17.

Therefore, we aimed to identify key balance biomarkers of MCI using wearable inertial sensor signals for early diagnosis using machine learning techniques. The contributions of our paper are summarized as follows:

  • To extract and evaluate a set of objective measures of balance dysfunction that are different in individuals with MCI and healthy controls using wearable inertial sensors involving static balance metrics.

  • Our study examines changes in static postural sway in four distinct scenarios, i.e., Eyes Open, Eyes Closed, Right Leg Lift, and Left Leg Lift, to see which one yields the greatest outcomes.

  • To analyze the performance of key balance features for early detection of MCI using various machine learning models.

Methods

Participants

At National Research Center for Dementia, Gwangju, South Korea, selection of 60 participants was carefully carried out to remove demographics biases such as age, height, weight etc. They were divided into two groups: 30 participants classified as Cognitively Normal (CN) and another 30 diagnosed with Mild Cognitive Impairment (MCI).

All the subjects were evaluated and diagnosed based on assessments conducted by medical professionals at Chosun University Hospital and Chonnam National University Hospital in Gwangju. The Gwangju Institute of Science and Technology’s (GIST) Institutional Review Board approved the study protocols and all experimental procedures were carried out according to the approved guidelines and regulations. Before the trials began, all subjects and/or their guardians gave written, informed consent.

Each subject underwent a comprehensive set of tests, including MRI scans to investigate brain anatomy and PET scans to detect Beta-amyloid plaques. To evaluate cognitive ability, the Mini-Mental State Examination (MMSE)18 was utilized, with MCI diagnosis determined by scores greater than 1.5 standard deviations19,20. The neuro-psychological assessments were conducted using Seoul Neuro-psychological Screening Battery (SNSB)21 consisting of five major cognitive domains: attention, language, visuo-spatial, memory, and frontal/executive domains. Participants with focal brain lesions, dementia unrelated to Alzheimer’s disease, and other severe medical, neurological, or mental problems that could impair cognitive functions and balance were specifically excluded from the study.

Analysis of Variance (ANOVA) was utilized to compare the two groups’ demographic and cognitive characteristics. The demographic and neuropsychological outcomes for all participants, along with corresponding p-values, are shown in Table 1. Importantly, no noteworthy differences in age, gender, height, weight, or education level were discerned between the Cognitively Normal (CN) and Mild Cognitive Impairment (MCI) groups.

Table 1 Demographics of subjects.

Data Acquisition Protocol

The experimental protocol involved subjects wearing the Shimmer 3 inertial sensor22, a wearable device with a tri-axial accelerometer and tri-axial gyroscope. This sensor was securely positioned on the lower back of participants (specifically, the L3-L5 vertebrae) using an adjustable belt and was closely monitored by an observer. The location selected is close to the Center-of-Mass (COM) of the human body, and whole-body movements are revealed using acceleration data from a sensor positioned close to COM23. Under four different conditions, subjects were instructed to remain in an upright posture with their arms by their sides: eyes-open (EO), eyes-closed (EC), right-leg lift (RL), and left-leg lift (LL), as shown in Fig. 1. The data was collected from two trials of each subject. Prior to data collection, the sensor underwent pre-calibration following the outlined procedure in24, ensuring accuracy. It was configured to measure within a range of ± 4 g, and the sampling rate was set at 64 Hz. Bluetooth was used to transfer data to a nearby laptop, and ConsensysPRO25 was used to synchronize the data in real time.

Figure 1
figure 1

(a) Sensor mounted on participant’s back, with data collection in 4 conditions, (b) Eyes Open (EO), (c) Eyes-Closed (EC), (d) Right-Leg (RL) and Left-Leg Lift (LL).

During signal processing, the collected data underwent filtration through an 8th-order zero-phase low-pass Butterworth filter. The 5 Hz cutoff frequency was chosen for the filter, contributing to the refinement of the sensor data for subsequent analyses. The sensor’s x-, y-, and z-axes corresponded to the Medio-Lateral (ML), Vertical (V) and Antero-posterior (AP) orientations of the participants, respectively.

Feature Extraction

We systematically assessed a considerable number of features from literature across diverse domains related to balance and falls risk in aging and neuro-degenerative diseases. The focus of our study is on computing several standard measures within the quantitative balance parameters, specifically in the time and frequency/spectral domains, to quantify postural balance. A comprehensive set of 76 postural sway measures was employed, comprising 43 features associated with time and 33 features linked to frequency/spectral characteristics, as shown in Table 2. Every feature was computed for all of the four standing balance conditions for every subject.

Table 2 Summary and brief description of the features.

Equation (1) is used to calculate acceleration Signal Vector Magnitude (SVM). Some parameters (36–56, 65–67, 69–71) are computed for each axis: ML, V, AP; other parameters (0–3, 5–16, 21–28, 57–64, 72–75) are computed for SVM as well as for each axis: ML, V, AP; few parameters (17–20, 32–35) are calculated for some of these planes: AP-ML, ML-V, AP-V, and rest of the parameters (4, 29–31, 68) are calculated just for SVM.

$$\begin{aligned} SVM (n) = \sqrt{A_x (n)^2 + A_y (n)^2 + A_z (n)^2} \end{aligned}$$
(1)

Feature Selection

Feature ranking was executed through the utilization of a “Leave-One-Subject-Out (LOSO)” cross-validation technique for each session. The dataset was divided iteratively into training and test sets during this process, with one subject removed at a time. Two distinct categories of feature-ranking techniques were applied. Firstly, Filter Methods were employed, including ANOVA (Analysis of Variance) and Mutual Information, which assess the relevance of features independently of the classification model. Mutual Information (MI) measures the extent to which knowledge of one quantity reduces uncertainty about another, while ANOVA compares the variation in the group means. Secondly, Wrapper Methods: Random Forest, Support Vector Machine (SVM) were utilized to evaluate feature subsets by considering how they affect the performance of a specific classification model.

The value of each feature was calculated by the number of times it was featured in the top 15 feature list throughout all 60 folds. Following that, a score was assigned to each feature, offering insight into the consistent significance of features across diverse subject exclusions. Finding strong features that were consistently significant over a range of subjects and folds was the aim of this technique, which laid the groundwork for further analysis processes. The scores from the feature-ranking techniques were combined to determine the final score, which indicates the relative weight of feature across the various approaches.

In a research work34, hybrid feature selection approach is proposed, by combining filter and wrapper methods . In their study, the features were first ranked based on the ranking criteria’s and then a wrapper algorithm is invoked to generate a subset from the ranked features. However, our study used both filter and wrapper methods for assigning score to the feature.

The key features identified for each session are shown in Fig. 2 and Supplementary Table S1. This extraction process ensured that the most important features would be selected by employing multiple feature-ranking techniques. Table 3 presents the top 10 salient features found in each session.

Figure 2
figure 2

Results of each ranking technique for top 15 features.

Table 3 Key features identified in each session.

Results

Model Evaluation

To analyze the variability of features, box plots are employed to display the distributional characteristics of the data. In Session-1, RMS, SEF-G-ML and Area-CC have lower values for MCI than CN (Fig. 3). MDIST exhibits lower values for MCI than CN in all conditions. The MCI group had considerably higher values than the CN group for MVELO, M-freq and TOTEX (Fig. 4). Similarly, Spect-Entropy-G-AP, Spect-Entropy-ML, SEF-ML and TOTEX-AP have higher values for MCI than CN, hence contributing to discriminating the classification groups (Figs. 5, 6).

Figure 3
figure 3

Session-1 (Eyes-Open)—Illustration of balance data across MCI and Controls. Orange indicates CN and blue represents MCI subject groups.

Figure 4
figure 4

Session-2 (Eyes-Closed)—Illustration of balance data across MCI and Controls. Orange indicates CN and blue represents MCI subject groups.

Figure 5
figure 5

Session-3 (Right-Leg-Lift)—Illustration of balance data across MCI and Controls. Orange indicates CN and blue represents MCI subject groups.

Figure 6
figure 6

Session-4 (Left-Leg-Lift)—Illustration of balance data across MCI and Controls. Orange indicates CN and blue represents MCI subject groups.

Explainability using SHAP

The feature set was further analyzed by SHAP (Shapley Additive exPlanations), suggested by Lundberg and Lee35. Game theory forms the basis of SHAP36, and it provides a way to calculate each feature’s contribution to the model. The greater contribution a feature makes to the model’s prediction, the more significant it is. A concise description of the impact distribution of features on Random Forest model and the relationship between those features’ SHAP values and their impact, is given by the SHAP summary charts in Fig. 7. After comparing the important features based on SHAP impact value (Fig. 7) and our top features (Table 3), it is important to note that all key features identified by our research were included in the list of SHAP, hence proving the effectiveness of our method in identifying important biomarkers for early detection of MCI. In Session-1 (EO), maximum similarity is obtained, followed by Session-2. This also explains the highest accuracy results obtained in Session-1 (Table 4). However, in Session-3 and Session-4, similarity between identified features extends to the top 10 or 15 features. Therefore, we have observed that a larger set of features are required to differentiate between controls and MCI in those conditions, as frequency/spatial features are also involved in that scenario.

Figure 7
figure 7

SHAP summary plots showing the relationship between the features’ SHAP values and their impact.

Additional investigation was done to find significant features across all sessions. To achieve that, scores of top features from all sessions were combined. The top 10 features that were determined to be the most significant throughout all sessions are displayed in Fig. 8. The final score for each feature is calculated from the contribution from each of the four sessions, as given in Supplementary Table S2.

Figure 8
figure 8

Top key Features identified across all sessions. Contribution of scores from each of the four sessions are shown in two different perspectives.

Classification results

The top 15 features in each session were then used as the foundation for the classification stage, where Support Vector Machine (SVM)37, Random Forest38, and Ensemble Models (comprising Majority Voting and Gradient Boosting39) were employed. Leave-One-Subject-Out (LOSO) cross-validation strategy was applied to assess the effectiveness of the chosen characteristics for MCI and CN classification. This required categorizing the data, systematically eliminating subjects one at a time, and obtaining the predictions across all 60 folds. Accuracy, precision, specificity, and sensitivity measures were calculated based on the predicted values, providing a thorough assessment of the classification models’ performance. This phase demonstrated a direct correlation between feature selection and prediction accuracy, providing valuable insights into the features’ potential utility as balance biomarkers for MCI early identification. The best results obtained by using top features for classification are summarized in Table 4.

MMSE is frequently administered in clinical contexts to evaluate cognitive impairment. MMSE achieved Area-under-the-curve (AUC) score AUC=0.669, on our dataset. Our findings demonstrate that top significant features outperformed MMSE in terms of performance, Table 4.

Table 4 Summarized classification results of various machine learning models.

Discussion and conclusion

The study used wearable inertial sensors to analyze various static balance metrics in eyes-open, eyes-closed, right-leg lift, and left-lift scenarios. Our research pioneers the use of wearable sensor data for classifying MCI patients and individuals without cognitive impairment through static balance biomarkers. Existing research has mainly focused on finding potential biomarkers and analyzing them statistically3,7,10 or by finding correlation with MMSE16. However, we have employed a multi-step methodology based on machine learning techniques for selecting features, combining the strengths of filter and wrapper methods34, and subsequently employing classification algorithms. Boxplots and SHAP have also been utilized for statistical analysis of key features.

Under several static balance settings, we have discovered and analyzed features that significantly differ in MCI patients. Root Mean Square (RMS), mean distance (MDIST), 95% Confidence Circle Sway Area (AREA-CC), Path Length (TOTEX), and Mean Sway Velocity (MVELO) are the principal features of relevance in the Time domain, Table 3 and Fig. 8. Substantial biomarkers for balance in the frequency or spectrum domain are Spectral Edge frequency (SEF), Mean frequency (M-Freq), Peak frequency (P-Freq), Centroid frequency (C-Freq), and Entropy (Spect-Entropy), Table 3 and Fig. 8.

Previous findings have also identified RMS of the acceleration signal9,31, Mean Frequency17 and 95% AREA-CE14 as important biomarkers for static balance impairment. Our results are consistent with prior research7, which states that in the setting of eyes-open conditions, balancing metrics such as anterior-posterior (AP) sway and medio-lateral (ML) sway position were found to be relevant discriminators, but not in eyes-closed conditions, as the defective central processing of visual information is linked to balance abnormalities associated with MCI. One possible explanation for this is that MCI slows down information processing, and these groups rely on visual cues to maintain postural stability.

The key features identified in feature selection process were provided as input to different classification models: Support Vector Machine (SVM), Random Forest, and Ensemble Model (Majority Voting). The results (Table 4) show that best classification results are obtained using SVM yielding 75.8% accuracy in eyes-open condition. However, Random Forest Classifier can be regarded as the best performing model across all sessions, by obtaining more than 65% accuracy in four different conditions. Static eyes-open balance features were discovered to have promising routes for early identification as they stood out as particularly unique, Table 4 and Fig. 7.

Although our study adds to the expanding corpus of research on dementia, it is critical to recognize its limitations. To achieve a thorough understanding of MCI, more research is necessary to evaluate the identified biomarkers in longitudinal cross-cultural settings as well as investigate the integration of other modalities. Furthermore, longitudinal studies are essential for monitoring the evolution of biomarkers related to balance over time.

Our study shows that wearable inertial sensors can be a viable tool for early dementia identification, and it also emphasizes the significance of postural balance measurement in MCI detection. Using these methods could make it possible to identify dementia and Alzheimer’s disease early on even in home settings. Furthermore, the study provides strong data outlining the critical parameters for assessing balance and lays the groundwork for future research aimed at improving, validating, and optimizing a standardized clinical motor assessment technique customized for people with MCI.