Selecting Clinically Relevant Gait Characteristics for Classification of Early Parkinson’s Disease: A Comprehensive Machine Learning Approach

Rehman, Rana Zia Ur; Del Din, Silvia; Guan, Yu; Yarnall, Alison J.; Shi, Jian Qing; Rochester, Lynn

doi:10.1038/s41598-019-53656-7

Download PDF

Article
Open access
Published: 21 November 2019

Selecting Clinically Relevant Gait Characteristics for Classification of Early Parkinson’s Disease: A Comprehensive Machine Learning Approach

Rana Zia Ur Rehman¹,
Silvia Del Din ORCID: orcid.org/0000-0003-1154-4751¹,
Yu Guan²,
Alison J. Yarnall^1,4,
Jian Qing Shi³ &
…
Lynn Rochester^1,4

Scientific Reports volume 9, Article number: 17269 (2019) Cite this article

8413 Accesses
78 Citations
26 Altmetric
Metrics details

Subjects

Abstract

Parkinson’s disease (PD) is the second most common neurodegenerative disease; gait impairments are typical and are associated with increased fall risk and poor quality of life. Gait is potentially a useful biomarker to help discriminate PD at an early stage, however the optimal characteristics and combination are unclear. In this study, we used machine learning (ML) techniques to determine the optimal combination of gait characteristics to discriminate people with PD and healthy controls (HC). 303 participants (119 PD, 184 HC) walked continuously around a circuit for 2-minutes at a self-paced walk. Gait was quantified using an instrumented mat (GAITRite) from which 16 gait characteristics were derived and assessed. Gait characteristics were selected using different ML approaches to determine the optimal method (random forest with information gain and recursive features elimination (RFE) technique with support vector machine (SVM) and logistic regression). Five clinical gait characteristics were identified with RFE-SVM (mean step velocity, mean step length, step length variability, mean step width, and step width variability) that accurately classified PD. Model accuracy for classification of early PD ranged between 73–97% with 63–100% sensitivity and 79–94% specificity. In conclusion, we identified a subset of gait characteristics for accurate early classification of PD. These findings pave the way for a better understanding of the utility of ML techniques to support informed clinical decision-making.

Prasinezumab slows motor progression in rapidly progressing early-stage Parkinson’s disease

Article Open access 15 April 2024

Self-supervised learning for human activity recognition using 700,000 person-days of wearable data

Article Open access 12 April 2024

Neurofilaments as biomarkers in neurological disorders — towards clinical application

Article 12 April 2024

Introduction

Parkinson’s disease (PD) affects approximately 10 million people worldwide, with a doubling of the global burden over the past 25 years due to increasing longevity and longer disease duration¹. PD has both motor and non-motor symptoms, and diagnosis is based on clinical features^2,3. The diagnostic accuracy of clinical diagnosis of PD in differentiating PD largely from other neurological disorders is only 74% when performed by non-experts and 80% by movement disorder specialists; this is particularly problematic in the early stages of disease⁴. The Movement Disorder Society has recently proposed new clinical diagnostic criteria for PD that incorporates non-motor manifestations⁵. However, other diagnostic aids are needed to improve accuracy. Gait performance is a marker of global health in general, predicting mortality, morbidity, falls and neurodegenerative disorders⁶. Gait impairments are a common feature of PD, appearing early and evolving over time^7,8,9,10. They could therefore inform early diagnosis¹¹. Moreover, evidence suggests they are present in the prodromal phase and could identify risk of disease in the prodromal phase^12,13 along with the possibility of different phenotypes of PD. Collectively this could lead to more personalized care and clinical trials.

Gait is typically described by its spatiotemporal characteristics such as step length, step velocity, step width, step time, swing time, stance time (mean gait characteristics) and their respective variability and asymmetry (dynamic gait characteristics)^6,14,15. A comprehensive conceptual gait model organized these spatiotemporal gait characteristics into five domains (pace, rhythm, variability, asymmetry and postural control) based on factor analysis and highlighted its importance due to their association with clinical attributes including cognitive impairment in PD¹⁵. For example, factors in the pace domain may help to differentiate mild cognitive impairment from normal cognition¹⁶, whereas postural control may act as an early biomarker for asymmetrical neurodegenerative diseases such as PD¹⁷, while variability in gait predicts falls in older adults and PD¹⁸. Currently, gait impairment is commonly described using a univariate approach, precluding an understanding of the contribution of multiple gait characteristics. Identifying the optimal combination of gait characteristics to better define PD is therefore a priority in order to develop its use as a possible tool to aid diagnosis and management of PD¹⁹.

Machine learning (ML) provides a method to identify the best combination of clinically relevant spatiotemporal gait characteristics to address questions around disease classification^20,21. Earlier work using sequential forward selection, minimum redundancy, maximum relevancy, and mutual information based methods applied to the vertical ground reaction forces has been used to find suitable statistical features for PD classification¹¹. A range of other methods have also been tested for selection of suitable features in neurodegenerative diseases^22,23,24. However, the feature selection method that has small searching space for optimal results is missing. As a starting point, a good feature selection technique should select the features that have a high correlation with the response variable (PD or healthy controls classes) and minimum redundancy among the gait characteristics²⁵. Therefore, there is a need to identify the suitable ML modes and the optimal combination of gait characteristics for classification of PD.

Widely reported machine learning models in literature for PD classification are support vector machine, random forest, k-nearest neighbours, classification and regression trees, neural networks, and logistic regression^{11,20,21,22,24,26,27,28,29,30,31,32,33}. However there is no consensus and studies are difficult to compare. Therefore, a comprehensive ML approach whereby previous ML models are implemented on the larger dataset with a comprehensive combination of gait characteristics is needed in order to identify the most relevant gait features for classification of PD. The choice of gait characteristic is important for the models so that their findings are easy to interpret. Based on the literature, gait characteristics vary widely, often with no consistency across studies or rationale for feature inclusion for classification of PD^{11,20,21,22,24,26,27,28,29,30,31,32,33}. Features based upon common spatiotemporal gait characteristics that can be easily understood in relation to the underlying disease are helpful and pre-existing gait models inform comprehensive feature selection¹⁵. For example, asymmetry may be helpful in early PD as degeneration of dopaminergic cells occurs with an asymmetrical distribution. Other limitations of previous work include participants with more severe disease, a relatively small sample size and lack of ground truth data to quantify the best gait features. Together this reduces the generalizability, validity and applicability of results. Therefore, large studies for PD classification in people with less severe disease using a selection of gait characteristics that are easily interpretable and easily quantified are needed.

This is the largest early study in which a comprehensive set of clinically relevant spatiotemporal gait characteristics extracted from early cohort are used for classification of PD. The aims of the study are to identify: 1) suitable ML models to apply to gait features to discriminate PD and healthy controls (HC); and 2) the optimal combination of clinically relevant gait characteristics for early classification of PD. In order to achieve these aims, first we need to understand the input features (gait characteristics) in ML models as training data and then propose a ML framework for finding the optimal traditional ML models for PD classification while addressing generalizability issues.

Methods

Participants

303 subjects were recruited from the “Incidence of Cognitive Impairment in Cohorts with Longitudinal Evaluation-GAIT” (ICICLE-GAIT) study¹⁵. All the recruited subjects from ICICLE-GAIT were used for analysis without applying any additional inclusion or exclusion criterian. Among the cohort, 119 were people with early PD diagnosed according to the UK Parkinson’s Disease Brain Bank criteria³⁴ by a movement disorder specialist³⁵ and 184 healthy control subjects (HC). Ethical approval was obtained from the “Newcastle and North Tyneside research ethics committee” (REC No. 09/H0906/82). All subjects gave written informed consent before participating in this study. In addition, confirming that, all the methods and experiments were performed according to the declaration of Helsinki.

Demographic and clinical measures

Participants’ demographic characteristics such as age, height, weight, and BMI were recorded. Severity of the PD motor symptoms was assessed using Hoehn and Yahr scale³⁶ and part III of the modified version of Movement Disorder Society Unified Parkinson’s Disease Rating Scale (MDS-UPDRS)³⁷; tremor dominant and postural instability and gait difficulty (PIGD) phenotypes were calculated from MDS-UPDRS³⁸. Freezing of gait (FOG) was assessed with the new freezing of gait questionnaire³⁹ and levodopa equivalent daily dose (LEDD) was also measured. Cognition was assessed with the Mini-Mental State Examination (MMSE)⁴⁰; and balance confidence was evaluated with the balance self-confidence scale⁴¹.

Testing protocol and experimental setup

Participants were instructed to walk at their preferred pace continuously for 2 minutes on a 25 m oval circuit³⁴, gait was repeatedly sampled as participants walked on an instrumented walkway (Platinum model GAITRite; 7.0 meters long and 0.6 meters wide) placed in the middle of the circuit (Fig. 1). GAITRite has a spatial accuracy of 1.27 cm and temporal accuracy of 1 sample (240 Hz, ~4.17 ms). PD patients were assessed whilst in a clinically defined “ON” state.

Data processing and outcome

From GAITRite, each individual step’s data were extracted with Microsoft Access. Mean gait characteristics were calculated by taking the average of all trials, and dynamic gait characteristics were calculated according to the methods described previously³⁴. In total, 16 gait characteristics were derived based on previous work and grouped to broad independent domains for easy interpretation (pace, rhythm, variability, asymmetry, and postural control)¹⁵.

Statistical analysis

Independent t-tests were used to examine the difference between groups (PD vs. HC) for demographic data and gait characteristics. The area under the curve was used to check their discriminative power for classification. Pearson’s correlation between gait characteristics was also evaluated to see the independence and redundancy.

Framework for classification modelling

For supervised ML modelling, a comprehensive approach was adopted. Different ML models such as logistic regression (LR)³², linear discriminant analysis (LDA)⁴², k-nearest neighbour (KNN)⁴³, classification and regression tree (CART)¹¹, Naive Bayes (NB)²⁰, support vector machine (SVM)^21,32,33, random forest (RF), bagged decision tree (BDT), extra tree classifier (ETC), AdaBoost classifier (AC), gradient boosting classifier (GBC)⁴⁴, and voting methods²² containing LDA, NB, and SVM were employed. A ML framework was proposed for the selection and evaluation of these models with a test harness (Fig. 2).

The proposed ML framework used 16 gait characteristics as predictor variables (standardized data: zero-mean, unit variance) and disease status (PD or HC) as a response variable. Firstly, based on literature^{11,20,21,22,32,33,42,43,44}, 12 widely reported linear and non-linear models in the classification of PD were selected. Secondly, based on spot checking of 12 models on the whole dataset with help of 10-fold cross-validation (CV), an initial selection of five models was made for further analysis. Then a test harness was developed for further testing of the models. As a first step, the dataset was split into training (90%) and testing (10%). This stage was important to compare the training and testing performance of the initially selected models to check the distribution of the data. Then model selection was performed again on the training and testing results. Selected models’ hyperparameters were tuned further on the training data to obtain the best classification accuracy on the validation data set. A grid search method was utilized with 10 fold cross validation to find the appropriate hyperparameters. Selection of the gait characteristics and the desired optimal number were also performed based on their contribution in the ML models. For ML modelling Python is used with standard libraries⁴⁵.

Gait characteristics selection techniques

Recursive feature elimination (RFE) technique with RF, linear kernel SVM, and LR was used to select the optimal number of features based on their contribution in the classification accuracy, evaluated through the 10-fold validation⁴⁶. To further validate these results, model performance was compared using the test data set. The gait characteristics’ importance was also quantified using RFE with linear kernel SVM and LR. For RF, information gain is used to know the relative importance of the gait characteristics, which is a widely used method in bioinformatics for features selection^47,48. The general algorithm for RFE is given below⁴⁹ for gait characteristics selection. For implementation, standard commands from SciKit-learn library in Python were used.

Model Inputs

Training data (n = 303 subjects)

X₀ = [x₁, x₂, x₃, … x_k…, x_n]^T

Class labels (PD or HC)

y = [y₁, y₂, y₃, … y_k…, y_n]^T

Initiation of selection process

Selected features, N

s = [1, 2, …, N]

Feature ranking

r = []

Recursive repetition until s = []

Restrict the training data to good features indices

X = X₀(:,s)

Training the model

α = model-train(X, y)

Compute the weight for each gait feature in s

w = \(\sum _{k}{\alpha }_{k}{y}_{k}{X}_{k}\)

Calculation for ranking

c_i = (w_i)² for all i

Find the features with the smallest ranking criterion

f = argmin(c)

Update the features ranking list

r = [s(f), r]

Eliminate the features with the smallest ranking criterion

s = s(1:f−1, f + 1:length(s))

Output

Ranked gait features list r.

Results

Table 1 shows the demographic, cognitive and clinical characteristics of participants. In keeping with early disease, mean MDS-UPDRS III score was 25.4, and mean LEDD 175.9 mg/day. Only 11 participants had evidence of freezing of gait (FOG) with mean FOG score 0.681. In comparison with HC, PD participants (median of 4.7 months from diagnosis) were relatively younger, taller, had proportionally more males, lower balance confidence (ABC), and poorer cognition (MMSE).

Table 1 Demographic and clinical characteristics; M: Male; F: Female; BMI: Body mass index; MMSE: Mini-mental state examination; ABC: Activities specific balance confidence scale; UPDRS: Unified Parkinson’s disease rating scale; PIGD: Postural instability and gait disorder phenotype; ID: Indeterminate phenotype; TD: Tremor dominant phenotype; t(df): t-value at degree of freedom; p showing the statistical difference between PD and HC. In bold significant p values (p < 0.05).

Full size table

Input features as training data

From Table 2, all gait domains differed significantly between groups and 13 out of 16 gait characteristics were significantly impaired in PD. When looking at the association between the gait characteristics in Fig. 3 we found a number of highly correlated characteristics. As these gait characteristics are not independent due to their high correlation, it was important to find the optimal combination of gait characteristics for classification of PD to avoid redundancy.

Table 2 Significant difference between PD and HC; AUC: Area under the curve; p showing the statistical difference between PD and HC. In bold significant p values (p < 0.05).

Full size table

Selected machine learning models

All 16 gait characteristics (Table 2) were used for classification modelling. We adopted a comprehensive approach to select the optimal ML model. Table 3 shows spot checking results based on the whole dataset with 10-fold cross validation under default hyper-parameters of models. Baseline accuracy based on a zero rule algorithm was 60.72%. All linear models performed almost the same with around 80% classification accuracy between PDs and HCs. In other models, SVM with radial basis function (RBF) kernel accuracy was about 84% and in ensemble models such as RF and GBC, showed 86% and 85% accuracy respectively. Selection of the classification models was refined to five models (LR, LDA, SVM-RBF, RF, and GBC) following spot checking of 12 models based on model domain and were subsequently tested using the test harness described in Fig. 2 using training and testing data separately. Results are presented in Table 4.

Table 3 Spot checking results of the models; RBF: Radial basis function.

Full size table

Table 4 Checking model performance on training and testing data; SE: Sensitivity, SP: Specificity; RBF: Radial basis function.

Full size table

Training accuracy (based on the 10-fold cross-validation) and testing accuracy were almost similar for RF. For GBC and SVM there was a slight decrease in the testing accuracy to a maximum of 2%. On the other hand, both linear models (LDA and LR) had lower training accuracy but similar testing accuracy compared to other models. For further analysis, only three models were selected from these five models. Each model was selected from a different domain, such as RF from ensemble or tree based approach, SVM due to kernel techniques, and LR due to linear models were selected for further fine-tuning to get the optimal results. During fine-tuning on training data, the hyper-parameters (such as number of trees and regularization coefficients) were determined by cross validation.

Selected gait characteristics for optimal results

Based on the RFE technique in Fig. 4(a–c), the optimal performance in RF and SVM was achieved with five gait characteristics. However for LR, the optimal performance was achieved with seven characteristics, and after five characteristics the performance decreased drastically. From this analysis, it was clear that with five gait characteristics optimal performance can be achieved. Performance evaluation of the models with RFE is performed with 10-fold cross-validation (RFECV). The F1 score was used for this purpose to find the balance between precision and recall as shown in the Fig. 4(a–c). The RF has a classification score of 96.4% with five gait characteristics. Similarly, for SVM, the best F1 score was 84.5% with five gait features. However, for LR the score was 87.5% with seven gait characteristics.

Figure 5(a–c) show the contribution of each gait characteristic in the classification model. The six common gait characteristics among the top 10 were mean step velocity, step length, step time, stance time; step width variability; and step length asymmetry. The ML models trained with different gait characteristics (top five selected with each ML model, top 10 selected with each ML model, common among 10 in all models, and top five selected with linear-SVM) were evaluated on testing data to identify the optimal combination.

The testing results of the RF, SVM-RBF, and LR are presented in Table 5 with the F1 score representing the training results based on RFE technique. Overall, RF performed better than SVM-RBF and LR. From a total of 16 gait features, the top ten gait characteristics selected by each model gave good testing classification accuracy. RF showed 94.28% accuracy with 100% sensitivity and 89% specificity, followed by the LR which had 82.85% accuracy with 71% sensitivity and 89% specificity; and SVM-RBF showed 81.92% accuracy with 71% sensitivity and 89% specificity. With the common features selected by all the models, RF performance decreased slightly, SVM-RBF classification accuracy increased, and for LR it remained almost the same. With the top five gait characteristics, we observed the same classification accuracy for RF as with the top ten features. However, for the SVM-RBF, the accuracy increased to 85.71% by reducing the feature set. A similar case was with LR, the accuracy increased to 84.28% by reducing the feature set. Further, we also observed that, if we feed five gait characteristics selected with linear-SVM-RFE, then all the models gave optimal performance. The final optimal performance from RF was 97.14% classification accuracy with 100% sensitivity and 94% specificity. For SVM-RBF it was 85.71% accuracy with 79% sensitivity and 94% specificity. Similar for LR it was 84.99% accuracy with 76% sensitivity and 94% specificity. In addition, the training accuracy with 10-fold cross validation is evaluated in terms of the F1 score to have single measures to check the performance of the model. RF has the highest F1 score of 96.4% followed by LR of 87.5% and SVM of 84.5%.

Table 5 Optimal classification accuracy on testing and training data; GC: Gait characteristics; SE: Sensitivity, SP: Specificity; RFE: Recursive features elimination technique; RF: Random forest; SVM-RBF: Support vector machine with radial basis function kernel; LR: Logistic regression.

Full size table

Discussion

Based on the best of our knowledge, this is the largest classification study in PD using a comprehensive approach to determine the optimal ML model and spatiotemporal gait features. Gait features were selected according to a validated model of gait in PD in participants with relatively early disease⁶. We were able to identify both the optimal ML model and combination of gait characteristics for classification of PD. We found that by using only five gait characteristics from three independent gait domains (pace, postural control, and variability) selected by RFE-SVM we were able to achieve optimal PD classification. The highest testing classification accuracy of 97% with 100% sensitivity and 94% specificity was achieved with RF.

Sixteen gait characteristics from five domains (pace, rhythm, variability, asymmetry and postural control) were used as input features for classification (see Supplementary Fig. S1 for data distribution). Pace, rhythm, variability, asymmetry and postural control characteristics differed significantly between groups. PD walked at a slower pace (slower and with shorter steps) and rhythm, with impaired postural control (higher step length asymmetry and lower step width variability) and with a more variable and asymmetric gait pattern compared to HC. This is in line with previous research on gait impairment in PD^{6,11,15,20,50}.

Only a few studies reported the feature selection processes used to identify the importance of gait characteristics in ML modelling with walking speed (step velocity), step/stride length, stride time, and step time asymmetry identified as important features for classification of PD^3,21,22. There are some notable exclusions, as none of these studies included gait related postural control features in their models (e.g. step width and step width variability) which have been shown to be important and sensitive indicators of gait impairment in PD^6,15. In this study, we first presented a feature selection phase and we identified step velocity, step length, step width variability, step width and step length variability as important characteristics to classify PD. Selection of step velocity and step length is in line with previous studies^3,21,22. We were surprised to see that both were included because of the high correlation between these characteristics. Adopting a data driven approach however indicates that the spatial component of step velocity (e.g. step length) retains additional information that is an important “independent,” explaining its selection. Conversely it seems that step time does not contribute additional information to step velocity and hence was not selected (“important”) as a feature in the top five. Step width and step width variability (standard deviation of step widths⁵¹) had no or low correlation with other gait characteristics and were highly relevant in the selection process, so we suggest that these are variables to be included in future classification studies. Only one study used step width in machine learning and did not report its importance; the accuracy achieved in this study was 93%²⁰. From a clinical perspective, it also makes sense that gait related postural control features are important for classification, based on evidence that postural control is a specific biomarker for neurodegenerative diseases and in particular for PD^6,15.

Selection of the ML models for PD classification was based on an extensive ML framework. First, a comprehensive approach was utilized to include models used in previous studies^{3,20,22,32,33,42} and models such as the LR, LDA, KNN, CART, NB, SVM, RF, BDT, ETC, AC, GBC, and the voting method were therefore implemented. Previously, LR was used with eight feet force sensor data for classification between PD and HC³². LDA was trained on the statistical features extracted from two Shimmer sensors and obtained a classification accuracy of 82%⁴². Similarly, KNN^11,20,22,43, SVM^11,20,22,42 with linear³² and non-linear kernels^3,21,30, CART²², NB^20,22, RF^11,20,32,44, and majority voting²² were used to get reasonable classification accuracy. Based on spot-checking in our study, we found ensemble models such RF, GBC, BDT, ETC and AC performed better with an overall classification accuracy of 86%. The non-linear SVM-RBF model gave classification accuracy of 84%. From linear models, LR and LDA gave similar classification accuracy of 80%. Based on these results in the initial model selection phase, five classification models RF, GBC, SVM, LDA, and LR were therefore selected.

The deployment of the ML model in real world practice is still unknown due to issues with generalizability. In order to test the robustness of ML models, independent/external datasets which have not been used in the training of the model should be used to validate model performance. To our knowledge, there are only two studies that used independent datasets for checking the performance of the proposed models^42,43. Their classification accuracy ranged between 81 to 85.71%. However, most studies used 10-fold cross-validation methods, due to their small sample sizes. As a consequence generalizability of the models remains unclear. To date the largest study (PD:156, HC:424)⁴³ reported the classification accuracy of 85.71% but did not include the sensitivity and specificity of their models. Using a smaller dataset (DP:12, HC:20)²¹, accuracy was up to 100% with sensitivity ranging between 9.03–100% and specificity ranging between 86.7–100%, however model overfitting can bias the results and provide an unstable performance when other evaluation metrics are included (e.g., F1 score). To overcome these limitations we proposed a test harness in the ML framework where our five selected models were trained and tested separately using independent data. Different split ratios for training and testing (70/30%, 80/20%, and 90/10%) were used, due to similar results and to have more data for training, 90% data for training and 10% for independent testing were used in final analysis. Overall training accuracy of 76–88% based on 10-fold cross-validation was achieved and a similar accuracy of 81–87% was achieved on the test data. Further hyper-parameter tuning and feature reduction in RF, SVM, and LR gave the highest test classification in a range of 81–97.14% with 71–100% sensitivity and specificity of 89–94%. In this context we report 100% sensitivity to indicate that models were able to classify all people with PD correctly. With 94% specificity, some of the HC were classified as the PD. In the real world, high sensitivity compared to specificity may be optimal to avoid misdiagnosis in initial screening. We also reported another commonly used metric (F1 score) which was between 85–96%. All these results are higher or comparable to the previous studies^{3,11,20,21,22,30,32,33,42,44}.

Motivation of using ML for feature selection was to extract the discriminatory features while suppressing the redundant features. Even though the data between groups was overlapping, based on some extracted features, we can see ML models effectively classify PD and HC groups. In order to select the optimal gait characteristics for the model we chose the recursive feature elimination (RFE) wrapper based method which has advantages over other filter based methods⁵². This is an iterative method where features are removed one by one rather than in combination. As the ranking of features is based on a single gait characteristic, this technique will have no effect on methods using correlations⁴⁹. The space dimensionality of the gait characteristics is reduced with RFE and the least related gait characteristics are removed one by one without having an effect on the training error.

ML models can also give different importance weights to features depending upon the nature of the models. Based on analysis the top five gait characteristics were enough for optimal PD classification. These characteristics belong to pace, variability and postural control followed by asymmetry and rhythm domains of gait model¹⁵. In this study, RF gave a relatively high importance to step width variability, step time asymmetry, swing time asymmetry, step velocity, and step length. RFE also gave similar results with SVM and LR models where the same four features were selected (step velocity, step length, step width variability, step width), with a difference on the 5^th gait characteristic (step length variability with SVM and step length asymmetry with LR). The features selected with SVM gave the highest classification accuracy and this model is in line with previous work¹¹.

It’s possible that, some of the results may be influenced by more severe PD (HY III), despite the fact that subjects had gait assessment with a median of 4.7 months from clinical diagnosis with relatively low doses of dopaminergic medication. To check the original results, analysis was re-run by removing the 21 subjects at HY III. Based on analysis, classification performance ranged in between 75.75–96.11% with 76–95% sensitivity and 78–95% specificity. RF gave best classification performance on the features selected with RFE-SVM. The same first four gait characteristics (step velocity, step length, step width variability, step width) were selected and the same model (RF) gave the optimal performance. The performance of the models was comparable to whole data set including HY III, with slightly less sensitivity and high specificity. Due to the heterogeneous nature of PD, even in early disease, there will be a range of motoric and cognitive abilities. The inclusion of these participants ensures that our dataset is generalizable to those seen in clinics. Therefore we used the entire dataset for the final analysis.

ML methods appear to be more sensitive to overall variability in the data compared to simple statistical methods, which is important to understand for classification of early PD. In classification studies for healthcare application, the addition of a feature selection and reduction phase using ML plays an important role to tackle the problem of model overfitting, limiting the impact of the noise in the data during the classification phase. Further, feature reduction can help to improve model accuracy, as seen in our study where the accuracy of the models increased to 97% when redundant features were removed. This also reduces training time, augmenting the overall ML performance and implementation. From a clinical perspective, classification with ML techniques can help clinicians to use ML as a tool to support diagnosis of PD and provide an explanation for informed decision making. Our findings also help pave the way to enhance the utility of ML for clinicians.

There are some limitations in this study. One model of gait including specific gait characteristics was included in this work, and whilst comprehensive, in the future other reported models and outcomes should be considered to identify the best measure (or combination of measures) for classification of early PD. Due to the large cohort size, there was an imbalance between gender and a statistical difference between the gender, age, and height. This is reasonable for achieving the model generalizability on a diverse dataset, however, classification results may improve with a more homogeneous dataset. In this early cohort, HY III PD were included with very low FOG score, LEDD intake was relatively low, and the MDS-UPDRS III score was low in support of a mildly affected group. Due to the heterogeneous nature of PD, inclusion of these participants ensures that our dataset is generalizable to those seen in clinics. These were early stage PD without post-mortem confirmation, therefore it is possible that a small number may have an alternative diagnosis. However, participants continue to be followed up every 18 months with consideration of alternative diagnoses given at that time. This was not a de novo group, and thus may limit the generalisability, although our cohort reflects clinical practice. In this study, gait characteristics were derived from an instrumented mat (GAITRite); however, a similar analysis should also be performed with wearable sensors to investigate the contribution of the characteristics in ML classification models. Only single-task gait characteristics were analysed in this study, in future the contribution of the dual-task gait characteristics in PD classification models will also be investigated. The findings in this study are based on the test results (10-fold cross-validation and on 10% testing data) on our cohort dataset (in a controlled setting), which may not generalise well to other cohorts. In future, we aim to evaluate our findings on much larger datasets (with diverse cohorts) in more naturalistic environments.

Conclusion

In this study, comprehensive ML approaches were used to identify suitable models and the most important combination of spatial-temporal gait characteristics for classification of early PD. The best classification models for our dataset were RF, SVM, and LR. Following feature selection, model performance improved by 10%. RF gave the highest testing classification accuracy of 97% with features selected with RFE-SVM such as mean step velocity, mean step length, step width variability, mean step width, and step length variability. These features not only give better results but pave the way for an enhanced understanding of ML for clinicians. The findings are the first step to demonstrate the potential of ML as a complementary tool to support clinical practice, however further external validation is needed to confirm these findings.

Data availability

All the digital gait characteristics are presented in the Table 2 in the manuscript. Also the distribution of the data is shown through violin plots in Supplementary Fig. S1. Due to data privacy and sharing agreement, the complete dataset is not publically available. However, it can be available upon reasonable request from corresponding author (lynn.rochester@ncl.ac.uk).

References

Dorsey, E. R. et al. Global, regional, and national burden of Parkinson’s disease, 1990–2016: a systematic analysis for the Global Burden of Disease Study 2016. The Lancet Neurology 17, 939–953, https://doi.org/10.1016/S1474-4422(18)30295-3 (2018).
Article Google Scholar
Przedborski, S., Vila, M. & Jackson-Lewis, V. Series Introduction: Neurodegeneration: What is it and where are we? The Journal of Clinical Investigation 111, 3–10, https://doi.org/10.1172/JCI17522 (2003).
Article CAS PubMed PubMed Central Google Scholar
Jankovic, J. Parkinson’s disease: clinical features and diagnosis. Journal of Neurology, Neurosurgery &amp; Psychiatry 79, 368 (2008).
Article CAS Google Scholar
Rizzo, G. et al. Accuracy of clinical diagnosis of Parkinson disease. Neurology 86, 566, https://doi.org/10.1212/WNL.0000000000002350 (2016).
Article PubMed Google Scholar
Postuma, R. B. et al. MDS clinical diagnostic criteria for Parkinson’s disease. Movement Disorders 30, 1591–1601, https://doi.org/10.1002/mds.26424 (2015).
Article PubMed Google Scholar
Lord, S., Galna, B. & Rochester, L. Moving forward on gait measurement: Toward a more refined approach. Movement Disorders 28, 1534–1543, https://doi.org/10.1002/mds.25545 (2013).
Article PubMed Google Scholar
Rochester, L. et al. Gait and gait-related activities and fatigue in Parkinson’s disease: What is the relationship? Disability and Rehabilitation 28, 1365–1371, https://doi.org/10.1080/09638280600638034 (2006).
Article PubMed Google Scholar
Morris, M., Iansek, R., Matyas, T. & Summers, J. Abnormalities in the stride length-cadence relation in parkinsonian gait. Movement disorders: official journal of the Movement Disorder. Society 13, 61–69 (1998).
CAS Google Scholar
Hausdorff, J. M., Cudkowicz, M. E., Firtion, R., Wei, J. Y. & Goldberger, A. L. Gait variability and basal ganglia disorders: stride‐to‐stride variations of gait cycle timing in Parkinson’s disease and Huntington’s disease. Movement disorders 13, 428–437 (1998).
Article CAS Google Scholar
Ehgoetz Martens, K. A. et al. Subtle gait and balance impairments occur in idiopathic rapid eye movement sleep behavior disorder. Movement Disorders (2019).
Alam, M. N., Garg, A., Munia, T. T. K., Fazel-Rezai, R. & Tavakolian, K. Vertical ground reaction force marker for Parkinson’s disease. PLoS ONE 12, https://doi.org/10.1371/journal.pone.0175951 (2017).
Article Google Scholar
McDade, E. M. et al. Subtle gait changes in patients with REM sleep behavior disorder. Movement Disorders 28, 1847–1853, https://doi.org/10.1002/mds.25653 (2013).
Article PubMed PubMed Central Google Scholar
Mirelman, A. et al. Gait alterations in healthy carriers of the LRRK2 G2019S mutation. Annals of Neurology 69, 193–197, https://doi.org/10.1002/ana.22165 (2011).
Article PubMed Google Scholar
Lim, L. et al. Measuring gait and gait-related activities in Parkinson’s patients own home environment: a reliability, responsiveness and feasibility study. Parkinsonism & related disorders 11, 19–24 (2005).
Article CAS Google Scholar
Lord, S. et al. Independent Domains of Gait in Older Adults and Associated Motor and Nonmotor Attributes: Validation of a Factor Analysis Approach. The Journals of Gerontology: Series A 68, 820–827, https://doi.org/10.1093/gerona/gls255 (2013).
Article Google Scholar
Verghese, J. et al. Gait dysfunction in mild cognitive impairment syndromes. Journal of the American Geriatrics Society 56, 1244–1251 (2008).
Article Google Scholar
Mancini, M. et al. Trunk accelerometry reveals postural instability in untreated Parkinson’s disease. Parkinsonism & related disorders 17, 557–562 (2011).
Article Google Scholar
Hausdorff, J. M. Gait dynamics, fractals and falls: finding meaning in the stride-to-stride fluctuations of human walking. Human movement science 26, 555–589, https://doi.org/10.1016/j.humov.2007.05.003 (2007).
Article PubMed PubMed Central Google Scholar
Buckley, C. et al. The Role of Movement Analysis in Diagnosing and Monitoring Neurodegenerative Conditions: Insights from Gait and Postural Control. Brain Sciences 9, 34 (2019).
Article Google Scholar
Wahid, F., Begg, R. K., Hass, C. J., Halgamuge, S. & Ackland, D. C. Classification of Parkinson’s disease gait using spatial-temporal gait features. IEEE J. Biomedical Health Informat. 19, 1794–1802 (2015).
Article Google Scholar
Tahir, N. M. & Manap, H. H. Parkinson Disease Gait Classification based on Machine Learning Approach. Journal of Applied Sciences 12, 180–185 (2012).
Article ADS Google Scholar
Caramia, C. et al. IMU-Based Classification of Parkinson’s Disease from Gait: A Sensitivity Analysis on Sensor Location and Feature Selection. IEEE J. Biomedical Health Informat. 22, 1765–1774, https://doi.org/10.1109/JBHI.2018.2865218 (2018).
Article Google Scholar
Gao, C. et al. Model-based and model-free machine learning techniques for diagnostic prediction and classification of clinical outcomes in Parkinson’s disease. Sci. Rep. 8, https://doi.org/10.1038/s41598-018-24783-4 (2018).
Xia, Y., Gao, Q. & Ye, Q. Classification of gait rhythm signals between patients with neuro-degenerative diseases and normal subjects: Experiments with statistical features and different classification models. Biomed. Signal Process. Control 18, 254–262, https://doi.org/10.1016/j.bspc.2015.02.002 (2015).
Article Google Scholar
Hanchuan, P., Fuhui, L. & Ding, C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 1226–1238, https://doi.org/10.1109/TPAMI.2005.159 (2005).
Article Google Scholar
Abdulhay, E., Arunkumar, N., Narasimhan, K., Vellaiappan, E. & Venkatraman, V. Gait and tremor investigation using machine learning techniques for the diagnosis of Parkinson disease. Future Generation Computer Systems 83, 366–373 (2018).
Article Google Scholar
Pradhan, C. et al. Automated classification of neurological disorders of gait using spatio-temporal gait parameters. Journal of Electromyography and Kinesiology 25, 413–422 (2015).
Article Google Scholar
Muniz, A. et al. Comparison among probabilistic neural network, support vector machine and logistic regression for evaluating the effect of subthalamic stimulation in Parkinson disease on ground reaction force during gait. Journal of biomechanics 43, 720–726 (2010).
Article CAS Google Scholar
Jane, Y. N., Nehemiah, H. K. & Arputharaj, K. A Q-backpropagated time delay neural network for diagnosing severity of gait disturbances in Parkinson’s disease. Journal of biomedical informatics 60, 169–176 (2016).
Article Google Scholar
Pham, T. D. & Yan, H. Tensor Decomposition of Gait Dynamics in Parkinson’s Disease. IEEE Trans. Biomed. Eng. 65, 1820–1827, https://doi.org/10.1109/TBME.2017.2779884 (2018).
Article ADS PubMed Google Scholar
Hammerla, N. Y., et al. PD disease state assessment in naturalistic environments using deep learning. Twenty-Ninth AAAI Conference on Artificial Intelligence 1742–1748 (2015).
Chang, D., Alban-Hidalgo, M. & Hsu, K. Diagnosing Parkinson’s disease from gait, http://cs229.stanford.edu/proj2014/Daryl%20Chang,%20Marco%20Alban-Hidalgo,%20Kevin%20Hsu,%20Diagnosing%20Parkinson’s%20from%20Gait.pdf (2014).
Djurić-Jovičić, M., Belić, M., Stanković, I., Radovanović, S. & Kostić, V. S. Selection of gait parameters for differential diagnostics of patients with de novo Parkinson’s disease. Neurological research 39, 853–861 (2017).
Article Google Scholar
Galna, B., Lord, S. & Rochester, L. Is gait variability reliable in older adults and Parkinson’s disease? Towards an optimal testing protocol. Gait & posture 37, 580–585 (2013).
Article Google Scholar
Khoo, T. K. et al. The spectrum of nonmotor symptoms in early Parkinson disease. Neurology 80, 276–281 (2013).
Article Google Scholar
Hoehn, M. M. & Yahr, M. D. Parkinsonism: onset, progression, and mortality. Neurology 50, 318–318 (1998).
Article Google Scholar
Goetz, C. G. et al. Movement Disorder Society‐sponsored revision of the Unified Parkinson’s Disease Rating Scale (MDS‐UPDRS): scale presentation and clinimetric testing results. Movement disorders: official journal of the Movement Disorder. Society 23, 2129–2170 (2008).
Google Scholar
Stebbins, G. T. et al. How to identify tremor dominant and postural instability/gait difficulty groups with the movement disorder society unified Parkinson’s disease rating scale: comparison with the unified Parkinson’s disease rating scale. Movement Disorders 28, 668–670 (2013).
Article Google Scholar
Nieuwboer, A. et al. Reliability of the new freezing of gait questionnaire: agreement between patients with Parkinson’s disease and their carers. Gait & posture 30, 459–463 (2009).
Article Google Scholar
Tombaugh, T. N. & McIntyre, N. J. The mini‐mental state examination: a comprehensive review. Journal of the American Geriatrics Society 40, 922–935 (1992).
Article CAS Google Scholar
Powell, L. E. & Myers, A. M. The activities-specific balance confidence (ABC) scale. The Journals of Gerontology Series A: Biological Sciences and Medical Sciences 50, M28–M34 (1995).
Article Google Scholar
Klucken, J. et al. Unbiased and mobile gait analysis detects motor impairment in Parkinson’s disease. Plos One 8, e56956 (2013).
Article ADS CAS Google Scholar
Cuzzolin, F. et al. Metric learning for Parkinsonian identification from IMU gait measurements. Gait & posture 54, 127–132 (2017).
Article Google Scholar
Arora, S. et al. In Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. 3641-3644 (IEEE).
Pedregosa, F. et al. Scikit-learn: Machine learning in Python. Journal of machine learning research 12, 2825–2830 (2011).
MathSciNet MATH Google Scholar
Granitto, P. M., Furlanello, C., Biasioli, F. & Gasperi, F. Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products. Chemometrics and Intelligent Laboratory Systems 83, 83–90 (2006).
Article CAS Google Scholar
Strobl, C., Boulesteix, A.-L., Kneib, T., Augustin, T. & Zeileis, A. Conditional variable importance for random forests. BMC bioinformatics 9, 307 (2008).
Article Google Scholar
Qiu, H., Rehman, R. Z. U., Yu, X. & Xiong, S. Application of Wearable Inertial Sensors and A New Test Battery for Distinguishing Retrospective Fallers from Non-fallers among Community-dwelling Older People. Sci. Rep. 8, 16349, https://doi.org/10.1038/s41598-018-34671-6 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Guyon, I., Weston, J., Barnhill, S. & Vapnik, V. Gene selection for cancer classification using support vector machines. Machine learning 46, 389–422 (2002).
Article Google Scholar
Baltadjieva, R., Giladi, N., Gruendlinger, L., Peretz, C. & Hausdorff, J. M. Marked alterations in the gait timing and rhythmicity of patients with de novo Parkinson’s disease. European Journal of Neuroscience 24, 1815–1820, https://doi.org/10.1111/j.1460-9568.2006.05033.x (2006).
Article PubMed Google Scholar
Din, S. D., Godfrey, A. & Rochester, L. Validation of an Accelerometer to Quantify a Comprehensive Battery of Gait Characteristics in Healthy Older Adults and Parkinson’s Disease: Toward Clinical and at Home Use. IEEE J. Biomedical Health Informat. 20, 838–847, https://doi.org/10.1109/JBHI.2015.2419317 (2016).
Article Google Scholar
Guyon, I. & Elisseeff, A. An introduction to variable and feature selection. Journal of machine learning research 3, 1157–1182 (2003).
MATH Google Scholar

Download references

Acknowledgements

This work was supported by “Keep Control” project, which is a European Union Horizon 2020 research and innovation ITN program under the Marie Sklodowska-Curie grant agreement No. 721577. ICICLE-Gait study was supported by Parkinson’s UK (J-0802, G-1301) and by the National Institute for Health Research (NIHR) Newcastle Biomedical Research Centre based at Newcastle Upon Tyne Hospital NHS Foundation Trust and Newcastle University (REC number: 09/H0906/82). The work was also supported by the NIHR/Wellcome Trust Clinical Research Facility (CRF) infrastructure at Newcastle upon Tyne Hospitals NHS Foundation Trust. The authors would like to thank all the participants of the ICICLE study and PD UK framework for supporting the study. All opinions are those of the authors and not the funders.

Author information

Authors and Affiliations

Institute of Neuroscience/Institute for Ageing, Newcastle University, Newcastle Upon Tyne, NE4 5PL, UK
Rana Zia Ur Rehman, Silvia Del Din, Alison J. Yarnall & Lynn Rochester
School of Computing, Newcastle University, Newcastle Upon Tyne, NE4 5TG, UK
Yu Guan
School of Mathematics, Statistics, and Physics, Newcastle University, Newcastle Upon Tyne, NE1 7RU, UK
Jian Qing Shi
The Newcastle Upon Tyne Hospitals NHS Foundation Trust, Newcastle Upon Tyne, NE7 7DN, UK
Alison J. Yarnall & Lynn Rochester

Authors

Rana Zia Ur Rehman
View author publications
You can also search for this author in PubMed Google Scholar
Silvia Del Din
View author publications
You can also search for this author in PubMed Google Scholar
Yu Guan
View author publications
You can also search for this author in PubMed Google Scholar
Alison J. Yarnall
View author publications
You can also search for this author in PubMed Google Scholar
Jian Qing Shi
View author publications
You can also search for this author in PubMed Google Scholar
Lynn Rochester
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Rana Zia Ur Rehman performed data analysis, statistical analysis, drafting and critical revision of the manuscript. Silvia Del Din helped in data analysis, interpretation of data and critical revision of the manuscript for important intellectual content. Yu Guan and Jian Qing Shi provided support for statistical analysis, interpretation, and critical revision of the manuscript for important intellectual content. Alison J. Yarnall was involved in interpretation of data and critical revision of the manuscript. Lynn Rochester conceptualized and designed the study, helped in interpretation of data, and critically revised the manuscript for important intellectual content.

Corresponding author

Correspondence to Lynn Rochester.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Figure S1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Rehman, R.Z.U., Del Din, S., Guan, Y. et al. Selecting Clinically Relevant Gait Characteristics for Classification of Early Parkinson’s Disease: A Comprehensive Machine Learning Approach. Sci Rep 9, 17269 (2019). https://doi.org/10.1038/s41598-019-53656-7

Download citation

Received: 24 April 2019
Accepted: 23 October 2019
Published: 21 November 2019
DOI: https://doi.org/10.1038/s41598-019-53656-7

This article is cited by

Reliability of patient-specific gait profiles with inertial measurement units during the 2-min walk test in incomplete spinal cord injury
- Romina Willi
- Charlotte Werner
- Marc Bolliger
Scientific Reports (2024)
Gait classification for early detection and severity rating of Parkinson’s disease based on hybrid signal processing and machine learning methods
- Qinghui Wang
- Wei Zeng
- Xiangkun Dai
Cognitive Neurodynamics (2024)
A hybrid linear discriminant analysis and genetic algorithm to create a linear model of aging when performing motor tasks through inertial sensors positioned on the hand and forearm
- Veronica de Lima Gonçalves
- Caio Tonus Ribeiro
- Adriano Alves Pereira
BioMedical Engineering OnLine (2023)
The performance of various machine learning methods for Parkinson’s disease recognition: a systematic review
- Nader Salari
- Mohsen Kazeminia
- Masoud Mohammadi
Current Psychology (2023)
Designing compact features for remote stroke rehabilitation monitoring using wearable accelerometers
- Xi Chen
- Yu Guan
- Janet Eyre
CCF Transactions on Pervasive Computing and Interaction (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Methods

Participants

Demographic and clinical measures

Testing protocol and experimental setup

Data processing and outcome

Statistical analysis

Framework for classification modelling

Gait characteristics selection techniques

Model Inputs

Initiation of selection process

Output

Results

Input features as training data

Selected machine learning models

Selected gait characteristics for optimal results

Discussion

Conclusion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links