Introduction

Cervical spondylotic myelopathy (CSM), as a common senile degenerative disease, has become a worldwide health problem, bringing serious burdens to individuals and society1. The pathogenesis of CSM is thought to be related to spinal cord compression, but the exact pathophysiological mechanism remains unclear2. Currently, the main treatment for CSM is spinal cord decompression by surgery, but the efficacy of surgery varies with individual differences3,4. Preoperatively, the severity of the disease is mainly evaluated based on the patient's symptoms and imaging parameters, but this pattern still has certain limitations in predicting the patient's curative effect5. Previous studies have shown that preoperative factors such as age, severity of spinal cord compromise, duration of symptoms, comorbidities and cervical sagittal alignment can affect the prognosis of patients6,7,8,9,10. Nevertheless, in most of these studies patient classification is based on research purposes and clinical experience, leading to limitation of patient characteristics, patient-based assessment such as the quality of life and inconsideration of confounding factors.

As an exploratory data analysis method, cluster analysis refers to the analysis process of grouping a collection of samples into multiple categories composed of similar samples, so that the samples have a high degree of intra-group similarity and inter-group difference. In this process, cluster analysis can automatically classify from sample data without relying on the classification criteria given by researchers in advance. Therefore, from the perspective of machine learning, cluster analysis is an unsupervised learning process. Cluster analysis based on unsupervised machine learning can improve the classification of disease phenotypes and patients in studies11,12, and with incorporating more clinical features, previously unobvious data associations and structures may be revealed13. In this study, it is assumed that there are clinically related groups in existing CSM patients that transcend the previous prior classification, and hierarchical clustering is applied to explore the types of patients, and the types of patients generated by the clustering are analyzed, so as to identify the preoperative related factors with predictive significance for the mixed system, and to determine which patients have the best surgical effect.

Data and methods

General information

Based on the clinical big data research platform of our institution, data of CSM patients who received surgical treatment in our hospital from January 2012 to December 2020 were collected. Inclusion criteria: (1) CSM patients diagnosed by orthopedics and undergoing surgical treatment (anterior cervical discectomy and fusion [ACDF] or laminoplasty [LP]); (2) Age > 18 years old; (3) Complete preoperative baseline data and follow-up data including at least one short-term (≤ 6 months) and one long-term (≥ 12 months) follow-up. Exclusion criteria: (1) previous history of neck surgery; (2) unqualified follow-up time; (3) patients with traumatic myelopathy or cervical spine deformities.

Baseline data

  1. 1.

    Population characteristics: gender, age, smoking and alcohol consumption history.

  2. 2.

    Clinical symptoms: numbness, neck and shoulder pain, chest and abdominal banding sensation, plantal cotton-stepping sensation, fine motor loss, gait abnormality, sympathetic symptoms;

  3. 3.

    Physical examination: muscular atrophy, muscle strength loss, abnormal reflexes, positive pathological signs (included Rossolimo’s sign, Hoffmann’s sign or Babinski’s sign), positive Eaton test, positive Spurling test;

  4. 4.

    Scoring Information: modified Japanese Orthopedic Association score (mJOA)14 and Quality of Life Short Form 36 (SF-36) scale15 were used to assess cervical spinal cord function and preoperative quality of life respectively.

In the above data, gender is a binary variable, while age and score data are continuous variables. Smoking history, alcohol consumption history, clinical symptoms and physical examination were all defined as "yes/no" binary variables.

Follow-up data

Short-term follow-up data within 6 months and long-term follow-up data over 12 months after surgery were included in this study. If patients had multiple follow-ups during this period, the follow-up timepoint farthest from the operation time was selected respectively. The follow-up included mJOA score and SF-36 scale, in which the data of SF-36 scale included scores of eight dimensions including physical function (PF), role-physical (RP), bodily pain (BP), vitality (VT), social function (SF), role-emotional (RE), mental health (MH) and general health (GH), and SF-36 health transformation (HT) score.

Patient prognosis assessment

In this study, the spinal cord Recovery Ratio (RR) was used to evaluate the prognosis of patients. Spinal cord function RR = (follow-up mJOA score − baseline mJOA score)/(17-baseline mJOA score) × 100%16. An improvement rate of > 50% was defined as a good prognosis17.

Statistical analysis

SPSS (IBM, 24.0) was used for statistical analysis of the data.

  1. (1)

    In this study, hierarchical clustering was selected as the clustering method, and the clustering characteristic values included mJOA and SF-36 scores of short-term follow-up patients. The square of Euclidean distance was selected to measure the similarity of objects, and Ward's method was used for clustering calculation18. According to the dendrogram generated by hierarchical clustering and the elbow graph drawn by sum of the squared errors (SSE) and clustering number k, the optimal clustering number was finally selected.

  2. (2)

    For patients in each group determined by hierarchical clustering, the spinal cord function improvement rate (RR) based on mJOA score was calculated to show the prognosis of patients in each group, and the prognosis of patients in each group during long-term follow-up was analyzed to test the effectiveness of cluster analysis. And finally the impact of baseline data on the prognosis of patients in each group was analyzed.

  3. (3)

    For continuous variables such as age and score data, they were expressed as mean ± standard deviation; Shapiro–Wilk test was used to evaluate whether the data were normally distributed. For the data with normal distribution, one-way ANOVA was used to test the differences between the groups. For data with non-normal distribution, the rank sum test (Kruskal–Wallis test) was used to analyze the differences between groups. For the other classification variables, the expected frequency was calculated first. For the data with expected frequency ≥ 5, Pearson Chi-square test (χ2) was used to analyze the differences among all groups. For data with expected frequency < 5, Fisher's exact test was used to analyze the differences among groups.

Ethical approval

This study was approved by the Ethics Committee of Peking University Third Hospital (2021-60-02). Informed consent was obtained from all subjects in the database. All analysis was performed in accordance with relevant regulations of the committee and the Declaration of Helsinki.

Results

In this study, 476 patients with CSM (249 males and 227 females) with an average age of 52.0 ± 11.2 years were included, including 71 patients (14.9%) with a history of smoking and 40 patients (8.4%) with a history of alcohol consumption. Clinical symptoms: 389 patients (81.7%) with numbing, 106 patients (22.3%) with neck and shoulder pain, 28 patients (5.9%) with chest and abdominal band sensation, 185 patients (38.9%) with plantar cotton-stepping sensation, 59 patients (12.4%) with fine motor loss, 40 patients (8.4%) with gait abnormality, 96 patients (20.2%) with sympathetic symptoms; Physical examination: 49 patients (10.3%) with muscle dystrophy, 255 patients (53.6%) with decreased muscle strength, 43 patients (9.0%) with abnormal reflex, 337 patients (70.8%) with positive pathological signs, 183 patients (38.4%) with positive Eaton test, 92 patients (19.3%) with positive Spurling test. Among all patients, 355 (74.6%) had surgery involving multilevel of the cervical spine. 288 (60.5%) patients underwent ACDF, and 188 (39.5%) underwent laminoplasty (LP). There was no significant difference in prognosis of patients between ACDF and LP group, with a short-term mJOA RR (%) 43.3 ± 60.8 vs 41.6 ± 51.3, P = 0.078, and a long-term mJOA RR (%) 44.0 ± 79.5 vs 39.1 ± 68.5, P = 0.073. Baseline and follow-up data of mJOA score and SF-36 score for the entire cohort are shown in Table 1 and Fig. 1.

Table 1 Baseline data and follow-up data of mJOA score and SF-36 score for the entire cohort.
Figure 1
figure 1

Baseline data and follow-up data of mJOA score and SF-36 score for the entire cohort. The change of each SF-36 score is different, and the SF-36 data of each patient is quite different. (A) The mJOA score shows a trend of improvement in both ACDF group and LP group (B).

Results of hierarchical cluster analysis

The results of hierarchical clustering of 476 patients are shown in the dendrogram (Fig. 2). Refer to the relation between the sum of squared errors (SSE) and clustering number k (Fig. 3), the optimal clustering number in this study was 4. Table 2 shows the total sample of CSM patients included in this study and the score data of the 4 clusters of patients generated by hierarchical clustering. Among them, the short-term follow-up score was the features included in hierarchical cluster, with significant difference among all clusters (P < 0.001). Preoperative score data of all clusters showed significant differences in SF-36 physiological function (P = 0.008). There were significant differences in mJOA scores (P = 0.042), and no significant differences in other preoperative scores (P > 0.05, Table 2).

Figure 2
figure 2

Dendrogram from application of unsupervised hierarchical clustering. It shows 2 definite clusters at the first branch, 3 clusters at the second branch and 4 clusters at the third branch. (A) Each dot represents a patient of the sample and colors represent the portions of the dendrogram in each of the four patient clusters. Green represents cluster 1, yellow cluster 2, blue cluster 3 and gray cluster 4.

Figure 3
figure 3

“Elbow” relation between the sum of the squared errors (SSE) and clustering number k. SSE is the clustering error of all samples. With the increase of clustering number k, the aggregation degree of each category will gradually increase, and SSE will become smaller. When k is less than the real cluster number, the increase of k value will greatly increase the aggregation degree of each category, and the corresponding SSE decreases greatly. However, when k reaches the real clustering number, the aggregation degree improved by increasing k becomes smaller rapidly, so SSE tends to get flat with the continuous increase of k value. Therefore, the graph shape of SSE and k resembles "elbow", and the value of k corresponding to the inflection point of the elbow was 4.

Table 2 Score data of patients in each cluster.

The prognosis of each cluster was mainly reflected by the improvement rate of mJOA score after surgery. Cluster 2 and 4 had higher mean mJOA improvement rates, while cluster 1 and 3 had lower mean mJOA improvement rates, with significant differences among clusters. (P < 0.001, Table 3) Meanwhile, patients in each cluster showed similar results during long-term follow-up: the mJOA RR in cluster 2 and 4 was significantly higher than that in cluster 1 and 3 (Fig. 4).

Table 3 mJOA score and RR of patients in each cluster.
Figure 4
figure 4

mJOA score and RR of patients in each cluster. The mJOA scores of patients in each cluster were improved postoperatively. (A) The recovery ratio of mJOA score (mJOA RR) was used to evaluate the prognosis of patients in each cluster. (B) The mJOA RR of patients in cluster 2 and 4 were more than 50%, indicating a good prognosis, while those in cluster 1 and 3 had a poor prognosis.

Table 4 shows the differences in the distribution of CSM patient population characteristics, clinical symptoms, physical examination and surgical procedures among all clusters. The population difference among all clusters was manifested in age (P = 0.021). Preoperative clinical symptoms were different in neck and shoulder pain (P = 0.044) and gait abnormality (P = 0.012). The difference of preoperative physical examination was in the proportion of patients with positive pathological signs among all clusters (P = 0.006). There was no significant difference in other characteristics of patients (P > 0.05, Table 4).

Table 4 Population characteristics, clinical symptoms and physical examination features of patients in 4 clusters.

Discussion

Advantages of cluster analysis

Cluster analysis is an unsupervised machine learning method that can find more homogeneous groups in different data sets19. This approach can identify more complex data patterns in more realistic hybrid systems and classify patients or interventions based on observable eigenvalues. In cluster analysis, determining patient types and intervention categories are purely data-driven that does not rely on a priori assumptions and is therefore an effective complement to supervised learning.

Hierarchical clustering, as one of the methods of cluster analysis, does not need to specify the number of clusters in advance, and can show the clustering process of large samples directly. Ames, etc.13 proposed that hierarchical clustering based on artificial intelligence can be used to include and synchronously analyze more overall patient population characteristics, symptom factors, imaging and functional scores than existing patient classification schemes. Similarly, the method has been used to describe groups of patients in various diseases, including adult spinal deformities, pulmonary hypertension, asthma, mental disorders, and malignancies20,21,22,23, and the influencing factors of patient benefit after surgical intervention24.

Significance of CSM patient categories obtained based on hierarchical clustering method

With the baseline data of the patients included in the study, this study reviewed the literature on CSM in recent years, summarized the common symptoms and signs, and combined with the data of our hospital to screen the specialty characteristics of cervical spondylosis with high frequency, included in this study. The mJOA score scale is the most used objective index to evaluate the outcome of patients with CSM14. But previous studies10,25 showed a lack of patient-based assessment such as the quality of life. In addition to spinal cord function, the improvement of patients' health-related quality of life more directly demonstrates the therapeutic effect from the perspective of patients. Therefore, the results of SF-36 scale were also included in the selection of clustering features in this study.

In this study, long-term follow-up data of the same population were analyzed to preliminarily verify the rationality of the hierarchical clustering results based on short-term follow-up data. A previous 10-year follow-up study26 found that mJOA scores showed similar improvement rates within one year or more for patients after CSM surgery, and this trend continued until 5 years after surgery. In this study, the mJOA RR in each cluster during long-term follow-up was consistent with that in short-term follow-up, indicating that the hierarchical clustering results based on short-term follow-up data were representative to a certain extent. Thus, it is reasonable to use cluster 2 and 4 to represent patients with good prognosis in this study.

From the results of hierarchical clustering, surgical intervention can significantly improve spinal cord function in CSM patients, which is consistent with the results of a systematic review by Rhee et al.27 in 2017. Affected by preoperative factors, the prognosis of the four clusters of patients was different.

In terms of population characteristics, the average age of patients was younger in the cluster 2 and 4 with a good prognosis, while the average age of patients in the cluster 1 and 3 with a poor prognosis corresponded to a larger average age (Fig. 5). Matsuda et al. conducted a study on 17 CSM patients over 70 years old and reported that their recovery rate was significantly lower than that of the control group28. In a prospective study, Furlan et al.29 reported similar results, and this study suggests that such age-related prognostic differences are also present in younger age groups. However, Hasegawa et al.30 and Holly et al.31 reported that compared with younger CSM patients, there is no significant difference in surgical outcomes in older patients, but the incidence of neurological complications is higher. This may be contributed to the fact that (1) the spinal cord in the elderly experiences age-related changes, including decreased C-motor neurons, decreased number of anterior horn cells, and decreased number of myelinated fibers in the corticospinal tract and posterior cord32; (2) older patients are more likely to have unrelated comorbidities that may affect quality of life33. Kusin et al.34 reported that smoking is also an important prognostic factor in CSM patients, but there was no similar trend found in our cluster analysis. The reasons may be that (1) most patients choose to quit smoking before admission; (2) the personal history of medical history was not collected carefully, resulting in inaccurate baseline data, and the impact of local air pollution on non-smokers.

Figure 5
figure 5

Radar map of prognostic factors of patients in each clusters. The proportion of color on the map represents the prognosis of patients in each cluster. PR positive ratio, PF physical function.

In terms of clinical symptom and physical examination, compared with patients in the cluster 2 and 3, it can be found that lower rate of abnormal gait and positive rate of pathological signs before surgery also correspond to better prognosis of patients, similar to the specific symptoms and signs mentioned by Badhiwala et al.5 which may also be potential predictors of outcome in patients with CSM, but the biological mechanism remained unclear. Previous studies25,35 reported that gait abnormality was a disturbing symptom in patients with CSM, which can result from the involvement of long tracts of the spinal cord.

In terms of preoperative score data, it is reported in the systematic review by Tetreault et al.33 that the duration of symptoms and the severity of preoperative myelopathy are also clear predictors of prognosis in CSM patients. The results of this study suggest that preoperative myelopathy severity can be better reflected by SF-36 physical function (PF) score and mJOA score, which are significantly higher in cluter 2 with a good prognosis than in the other groups, while the cluster 3 with a poor prognosis corresponds to a poor preoperative score. In addition, by comparing patients in cluster 3 and 4, it could be speculated that age and incidence of neck and shoulder pain may be influencing factors to determine the prognosis of patients with poor preoperative SF-36 physical function (PF) score and mJOA score. Whether they may be useful clinical factors for identifying CSM patients at risk for poor postoperative outcomes deserves further studies.

In the field of CSM, Zhou et al.36 previously proved that machine learning-based clustering could be used to rationally classify a heterogeneous cohort of CSM patients effectively. In this study, through the postoperative cluster analysis of CSM patients and combined with the baseline data of patients, the preoperative population characteristics and clinical characteristics affecting the prognosis of patients were screened out. In future studies, the amount of information of patients will be larger, and the influence of various factors on the surgical prognosis of patients will be more complex and diversified. Unlike traditional research methods such as cohort study, cluster analysis can identify more valuable preoperative related factors in a hybrid system closer to the real world, and make more accurate preoperative curative effect prediction based on these factors.

Limitations

In this study, patients were unsupervised divided into 4 categories and some possible prognostic factors for CSM patients were screened out, which is helpful to provide useful information for informed consent of patients before surgery and assist doctors in clinical decision-making18. However, this study also has shortcomings: (1) As a retrospective study, the data in this study was restricted in a single center, and complete data of patients only accounted for 25.1% with a relatively serious situation of data missing and loss of follow-up. Still, this is already a relatively large cohort based on a review of similar studies of cervical spine disease37. (2) This study did not include preoperative and follow-up imaging data for analysis, and did not collect patients' previous history data (e.g., diabetes), detailed surgical data (e.g., segments involved, blood loss) and other postoperative rehabilitation data, which may have a certain impact on our patient clustering38. (3) Compared with the analysis of multiple perioperative time points, the duration of this study was fairly short, which may have limitations in describing the duration of patients' symptoms and recovery course24. (4) Blind evaluation is recommended for clinical studies; however, mJOA scores are evaluated by surgeons. The data from this process will likely affect the results of the experiment18.

In conclusion, the postoperative efficacy of CSM patients related to this study still needs to be verified by multi-center, large sample size and long-term follow-up observation. Despite the above limitations, the results of this study preliminarily verified the feasibility of hierarchical clustering in the study of prognostic factors of CSM patients, laying a foundation for future cluster studies involving surgical information, imaging information and other factors.

Conclusions

In this study, cluster analysis was performed based on postoperative follow-up information, and CSM patients who underwent surgical treatment were divided into four categories, representing four different prognostic patterns of patients, from which preoperative factors were identified and could help predict the prognosis of patients: (1) lower age in the population characteristics; (2) lower rates of neck and shoulder pain and gait abnormalities among clinical symptoms; (3) a smaller positive rate of pathological signs on physic al examination; (4) higher SF-36 physiological functional dimension scores and mJOA scores in the scoring information, all of above referred to better patient outcomes. This study explored the feasibility of applying cluster analysis method in the study of prognostic factors of CSM patients, and provided reference and research basis for further relevant studies, which may include collecting larger sample data, extracting more patient characteristics, setting more follow-up timepoints, and improving the clustering algorithm.