Introduction

Mechanical failure (MF) following surgery for adult spinal deformity (ASD) is a severe postoperative complication and often requires planned and unplanned revision surgery1,2,3,4,5,6,7,8,9,10,11,12,13,14,15. Various types of symptomatic and asymptomatic MF can be developed following ASD surgery (proximal junctional kyphosis/failure [PJK/PJF], distal junctional kyphosis [DJK], rod failure [RF]) The postoperative MF rates for ASD have been reported to be as high as 50%1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17. Yilgor et al. have described that postoperative global alignment and proportion (GAP) were significantly correlated with the MF16. Theologis et al. reported that MF increased the treatment cost by more than double in ASD surgery17. Several previous studies have described the poor prognosis of patients who developed neurological deficits following MF6,10,11,12,13,14,15. Various risk factors for MF have been reported, including age, spinal misalignment requiring a large correction, osteoporosis, and the application of pedicle subtraction osteotomy (PSO)2,4,5,10,11,12,13,14,15,16,18. However, it is still difficult to prevent or minimize the risk of complications of individuals following ASD surgery.

This study aimed to establish ASD-specific risk stratification model for predicting the risk of MF using the individual’s demographic and radiographic, and surgical description data.

Materials and Methods

This study was approved by the institutional review board at our institutions (Keio University School of Medicine Ethics Review Committee), and all subjects consented and agreed with their inclusion. We attest that the oral and written informed consents were obtained from all these patients. The all methods were performed in accordance with the relevant guidelines and regulations.

Patient population

We retrospectively reviewed charts and radiographs for 321 consecutive patients who underwent surgery for ASD in four academic hospitals between 2009 and 2016. For this study, we used the multicentered database established in a previous study and added 86 new patients who reached a 2-year postoperative follow-up duration19.

Inclusion and exclusion criteria

Subjects were at least 20 years old at the index surgery and had a spinal deformity defined by a Cobb angle ≥ 20°, a C7 sagittal vertical axis (C7SVA) ≥ 5 cm, or pelvic tilt (PT) ≥ 25°. We included patients with at least 5 fused vertebrae, segmental instrumentation and fusion from the upper-instrumented vertebra (UIV) to the LIV (lower-instrumented vertebra), and complete 2-year follow-up data. Patients were excluded if they lacked appropriate radiographs or had multi-rod constructs, posterior tethering at the UIV + 1 vertebra, or syndromic, neuromuscular, or other pathological conditions.

Collection of radiographic, health-related quality of life (HRQOL), and other demographic data

We collected demographic and clinical data for each patient, including age, gender, body mass index (BMI), bone mineral density (BMD), smoking status, history of joint arthroplasty (hip), and spine surgery. Frailty and comorbidities were assessed using the modified frailty index (mFI) and the Charlson comorbidity index (CCI)20,21,22,23. We collected the following surgical data: the SRS-Schwab ASD classification, GAP score, application of three column osteotomies, lateral lumbar interbody fusion (LIF), UIV and LIV levels, number of fused vertebrae, number of cross-connectors, material of the rods, estimated blood loss, and time of surgery. Radiographic data obtained at baseline and at the 6-week and 2-year follow-up, included the following measurements: Cobb angle, C7SVA, T4-T12, lumbar lordosis (LL), sacral slope, PT, pelvic incidence (PI), T1 pelvic angle (TPA), relative pelvic version (RPV), relative lumbar lordosis (RLL), lordosis distribution index (LDI) and relative spinopelvic alignment (RSA). As a surrogate for HRQOL, we used the Scoliosis Research Society-22r questionnaire (SRS22r) results at baseline and at the 2-year follow-up.

Of 332 candidates, 321 subjects had complete demographic and radiographic data that sufficiently captured any instances of postoperative MF and were thus included in the study cohort. We split patient samples into a training and testing cohort at an 8:2 ratio. The remaining 11 candidates were lost during follow-up, including 4 candidates who died for reasons unrelated to the surgery and were excluded from the cohort.

Inclusion of mechanical failures

According to the previous literature, we included all MFs found on the radiographs that developed within 2 years of the operation (PJK/PJF, DJK, RF, and other implant-related complications).

Data preparation

Subjects were categorized into two groups: those who had any MF within 2 years of the operation and those who were free of MF. We investigated the relationships between patient demographics, spinal alignment, surgical factors, and the development of complications by univariate and multivariate logistic regression analysis using the data of the training cohort (age: 53 ± 19 years; follow-up: 4.4 ± 1.7 years). We created categories based on clinical importance and on the results of unpaired t-tests and Tukey’s HSD test or the Wilcoxon ranked test where appropriate, as follows: age ≤ 60 years or > 60 years; BMI < 18.5 kg/m2, 18.5–25 kg/m2 >25 kg/m2; BMD T-score ≤ −1.5 or > −1.5; frailty: robust (mFI: 0), prefrail (0.09–0.21), or frail (≥ 0.27); UIV T1-T6 or T9-T11; LIV L5 and above or the pelvis; Cobb angle <70° or> 70°; C7SVA < 40 mm, 40–95 mm, or> 95 mm; PT < 20°, 20°–30°, or > 30°; PI-LL < 10°, 10°–20° or, >20°.

Analysis of risks for mechanical failure in the patient cohort

We calculated the overall summary statistics, including the means and standard deviations for continuous variables and frequencies and percentages for categorical variables. After descriptive analysis, we analyzed the associations between potential risk factors and MF by univariate comparisons. We then created a multivariate binary logistic regression model to evaluate the adjusted associations of each potential explanatory variable and to predict the likelihood of developing MF. Clinically relevant variables and variables with a univariate significance level < 0.05 were included in the multivariate logistic regression analysis.

Building a model to predict mechanical failure

Based on the predictors obtained from multivariate analysis, we designed a simplified, risk-stratification algorithm (PRISM) to provide a score that can be used to predict the incidence of MF. We included 6 variables in our risk-stratification model. We first established values for each risk indicator by rounding the β regression coefficient obtained in the univariate analysis to the nearest whole integer (0–3). Next, the values for all applicable risk indicators were added together with age and LIV level to establish the PRISM score (0–12). We evaluated the discriminative ability of the PRISM score based on the area under the receiver operating characteristic curve (AUROC). Linear regression analysis and the Cuzick test were performed to analyze whether there is a trend between the PRISM score and the incidence of MF. The PRISM score was used to stratify patient risk into low risk (0–1), moderate risk (2–4), high risk (5–8), and very-high risk (9–12).

Validation of PRISM for MF

The PRISM model was applied to 64 testing samples that were not used for model development. The discriminative ability of PRISM was evaluated using AUROC analysis, linear regression algorithm, and the trend test.

Statistical analysis

Differences between the MF and MF-free groups were compared by unpaired t-test, chi-square test, Tukey’s HSD test, and Fisher’s exact test where appropriate. Potential risk variables were analyzed by univariate and multivariate logistic regression. A correlation of the distribution of the observed MF rate for each PRISM score was created with a linear regression algorithm that best fit the data points with a 95% CI. A p value less than 0.05 with a CI of 95% was considered statistically significant24,25. All analyses were performed using the Statistical Package for the Social Sciences (SPSS statistics version 26.0, SPSS modeler version 18, IBM Corp., Armonk, NY).

Results

Patient characteristics

Among 257 training samples, MF developed in 40.5% (n = 104) of the patient cohort. The most common MF was PJK (n = 55, 21%), and the second common was RF (n = 33, 13%). Thirty-nine (38%) patients developed more than two MFs and 33 of the MF group patients (32%) required unplanned additional surgeries to treat the MFs.

Clinical and radiographic outcomes in the mechanical failure and mechanical failure-free groups

Patients who developed MFs experienced significant improvements in HRQOL, as measured by SRS22r at the 2-year follow-up (Supplemental Table 1). However, the 2-year SRS22 scores were worse in the MF group than in the MF-free group except for the mental-health subdomain (Supplemental Table 1).

Risk analysis for mechanical failure

Comparisons of the demographic and radiographic data between the MF and MF-free groups indicated different distributions of age, BMI, BMD, frailty, CCI, baseline and 2-year postoperative sagittal spinal alignment, curve type, and surgical details (Table 1 and Supplemental Table 2). Univariate analyses revealed the following significant risk factors for the development of MF, presented here in order of the odds ratio (OR): Curve type N, type L, LIV, PT, age, frailty, PI-LL, BMD, PSO, UIV, C7SVA, and LIF (Table 2).

Table 1 Comparisons of demographic data and surgical descriptions between the mechanical failure-free and mechanical failure groups in the training cohort.
Table 2 Univariate logistic regression analysis for the risk of mechanical failure following ASD surgery.

Among these risk factors, the multivariate analysis identified that BMD, PT, frailty, and BMI as independent risk factors for MF (BMD: OR 3.8 [1.9–7.7], PT: OR 2.6 [1.8–3.9], frailty: OR 1.9 [1.1–3.2], BMI: OR 1.7 [1.0–2.9], Table 3).

Table 3 Multivariate logistic regression analysis for the risk of mechanical failure following ASD surgery.

Building and validating a model to predict mechanical failure

We created a surgical risk grading system with 6 risk variables, including 4 variables identified in the multivariate analysis, namely, BMI, BMD, baseline PT, and frailty, and 2 clinically important variables, patient age and level of LIV (pelvis). The PRISM score was determined as the sum of the values of the risk variables (Fig. 1). Thirty-seven percent of the patients were classified as grade high risk, 27% as moderate risk, 22% as low risk, and 14% as very high risk (Fig. 2). MF increased exponentially as the PRISM score worsened, and the linear regression algorithm that best fit the data points with a 95% CI confirmed an excellent correlation between the PRISM score and the actual development of MF, using the following regression model: y = 1.66 + 8.29x and r2 = 0.956.

Figure 1
figure 1

Risk-grading system for mechanical failure following ASD surgery. The risk stratification score was used to stratify the risk into low risk (risk score 0–1), moderate risk (risk score 2–4), high risk (risk score 5–8), and very high risk (risk score 9–12).

Figure 2
figure 2

The distribution of mechanical failure in the training cohort, stratified by score relative to the observed mechanical failure rate. The mechanical failure rate increased with the score. A statistically significant trend between the mechanical failure rate and the score was observed (p for trend ≤ 0.001, Cuzick test).

The PRISM score also showed excellent accuracy for predicting the incidence of MF, with an AUROC of 0.812 (95% CI 763-0.864, Fig. 3 and Supplemental Table 3). In addition, the trend analysis showed excellent correlation with the incidence of MF and PRISM score (Cuzick test, P < 0.001). Internal validation with the testing sample for PRISM showed good model fit for the prediction of MF with excellent discriminating ability (y = 13.9 + 10.5x and r2 = 0.866, AUROC 0.855 [95% CI 765-0.945], Figs 4 and 5 and Supplemental Table 4).

Figure 3
figure 3

The distribution of the score and receiver operating characteristic (ROC) analysis in the training cohort relative to the observed mechanical failure rate for each score. ROC curve of the mechanical failure rate for the score (red line) in the training cohort. The area under the ROC curve (AUROC) was 0.812, stander error = 0.026, p ≤ 0.001, 95% CI = 0.763-0.864 for the score.

Figure 4
figure 4

The distribution of scores and receiver operating characteristic (ROC) analysis in the testing cohort relative to the observed mechanical failure rate for each score. ROC curve of s mechanical failure s for the scores (red line) in the testing samples. The area under the ROC curve (AUROC) was 0.855, stander error = 0.046, p 0.001, 95% CI = 0.765-0.945 for the score.

Figure 5
figure 5

Distribution of scores and grades in the testing cohort relative to the observed mechanical failure rate. The incidence of mechanical failure increased with the score. A statistically significant trend between the mechanical failure rate and the score was observed (p for trend ≤ 0.001, Cuzick test).

Discussion

Mechanical failure is the most common surgical complication after ASD surgery1,2,3,4,5,6,7,8,9,10,11,12,13,14,15. In this study, MF developed in 40.5% of patients following ASD surgery. Among them, 39% of patients in the MF group developed multiple MFs during follow-up, and 30% of patients in the MF group required unplanned reoperation. Crawford et al. described that 24% of patients required an unplanned reoperation following ASD surgery, and the most common indication for reoperation was RF1. Inoue et al. described that the most frequent type of MF associated with reoperation was PJF25. In the present study, approximately 36% of RFs developed 3 years or longer after surgery, and 70% of them were not associated with either significant alteration in HRQOL or progressive deformity. Following the previous literature, the clinical outcomes of patients who developed MF were inferior to those of MF-free patients at 2 years after surgery. Soroceanu et al.26 described that MF significantly affected HRQOLs in 245 consecutive ASD surgeries. Lertudomphonwanit also described that the MF group had less overall improvement in HRQOLs than did the MF free group2. Recently, several cost-utility analyses of the surgical treatment for ASD have been performed17,27,28. Yagi et al. described that revision surgery for ASD increased the 2-year total cost by approximately 30% and significantly decreased the cost-effectiveness of the surgery27. Safaee et al. also described the significant alteration in cost-effectiveness of ASD surgery for those who required revision surgery28. Taken together, to predict the risk of MF and to reduce this common but significant surgical complication, it is essential to mitigate the risk of surgical complications and to improve the outcome of surgical treatment of ASD surgery.

Several potential risk factors for developing MF have been reported and confirmed in the previous literature2,4,5,10,11,12,13,14,15,16,18. One can argue that MF includes various implant-related complications, and therefore, the risk analysis should be oriented to each type of MF. However, there is an overlap among the risk of each type of MF. Yagi et al. described BMD, age, a large amount of sagittal alignment correction, and pelvic fusion as independent risk factors for PJF13,14,15. On the other hand, Smith et al. described these factors as independent risk factors for RF. In the present study, 38% of the MF group patients actually developed 2 or more types of MFs during follow-up5,29,30. This higher incidence of the development of multiple MFs in the same patient observed in the present study strongly supports the presence of common risk factors among the various types of MFs.

In the present study, we established a mechanical failure predictive model based on individual demographics, baseline spinal alignment, and surgical descriptive data. The advantages of the model include the feasibility of the scoring system. This scoring system consists only of preoperative values and therefore can be fully assessed before surgery. The model also showed good accuracy for predicting the incidence of MF in both the training and testing cohorts, with an AUC of 0.81 in the training subjects and of 0.86 in the testing subjects.

This study was limited by its retrospective design and the lack of external validation, which precludes drawing firm conclusions about our model’s predictive power for MF. However, we enrolled consecutive patients from a prospective database and analyzed the patients retrospectively, which is the most common method for investigating how a factor affects outcomes and complications in clinical research when randomized controlled trials are not possible. Moreover, MF is a radiographic complication and can therefore be retrospectively corrected from the periodical radiographs. In the present study, we enrolled patients from 7 surgeons in 3 academic hospitals in an East Asian country, so the patients were mostly Asian. Therefore, our results cannot necessarily be extrapolated to all other hospital settings. It is widely accepted that BMD is different between races31. Therefore, either adjustment of BMD index or addition of race index may be necessary to further improve the accuracy of PRISM in the other different populations. Further analyses including different patient populations are necessary to validate the predictive power of our model for MF following ASD surgery. Despite these limitations, the present study clearly showed the good predictive probability of this newly created model, which is based on demographic, radiographic, and surgical description data, for predicting MF following ASD surgery.

Conclusion

The newly established risk-stratification scoring model can predicted MF following ASD surgery using individual demographics and radiographic parameters as well as surgical description data that would normally be collected routinely when considering surgical treatment for a patient with ASD. This model can help surgeons identify patients with a high risk of MF and treat modifiable risk variables to mitigate the risk of MF following ASD surgery.