Introduction

The incidence of spondylodiscitis, the commonest infection of the vertebral bodies and intervertebral discs, has recently been reported to be as high as 4.4 per 100,000 per year in the Western world, carrying a high mortality rate of up to 20%1,2,3,4,5,6,7. Chronic infection and subtherapeutic treatment can lead to persistent spinal deformity, neurological deficits, permanent reduction in quality of life, residual pain needing long-term analgesia, and mortality in otherwise healthy individuals2,3,6,7. Once predominantly caused by granulomatous spondylodiscitis, pyogenic spondylodiscitis now prevails due to improved diagnostics and a more susceptible population6,8,9,10.

At present, conservative treatment, the most commonly used treatment option, consists of long-term antibiotics, the duration and specifics of which are highly debated11. Indications for surgery include failure of conservative management, mechanical instability, and compression of neurological structures. Early surgery has been hypothesised to accelerate clearance and prevent deformity, but its role remains controversial. Given the significant implications of spondylodiscitis, and the increasing incidence, defining the role of early surgery is critical.

Present literature often cites age and co-morbidities as being vital in deciding optimal treatment strategies12,13. Lesion subtypes may be vital too; patients with spinal epidural abscesses (SEAs) may be preferentially managed with surgical debridement and decompression14. However, there is a clear source of heterogeneity in the findings of current studies. For example, the seminal review by Rutges et al. found that early antibiotics were important in improving outcomes and found an anterior surgical approach to be beneficial15. Nonetheless, they were unable to recommend early surgical or conservative management over the other, due to data heterogeneity. However, the authors did not attempt to explore or mitigate this.

To facilitate decision-making in the management of spondylodiscitis, a robust quantitative and qualitative synthesis is required. This study therefore aimed to define the role of early surgery in spondylodiscitis, in comparison to conservative management.

Methods

Search strategy and selection criteria

This systematic review was conducted using the guidelines outlined by the Cochrane Collaboration, and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). The completed PRISMA flowchart is shown in Fig. 1A. The literature search was carried out on the 30th of April 2022, using a search of MEDLINE, Embase, Scopus, PubMed, and JSTOR from 1943 to 2022, the complete search strategy can be found in Supplemental Digital Content 1: Supplementary Table S1. The inclusion and exclusion criteria for systematic review and meta-analysis are in Supplemental Digital Content 1: Supplementary Table S2. Only studies that compared outcomes of patients receiving conservative treatment versus early surgery were included in the meta-analysis. The definition of early surgery at the point of study selection was surgical intervention immediately after the diagnosis of spondylodiscitis (as opposed to delayed surgery). In the first abstract screening, conducted by two reviewers (SGT & ASMV), all original articles in the English language that reported on spondylodiscitis were included. Subsequently, only studies reporting on the management of spondylodiscitis which also fulfilled our inclusion criteria were included. All included papers were assessed for eligibility by two independent reviewers (SGT & ASMV). Any disagreements were resolved by consensus after discussion with a third (KV) and subsequently a fourth reviewer (RV).

Figure 1
figure 1

(A) The preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flowchart outlining the study selection process is shown. (B) A world map indicated the origin of publications included in this study (n = 31)13,14,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52. The countries are coloured according to whether n = 1, 2, 5 or 13 studies from these countries have been included in this systematic review. The legend at the bottom indicates the colour coding. Following countries are coloured: United States of America (n = 13), United Kingdom (n = 1), France (n = 1), Italy (n = 1), Germany (n = 5), Austria (n = 1), Denmark (n = 2), Iraq (n = 1), India (n = 1), Taiwan (n = 1), Japan (2), South Korea (2). (C) A risk of bias summary plot for non-randomized studies with bar chart of the distribution of risk-of-bias judgments for all included studies (n = 31) across the domains of the ROBINS-I tool, shown in percentages (%) is shown. In the bottom, an overall risk of bias, which represents the collated risk-of-bias judgements for all domains, is depicted.

Data analysis

All relevant data were extracted manually using the Covidence data collection tool16. A list of extracted variables can be found in Supplemental Digital Content 1: Supplementary Table S3. In case of missing data, the respective studies’ corresponding author was contacted. All articles were critically appraised, and the risk of bias was determined against all the domains of the ROBINS-I tool by two independent reviewers (SGT & ASMV), and a consensus was reached by discussion with a third reviewer (KV)17. Results of the ROBINS-I analysis can be found in Supplemental Digital Content 1: Supplementary Table S4. Furthermore, the level of evidence for each included article was scored using the Oxford Centre of Evidence-Based Medicine (OCEBM) Levels of Evidence Table (Supplemental Digital Content 1: Supplementary Table S5), as well as GRADE scoring (Supplemental Digital Content: Supplementary Table S6). Definitions of early and delayed surgery used by each study are shown in Supplemental Digital Content 1: Supplementary Table S7.

An Egger’s regression and asymmetry test were used to assess publication bias (p < 0.05% = significant)18. Data preparation, statistical analysis, and forest plot synthesis were carried out by utilizing meta package with the R software (version 4.0.4)19,20. Firstly, a proportional meta-analysis was performed for mean proportions of mortality and relapse/failure among patients treated with early surgery and conservative treatment. The mortality and relapse data included both in-hospital and follow-up mortality. The most acute short-term postoperative outcome data (30 days, or 90 days) were used if longer or multiple follow-up periods were provided. All definitions of mortality and relapse/failure can be found in Table 1. Secondly, relative risk meta-analyses were computed for mortality and relapse/failure, and mean difference meta-analyses for length of stay. All outcome variable computation included 95%-CI, as well as heterogeneity measured by the I2 test21,22. The R Code used is available in Supplemental Digital Content 1: Supplementary Table S8. A detailed description of the computation, including subgroup meta-analyses, influence, and sensitivity analyses, is shown in Supplemental Digital Content 1: Supplementary File S1, and a detailed account of correlation analysis findings in Supplemental Digital Content 1: Supplementary File S2.

Table 1 Study characteristics of the included studies in this systematic review.

This systematic review was registered on PROSPERO CRD42022312573 under the title “Early surgical intervention vs expectant management in spondylodiscitis: a systematic literature review and meta-analysis” on the 28th of February 2022.

Results

A total of 13,209 studies were screened. From these, 75 full texts were assessed using our inclusion criteria. A total of 31 studies were included in this systematic review. From these, 21 studies were also included in the meta-analysis (Fig. 1A). The total pooled sample size of the systematic review was 48,504 and the overall pooled sample size of the meta-analysis was 10,954 patients. A world map of publication origins is shown in Fig. 1B.

Out of the 31 included studies, 14 were deemed to have a ‘low’ risk of bias23,24,25,26,27,28,29,30,31,32,33,34,35,36; 11 a ‘moderate’ risk of bias13,14,37,38,39,40,41,42,43,44,45; 6 a ‘serious’ risk of bias46,47,48,49,50,51, and 1 study had a ‘critical’ risk of bias52 using the ROBINS-I tool17. A scoring explanation is available in Supplemental Digital Content 1: Supplementary Table S4, a graphical summary in Fig. 1C. The OCEBM guidance was used to determine the level of evidence of each study. 21 studies were classified as 2b, three studies as level 3b, and seven studies as level 4 (Supplemental Digital Content 1: Supplementary Table S5). The GRADE scoring is shown in Supplemental Digital Content 1: Supplementary Table S6 and showed that, in terms of the study findings’ probability of being close to the estimated effect, 17 studies scored as moderate, 6 studies as high, 7 studies as low, 1 study as very low. The study characteristics are detailed in Table 1, and the main findings from each study are demonstrated in Table 2 (excluding studies that focused on purely spinal epidural abscesses) and Table 3 (including only studies that focused on purely spinal epidural abscesses). Study characteristics are additionally graphically presented in Fig. 2A–D. Egger’s asymmetry plot (Fig. 3A) yielded that there was a significant publication bias (p = 0.0082), however, a funnel plot (Fig. 3B) showed that there were no individual studies that skewed the publication bias regression analysis.

Table 2 Detailed summary of the results from each of the included studies in this systematic review (excluding studies that focused only on spinal epidural abscesses).
Table 3 Detailed summary of the results from each of the included studies in this systematic review that only focussed on spinal epidural abscesses.
Figure 2
figure 2

(A) Bar plot visualizes the number of prospective (n = 3), retrospective (n = 27) and ambispective (n = 1) studies included in the systematic review (n = 31)13,14,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52. (B) Bar plot visualizes the number of included studies (n = 31) that are cohort studies (n = 21), case series (n = 9) and case–control studies (n = 1). (C) Line plot displays the number of studies for the following years of publications: 1996 (n = 1), 2002 (n = 2), 2004 (n = 1), 2009 (n = 2), 2010 (n = 1), 2013 (n = 1), 2014 (n = 2), 2017 (n = 3), 2017 (n = 1), 2018 (n = 1), 2019 (n = 1), 2020 (n = 4), 2021 (n = 3), 2022 (n = 7). Each year is indicated as black circle, and the circles are connect by an interrupted line to visualise the trend more clearly. (D) Bar plot shows the sample size of each included study in the systematic review (n = 31). Studies are named alphabetically A–Z, each letter refers to the cited studies in synchronized order, which is furthermore depicted in the legend on the right of the graph. The bar plot is interrupted to allow for adequate visualisation of all data points.

Figure 3
figure 3

(A) An Egger’s asymmetry plot of all data points included in the meta-analysis (n = 21 studies)13,23,24,25,26,29,32,33,34,35,37,38,41,42,43,47,48,50,52; the x-axis represents the inverse of standard error, and the y-axis the standardized treatment effect (as z-score). Furthermore, at the top of the graph different parameters of heterogeneity, including I2, are shown. P-value < 0.05 is deemed to be significant and implicates publication bias. Egger’s asymmetry test yielded p = 0.0082, calculated running an Egger’s regression (see Egger’s regression line) on the collated DOR and standard errors of all data used in the meta-analysis (n = 21), indicating significant publication bias. (B) A funnel plot is shown, which plots every study included in the meta-analysis (n = 21). The observed effect sizes (diagnostic odds ratio) are on the x-axis against a measure of their standard error on the y-axis. All studies fall roughly within the parameters of the funnel plot, there are no gross outliers, indicating that there is no individual studies skewing the publication bias regression analysis. (C) The effects of early surgery versus conservative treatment for spondylodiscitis in terms of: (a) clinical [non-neurological] outcomes, (b) neurological outcomes, (c) overall outcomes, are visualized as harvest plot. The effects are stratified intro three columns: early surgery has better outcomes than conservative treatment (“Early surgery +), there is no difference between the two treatment modalities (“No difference”) and conservative treatment has better outcomes than early surgery (“Conservative +). A rectangle represents a single study, unless at bottom of the rectangle a number is specified as i.e. × 2 (= two studies). The colours of the rectangles correspond to the study design: black (retrospective), grey (ambispective), white (prospective). The number on top of the rectangle specifies the risk of bias in overall risk of bias (in line with risk of bias analysis, with 4 implying low risk of bias, 3 implying moderate risk, 2 serious risk and 1 critical risk). The height of the rectangle directly correlates to the risk of bias in outcome measurement, and the aforementioned number on top of the rectangle. Definitions for clinical and neurological outcomes are as follows: Clinical outcomes pools different definitions used by different studies including prognosis, recurrence, hospital stay, mortality rates, and lab parameters. Further in-depth investigation of these can be seen in the meta-analysis. On the other hand, the definition of neurological outcomes was split in two categories—the first being the presence or absence of neurological deficits, and the second being a graded scale of neurological deficits based on the American Spinal Injury Association Scale (ASIA scale).

Treatment outcomes

Conservative treatment

Conservative treatment mostly consisted of intravenous and/or oral antibiotics. The antibiotic regimen was not specified in most studies13,24,26,28,29,30,31,32,33,34,35,39,41,42,45,46,50,51,52. Several studies mentioned that antibiotic therapy was targeted toward an isolated organism23,37,40,43,47,49. However, blood or tissue culture positivity rates ranged from 2426 to 93%51 meaning that in several cases, broad-spectrum antibiotics were required. When antibiotic regimes were specified, common treatments included vancomycin38,44,48, beta-lactams37,38,40,44,49, and linezolid44,49 among others. Where antibiotic therapy duration was specified, the average duration ranged between 4 and 12 weeks27,39,40.

Early surgical treatment

The most common operations performed were laminectomies, debridement surgeries, and decompression surgeries. Several different approaches were used for surgery, with a posterior approach being most referenced in 17 studies13,24,25,28,30,31,37,39,40,43,45,46,47,48,49,51,52, and an anterior approach in 13 studies24,25,28,31,37,39,40,45,46,48,50,51,52. The most common indication for surgery was the presence or worsening of a neurological deficit (n = 19 studies)13,23,24,25,26,27,32,35,37,38,39,40,42,44,45,47,48,50,52, followed by failure of conservative management with antibiotics (n = 12 studies)23,24,25,26,27,29,35,35,37,44,50,50. Definitions of early surgery were heterogeneous, and a list of definitions used can be found in Supplemental Digital Content 1: Supplementary Table S7. Twenty studies did not provide information on how much time had elapsed between patient admission or diagnosis and when they had surgery23,24,25,26,27,30,31,34,35,38,39,41,42,43,44,45,46,48,50,51. Five studies reported that patients had surgery ‘immediately’ once the diagnosis was made, but did not define this time frame quantitively29,37,40,49,52.

Early surgical treatment vs conservative treatment

A graphical summary of qualitative comparative findings is shown in Fig. 3C. Ten studies stated that the clinical outcomes (non-neurological) of early surgical treatment were superior14,27,28,32,39,40,44,48,50,52, while six studies stated that there was no significant difference between the two modalities (Fig. 3C[a])23,25,33,41,42,51. No studies reported that conservative treatment had superior clinical outcomes. It is noted, however, that a range of definitions was used to determine clinical outcomes in patients including prognosis, recurrence, hospital stay, mortality rates, and lab parameters. The definition of neurological outcomes was split in two categories—the first being the presence or absence of neurological deficit14,39,40,44,51 and the second being a graded scale of neurological deficit based on the American Spinal Injury Association Scale (ASIA scale)13,23,28,50. In terms of these neurological outcomes, six studies reported that surgical treatment resulted in superior neurological outcomes13,14,28,39,40,50, one study reported that conservative treatment resulted in superior neurological outcomes44, and three studies reported that there was no significant difference between the two modalities (Fig. 3C[b])23,42,51. Sixteen studies stated that overall, when taking into account both neurological and clinical outcomes, early surgery yielded better outcomes13,14,24,26,28,29,32,34,39,40,43,44,48,50,51,52, while 10 studies stated that there was no difference23,25,27,30,31,33,37,41,42,47. No study stated that conservative treatment was superior (Fig. 3C[c]).

Meta-analysis

Mortality

For mortality, eleven studies13,23,24,26,29,32,35,37,41,42,43 (five scoring moderate risk of bias13,37,41,42,43) with a pooled sample size of n = 8,798 patients were included. The pooled proportion of mortality among patients treated with early surgery was 0.08 (CI 95% 0.04 – 0.15), or 8% (Fig. 4A), and 0.13 (CI 95% 0.09–0.20), or 13% (Fig. 4B), for patients treated conservatively.

Figure 4
figure 4

Four forest plot indicating and visualizing the proportion in mortality and relapse/failure in the context of spondylodiscitis following early surgical management (treatment arm) versus conservative management (control arm) is shown, pooling the results of all the studies included in the meta-analysis. (A) The pooled proportional mortality after early surgery is shown, (B) pooled proportional mortality after conservative treatment, (C) pooled proportional relapse/failure after early surgery, (D) pooled proportional relapse/failure after conservative treatment. The size of the grey square of the “Proportion” visual correlates to study sample size and the straight line indicated the confidence interval. The diamond at the bottom indicates the overall pooled proportion. Heterogeneity is indicated by the chi-squared statistic (I2) with associated r2 and p-value. The 95% confidence intervals (CI) are shown in squared bracket ([ ]). P-value < 0.05 is deemed significant. Furthermore, for every study the following are displayed: study author with publication date (“Study”), total sample size number for each study for the respective treatment arm (“Total”), number of deaths/relapses (“Events”) per respective treatment arm, and proportion of deaths/relapses (“Proportion”), test for significance of overall effect size as tn and p-value, and weighting of each study in percentage (%).

Relapse/Failure

For relapse/failure, defined as the need for repeat surgery or admission after initial treatment, eleven studies13,23,24,25,33,34,35,41,42,47,48 (two scoring serious risk of bias47,48 and three scoring moderate risk13,41,42) were included with a pooled overall sample size of n = 2,196 of surgically and conservatively treated patients. The pooled proportion of relapse/failure among patients treated with early surgery was 0.15 (CI 95% 0.09–0.23), or 15% (Fig. 4C), and 0.21 (CI 95% 0.12–0.34), or 21%, for patients treated conservatively, in the random effects model (Fig. 4D).

Relative risk reduction

The mortality risk reduction comparing early surgery to conservative treatment was 0.61 RR (CI 95% 0.40–0.82) (p < 0.01) (Fig. 5A), indicating a 39% risk reduction when using early surgery. The pooled relative risk reduction in relapse/failure rates when comparing early surgery to conservative treatment was 0.60 RR (CI 95% 0.39–0.82) (p < 0.01) (Fig. 5B), indicating a 40% risk reduction when using early surgery over conservative treatment.

Figure 5
figure 5

(A) A forest plot indicating and visualizing the treatment effect (“TE”) size in relative risk in the context of comparing the mortality rate of spondylodiscitis following early surgical management (treatment arm) versus conservative management (control arm) is shown, pooling the results of all the 11 studies included in the meta-analysis. The size of the grey square of the “Relative Risk” visual correlates to study sample size and the straight line indicated the confidence interval. The diamond at the bottom indicates the overall pooled relative risk ratio. The red bar below it indicates the prediction interval. Heterogeneity is indicated by the chi-squared statistic (I 2) with associated r2 and p-value. The 95% confidence intervals (CI) are shown in squared bracket ([ ]). P-value < 0.05 is deemed significant. Furthermore, for every study the following are displayed: study author with publication date (“Study”), total sample size number for each study (“Total”), and standard error of the treatment effect (“seTE”), test for significance of overall effect size as tn and p-value, and weighting of each study in percentage (%). The weighting of each study represented in the percentage (%) is derived from the inverse of the variance of each study's effect estimate. This means that more weight is given to the studies that provide more detailed information or have less variability in their outcomes, giving a balanced representation of the available data in the pooled analysis. A significant pooled relative risk was yielded overall (p < 0.01), indicating that early surgical management vs conservative has a relative risk of 0.61 in the context of overall mortality. Effectively this means that early surgical management of spondylodiscitis achieves a 39% risk reduction (overall mortality) when compared to conservative management. (B) A forest plot indicating and visualizing the treatment effect (“TE”) size in relative risk in the context of comparing the relapse/failure/recurrence rate of spondylodiscitis following early surgical management (treatment arm) versus conservative management (control arm) is shown, pooling the results of all the 17 studies included in the meta-analysis. The size of the grey square of the “Relative Risk” visual correlates to study sample size and the straight line indicates the confidence interval. The diamond at the bottom indicates the overall pooled relative risk ratio. The red bar below it indicates the prediction interval. Heterogeneity is indicated by the chi-squared statistic (I2) with associated r2 and p-value. The 95% confidence intervals (CI) are shown in squared bracket ([ ]). P-value < 0.05 is deemed significant. Furthermore, for every study the following are displayed: study author with publication date (“Study”), total sample size number for each study (“Total”), and standard error of the treatment effect (“seTE”), test for significance of overall effect size as tn and p-value, and weighting of each study in percentage (%). A significant pooled relative risk was yielded overall (p < 0.01), indicating that early surgical management vs conservative has a relative risk of 0.6 in the context of leading to relapse/failure/recurrence. Effectively this means that early surgical management of spondylodiscitis achieves a 40% risk reduction (relapse/failure/recurrence) when compared to conservative management. (C) A forest plot indicating and visualizing the treatment effect (“TE”) size in relative risk in the context of comparing the mean length of hospital stay (in daysI of spondylodiscitis patients following early surgical management (treatment arm) versus conservative management (control arm) is shown, pooling the results of all the studies included in the meta-analysis. The size of the grey square of the “Mean Difference” visual correlates to study sample size and the straight line indicated the confidence interval. The diamond at the bottom indicates the overall pooled mean difference. The red bar below it indicates the prediction interval. Heterogeneity is indicated by the chi-squared statistic (I2) with associated r2 and p-value. The 95% confidence intervals (CI) are shown in squared bracket ([ ]). P-value < 0.05 is deemed significant. Furthermore, for every study the following are displayed: study author with publication date (“Study”), total sample size number for each study (“Total”), and standard error of the treatment effect (“seTE”), test for significance of overall effect size as tn and p-value, and weighting of each study in percentage (%). A significant pooled mean difference was yielded overall (p < 0.01), indicating that early surgical management vs conservative has -7.75 day mean difference in the context of overall length of stay, effectively meaning that surgery is associated with a mean 7.75 day reduction in length of stay. (D) A correlation matrix visualizes the relationships of following parameters among all studies included in the systematic review (n = 31): The following parameters are used here: Date of publication, lumbar location of infection, proportion of females overall, dropout rate, proportion of intravenous drug users, sample size, cervical location of infection, proportion of epidural abscesses, proportions of diabetics, mean overall relapse/failure rate, proportion of positive cultures (tissues and blood), relapse/failure rate in conservatively treated patient (“Relapse failure [C]”), relapse/failure rate in surgically treated patients (“Relapse failure [S]”), proportion of diabetics in conservatively treated patients, proportion of patients with diabetes, thoracic location of infection, mean age of study population, mortality rate overall, proportion of diabetics in surgically treated patients, combined thoracic and lumbar location of infection, mean overall mortality, mean mortality in surgically treated patients, proportion of nephropathy in surgically managed patients (“Nephropathy [S]”), and mean mortality in conservatively treated patients. The legend bar at the right of the matrix explains the coloring. Red hue indicates a negative association between two parameters, and a blue hue a positive association. One asterisk (*) indicates a statistical significance of p < 0.05, two asterisks (**) indicate p < 0.01, three asterisks (***) indicate p < 0.001.

Length of stay

For length of stay, eight studies were included with a pooled overall sample size of n = 8,48113,24,32,33,34,38,50,52, four scoring a low risk of bias24,32,33,34, two scoring a moderate risk13,38, one scoring a serious risk50, and one study scoring a critical risk of bias52. The overall mean difference between early surgical management and conservative management was − 7.75 (CI 95% − 11.98 to − 3.51) (p < 0.01) (Fig. 5C), indicating that early surgical management of spondylodiscitis achieves a length of stay reduction of − 7.75 days per patient when compared to conservative treatment.

SEA-only and SEA-excluded analyses

Six additional subgroup meta-analyses were run, two on mortality, two on relapse/failure, and two on length of stay: for each outcome variable, a meta-analysis was computed including only studies that focus solely on patients with spinal epidural abscesses (SEA); and then a meta-analysis was computed excluding the studies that focus solely on patients with SEA (Supplemental Digital Content 1: Supplementary Fig. S1A–F). The meta-analysis on relapse/failure including studies that only focussed on patients with SEA yielded 0.74 RR (CI 95% 0.68–0.80) (p < 0.01), for mortality 0.56 RR (CI 95% 0.22–0.89) (p < 0.01), for length of stay a mean difference of − 6.53 (CI 95% − 13.13 to 0.08) (p = 0.05). The meta-analysis on relapse/failure excluding studies that only focus on patients with SEA yielded 0.46 RR (CI 95% 0.12–0.80) (p = 0.02), for mortality 0.67 RR (CI 95% 0.24–1.10), with t = 6.70 (p = 0.02), for length of stay a mean difference of − 6.53 (CI 95% − 13.13 to 0.08) (p = 0.05).

Influence analysis and linear regression

The exclusion of outlier studies based on a set of three influence analyses (Supplemental Digital Content 1: Supplementary Figs. S2, S3, S4), did not yield a significant change in effect size (Supplemental Digital Content 1: Supplementary Figs. S5, S6, S7). The exclusion of outlier studies based on high levels of risk of bias scoring did not yield any significant changes to effect size of any of the outcome variables (Supplemental Digital Content 1: Supplementary Figs. S8, S9). The meta-regressions scored the influence of all co-variates on the overall effect size of the relapse/failure meta-analysis, mortality meta-analysis, and length of stay meta-analysis (Table 4). Only for the relapse/failure meta-analysis there were significant (p < 0.05) co-variates that were found: “IVDU” and “diabetes”. None of the exclusion subgroup meta-analyses (excluding studies with high proportions of diabetics, and the studies with high proportions of intravenous drug users) yielded strong differences in the meta-analysis effect size (Supplemental Digital Content 1: Supplementary Figs. S10, S11).

Table 4 Mixed-effects single-variate meta-regression.

Multivariate correlation analysis

In Fig. 5D, a multivariate correlation matrix visualises and compares the occurrence of all numerical study characteristics and patient characteristics, extracted from all studies included in the systematic review (n = 31). It confirmed the influence of IVDU (positive prognostic factor in surgically managed patients), and diabetes (negative prognostic factor). An important positive prognostic factor was found to be a cervical localisation of infection (p < 0.01). Important negative prognostic factors were found to be: thoracic and/or lumbar location of infection (p < 0.001), positive cultures (tissues and blood) (p < 0.01), presence of epidural abscesses (p < 0.05), and advanced age (p < 0.05). A list of all correlations can be found in Supplemental Digital Content 1: Supplementary File S2.

Discussion

This is the first meta-analysis, to compare early surgical versus conservative management for spondylodiscitis. The meta-analysis included 21 studies, comprising data from 10,954 patients. The findings showed that early surgery had lower mortality rates (8% vs. 13% for conservative treatment) and lower relapse/failure rates (15% vs. 21%). Early surgery also led to a shorter hospital stay of 7.75 days per patient. These results consistently favoured early surgical management for pyogenic spondylodiscitis.

Surgical debridement is a widely accepted therapy for the treatment of infectious diseases, to reduce the infection load and facilitate faster infection control, while also providing tissue samples that may help to optimise adjunct antibiotic therapy53,54,55. Generally, surgery is most effective for infection poorly penetrated by antibiotics, as well as locally contained infections such as abscesses56,57,58. However, interestingly, our meta-analysis found that while early surgery was more effective than conservative therapy for patients with purely SEA, early surgery was even more effective in spondylodiscitis (without SEA) (10.06 day versus 6.5 length of stay reduction, 44% reduction in mortality versus 33%; 54% reduction in relapse rate compared to 26%).

This finding instigates a question: Could the mechanism by which surgery achieves better outcomes for spondylodiscitis patients involve more than just debridement? One hypothesis suggests spinal stabilization achieved by surgical intervention may more substantial contributing factor59,60,61,62. Even though antibiotics are essential in treating the infection, they are unable to provide spinal stability59,60,61,62,63,64,65. Infection may lead to spinal macro-instability, predisposing patients to experience more pain, decreased postural control, and a decreased arc of movement. However, we recognize the existing evidence base may not be robust enough to draw definitive conclusions about the mechanism and invite further studies to explore this hypothesis.

So how should this study inform clinical practice? Whilst we undertook an exhaustive search, enabling the largest pooled analysis of its kind, alongside multiple robust approaches to managing data heterogeneity, ultimately the source evidence was largely retrospective and/or cohort by design, suffered heterogeneity with outcome reporting and definition, and held moderate risk of bias. Furthermore, the included studies largely did not report on the use of intra-operative, localised antibiotics, which have shown promising results in recent studies, hence it was not possible to perform a sensitivity analysis on this parameter66. Despite the seemingly promising outcomes associated with early surgery, we recognize and emphasize the limitations inherent in our study. The primary studies included in our meta-analysis were largely retrospective and cohort by design, harbouring a moderate risk of bias and outcome reporting heterogeneity. Also, it is crucial to account for the probable selection bias in these studies, where the healthiest patients were more likely to be selected for early surgery. This selection bias may partially explain the observed lower mortality and relapse rates in the early surgery group. Moreover, apart from differences in patient health, disease severity may also influence the choice and timing of treatment, as well as outcomes. However, most studies did not provide data on disease severity. Potentially, the surgical approach may act as a proxy marker of disease severity, however, the data on surgical approaches were too heterogeneous to be compared. Future studies reporting on disease severity, as well as using consensus-based and comparable operative protocols, will be required to allow for robust sensitivity analyses. Furthermore, there was a statistical suggestion of publication bias, albeit extensive subgroup analysis did not identify specific outlying studies or factors. Considering these limitations, the absolute changes in outcome thresholds in a population with probable selection bias, where relapse/failure of early surgery is still high (15% versus 21%), remain difficult to interpret. No study considered the health economics of early surgery, and superficially saving eight hospital bed days may not be a sufficient trade-off for the costs and risks of routine surgery. When considering the reconfiguration of services to enable early surgery would be substantial (as spinal surgery is a tertiary specialty), it is clear that there remain significant knowledge translation gaps. The most striking finding may be the lack of any randomised comparison. This is for three reasons: firstly, the strong rationale and current evidence, secondly, the significant and increasing burden of spondylodiscitis disease, and finally, the evidence of field-wide equipoise, a premise for any randomised comparison.

However, it is important to acknowledge the obstacles to enabling a randomised control trial on spondylodiscitis management. Firstly, there is no clear consensus on what constitutes early surgery or conservative therapy, and perhaps most importantly what constitutes spondylodiscitis (particularly in the context of SEAs). The principal outcome measures or success criteria also remain undefined. Secondly, whilst there may be clinical equipoise at a field-wide level, this does not necessarily translate into institutional or physician-level equipoise—future efforts must be made to reduce local deviations from field-level recommendations of practice, including increased communications of the latest findings to raise awareness. Finally, the relative infrequency of spondylodiscitis, the population, and treatment heterogeneity, coupled with the discrimination of outcome measures for pain or neurological function, suggest any trial would require a large, probably multi-national collaboration. This will be an immense logistical challenge and will require a sufficient clinical buy-in and research funding. Despite these challenges, given the uncertainty of the clinical approach for spondylodiscitis, combined with variations in definitions and a lack of a uniformed ICD-10 for spondylodiscitis, the authors believe that these deficiencies demand for clinical equipoise to enable randomised comparison, as well as the need for expert consensus on treatment and pathology definitions in order to provide the best care for spondylodiscitis patients.

Conclusion

This meta-analysis, with an overall pooled sample size of 10,954 patients, suggests that early surgical management may be more effective than conservative therapy for spondylodiscitis, and is associated with a 40% risk reduction in relapse/failure, a 39% risk reduction in mortality and a 7.75 days per patient reduction in length of hospital stay (p < 0.01). Excluding SEAs, these benefits were magnified. However, given the modest quality of the source evidence, probable selection bias, and remaining unanswered questions critical for implementation, we recommend treating these findings with cautious optimism. Recognising the increasing burden of the disease and the existing limitations of current research, there is a clear call for a well-designed, multi-national randomised controlled trial.