Introduction

There is an increasing number of studies demonstrating the role of the so-called ‘secondary injury’ in spinal cord injuries (SCI). The traumatic event initiates a cascade of secondary injury mechanisms such as vascular changes, electrolytes shifts, excitotoxic neurotransmitters accumulation, inflammation, loss of energy metabolism, etc.1, 2, 3, 4, 5, 6, 7, 8, 9 The persistent compression of spinal cord represents a cause of secondary injury potentially reversible.10

The comprehension of these pathophysiological mechanisms has led to the studies of the National Acute Spinal Cord Injury Study (NASCIS) group on the use of high-dose methyl-prednisolone. The studies of NASCIS, although with borderline statistical significance, provided some clinical evidence for the concept of secondary injury, and had also the importance of confirming the role of timing in the treatment of acute SCI.11, 12, 13

The question then arises whether this concept of ‘time window’ should be applied to the surgery as well. Experimental data have demonstrated that neurological recovery is enhanced by early decompressive procedures.14, 15, 16, 17, 18, 19, 20 On the other hand, early surgery can cause deterioration of respiratory, haemodynamic and neurological functions. In acute systemic trauma, there is a greater probability of failure of alignment and fusion, and surgical treatment may be precluded by the lack of specific equipment and experienced personnel.21, 22 Furthermore, definitive and unequivocal evidence to support the practice of early or late surgery are still lacking in clinical studies.21, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36 Recently, in an evidence-based review of the literature, Fehlings and Tator10 concluded, on the bases of the current knowledge, that early decompression can be considered a practice option.37 Clinically, the neurological examination within 24 h of injury is fraught with difficulty. The clinician can hardly rely on its accuracy for prognosis or for accurate quantification of loss. Owing to sedation, analgesia and often poor cooperation of the patient, the neurological impairment is often overestimated with implications on the subsequently calculated size of the neurological gain. The heterogeneity between the timing of the first examination add further difficulties to accuracy in comparing results between various studies.

For all these reasons the role, rationale and timing of surgery after SCI continue to be debated. In the present paper, the clinical literature on the SCI was reviewed with emphasis on the role of an early decompression in improving the neurological outcome. The outcome of patients undergoing a decompressive procedure within 24 h was compared with that of patients treated through delayed decompressive procedures and that of nonoperatively managed patients.

The review was performed by means of a meta-analytic procedure defined as ‘a statistical analysis of a collection of analytic results for the purpose of integrating the findings’38 to arrive at an estimate of the advantages of early surgical procedures in acute SCI.

Subjects and methods

Study identification

This review utilized a MEDLINE search of the English language medical literature from 1966 to 2000 using the medical subjects headings (MeSH) ‘Spinal Cord Injury’ and ‘Therapeutics’. Since computer searches have been shown to miss relevant articles, the review was supplemented with manual searches of original and review articles and monographs. The reference lists of identified publications were also checked for additional trials. The Cochrane library was also reviewed.

Eligible studies

All studies on treatment strategy for acute SCI published in journals or books from 1966 through to December 2000 were eligible for our meta-analysis

Data extraction

Two readers (AC and SC) independently assessed eligibility of studies and extracted data from the included studies. Observers were blinded to the names of the authors and their institutions, the names of the journals, sources of funding and acknowledgments. In case of disagreement between the two readers, consensus was reached by joint review of the study.

More than 3600 titles were found from the MEDLINE search, an abstract review was performed and 108 articles were selected for complete review. Articles were divided according to class of evidence: (1) Class I evidence includes data collected in prospective randomized trials; (2) Class II evidence includes studies based on prospective studies; (3) Class III evidence is based on retrospectively collected data; (4) Class IV evidence consists of case reports, anecdotal reports and expert opinion.

The patients, all harbouring acute SCI, were divided and pooled for the treatment in three groups. Patients who underwent decompressive procedure within 24 h were considered early treated, patients treated later than 24 h were considered late treated. The third group was composed of conservatively managed patients.

In each group, patients were divided on the basis of their neurological status at the admission in two subgroups, ‘complete’ and ‘incomplete’. Patients with incomplete paralysis were also included in three different subgroups: B, C and D, according to the Frankel's scale, since the possible different degree of recovery in patients harbouring a variable degree of residual neurological functions.

Each patient was matched with selected inclusion criteria. The inclusion criteria were: (1) acute nonpenetrating SCI, (2) neurological deficit attributable to a cord damage, excluding therefore only nerve root deficit, (3) at least 6 months follow-up, (4) recovery of the distal cord functions, excluding, therefore, patients harbouring only root recovery.

Some of the patients in a published series often did not meet the inclusion criteria. In these cases, only the patients who met the stated criteria were included in the analysis.

Several series contained the same patients, either in duplicate or in updated form. The most complete published report was used in the analysis, and care was taken to avoid inclusion of duplicated data.

Some patients’ series contained eligible and ineligible patients, for instance patients early and late treated, and sometimes data could not be confidently assigned to one or another. We selected for meta-analytical purposes only articles characterized by an analytical review of patients’ series, those in which data were furnished for each patients, excluding, therefore, those in which only pooled data and results were provided.

Data analysis

The same authors who assessed eligibility of studies and patients were devoted to assess the results of the three treatment options. Authors calculated the percentage of patients with spinal cord sensory or motor functions improvement for each study separately and all studies combined. The patient was considered neurologically improved if he gained at least one grade according to the Frankel's scale. In studies in which a different evaluation scale was used, the outcome measures were adapted, wherever possible, to the Frankel's one. Corresponding exact 95% confidence intervals (CIs) for single proportions were determined.

The analysis compared results for the three treatment options (early or late decompression and nonoperative management) for patients with complete and incomplete neurological deficits. The analysis was accomplished using both 2 × 3 and 2 × 2 χ2 matrices for comparison of all three treatments options together and each treatment against the other, respectively.

In each subgroup of patients, an analysis of homogeneity of combined data was performed to assess whether pooling was appropriate. Lack of identification and inclusion of trials through publication or location bias was assessed by a qualitative analysis of the asymmetry of the plots.32 A χ2 test for homogeneity across studies was then used to assess quantitatively the homogeneity.

To perform data analysis, the STATCALC 4.1 (ACASTAT Software, Ashburn, USA) and the INSTAT 3.00 (GRAPHPAD Software, San Diego, California, USA) software applications were used. A P-value of less than 0.05 was considered statistically significant.

Results

Patient population

In our literature review, we found only one prospective randomized study (Class I evidence)39– and eight prospective nonrandomized case series (Class II evidence).32, 34, 40, 41, 42, 43, 44, 45 Most studies reported on retrospective case series with historical controls (Class III evidence).

Early decompression

The literature search yielded 17 articles reporting clinical data of patients who underwent decompression within 24 h after the trauma. The articles were fully reviewed, but clear, analytic data for each patient were available in only 11 of 17 articles.

In these selected articles, the data were considered in 409 patients, 287 treated for a cervical fracture and 122 for a thoracic trauma. The median study size was 37.2 patients with a minimum of one and a maximum of 70 patients. A total of 226 eligible patients who clearly fulfilled the selected criteria were considered in the meta-analysis. Data regarded 119 patients with ‘complete’ neurological deficit and 107 with ‘incomplete’ deficit. (Table 1)

Table 1 Case series reporting data of 226 patients who underwent early decompression for acute SCI considered eligible for meta-analysis

Late decompression

The literature search yielded 27 articles reporting clinical data of patients who underwent decompression later than 24 h after the trauma. Many series did not report demographic and clinical data in a manner that allowed for differentiation between eligible and ineligible patients, resulting in missing data. Therefore, 13 of 27 articles resulted suitable for meta-analytic review.

In these selected articles, the data regarded 827 patients, 342 treated for a cervical, and 485 for a thoracic trauma. The median study size was 63.6 patients with a minimum of 20 and a maximum of 218. In total, 567 eligible patients who clearly fulfilled the selected criteria were considered in the meta-analysis. Data regarded 242 patients with ‘complete’ neurological deficit and 325 with ‘incomplete’ deficit (Table 2).

Table 2 Case series reporting data of 567 patients who underwent late decompression for acute SCI considered eligible for meta-analysis

Nonsurgical group

The literature search yielded 13 articles reporting clinical data of patients who were conservatively treated. Even in this case, many series did not report demographic and clinical data in a manner that allowed for differentiation between eligible and ineligible patients, only nine of 13 articles resulted suitable for meta-analytic review

In these selected articles, the data regarded 1335 patients, 535 treated for a cervical, and 800 for a thoracic or thoraco-lumbar trauma. The median study size was 102.7 patients with a minimum of 21 and a maximum of 612. A total of 890 eligible patients who clearly fulfilled the selected criteria were considered in the meta-analysis. Data regarded 558 patients with ‘complete’ neurological deficit and 332 with ‘incomplete’ deficit (Table 3).

Table 3 Case series reporting data of 890 patients who underwent conservative management for acute SCI considered eligible for meta-analysis

Meta-analysis outcome

  1. a)

    In the early group, the rate of patients presenting an improvement of distal cord functions was 42% (95% CI 33.1, 50.8%) in those presenting a complete neurological deficit at admission and 89.7% (95% CI: 83.9, 95.5%) in those presenting incomplete deficit.

  2. b)

    In the late group, the rate of improvement was 8.3% (95% CI: 4.8, 11.8%) and 58.5% (95% CI: 53.1, 63.9%) in patients with complete and incomplete deficits, respectively.

  3. c)

    In the conservative treatment group, an improvement of neurological status was recorded in 24.6% (95% CI: 21, 28.2%) of patients with complete neurological deficits, and 59.3% (95% CI: 54, 64.6%) in patients with incomplete neurological deficits (Figure 1).

    Figure 1
    figure 1

    Bargraph showing the percentage of patients with neurological improvement in the three groups of patients. The percentage of patients with complete (white bar) and incomplete neurological impairment (grey bar) who underwent early decompression was higher compared with that of late treated and conservatively managed patients. Error bars indicate 95% CI

The neurological outcome of patients in the different groups was compared using a 2 × 3 χ2 matrix (Table 4). A statistically significant difference was found among the three therapeutic options (χ2=55.4, degrees of freedom (df)=2, P<0.001 for complete and χ2=37.6, df=2, P<0.001 for incomplete patients). When the individual treatment modalities were compared against each of the others in 2 × 2 χ2 matrices (Table 5 and 6), early decompression resulted in significantly better outcome compared with both late (χ2=58.1, df=1, P<0.001 for complete patients, and χ2=35.1, P<0.001 for incomplete) and conservative management (χ2=15, df=1, P=0.002 for complete patients, and χ2=33.7, P<0.001 for incomplete).

Table 4 Improvement rate in the three treatment groups in complete and incomplete patients
Table 5 Comparison of neurological improvement rate in early- versus late-treated patients
Table 6 Comparison of neurological improvement rate in early- versus conservatively-treated patients

Patients with incomplete neurological impairment were included in three subgroups named B, C, and D according to the Frankel's scale. Owing to the number of different scale used to assess the degree of neurological impairment, and the presence of articles in which such differentiation was not clearly performed, this procedure was possible in all selected articles but five, namely in 89.1% of all eligible patients. Early decompression resulted in significantly better outcome compared with both late and conservative management in Frankel B, C and D patients (P<0.05) (Figure 2).

Figure 2
figure 2

Bargraph showing the rate of patients with incomplete neurological deficit who underwent neurological improvement with regards to the different Frankel grades (B, C and D). The rate of ameliorated patients who underwent early decompression (black bars) was statistically higher compared with that of late decompressed (white bars) and conservatively managed patients (grey bars)

To test the homogeneity across the study, a χ2 test was used for each selected subgroup of patients.

The χ2 test for homogeneity demonstrated that the subgroup of patients undergoing early surgery for incomplete deficit was homogeneous (χ2=11.9; df=9; P>0.05); these results confirmed the qualitative analysis of asymmetry of plot that showed a complete symmetry with overlapping of all the 95% CI (Figure 3b). Differently, all the other subgroups turned out to be characterized by asymmetric plots, suggesting heterogeneity of the collected data. This was confirmed by the quantitative test showing:

  1. 1)

    Early-Complete: χ2=27.7; df=9. (2) Late-Complete: χ2=118.1; df=9. (3) Late-Incomplete: χ2=49.3; df=12. (4) Nonoperative-Complete: χ2=61.9; df=5. (5) Nonoperative-Incomplete: χ2=43.4; df=7 with a P<0.05. (Figure 3a,c,d–f)

Figure 3
figure 3

Analysis of homogeneity of pooled studies. Selected studies had been pooled on the basis of treatment and neurological status at admittance: early-treated patients with complete (a) and incomplete neurological deficit (b); late-treated patients with complete (c) and incomplete neurological deficit (d); conservatively managed patients with complete (e) and incomplete neurological deficit (f). Error bars indicate 95% CI. The analysis of the symmetry of the plots offer a simple way to assess qualitatively the homogeneity of pooled studies. A symmetric plot, namely a complete overlapping of all the 95% CI, indicates that the combined data are homogeneous. Only the plot displaying data of patients with incomplete neurological deficit who underwent early treatment showed an overlapping of all the error bars (b). In the other groups, the asymmetry of the plots indicates the heterogeneity of the pooled data

Discussion

The increasing understanding of the pathophysiology of acute SCI offers a potential for surgery to influence the natural history of this catastrophic event. Nevertheless the indication, the timing and even the role of this intervention in improving neurological recovery remains controversial because of the lack of unequivocal evidence.10, 37, 42

The objective difficulties in designing and conducting large prospective randomized studies and, at the same time, the urgent necessity to clarify the role of surgery has spurred a reanalysis of the enormous quantity of data produced in the past decades regarding SCI. Fehlings et al37 recently reviewed the literature using the modern criteria of evidence-based medicine and arrived to a qualitative evaluation of the available data, concluding that there are Class III data suggesting a role for urgent decompression in the setting of bilateral facet dislocation, and in incomplete injuries in patients with neurological deterioration.10 The authors considered those observations as options since derived from retrospective data. We tried to complete those findings to arrive to a quantitative estimate of the effect of early surgery after acute SCI, by using criteria provided by a meta-analytic approach.

Literature review

Although the efficacy of decompression after SCI in enhancing neurological recovery in animal models has been widely demonstrated,14, 15, 16, 17, 18, 19, 20, 46, 47, 48, 49 the review of the relevant clinical literature shows that most studies, comparing decompressive surgery with conservative management, actually fail to demonstrate, definitively, an advantage of surgery.21, 27, 32, 33, 34

Both in historical series such as that of Bedbrook50 or Maynard et al,51 and in those more recent of Tator et al32 and Wilmot and Hall,52 the operative treatment turned out not to be superior. It is worth noting, however, that the time factor was somewhat overlooked in those as well as in other studies. Recently, the group of Leeds led by Dickson reported a meta-analytical study to assess whether decompression could affect neurological outcome of patients with thoracolumbar fractures. Their results demonstrated that surgery does not offer a significant advantage compared with conservative treatment with respect to neurological outcome.53 In that study, however, again the timing of surgery was not considered as a factor affecting outcome.

The question is, therefore, whether the ‘time factor’ may change the prognosis of spinal cord-injured patients.

Basic science investigations provided a number of evidence regarding the acute pathophysiological events that follow cord compression and affect neurons, axons and blood vessels.1, 2, 3, 4, 5, 6, 7, 8, 9, 14, 15, 16, 17, 18, 19, 20 Some of these events, together constituting the ‘secondary injury’, evolve very early after the trauma.

The studies by NASCIS group demonstrated that neural damage can be attenuated in the clinical setting, operating directly on the secondary injury mechanisms by use of high-dose methyl-prednisolone.11, 12, 13 Such studies had poor statistical strength and a relatively recent meta-analysis refuted the proposition that the NASCIS protocols stood the level of evidence to be recommended.54 However, they still remain the only class I (Prospective Randomized) studies providing data on the possible existence of a time window in humans. In fact, although experimental studies demonstrated that early relief of decompression improves neurological outcome in animal models, as first shown convincingly by Tarlov and Klinger19, 20 in dogs and later by Brodkey et al46 in cats, the results of several clinical studies failed to show that patients treated early made improved recovery compared with those undergoing late surgery.21, 33, 39, 51

In the retrospective studies of Wagner and Chehrazi33 and Levi et al,21 comparing early- and late-treated patients, there were no differences in neurologic recovery. Even in more recent studies, based on prospective series, such as that of Duh et al,41 a better clinical outcome was recorded in patients treated either within 25 h or later than 200 h, without providing unequivocal evidence to support the practice of early or late surgery. Another study by Vaccaro et al,39 based on a prospective randomized series, stated no difference between early and late surgery. The study of Vaccaro et al provides robust evidence since it is the only Class I clinical study on the role of timing in surgical treatment after SCI, but it has an important shortcoming represented by the definition of early treatment as a decompression performed within 72 h. This is clearly in contrast with the definition of ‘effective time window’ provided by the studies of the NASCIS group.

Accordingly, one of the problems of our study was the definition of early treatment. In general, there is a lack of consensus among authors concerning the concept of early surgery. Early treatment ranged between 6 h and more than a week in the reviewed literature.39, 51, 53, 55, 56 Even when performed within 24 h, which we selected as time window for early surgery, the operation might be too late to reverse some of the secondary injury mechanisms that follow acute SCI. The only therapeutic window established in humans, actually, is the 8-h trauma-to-treatment window reported in the NASCIS-2 in which methyl-prednisolone was used.11, 12, 13 The limit of 24 h has been used in this study for two practical reasons: (1) very few reports provide data on an earlier treatment; (2) less than 50% of patients can be currently admitted to spinal cord centres within 24 h.57, 58

Analysis of results and limitations of the study

The meta-analytic procedure consisted of extraction, pooling and statistical analysis of data from parent studies. Since only one prospective randomized (Class I evidence)39 and eight prospective nonrandomized case series (Class II evidence)32, 34, 40, 41, 42, 43, 44, 45 were found from the literature search, the meta-analysis was based on studies reported on retrospective case series with historical controls (Class III evidence). The outcome of patients undergoing a decompressive procedure within 24 h was then compared with that of patients treated through delayed decompressive procedures and that of nonoperatively managed patients. Subjects were divided, on the base of the neurological status at the admission, in ‘complete’ and ‘incomplete’ deficit subgroups. Statistically, early decompression resulted in significantly better outcome compared with both conservative (P<0.001) and late surgery (P<0.001 for complete patients, and P<0.001 for incomplete). Another interesting finding was that the outcome of overall decompressed patients, combining together data of early and late treatment, namely overlooking the time factor, was similar to that of conservatively managed patients. This agree with the observations of other authors,32, 50, 51, 52, 53 and suggest that timing may play a role in surgical decompression.

Even if these results are credible and agree with the pathophysiological bases of SCI, their applicability deserves comment.

Patients with different degree of ‘incomplete’ neurological impairment were pooled to achieve a large cohort of patients. In the light of possible bias deriving from such pooling, those patients were further included in three subgroups according to the Frankel's scale. The meta-analytical procedure was then performed for B, C and D patients separately. Again, early treatment turned out to be advantageous compared with late or conservative management. It should be noted, however, that because of the number of different scale used to assess the degree of neurological impairment and the presence of articles where those patients were classified simply with the term ‘incomplete’, such procedure, and consequently the results, are applicable only to 89.1% of all included patients.

Meta-analysis may be defined as ‘a statistical analysis that combines or integrates the results of several independent clinical trials considered by the analyst to be combinable’.59, 60 The terminology, however, is still debated and expressions used concurrently include ‘overview’, ‘pooling’ and ‘quantitative synthesis’. If these studies represent one of the few methods that offer a rational, statistical approach to management decision in the absence of definitive class I evidence, the pooling of results from a particular set of studies may be inappropriate from a clinical point of view, producing a population ‘average’ effect.

This turned out to be particularly evident in our study in which it was not possible to locate studies reporting analytic data for each of several variables that may influence outcome of this population of patients. One of such variables is the incidence and type of complications occurring in the three groups. Complication rate may have influenced the neurological outcome, as well as the treatment modality may have influenced the complication rate. Although, the complication rate may be higher in patients undergoing early surgery, one could speculate this could be reinforcing rather than weakening our findings. Nevertheless, the absence of homogeneus and comparable data on the rate, type, intensity and duration of such factors prevented us from including it into our analysis. Accordingly, we limited the study to the analysis of the relationship between two dependent variables: the operation, that is, a spinal cord decompression, and the effect, that is, the distal cord functions recovery.

This procedure, however, carries the risk of producing biased data. In fact, the variability of selected and pooled data may have affected the interpretation of the results of our meta-analysis producing a excessive ‘average’ effect. Therefore, we performed an analysis of homogeneity of each considered subgroup of patients in order to estimate the influence on the result's interpretation of all the variables not included in the study because of the lack of sufficient and comparable data. Homogeneity implies that results of each single study are compatible with those of others and with combined data, representing, therefore an index of applicability of results. In other words, we can consider pooled data reliable only if findings of parent studies are compatible to each other. The comparison of the 95% CIs of parent studies, namely an analysis of the symmetry of the plots, represents a simple way to analyse homogeneity of the selected studies (Figure 3).

The overlapping of 95% CIs of each study, which was considered an index of homogeneity of pooled data, was present only in the subgroup of patients with incomplete neurological impairment treated within 24 h (Figure 3b). This result was confirmed by the quantitative analysis of homogeneity, which was performed using a χ2 test across the parent studies.

The other groups turned out to be heterogeneous. This result may depend on the procedure of extraction of data from the parent publications. In fact we had to rely on data as provided by the authors, who were often the surgeons who had performed the operations. This may have led to a different estimation of the recovery rate, especially in patients with complete neurological deficit. Patients with severe initial deficit and poor recovery may have been differently judged as improved or not, and, consequently, included in different class of the Frankel's or of similar scales. It is also likely that the timing of the first neurological examination was different among the three groups, with a tendency to overestimate the degree of neurological loss because of poor patient cooperation in those examined and treated early. Equally, if the baseline examination was delayed and some neurological recovery had occurred in the meanwhile, the calculated percentage recovery would be probably underestimated.

Furthermore, we had to rely on nonhomogeneous descriptions of the neurological status on admission and on nonhomogenous outcome measures on discharge or follow-up.

Furthermore, the time interval from injury to initial examination and from initial examination to follow-up was not exactly specified in most studies.

Concerning the patients treated late, we included patients treated just after 24 h and patients treated several days later which may have led to different outcomes.

The heterogeneity of the collected studies regarding conservatively managed patients can be explained considering that we included both series in which all patients were managed conservatively and series in which only patients with contraindication for surgery were treated nonsurgically. Accordingly, considering only those series in which all patients underwent conservative management, the homogeneity of the subgroup with incomplete deficits increases significantly. Furthermore, patients who underwent nonoperative management were combined with patients conservatively managed. Nonoperative management consisted, in most cases, of early mobilization in a brace while with conservative management recumbancy for a long period was the main treatment. This may have, therefore, played a role in determinig the heterogeneity of this group. In fact, there is literature documentation of neurologic deterioration in association with postural hypotension during mobilization of incomplete SCI patients with biomechanically stable, healed spinal fractures.61

Owing to the aforementioned difficulties, although we were able to find, in terms of statistics, some statistically significant neurological advantage with early surgery to the incomplete patients, we were not able to determine with any confidence a real neurological advantage to surgery within 24 h from injury to any of the groups.

Therefore, even though it is conceivable that early surgical treatment may offer some neurological advantage similar to the one reported in experimental studies, there are a number of variables that may influence the overall neurological outcome and the statistics by which it is measured. We were not able to locate nor estimate the exact role of all these variables. This prevented us from drawing definitive conclusions. Early treatment therefore continues to represent a practice option. Prospective, controlled, randomized studies are still needed to clarify the exact timing and role of decompression.