Introduction

Many different types of physical interventions are routinely provided to people with spinal cord injury (SCI) as part of their rehabilitation programmes and ongoing care. They include interventions such as strength, fitness and gait training, splinting, stretching and hand therapy. These interventions are typically provided by physiotherapists, occupational therapists, exercise physiologists, medical practitioners and other health-care providers. Physical interventions often target specific impairments such as poor strength, cardiovascular fitness, skill and joint mobility, or impairments related to muscle extensibility, bone loss, pain or spasticity. Each of these impairments impose activity limitations that directly or indirectly prevent patients from performing physical activities such as walking, using their hands, mobilizing in a wheelchair and attending to self-care. Physical interventions that address impairments invariably also address activity limitations. By reducing activity limitations, physical interventions address the ultimate aim of rehabilitation, namely to increase participation and thereby improve overall quality of life.

Large numbers of physical interventions are advocated for people with SCI. The challenge for clinicians is to ascertain which interventions are most effective. The best evidence for treatment effectiveness comes from high-quality randomized controlled trials.1, 2 Some of the key strategies important for minimizing bias in randomized controlled trials include concealing allocation, blinding assessors and performing intention-to-treat analyses.

Recent and comprehensive clinical guidelines have synthesized the evidence supporting physical interventions for people with SCI.3, 4 However, these guidelines do not always determine or interpret the size and associated uncertainty of between-group differences. In addition, they do not include some of the commonly administered physical interventions that have been evaluated within randomized controlled trials. The purpose therefore of this systematic review was to provide a quantitative analysis of all randomized controlled trials designed to determine the effectiveness of physical interventions for people with SCI.

Methods

Search strategy

The following databases were searched up until December 2007: Medline (from 1966), CINAHL (from 1982), Embase (from 1980), the Cochrane Central register of controlled trials and the Physiotherapy Evidence Database (PEDro).5 A sensitive search strategy for identifying randomized controlled trials was used6 along with the following MeSH terms: parapleg$, quadripl$, tetrapleg$, wheelchair$ and spinal cord. This search strategy was adjusted for each database. In addition, the bibliographies of relevant systematic reviews and clinical guidelines were hand searched.

Inclusion criteria

The inclusion criteria were as follows:

Type of trials: Randomized controlled trials written in English. Crossover trials were included provided allocation to the treatment schedule was randomized. In trials that randomly allocated subjects to experimental groups but included data from non-randomized control conditions,7 only data from the randomized groups were included.

Type of participants: Trials in which at least 75% of participants had sustained a SCI. There were no restrictions on the basis of time since injury, type of injury or age.

Type of interventions: Trials involving the administration of a physical intervention typically provided by health-care professionals. Only trials that involved a treatment administered over more than one occasion were included. In trials that looked at the effects of one-off treatments as well the effects of a series of treatments,8, 9 only the data reflecting the response to the series of treatments were included. Trials that examined the effectiveness of providing education or equipment were excluded, as were trials directed at respiratory care or skin management.

Type of comparisons: Trials comparing a physical intervention with control (including sham) or no intervention and trials comparing two or more physical interventions.

Types of outcomes: All physical and non-physical outcome measures were included.

Data collection and analysis

Two reviewers identified potentially eligible trials from the search. Full copies of these trials were attained and again screened for eligibility. Any disagreement between the two reviewers was resolved by a third independent reviewer. If trials were reported in more than one publication, only data from the key publication were included.

The following data were independently extracted by two reviewers: details about the subjects (including classification according to the International Standards for the Classification of SCI, time since injury and number), intervention/s (including type, dosage and duration), outcomes (including type, number and data collection points) and the statistical significance of between-group differences (as stated by trial authors). In all trials, data collected at the beginning and at the end of the intervention period were extracted. The exception was the study by Crowe et al.,10 where the post-intervention data were collected 6 weeks after the intervention period for all but one outcome. In trials with more than two groups, only data from the two most contrasting groups were extracted.

Between-group mean differences and 95% confidence intervals (CIs) were extracted for each continuous outcome.11 If between-group mean differences and corresponding 95% CIs were not reported, they were derived from standard deviations, standard errors or P-values, provided data were not obviously skewed.12, 13, 14 Data were extracted from figures, and authors were contacted to clarify ambiguities if necessary and feasible.

The 95% CI associated with the between-group mean difference for each outcome was used to determine if the effect of the intervention was large enough to be worthwhile. This was done by nominating minimally important between-group differences; the smallest added benefit of an intervention required to justify the time, cost and inconvenience associated with providing the intervention.15, 16 The minimally important between-group difference was set a priori at 10% of the combined values of experimental and control groups at the commencement of the trial unless the authors of trials explicitly nominated otherwise. In trials that did not provide initial values, 10% of the combined values of experimental and control groups at the completion of the trial were used. Results were then categorized in the following way:

  1. 1)

    Clearly important between-group difference: The 95% CI associated with the between-group mean difference was larger than the minimally important difference (see Figure 1).15

    Figure 1
    figure 1

    Examples of outcomes from trials demonstrating the four types of results, namely, between-group statistical significance and clearly important treatment effects; between-group statistical significance and inconclusive treatment effects; no between-group statistical significance and inconclusive treatment effects; and no between-group statistical significance and ineffective treatments. The shaded vertical lines indicate the minimally important differences. WUSPI refers to Wheelchair User's Shoulder Pain Index.

  2. 2)

    Inconclusive: The 95% CI associated with the between-group mean difference spanned the minimally important difference (see Figure 1).15

  3. 3)

    Ineffective: The 95% CI associated with the between-group mean difference was smaller than the minimally important difference (see Figure 1).15

This categorization takes into account the size and the uncertainty associated with treatment effects and emphasizes clinical significance rather than statistical significance. This methodology appropriately restricts conclusions to the comparisons made in trials. For example, in trials that compare two interventions without a control group, the categorization can only be used to make inferences about the relative effectiveness of the two interventions, not about the effectiveness of one intervention compared to no intervention. Skewed and categorical data were not categorized for clinical significance.

The quality of each trial was independently assessed by two reviewers using the PEDro scale.5 Any disagreements were resolved by an independent third person (reviewers did not rate their own trials). The PEDro scale assesses 10 key design features important for minimizing bias and interpreting between-group differences. It rates trials according to whether they did or did not use random allocation; conceal allocation; demonstrate baseline similarity; blind subjects, therapists and assessors; obtain outcome measures from more than 85% of subjects; provide measures of variability; use intention-to-treat analyses; and perform between-group statistical comparisons. Ratings were based on the written text and not on personal communications.

Results

Four thousand five hundred and forty three titles and abstracts were screened. Of these, 65 full papers were retrieved for further screening. Thirty-one trials met the inclusion criteria (see Table 1). The total number of subjects included in all the trials was 770. Three trials23, 29, 38 had more than 50 subjects. Fifteen trials had 20 or less subjects and the remaining 13 trials had between 20 and 50 subjects. Susceptibility to bias was a problem common to most trials. The median (interquartile range) PEDro score for trials was 4 (3–5; see Table 1). Only 12 trials blinded assessors8, 10, 23, 29, 31, 32, 33, 35, 36, 37, 40, 43 and only six trials concealed allocation10, 30, 31, 32, 33, 35 and performed intention-to-treat analyses.23, 31, 32, 33, 35, 37 Dropouts were also a common problem, with only 18 trials reporting outcome data on at least 85% of subjects.8, 9, 18, 19, 20, 22, 23, 24, 27, 28, 31, 32, 33, 34, 35, 37, 40, 41 Not surprisingly and due to the nature of the interventions, only two trials blinded subjects37, 43 and no trial blinded therapists.

Table 1 Details of included trials

Fourteen trials included people with recent SCI (that is, less than 1 year), 16 trials included people with chronic SCI (that is, more than 1 year) and one trial did not specify time since injury other than to say subjects were more than 3 months since injury.26 There were approximately equal numbers of trials including people with complete and incomplete SCI, and people with tetraplegia and paraplegia. The most common outcome measures were walking speed (five trials), joint range of motion (eight trials) and exercise capacity (four trials).

The 31 trials were broadly grouped into seven categories after taking into account the key purpose of the trial and the nature of the intervention (see Table 1). Meta-analyses were not performed because of the clinical heterogeneity between trials.

The key findings summarized in Table 1 are:

  1. 1)

    Fitness and strength training: Seven trials investigated the effectiveness of arm or leg exercise. Six trials involved active exercise of the upper limbs with or without electrical simulation (ES),7, 18, 19, 20, 21, 22 and one trial involved ES-driven exercise for the paralyzed lower limbs.17 Six of the seven trials reported a statistically significant between-group difference on at least one outcome measure.17, 18, 19, 20, 21, 22 However, only three trials had clearly important between-group differences. The key findings from these three trials were that ES-driven exercise (versus no intervention) of the paralyzed lower limbs increased lower limb lean body mass;17 strength and fitness training (versus education and relaxation) decreased pain;19 and arm-cranking exercise (versus no intervention) increased the proportion of slow twitch muscle fibres in the triceps muscles.22 The size of the between-group differences could not be ascertained in the other three trials with statistically significant between-group differences.18, 20, 21

  2. 2)

    Gait training: Five trials assessed the effectiveness of gait training either with weight-supported systems23, 24, 27 or with orthoses.25, 26 Two trials included ES.24, 27 Statistically significant and clearly important between-group differences were reported in two trials on at least one outcome measure. One of these trials found that the subjects walked faster with medially linked knee–ankle–foot orthoses (KAFO) than unlinked knee–ankle–foot orthoses,26 and the other trial found that the subjects walked faster with isocentric reciprocal gait orthoses than medially linked knee–ankle–foot orthoses.25 The other three trials23, 24, 27 compared the relative effectiveness of two or more of the following interventions: weight-supported treadmill training with or without electrical stimulation, conventional gait training with or without electrical stimulation and robotic walking. All three were inconclusive despite, in one trial, the inclusion of 146 subjects.23

  3. 3)

    Hand therapy: Three trials compared biofeedback,29 ES30 or somatosensory stimulation28 of the hand with conventional hand therapy. One trial28 reported a statistically significant between-group difference on functional measures of hand function and pinch grip but the data were skewed, so the size of the between-group difference could not be quantified. This trial examined the added benefit of somatosensory stimulation with massed practice of hand activities in people with incomplete tetraplegia. The other two trials did not report statistically significant between-group differences.29, 30

  4. 4)

    Stretch-based interventions: Four trials examined the effectiveness of different stretch-based interventions on range of motion and shoulder pain.10, 31, 32, 33 Two trials demonstrated that stretch-based interventions for range of motion were ineffective (versus no intervention or conventional care)32, 33 and two trials were inconclusive.10, 31

  5. 5)

    Hand splinting: Two trials examined the effect of hand splints in people with tetraplegia.34, 35 One trial demonstrated that splinting the thumb was ineffective for decreasing the extensibility of the flexor pollicis longus muscle (versus no splint),35 and the other trial did not provide sufficient data to determine between-group differences.34

  6. 6)

    Acupuncture: Three trials examined the effect of acupuncture.36, 37, 38 One trial, notable for its large size, demonstrated that the addition of acupuncture to standard care administered soon after injury improved strength and sensation.38 The other two trials investigated the effectiveness of acupuncture for shoulder pain and were inconclusive.

  7. 7)

    Other therapies: Seven trials examined a range of different therapies including the effect of shoulder exercises or massage for mobility and depression,9 ES or biofeedback for function,40, 41 hippotherapy for spasticity,8 upper limb exercise with graded tilting for postural hypotension,42 ultrasound for bone loss43 and stretches and strengthening exercises for shoulder pain.39 A trial investigating massage (versus head and upper limb range of motion exercises) had mixed results reporting between-group statistical differences for some outcomes but treatment ineffectiveness for the others.9 Ultrasound (versus sham ultrasound) was ineffective for preventing bone loss.43 There were insufficient data or inconclusive evidence to support ES or biofeedback for function,40, 41 hippotherapy for spasticity8 or stretches and strengthening exercises for shoulder pain.39

Discussion

It is somewhat surprising that only 31 randomized controlled trials have investigated the effectiveness of different physical interventions for people with SCI. Of the 31 identified trials, six reported between-group and clearly important differences on at least one outcome measure.17, 19, 22, 25, 26, 38 Six more trials reported between-group statistical differences but either the results were inconclusive9, 31 or the size of the between-group differences could not be determined.18, 21, 20, 28 The majority of trials either did not report or did not find between-group statistical differences and were inconclusive7, 8, 10, 23, 24, 27, 29, 30, 34, 36, 37, 39, 40, 41, 42 or demonstrated treatment ineffectiveness.32, 33, 35, 43

The interpretation of this systematic review is partly dependent on the definition of minimally important between-group differences.16 Ideally, researchers articulate minimally important differences for each outcome before the commencement of the trials. However, this was rarely done. The failure to do this not only potentially introduces bias to the interpretation of results but also leads to a reliance on statistical significance without taking into account the size of treatment effects. We needed a minimally important difference for each study to provide a meaningful interpretation of results and to summarize the large number of outcomes (it was not feasible to provide the between-group differences and 95% CIs of the 200 outcomes reported in this review). We considered using a distribution-based approach where the size of the treatment effect is normalized in relation to sample variance (for example, Cohen's d). However, these approaches are strongly influenced by the heterogeneity of the samples.44 In addition, they do not give a clear indication of the importance of treatment effects.45 For these reasons, we opted a priori to nominate a meaningful between-group difference equivalent to 10% of subjects' initial status unless otherwise articulated by the investigators. The value of 10% was somewhat arbitrary but probably overestimates rather than underestimates treatment effectiveness. It is unlikely that the time, cost and inconvenience associated with most interventions administered over prolonged periods of time can be justified unless the added benefit is equivalent to at least 10% of initial status.

The results of this systematic review provide some support for different strength and fitness training programmes with and without electrical stimulation.17, 19, 22 The results also indicate the relative effectiveness of gait training with orthoses.25, 26 The largest trial demonstrated that acupuncture improved strength and sensation.38 However, the findings of all these trials need to be interpreted with caution. None blinded assessors, concealed allocation or had PEDro scores greater than 5 (our PEDro ratings are lower than those given by other reviewers3 but consistent with the independent ratings of the Centre for Evidence-based Physiotherapy;5 the original authors of the PEDro scale). In addition, some trials had numerous outcome measures and statistical comparisons without adjustments for the increased likelihood of type I statistical errors (that is, finding statistical between-group differences by chance). The number of statistical comparisons per trial can be gauged from Table 1. This table probably understates this potential problem because some analyses may not have been reported and some investigators reported additional results in duplicate publications not included in this review. Interestingly, very few trials reported clearly important treatment effects on activity limitations and participation restrictions. Instead, the treatment effects were predominantly measured at the impairment level. This may be because impairments are often more directly responsive to interventions and less influenced by the array of variables affecting activity limitations and participation restrictions.46, 47

Four trials indicated that treatments were ineffective.32, 33, 35, 43 These trials looked at the effectiveness of stretch for the management of contractures,32, 33 splinting for the promotion of muscle shortening35 and ultrasound for the treatment of bone loss.43 The negative results of these trials could reflect design weaknesses such as poor inclusion criteria and inappropriate treatment dosage. However, these trials had a median (interquartile range) PEDro score of 8 (7.5–8). The results of these trials should therefore at least raise questions about the effectiveness of interventions, some of which are still routinely provided to people with SCI.

The majority of trials in this systematic review compared two or more types of interventions, which often differed only in subtle ways. In the absence of between-group differences, some investigators performed statistical analyses on pre- to post-intervention data. Significant change over time in groups was then attributed to the effectiveness of all interventions. This approach is flawed. Change over time can be due to any number of factors and does not provide good evidence for treatment effectiveness (for example, change can be due to natural recovery or exposure to the testing protocol).48 Proper estimates of treatment effectiveness can only come from between-group differences.

The trials identified in this systematic review included an array of people with different attributes and types of SCI. Some trials had very homogeneous subjects whereas others did not. There are clear advantages associated with restricting inclusion criteria to very specific types of subjects, provided it is known which types of subjects are most responsive to the intervention. Subjects with similar attributes are more likely to respond to the intervention in a consistent way, thereby reducing variability and increasing the likelihood of detecting between-group differences (that is, increasing statistical power). However, it is not always known which groups of subjects are most likely to respond to interventions. In this situation, it is not unreasonable to adopt a pragmatic approach and follow clinical practice, in which case patients in whom the treatments are typically administered in the clinical setting are used as subjects for trials.48

Most trials identified by this review were inconclusive. That is, the 95% CIs associated with the between-group differences spanned the minimally important differences (see Figure 1). The majority of these results were also statistically insignificant (see Table 1). Statistically insignificant findings do not provide good evidence for treatment ineffectiveness unless the upper end of the 95% CI falls short of the minimally worthwhile treatment effect. It was therefore not clear from the majority of trials whether the treatments were or were not effective.15

Inconclusive results are a common problem in trials investigating physical interventions. There are a number of explanations for this but it is primarily due to the difficulties associated with demonstrating modest treatment effects in heterogeneous subjects.13 It is also due to the large number of factors influencing outcomes such as completeness of injury, time since injury and level of injury. These factors do not systematically bias results, provided subjects are randomized, but they do generate noise making it difficult to isolate the effects of interventions. The obvious way to reduce the likelihood of inconclusive results is to increase sample size. However, it is difficult to recruit large numbers of homogeneous subjects without extensive financial support and multi-centred co-operation. Trials can also reduce the likelihood of inconclusive results by limiting the number of experimental groups. This strategy increases the number of subjects in each arm of the trial. Trials that measure at the impairment level also decrease the likelihood of inconclusive results. Needless to say that the relevance of treatment effects on impairments is increased when accompanied by evidence about treatment effects on activity limitations and participation restrictions.

Interestingly, some of the higher quality trials had more problems with inconclusive results than some of the lower quality trials. Perhaps this is partly due to the inherent biases of low-quality trials. Low-quality trials tend to overstate treatment effectiveness.49, 50, 51, 52 High-quality trials minimize bias and therefore provide a more robust and honest reflection of the uncertainty around estimates of treatment effectiveness. Inconclusive results are undesirable; however, they are still valuable provided they come from high-quality trials. Results of this kind can be pooled in rigorous meta-analyses.15 Future meta-analyses may provide our best hope for quantifying the effectiveness of some physical interventions, given the challenges of conducting adequately powered trials in this area.

The results of this systematic review provide initial evidence of the effectiveness of fitness training, strength training, gait training and acupuncture for people with SCI. However, there is still a long way to go to provide an evidence base for the wide range of physical interventions that have become standard practice. As we move forward to explore the effectiveness of new and emerging therapies, it is important that emphasis continues to be placed on high-quality trials. Ideally, these trials will be adequately powered to provide conclusive findings. However, this is not always going to be possible, in which case our best hope for quantifying the effectiveness of some physical interventions may come from future meta-analyses.