Introduction

Physiotherapy (PT) interventions are widely used in an effort to increase the voluntary strength of muscles directly affected by spinal cord injury (SCI), that is, neurologically-weak muscles. Some of the PT interventions commonly used for this purpose include resistance training [1], electrical stimulation [2, 3], hand therapy [4] and various types of gait training interventions [5,6,7]. It is important to better understand the effects of these interventions on strength because strength optimises function after SCI. In addition, researchers investigating drug or cellular therapies for neurological recovery often want to know how routinely administered PT affects the strength of neurologically-weak muscles, particularly the key muscles that are part of the International Standards for the Neurological Classification of SCI [8,9,10].

Surprisingly little work has been directed at summarising what we know from clinical trials about the effectiveness of different PT interventions on voluntary strength of neurologically-weak muscles. Many systematic reviews (including some of our own [11,12,13]) have investigated the effectiveness of different interventions on mobility and function. Some of these have included strength as an outcome. However, no systematic review has solely focused on voluntary strength of neurologically-weak muscles and included all PT interventions. Therefore, the purpose of this systematic review was to provide an overview of the effectiveness of PT interventions for increasing voluntary strength of muscles directly affected by SCI.

The two primary objectives of this systematic review were to determine:

  1. 1.

    The effectiveness of any PT intervention compared to sham or no intervention for increasing muscle strength in people with SCI. For example, the effectiveness of hand therapy versus no hand therapy on the grip strength of people with tetraplegia.

  2. 2.

    The relative effectiveness of any PT intervention compared to another PT intervention for increasing muscle strength in people with SCI. For example, the effectiveness of Lokomat gait training versus overground gait training on Lower Extremity Motor Scores (LEMS) of people with incomplete SCI.

Methods

We searched the following databases from inception to Jan 2018: Embase (via the Ovid search engine), Medline (via the Ovid search engine) and the Cochrane Central register of controlled trials. We also searched the Physiotherapy Evidence Database (PEDro). The Embase and Medline databases were searched using the Cochrane highly sensitive search strategy for identifying clinical trials [14] combined with the following MeSH terms: parapleg$, quadripleg$, tetrapleg$, wheelchair$ and spinal cord. We included a line in our search strategy to exclude animal studies. The searches were adjusted for each database. The Cochrane Central register of controlled trials was searched using the MeSH terms.

The searches were originally run in December 2007 [11], again in December 2015 [12], and then updated in Jan 2018. Two authors independently screened the titles and abstracts to generate a list of all randomised controlled trials (RCTs) investigating the effectiveness of any PT interventions involving people with SCI irrespective of outcomes, comparisons or purpose. As a second step, all three authors screened the list of RCTs to identify trials for this review. Full articles of any potentially eligible studies were retrieved and examined to check against the inclusion criteria. Any disagreements were resolved by further discussion.

Inclusion criteria

Types of participants

We included studies involving adults with a traumatic or non-traumatic SCI, and a complete or incomplete SCI regardless of time since injury. Trials involving participants with other conditions were only included if 80% or more of the participants had a SCI.

Types of intervention

We included studies which examined any type of PT intervention or combination of therapies provided the intervention required participants to voluntarily contract neurologically-weak muscles. This included any of the following administered with or without biofeedback, somatosensory stimulation or electrical stimulation:

  1. 1.

    Gait training such as robotic, overground and body-weight supported treadmill training (BWSTT).

  2. 2.

    Hand therapy such as mass practice and practice provided using Wii-like devices.

  3. 3.

    Generic exercises and general fitness training programs such as arm cranking, hydrotherapy and home exercise programs.

  4. 4.

    Resistance training programs such as those typically involving free weights, theraband, strengthening equipment or any type of gym equipment.

We accepted studies with non-PT co-interventions including surgery, drug therapies or psychological interventions provided they were administered in exactly the same way to both groups. We reasoned that these co-interventions would have a similar effect across both groups thereby enabling us to isolate the effect of the PT treatment.

We excluded studies investigating the effectiveness of drug therapies, stem cell therapies, cranial stimulation, and studies that examined the effect of any intervention on the non-paralysed muscles of people with SCI (for example, studies that examined the effect of training programs on the upper limbs of people with paraplegia). Studies that included a combination of people with paraplegia and tetraplegia and involved training of the upper limbs, were excluded if 20% or more of participants had paraplegia. This was done in order to be confident that the effects of the training on the upper limbs predominantly reflected the effects on muscles directly affected by SCI. We also excluded studies that did not require participants to voluntarily contract neurologically-weak muscles as part of the intervention. Therefore, we excluded studies that examined interventions such as passive movements, passive cycling, vibration, transcutaneous electrical nerve stimulation (TENS), acupuncture or massage. The only exception to this last rule was any study which compared two or more interventions when at least one of the interventions required participants to voluntarily contract neurologically-weak muscles. For example, we excluded studies that compared stretching to no stretching but we included studies which compared stretching to robotic gait training.

Types of comparisons

We were interested in two comparisons, namely:

  1. (i)

    PT interventions compared to sham or no intervention.

  2. (ii)

    PT interventions compared to each other.

If trials included more than two groups, we extracted data only from two groups. We prioritised the two groups which were most likely to demonstrate a between-group difference and were most commonly compared in other trials (namely, BWST versus overground gait training [5, 15], and conventional hand therapy plus somatosensory stimulation versus conventional hand therapy [16]).

Types of outcomes

Trials were only included if they assessed voluntary strength of muscles directly affected by SCI. That is, muscles that could be reasonably expected to have less than grade 5 strength (according to a manual muscle test). Strength could be measured in any way provided it reflected either force (kg or Newtons), torque (Nm) or results of a manual muscle test (points). We accepted composite measures of strength such as the LEMS (points). If studies included more than one measure to reflect voluntary strength, we selected the measure that was most often used by similar studies to enable pooling of data. If studies reported the strength of muscles on the left and right side of the body separately (e.g. right and left knee extensor strength), then we always extracted the measures from the right side of the body. If studies measured outcomes at more than one end point (e.g. measured at 6 weeks and 12 weeks from randomisation), then we used the outcomes reported as soon as possible after the last treatment. Two review authors independently identified the most relevant outcome and end-point from each trial according to our criteria. Any differences were resolved by the third author.

Type of studies

We only included RCTs and randomised cross-over trials that were published in English.

Data extraction and synthesis

Two authors independently extracted data from the included studies using a standardised Excel spreadsheet. One author was an investigator on some of the studies which were ultimately included in the review but the other author was not. The following descriptive data were extracted: characteristics of intervention and comparator, dosage, target muscle, study design, sample size, characteristics of participants, and details of strength measures used in the analysis.

Strength data were extracted from each study to determine mean between-group differences and 95% confidence intervals (95% CI). Data were estimated from graphs if necessary. The following rule was used (from first to last preference) when deciding upon which data to extract:

  • between-group difference in post-intervention scores, adjusted for baseline scores.

  • mean and standard deviation (SD) of change scores (post-intervention scores and change scores were not pooled in meta-analyses in which results were expressed as standardised mean differences (SMD)). 

  • mean (SD) post-intervention scores.

If only medians and inter-quartile ranges (IQR) were provided, medians were extracted and used as means, and SDs were estimated by dividing the interquartile range by 1.35 [17]. Cross-over studies were analysed using combined data from all study periods, if available, or using first period data if combined data were not available. RevMan software [18] was used to convert 95% CIs, standard errors, p values and any other appropriate combination of data or statistical results into SDs when necessary. The direction of effect of each outcome was standardised.

Meta-analyses were conducted across studies that made similar comparisons if there were at least two studies without excessive clinical or statistical heterogeneity. Clinical heterogeneity was assessed by examining the type of participants, type and intensity of the intervention, and other issues related to the design and conduct of the studies. Statistical heterogeneity was quantified using the I2 statistic where an I2 > 75% was considered to indicate excessive heterogeneity [17]. A fixed-effects model was used to pool data if the I2 was less than 50%, and a random-effects model was used if the I2 was between 50 and 75%. If studies in a meta-analysis used the same measure of strength and same units, effects were expressed as mean differences (MD) and 95% CI. If different measures of strength or different units were used within a meta-analysis, effects were expressed as SMD and 95% CI but post-interventions scores were not pooled with change scores. Data were analysed using RevMan v5.3 [18].

All studies were assessed for risk of bias using the following six items of the Cochrane Risk of Bias Tool: Random sequence generation (selection bias); Concealment of allocation (selection bias); Blinding of participants and personnel (performance bias); Blinding of assessors (detection bias); Incomplete outcome data (attrition bias); and Selective reporting (reporting bias). Each item was rated as high, low or unclear risk of bias as per Cochrane guidelines [17].

Results

The searches from 2007, 2015 and Jan 2018 together retrieved 18,956 papers (see Fig. 1). From these, we identified 217 studies which were potentially RCTs in people with SCI involving a PT intervention. Nine trials which assessed the strength of the upper limbs were excluded because 20% or more of participants had paraplegia. Ultimately, after examining the full text and excluding duplicate publications, we identified 38 trials [1,2,3,4,5,6,7, 15, 16, 19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,46,47] that met our inclusion criteria but only 26 provided useable data and hence were included [1, 2, 4,5,6,7, 15, 16, 19, 20, 22,23,24,25,26,27, 31,32,33, 35,36,37, 40, 43, 44, 46]. The other 12 trials did not provide useable data: they stated that they measured strength but either did not report any strength data [38, 39, 42, 47] or did not provide sufficient data for inclusion in the forest plots [3, 21, 28,29,30, 34, 41, 45] (see the Supplementary File 1 for details).

Fig. 1
figure 1

Flow chart. PT Physiotherapy

The number of participants in each group in the 26 trials ranged from 5 to 38 with a median number (interquartile range) of 14 (10 to 29) participants per group (excluding dropouts; the details are provided in Table 1). Four of the 26 trials used a cross-over design [15, 19, 36, 40] and two trials used a within-participant design where the unit of analysis was the limb, not the participant [1, 2]. Four of the 26 trials included more than two groups [4, 5, 16, 31].

Table 1 Details of the included studies

All 26 trials measured strength in either Newtons (N), Nm or points from a manual muscle test (typically summed across multiple muscles). The Risk of Bias in the trials is provided in Figs. 24. In brief, 12 of the 26 trials were at high risk of bias on three or more items of the Cochrane Risk of Bias Tool. Bias was most commonly due to failure to conceal allocation, blind participants and personnel, and blind assessors.

Fig. 2
figure 2

Forest plot of trials comparing an intervention to no intervention or sham intervention expressed as standardised mean difference or mean difference. The figure includes trials which provided the same intervention to both groups but an additional intervention to the experimental group. Meta-analyses were only performed in the comparisons in (a). The results of each trial in (b) should be interpreted independently. The data from Jones 2014 are change data (no post data were provided), and the data from Bye 2016, Glinsky 2008, Glinsky 2009, Harvey 2010 and Harvey 2017 are back converted from mean between-group differences (95% CI). All other data are post data. The risk of bias on each item of the Cochrane Risk of Bias Tool for the trials are indicated by the coloured circles, where green reflects low risk of bias, yellow reflects unclear risk of bias and red reflects high risk of bias. The Risk of Bias refers to A: Random sequence generation (selection bias); B: Concealment of allocation (selection bias); C: Blinding of participants and personnel (performance bias); D: Blinding of assessors (detection bias); E: Incomplete outcome data (attrition bias); F: Selective reporting (reporting bias). SS somatosensory stimulation, ES electrical stimulation, MMT manual muscle test, LEMS lower extremity motor score, N newton, Nm Newton/metre, and Kg kilogram

Fig. 3
figure 3

Forest plot of trials comparing one type of locomotor intervention to another type of locomotor or physiotherapy intervention expressed as mean difference. The data from Hornby 2005 and Field-Fote 2011 are change data, and the data from Labruyere 2014 are back converted from mean between-group differences (95% CI). The data from Senthilvelkumar 2015 was back converted from mean between-group difference data (95% CI) calculated using individual data attained from the authors (the authors acknowledge in a personal communication that the mean (95% CI) data reported in their paper are incorrect). All other data are post data. The data from Field-Fote 2011 were converted from a 25-point scale (composite measure of lower limb manual muscle tests for right leg) to a 50-point scale; and the data from Alexeeva 2011 were converted from a 100-point scale (composite measure of lower limb manual muscle tests from 10 lower limb muscles) to a 50-point scale. The definition of the experimental group is based on the intervention that is most effective. The risk of bias: see legend for Fig. 2. BWSTT body-weight supported treadmill training, OG over-ground gait training

Fig. 4
figure 4

Forest plot of trials comparing one type of non-gait related therapy with another type of non-gait related therapy expressed as mean difference. Meta-analyses were not performed in any of the comparisons including the comparison of hand therapies (for this comparison there was a mix of change and post data - see text for details). The data from Gomes-Osman 2016 and Kowalczewski 2011 are change data and the other data are post data. The data from Galea 2017 was derived from a median (95% CI) between-group difference calculated and provided by the authors using a boot strapped median regression model. The definition of the experimental group is based on the more effective intervention. The risk of bias: see legend for Fig. 2. LEMS Lower Extremity Motor Scores, N newton, Nm Newton/metre

Trials comparing a PT intervention to a sham or no intervention

Ten trials compared a PT intervention to a sham or no intervention [1, 2, 4, 20, 22,23,24,25,26,27]. The results of the ten trials are split across two figures (see Fig. 2a and Fig. 2b).

Figure 2a includes five trials that examined the effect of two interventions: resistance training (two trials) and hand therapy (three trials). The results of these five trials were pooled in two meta-analyses (fixed effects) with results expressed as SMD. The meta-analyses investigating the effect of resistance training showed a statistically significant between-group difference (SMD = 0.64; 95% CI, 0.22–1.07; p = 0.003).

Figure 2b includes the results of the remaining five trials which compared one intervention to a sham or no intervention. These trials examined the effect of five different interventions: resistance training plus electrical stimulation, electrical stimulation, arm cranking exercise, biofeedback, and a package of PT interventions including locomotor training (see Table 1A). The results of the five trials were not pooled because of clinical heterogeneity, but instead are presented individually (see Fig. 2b). The results are expressed as MD with different units for each comparison. Two of the five trials indicated a statistically significant between-group difference: resistance training combined with electrical stimulation (MD = 14 Nm; 95% CI, 1–27; p = 0.03) and a package of PT interventions including resistance training combined with locomotor training (MD = 4.8/50 points on the LEMS; 95% CI 1.9–7.7; p = 0.01).

Trials comparing any two PT interventions to each other

Sixteen trials compared one type of PT intervention with another type [5,6,7, 15, 16, 19, 31,32,33, 35,36,37, 40, 43, 44, 46]. These 16 trials are split across two figures (see Figs. 3 and  4).

Figure 3 includes 10 trials that compared one type of gait training with another type of gait training (or another type of PT intervention): BWSTT versus overground gait training (two trials), robotic gait training versus overground gait training (three trials), robotic gait training versus strength training (one trial), swing-assisted robotic gait training versus swing-resisted robotic gait training (one trial), high-intensity BWSTT and overground gait training versus low-intensity BWSTT and overground gait training (one trial), body-weight supported overground gait training versus BWSTT (one trial), and robotic gait training versus stretch (one trial). All trials measured lower limb strength with a manual muscle test expressed as LEMS for both legs (/50 points), LEMS for one leg (/25 points) or a composite score of 10 muscles in each leg (/100 pts). For the purpose of analyses, all results were converted to a 50-point scale and presented as MD in Fig. 3. The results of the trials examining two comparisons (BWSTT versus overground gait training; robotic gait training versus overground gait training) were pooled in two meta-analyses (fixed effects) with results expressed as MD. One of the two meta-analyses comparing robotic gait training with overground gait training showed a statistically significant between-group difference (MD = 3.1/50 points on the LEMS; 95% CI, 1.3–5.0; p = 0.0008). The results of the other five trials were not included in meta-analyses and are presented individually. None of these trials showed a statistically significant between-group difference.

Figure 4 has the results of the remaining six trials which compared one type of non-gait related therapy with another type of non-gait related therapy. These trials examined three comparisons: one combination of hand therapies versus another combination of hand therapies (four trials), FES cycling versus passive leg cycling (one trial) and maximal intensity resistance training versus conventional resistance training (one trial). The results of the six trials were not pooled because the comparisons were not similar. Instead the results of each trial are presented individually and expressed as MD with different units for each comparison (see Fig. 4). None of these trials showed a statistically significant between-group difference.

Trials that did not provide useable data

Of the 12 trials that did not provide useable data, seven did not report any results for between-group statistical comparisons [30, 34, 39, 41, 42, 45, 47]; three reported a non-significant between-group difference [3, 21, 38]; and one stated the difference between the groups was significant but the reported p value indicated otherwise [29]. One trial reported a between-group statistical difference for neuromuscular stimulated-assisted arm cycling versus arm cycling alone. The outcome for this trial was the number of triceps muscles that improved by more than one grade on the 6-point manual muscle test. However, there was a unit of analysis problem because both arms of some, but not all, participants contributed to the analysis even though participants, not arms, were randomised [28] (see the Supplementary File 1 for details).

Discussion

This systematic review is important because strength is central to the independence, mobility and quality of life of people with SCI. The strength of muscles directly affected by SCI is particularly important because this is often used as a measure of recovery [8,9,10]. The main finding from this review is that there is surprisingly little evidence to support the widely held belief that PT increases the strength of muscles directly affected by SCI. We were only able to find a statistically significant between-group difference for four comparisons, namely, resistance training versus no intervention; resistance training combined with electrical stimulation versus no intervention; a package of PT interventions including locomotor training versus no intervention; and robotic gait training versus overground gait training. However, the results from these four comparisons were not compelling because of the small numbers of trials and imprecise estimates. Of course, our failure to find more convincing evidence that PT interventions increase strength is not evidence that PT interventions are not effective. It does however suggest that there is considerable uncertainty around this topic.

Two types of comparisons were included in this review. The first included trials that compared any PT treatment with sham or no intervention. This comparison alone answers the question about the effectiveness of PT interventions. For this comparison we were only able to find evidence to support resistance training (two trials), resistance training combined with electrical stimulation (one trial) and a package of PT interventions including locomotor training (one trial). Interestingly, the package of PT interventions also included “resistance training” and “strengthening exercises” (pg 2246.e1) along with locomotor training, making it impossible to determine the relative effectiveness of resistance and locomotor training for increasing strength [25]. However, taken together, the findings of these four trials point to the importance of resistance training alone or in combination with other interventions. This finding aligns with what is known about the role of resistance training for increasing strength in able-bodied individuals with neurologically-intact muscles [48].

The second comparison included trials that compared any two types of PT interventions to each other. This comparison provides no insights into the effectiveness of interventions per se. It only indicates whether one treatment is, or is not, superior to another. We only found a statistically significant between-group difference for strength (in this case, on the LEMS) from robotic gait training versus overground gait training. There was no other evidence that one type of intervention was superior to another. This may indicate that all interventions are equally effective or equally ineffective; it is not known without a control group.

We excluded trials investigating the effects of different PT interventions on upper limb muscles if 20% or more of participants had paraplegia because it cannot be assumed that neurologically-weak muscles respond in the same way to interventions as muscles directly affected by SCI. We believe that the distinction between the two types of muscles in people with SCI may be important and results should not be pooled within or across trials.

There are limitations to this review. For example, we did not include trials that were not published in English. This is a common limitation of systematic reviews that do not include bilingual authors. There is also the possibility that we introduced bias into our review because two of the authors are also authors on some of the included trials.

Our forest plots and meta-analyses did not include the results of the 12 trials which stated that they measured strength but either did not report any strength data [38, 39, 42, 47] or did not provide sufficient data for inclusion in the forest plots [3, 21, 28,29,30, 34, 41, 45]. We also did not include trials which may have measured strength but did not state that strength was measured. Similarly, there may have been trials which were conducted but never reported. These are all potential sources of publication bias tending to overstate treatment effectiveness because authors are less likely to publish the results of negative trials than positive trials, and because authors commonly omit the details of outcomes which are either negatively affected by the intervention or not changed by the intervention. It is not clear how big these types of publication biases are within the SCI literature. However, they could be minimised by compulsory prospective trial registration. We also did not extract data for all possible comparisons in studies with more than two groups. Instead, we prioritised comparisons that were most likely to demonstrate between-group differences. For this reason our results may be overly optimistic.

In all, the results of this systematic review raise important questions about the effectiveness of PT interventions for increasing strength in muscles directly affected by SCI. The results of a small number of trials indicate that resistance training is effective yet most trials in the review did not show a convincing treatment effect. The results should not be interpreted as evidence that PT interventions are not effective. They should however be interpreted as evidence that we need to be far more cautious before just assuming that PT interventions increase the strength of muscles directly affected by SCI.

Data archiving

The authors will consider all reasonable requests for the data upon which this systematic review are based.