Sir,

In a recent ‘Editors Page’,1 Lisa A Harvey warns to restrain from randomized controlled trials (RCTs) in clinical research. For theoretical reasons, randomization is bound to be superior to any non-randomizing statistical procedure like ‘historical controls’ obtained from retrospective sampling (see Gehan2and Lammertse3). The purpose of this letter is to demonstrate by interpreting a recent RCT (Spinal Cord Injury Locomotor Trial, (SCILT))4, 5 that RCTs also are vulnerable to produce rather misleading results. It appears that both methods have limitations and pitfalls, which need to be avoided.

Every new knowledge is defined only by the ‘historical’ one, as Edmund A Gehan (1984) notes in defending historical controls.2 Clearly, when judging a novel therapeutic approach, it is bound to show its superiority over previous (conventional) therapy. Unlike randomized trials, ‘historical’ controls cannot retrogradely be modified and are the fix standards for ‘conventional’ therapy. When randomizing on the other hand, two therapies are compared, which strictly speaking both do not necessarily relate to historic values; the ‘control’ therapy in particular might not at all be the same as the historic ‘conventional therapy’. A striking example for such mishappening in the field of locomotor training is SCILT, a multicenter randomized clinical trial also discussed in the Guttmann Lecture 2011 (printed in this journal).3 By ill-defining ‘conventional therapy’, SCILT caused quite some amount of confusion, which is still haunting the literature.3, 4, 5, 6, 7 This also shows that an apparently little mistake in the design of an RCT can have devastating consequences.

Retrospective sampling delivers the ‘historical’ control group, which by definition reflects the ‘conventional’ locomotor therapy at this time. In the 1980s, locomotor therapy for severely paralyzed non-ambulating spinal cord injury (SCI) persons focused on optimal handling of the wheel chair, whereas upright overground walking was not encouraged. This attitude changed with the introduction of motor-driven treadmill systems in which the patients were secured in an upright position by a suspending harness: therapists could comfortably aid in defined limb setting and body-weight shifts, which in spinal animals provided proprioceptive signals facilitating stepping (for ‘rules’ of spinal locomotion, for a review see, for example, Pearson8). These ‘rules’ were found valid also for the human spinal cord and led to training protocols like Laufband (LB) therapy and its later derivative ‘body weight-supported treadmill therapy’.9, 10, 11

In our initial comparisons with ‘historical’ controls, we had encountered a ‘striking superiority of LB therapy’ over any conventional locomotor training known and performed at that time.10, 11 Consequently, NIH funded a multicenter randomized clinical trial, SCILT,4, 5 that originally was dedicated as to ‘compare conventional….physical therapy with …. body weight-supported treadmill therapy'.

In a meeting preparatory to SCILT,4 this author suggested to strictly define what ‘conventional’ locomotor training was at that time. Instead, the trial's randomized control group was finally prescribed a never seen amount of ‘enriched aided walking over firm ground (OG)’ supported by several therapists as needed. This extraordinary effort was far beyond any ‘conventional’ locomotor therapy ever routinely applied to acute non-ambulating SCI persons in modern times; consequently, the (randomized) control group by far failed to represent conventional therapy but turned the control group into another intervention group instead. Quite unfortunate, this randomized ‘control’ group was in the text referred to as to perform OG or ‘conventional walking’; obviously, ‘conventional’ is used here in contrast to the (unconventional) novel ‘treadmill walking’.4, 5 Later on ‘conventional walking’ and ‘conventional therapy’ were used as synonyms,7, 12, 13 and many readers were led to conclude that the LB therapy was not better compared with ‘conventional overground training’.7, 13, 14 This is incorrect as (historical) conventional therapy did not at all foster regular upright walking for wheel chair-bound patients, and, ‘conventional OG training’ contained more aided OG walking than ever before, ‘conventional’ here referring to OG walking in contrast to LB training.

This problem of not defining ‘conventional’ therapy was further aggravated by the fact that both the randomized control and the experimental groups were actually trained according to the very same locomotor principles (‘rules’ of spinal locomotion9), similar in intensity and kinematics of aided limb settings: although on the treadmill BWS and upright positioning are conveniently maintained by the adjustable harness, during OG walking this support is delegated to 2–3 therapists and the patients themselves providing BWS via arm-supporting devices (like walking frames, rollator and parallel bars) Thus, not the novel principles, repeated context-related training of upright locomotion with defined joint and load settings of the limbs was tested but two ways to apply them, either aided OG or treadmill walking.7, 12, 13 Thus, the efficacy of intensive OG training is to be expected; in fact, we had encountered a patient near completely motor-paralyzed below T6, who had learned some independent stepping by training in parallel bars (patient Z in Wernig and Müller9). It is the merit of SCILT to have revealed, on a quantitative basis, that enriched OG walking can be quite effective (though not similarly practicable, see below).

Incidentally, for statisticians dealing with the matter of randomization and non-randomization, SCILT ought to be a rich source of real-life experimental data: It provides both, randomized (control and experimental) groups and a not-yet mentioned unique ‘historical’ control group of patients ’conventionally’ trained during the pre-trial period from 1997 to 1999 in the participating clinics.4 Using the same inclusion and exclusion parameters defined for the RCT (‘SCILT-eligible’),5 this historic control revealed that only 45% of the motor incomplete, non-ambulating patients reached independent walking with the ‘conventional’ therapy (extracted from Table 3 in Dobkin et al.4 from 36 initially ASIA C patients paralyzed below C5-T11 with Functional Independence Measure (FIM)<5, only 14 reached independent walking (FIM5)). The ‘randomized control’ group in the trial itself performed ‘enriched aided OG-walking’ and already scored 92%, which was not different from the randomized experimental group (92%) (see text in Dobkin et al.5). These numbers match perfectly well those from our previous trial,10, 11 with 39% success with conventional but 89% with LB therapy (calculated from the original set of data in Wernig et al.10).

When extrapolating this observation, one can even envisage superior effects from properly conducted OG walking, especially when in the experimental group BWS is incorrectly maintained at high levels throughout the training period on the treadmill.5, 15

SCILT confirmed that ‘enriched’ aided over-ground and treadmill training can produce similar results, but fails, however, to tell us formally whether the novel principles of training, not named but well applied in SCILT (see Methods in Dobkin et al.4, 5), are more effective than the ‘conventional’ (less walking-intensive) physio-therapy before the LB era. This ignorance barely hides that the final design of the randomized trial missed to answer the question originally set (by NIH) to comparatively evaluate conventional and LB therapy. Walking severely paralyzed acute and chronic SCI patients over ground in an upright position is not trivial (risks of orthostatic failures, falls due to heavy body weight, fading muscle strength, difficulty in passive limb setting, and so on) but consumes more of the patient energy and demands more effort from patients and (a larger number) of therapists. In fact, it is quite likely and understandable that the enormous effort necessary to train near completely paralyzed acute SCI patients OG without a treadmill and suspension in real life hindered and prohibited the proper and early application of locomotor training to those patients.

In this light, the recent suggestion6 of not to perform regular or robot-aided body-weight supported treadmill therapy as long as its superiority over enriched OG walking is not shown in randomized clinical trials seems more of a theoretical value and is incorrect. Taken seriously, such statement immediately would disclose severely paralyzed patients not capable of carrying their body weight or high tetraplegics with little or no use of the arms from the only practicable locomotor training available to them at the time.

Harvey1 quite rightly highlights the brilliance of randomizing incoming data. The case described here gives an example just how vulnerable even the best-designed RCT may be. It also shows that retrospective sampling is quite valid, in particular when complex therapies for complex diseases are to be tested. In this RCT, ill-definition and multiple use of the term ‘conventional’ and the missing comparison with a retrospective control group in SCILT turned out to be highly confusing. The notion that ‘conventional locomotor therapy is as good as LB therapy’ is simply wrong as it stands and is misleading readers and clinical investigators even today. The correct outcome is as follows: Aided OG walking can be as effective as LB therapy when performed in sufficient quality and amounts; the latter might not be available outside well funded investigations, however.

Interesting is Harvey's suggestion to apply RCT to major fractions of a cohort, but how does this work in trials in which a novel therapy is tested for the first time and the responding population is not known yet? When does the problem of selection become prominent in RCT as well? Is there a chance to improve conditions to avoid ‘sampling bias’ for retrospective collections, for example, by first considering all patients who entered the clinic during a set period of time and selecting all patients who meet the defined inclusion and exclusion criteria and unequivocally make them members of the trial? Keep in mind that there are few if any institutions who would support an RCT without strong indications derived from historical controls.