Introduction

The effectiveness of physical interventions targeting motor performance for people with spinal cord injury (SCI) is often measured in clinical trials through generic outcomes measures such as the Functional Independence Measure1 or Spinal Cord Independence Measure.2 However, generic outcome measures commonly focus on one aspect of motor performance such as burden of care or level of independence. Mostly, they do not capture the many aspects of motor performance important to clinicians or patients. A possible solution is to ask clinicians and patients to rate their overall impressions of change over time on a 15-point global impression of change scale anchored with ‘a very great deal better’ at one end (+7) and ‘a very great deal worse’ (−7) at the other.3, 4 Impressions of change are commonly used as part of clinical trials in gerontology5, 6, 7 and psychiatry,6, 8 but only rarely in SCI.9, 10

Although clinicians’ and patients’ impressions of change are subjective, they are nonetheless valuable and reflect the way clinical decisions are made in practice about ceasing, commencing or changing therapy.5 Presumably, when clinicians or patients rate overall impressions of change in motor performance, they take into account neurological status as well as the time and effort devoted to therapy and implications of change on real life. Arguably, clinicians’ impressions of change are particularly valuable because they draw on clinical judgement,6 which involves assessing the quality and speed of movement, and reflecting on the real-life implications of change in performance for people with SCI.

A problem with relying on clinicians’ and patients’ impressions of change for clinical trials is the potential for bias. That is, clinicians’ and patients’ impressions of change may be strongly influenced by their expectations of treatment effectiveness. This is arguably not a problem when capturing patients’ impressions of change, because their expectations are an integral and important part of the construct. However, bias is a problem when relying on clinicians’ impressions of change as part of clinical trials. A possible solution is the use of video clips. If blinded clinicians can rate impressions of change from comparing pairs of video clips taken at the beginning and end of an intervention period, their expertise can be harnessed in an assessment that reflects a global rating of impressions of change.11, 12, 13 Ratings provided to participants in the control group can be compared with ratings provided to experimental participants. However, before advocating the use of this methodology, we need a better understanding of the constructs captured in clinicians’ and patients’ impressions of change. For example, how closely do clinicians’ and patients’ impressions of change mirror traditional objective measures of change, and are clinicians’ and patients’ impressions of change similar? The purpose, therefore, of this study was to clarify the differences in clinicians’ and patients’ impressions of change in motor performance and to compare them with objective measures. The null hypothesis was that clinicians’ and patients’ impressions of change are not different to objective measures of motor performance.

Materials and methods

Patients

Thirty inpatients undertaking initial rehabilitation at one of the three Sydney SCI units were invited to participate in the study. Patients were included regardless of neurological status provided they had a recent SCI and therapy was being directed at improving their abilities to transfer, sit unsupported or walk. Patients were only assessed on motor tasks appropriate for them. For example, patients with motor complete thoracic paraplegia who were not participating in gait programmes were not assessed on their ability to walk, and patients with near-normal trunk control not receiving trunk-related therapy were not assessed on their ability to sit unsupported. This inclusion criterion was used to avoid floor and ceiling effects, and to mimic the way participants are typically selected for clinical trials targeting motor performance. The study received ethical approval from the appropriate institutions and informed consent was obtained from all patients. The authors certify that all applicable institutional and governmental regulations concerning the ethical use of human volunteers were followed during the course of this research.

Data collection

Patients were assessed on two occasions separated by between 1 and 5 months. Patients were assessed on one, two or three of the following motor tasks: transferring (n=23), sitting unsupported (n=25) and walking (n=12). In total, 120 assessments were performed (that is 60 pairs of assessments). The three motor tasks reflected skills commonly targeted in physical rehabilitation. The three tasks were also selected to maximize sample size and to provide a model to explore clinicians’ and patients’ impressions of change.

Three outcome measures for each of the motor tasks were used, namely (i) objective measures of performance, (ii) patients’ impressions of change and (iii) clinicians’ impressions of change. All outcomes were expressed as a 15-point change score between the first and second assessment.

Objective measures of task performance. Patients’ abilities to transfer, sit unsupported and walk were evaluated at the first and second assessment by a research clinician not associated with clinical services, using standardized objective outcome measures of task performance. All three objective measures were expressed relative to a total maximum score of seven. The difference between scores attained at the first and second assessment for all three motor tasks was used to derive change scores, where −7 reflected maximal deterioration (that is, a deterioration from a score of seven on the first assessment to zero on the second assessment), and +7 reflected maximal improvement (that is an improvement from a score of zero on the first assessment to seven on the second assessment).

Ability to sit unsupported was rated by asking the patients to pull a T-shirt over their heads while sitting unsupported on the edge of a physiotherapy bed. Patients’ feet were supported on the ground. A 7-point scale similar to the Functional Independence Measure was used, where a score of one reflected ‘total assistance’ and a score of seven reflected ‘complete independence’. This outcome measure is a modification of a similar one where time to don a T-shirt is used.14, 15 It is well correlated with other measures of unsupported sitting15 and was chosen because it uses a 7-point scale14 and reflects the ability to perform a challenging but purposeful motor task while maintaining an upright seated position. A loose fitting T-shirt that could be easily pulled over the head was used to minimize the need for good hand and upper limb function.

Ability to transfer was assessed using the transfer item of the Clinical Outcome Variables Scale.16 Patients were asked to move from the wheelchair to a physiotherapy assessment bed. They were scored on the following 7-point scale:

  1. 1

    total dependence;

  2. 2

    assistance of one person and a device;

  3. 3

    assistance of one person and no device;

  4. 4

    supervision with or without a device;

  5. 5

    independence with device;

  6. 6

    independence without device but slow, awkward and requires excessive effort;

  7. 7

    independence without device and in an effortless and co-ordinated movement within a reasonable time.

The transfer item of the Clinical Outcome Variables Scale was used rather than an equivalent item from the Functional Independence Measure because it is freely available. A similar transfer item of the Spinal Cord Independence Measure was not used because it is based on a 3-point scale.

Ability to walk was assessed using the Walking Index for SCI.17 This uses a 21-point scale based on the need for assistance, orthoses and walking aids. For the purpose of this study, scores were divided by three and expressed relative to a maximal score of seven. The mathematical manipulation of Walking Index for SCI data is not advocated for general use, but was a reasonable compromise given the purpose of this study. It enabled the use of the widely advocated Walking Index for SCI and for the results to be expressed in a uniform way, avoiding the need to express all outcomes as a percentage of total possible scores.

Patients’ impressions of change At the second assessments, patients were asked to rate their perceived impressions of change in performance since their first assessments for the three motor tasks (where applicable). This was performed after completing all objective assessments. They rated change on a 15-point scale, where −7 reflected ‘a very great deal worse’, 0 reflected ‘no difference’ and +7 reflected ‘a very great deal better’.4, 18 This concept was intentionally left undefined. Patients did not view their videos.

Clinicians’ impressions of change. Clinicians’ impressions of change were determined from video clips in order to mimic the intended use of this assessment. A short video of each patient attempting or performing each of the three motor tasks (where applicable) was recorded at the time of the first and second assessment. Each video was between 30 s and 2 min duration and the angle of the camera and distance between the camera and patient were standardized. Aids and orthoses were used as required and were not standardized between repeat videos. Videos of unsupported sitting showed patients sitting on the edge of a treatment bed without back support, but with their feet on the ground. A research assistant sat in front of the patients while they attempted to reach in various directions. The research assistant challenged patients with reaching tasks at the limits of their abilities. Videos of transferring depicted patients transferring between a wheelchair and a treatment bed. Patients used slide boards if necessary. Videos of walking captured patients walking 10 m at a comfortable speed with orthoses, assistance and walking aids as required. During all recordings, a research assistant provided the patient with guarding, verbal cueing or physical assistance if necessary.

All videos were collated into pairs corresponding with the first and second assessment of each patient. Sixty pairs of videos were generated (120 videos in total). Two physiotherapists not associated with the involved spinal units and with >6 years SCI experience (and at least another 4 years physiotherapy experience) were asked to separately view and rate the pairs of videos. The video taken at the time of a patient's first assessment always appeared on the left of the screen with the second assessment on the right. The physiotherapists were aware of this ordering. Otherwise, the presentation of videos was random. The physiotherapists were asked to rate their ‘impressions of change’ between the pairs of videos using the same 15-point scale used by patients. They were instructed to take into account all aspects of a patient's clinical presentation apparent on the video and to provide a score reflective of their impressions of change based on clinical judgement. Similar to the patients’ ratings of impressions of change, this concept was intentionally left undefined. The two physiotherapists were not told the patients’ own ratings for impressions of change or the results of the objective outcome measures. They were told that patients had received standard in-patient care, but were neither provided with details about the type or extent of therapy, nor were they provided with patients’ medical or past histories. They were, however, told the time lapse since injury. The two clinicians’ impressions of change scores were averaged for each pair of videos for the analyses.

Data analysis

Non-parametric statistics were used throughout and all data are reported as medians and interquartile ranges. Data from each motor task were analysed separately. For each motor task, Friedman's tests were used to determine statistically significant differences between clinicians’ impressions of change, patients’ impressions of change and objective measures of change. Where differences existed, post hoc Wilcoxon's signed rank tests were used to examine statistically significant differences between:

  1. 1

    clinicians’ and patients’ impressions of change;

  2. 2

    clinicians’ impressions of change and objective measures of change;

  3. 3

    patients’ impressions of change and objective measures of change.

For all analyses, P-values <0.05 were accepted as significant. In addition, per cent close agreements (defined as a two-point difference on the 15-point scale) between the two clinicians’ ratings from the video clips were calculated.

Results

The median (interquartile range) age and time since injury of the 30 patients were 37 years (25–56) and 3 months (2–4), respectively (see Table 1 for details). Clinicians’ and patients’ impressions of change are shown in Table 2 and Figure 1. There was a statistically significant difference between clinicians’ impressions of change, patients’ impressions of change and objective measures of change for all three motor tasks (P<0.01). Post hoc analyses revealed that both clinicians’ and patients’ impressions of change were significantly higher than objective measures of change for all three motor tasks (P-values <0.05). Patients’ impressions of change were also significantly higher than clinicians’ impressions of change for sitting unsupported (P=0.02), but not for transferring (P=0.08) or walking (P=0.06). There was good agreement between the two clinicians’ ratings of impressions of change taken from the videos. The per cent close agreement was 0.9, indicating that the two clinicians’ scores were within two points of each other 90% of the time.

Table 1 The characteristics of the patients
Table 2 Median (interquartile range) change in performance between time of first and second assessment for objective measures, clinicians’ impressions of change and patients’ impressions of change
Figure 1
figure 1

Vertical box plot indicating 10th (lower error bar), 25th (bottom of box), 50th (horizontal line across box), 75th (top of box) and 90th (top error bar) percentiles and outliers (filled circles) of objective measures, clinicians’ impressions of change and patients’ impressions of change for the three motor tasks, namely sitting unsupported, transferring and walking. All pair-wise comparisons within each of the three motor tasks are statistically significant (P<0.05) with the exception of clinicians’ and patients’ impressions of change for transferring (P=0.08) and walking (P=0.06).

Discussion

The important finding of this study was that clinicians’ and patients’ impressions of change in motor performance were greater than the changes measured on objective outcomes. In addition, patients rated their own change higher than clinicians, although this was only statistically significant for sitting unsupported. The overall purpose of this study was to better understand the constructs captured in clinicians’ and patients’ impressions of change in motor performance with the ultimate aim of exploring the potential usefulness of these outcome measures for clinical trials. The use of videos for rating change may provide a way of capturing clinicians’ impressions of change while minimizing bias and maintaining assessor blinding.19

The two clinicians used in this study provided very similar ratings and scored within two points of each other 90% of the time. The occasional use of negative scores by the two clinicians indicated their willingness to rate observed change rather than expected change, although the clinicians’ awareness of the ordering of videos may have led them to overstate change.7 Future studies could explore this issue by manipulating without disclosure the way in which videos are presented for rating.7 This was not performed as part of this study because it would have invalidated the comparison between clinicians’ and patients’ ratings. It could be performed for clinical trials, although it would not be essential because knowledge about the ordering of videos will not systematically favour one group over another. It might, however, inflate impressions of change for both groups.

Differences between clinicians’ and patients’ impressions of change were not unexpected and could be due to a number of reasons. Presumably, clinicians use their clinical judgement and past experience to mould their expectations and to benchmark the progress of a patient.20 Experienced clinicians, such as those used in this study, have a good understanding of movement upon which to assess the quality, fluency and speed of performance. In contrast, patients have no prior experience with SCI and little exposure to what patients in similar situations typically achieve. Differences between clinicians’ and patients’ ratings may also reflect different value systems. For example, clinicians may place a higher value on quality of movement than ease of movement. Even if both placed the same value on quality and ease of movement, clinicians may not have been able to readily judge the effort associated with movement, and patients may not have been able to accurately assess the quality of their own movement. In the same way, clinicians have no direct experience of the effort or emotions associated with SCI rehabilitation and may, therefore, be less impressed with improvements than the patients themselves. Part of the difference between clinicians’ and patients’ ratings may also reflect the accuracy of patients’ recall. Some patients were asked to rate change over a 5-month period; patients may have had difficulty remembering their initial performance and may have been overly influenced by their performance on the second assessment.12 In contrast, clinicians viewed initial and final performance almost simultaneously. They were not required to remember change over time. Perhaps, therefore, patients should have also rated their change from videos. This might have generated different results; something that could be explored in future studies.

The higher ratings given by clinicians and patients than the values obtained on the objective outcome measures need to be interpreted with caution. The scales are clearly measuring different constructs and using different scoring systems. In addition, clinicians’ and patients’ may have been heavily influenced by their pre-conceived beliefs about treatment effectiveness.12 Alternatively, it may be that clinicians and patients intuitively apply a scaling system when asked to rate change. That is, they rate change after taking into account factors such as weight, age, spasticity, neurological status and expected outcome. Initial status may also influence clinicians’ and patients’ impressions of change.12 For example, Liang20 suggested that clinicians rate change lower in patients with poor initial function than patients with high initial function. This type of scaling may prove to be problematic for trialists, but may also provide more scope for measuring responsiveness to treatment than traditional objective outcome measures.21 Of course the real issue for trialists is whether impressions of change provide a better way of distinguishing between improvements in motor performance between control and experimental participants than the more traditional objective outcome measures (for discussion about the statistical and methodological issues related to the use of impressions of change scales in clinical trials, see references6, 7, 20). The use of impressions of change in clinical trials would also require trialists to define a sufficiently important difference. That is, the minimal between-group difference required to warrant the time, cost and effort associated with an intervention. This concept is important because it shifts the emphasis in clinical trials from P-values to the size of treatment effects.3, 22 Obviously, the sufficiently important difference would depend on the circumstances, participants and intervention.3, 11, 18

Clinicians’ and patients’ impressions of change could also be used in other areas of research. For example, they could be used to define sufficiently important differences for objective outcome measures such as the Walking Index for SCI and Clinical Outcome Variables Scale. Impressions of change could be used as anchors to define worthwhile effects from the perspectives of the clinicians and patients.12, 18, 23 This and similar methodologies are increasingly used in different areas of medicine3, 24, 25 and pain management,11, 26 and are starting to be explored in SCI.27 However, this type of research in SCI requires large numbers of participants, some of whom need to show notable change over time.

Conclusion

Clinicians’ and patients’ impressions of change in motor performance may be a useful outcome measure for clinical trials. Independent clinicians blinded to interventions can rate impressions of change from videos, thereby minimizing bias. The potential usefulness of this type of outcome measures requires further validation before advocating its widespread use in clinical trials.