We thank Amiram Catz and colleagues for providing an opportunity to clarify any misunderstandings about our recent paper [1] that addressed how to better measure a treatment effect in a clinical trial after spinal cord injury (SCI). To increase the acceptance of SCAR by clinical investigators, the tool evolved through Rasch analysis of current assessment tools that could be modified to track changes in voluntary motor activities during recovery from SCI [1]. Although it incorporates some transformed items from the Spinal Cord Independence Measure (SCIM), as well as upper extremity motor scores (UEMS), it was not intended to replace SCIM [2]. SCIM is a valuable tool for describing many body functions after SCI, but it is a multidimensional ordinal assessment and was not designed to be an objective primary endpoint for a SCI trial.

The ultimate tool for any trial measurement is an interval scale that is unidimensional with the differences between scale scores being known and unchanging (i.e., linear). The clearest examples are measurements of time or distance. Understandably, such measures do not lend themselves to comprehensively describe body functions after SCI. Assessment of CNS trial endpoints has relied on subjective multidimensional ordinal scales (e.g., UEMS, SCIM), which were developed as descriptive classification tools and not intended as clinical outcome measures as differences between item scores are unequal and often a subjective evaluation based on the expertise and experience of the evaluator. However, it is possible to transform ordinal data by methods such as maximum likelihood estimation algorithms (e.g., Rasch analysis) to measure and test an underlying unidimensional trait. Our goal was to develop a unidimensional measure for acute and sub-acute SCI trials that would enable investigators to repeatedly and reliably track one dimension of SCI recovery, volitional motor behavior, as a metric for the effectiveness of an interventional treatment.

We examined current assessment tools, recognizing clinicians wish to use familiar scales. Through Rasch transformation, we were able to combine UEMS and items from SCIM that define voluntary task-specific activities of daily living (ADLs). UEMS can be a useful measure of voluntary motor capacity when SCIM activities cannot be performed (floor effect), such as early after cervical SCI. At acute stages when only neurological impairment (e.g., UEMS) can be measured and at later time points when both UEMS and SCIM can be performed, SCAR was hypothesized to provide a more accurate measure of change in volitional performance along the entire recovery period. In addition, we hypothesized that voluntary ADLs involving muscle groups increasingly caudal to the neurological level of injury are more challenging to re-acquire. To maintain the unidimensionality of the SCAR measurement (i.e., focus on voluntary motor performance) we must exclude other dimensions of patient recovery that are dependent on involuntary autonomic neural inputs; this also avoids any value judgement in the relative weighting of multiple dimensions as to their importance in a person’s recovery from SCI. In short, you sometimes have to sacrifice descriptive breadth to linearly track a primary endpoint that can be repeatedly performed by a person participating in a clinical study.

SCAR is a unidimensional measure as it tracks the performance along a single latent trait (voluntary motor activity). Rasch analysis generates an interval scale that has been repeatedly validated in multiple large samples from ~3000 participants within the EMSCI (European Multicenter study on Spinal Cord Injury) database. The re-scoring of some SCAR items, according to Rasch analysis, is necessary to avoid disordered item responses where the item scale might not accurately track the ability for a clinically meaningful change in volitional performance. Indeed, previous Rasch analysis by Catz et al. [2] also identified SCIM item scoring disorder as a concern. Any single unit (delta) change along the transformed SCAR scale is of the same magnitude (as in a ruler). The probability of correctly or more highly scoring an item is an increasing function of the difference between a person’s ability and the difficulty of an item (i.e., monotonicity). More difficult items are scored by fewer people with greater functional recovery. SCAR items are sufficiently different to avoid redundancy (i.e., local independence) and scoring estimates are consistent across clinically meaningful subgroups (i.e., differential item functioning).

Catz et al point to issues related to DIF, inflation of Type I error, etc. Our evaluation of these for SCAR provided assurance on the reliability and accuracy of SCAR as a unidimensional measure appropriate for use in clinical trials using modern psychometric methods. No other currently used measures for improvement of SCI subjects have been shown to fit the Rasch model let alone evaluate its properties. SEM refers to the Standard Error of Measurement and is σ √(1-r) where σ is the Standard Deviation (SD) in change of SCAR from acute baseline to 6 months (normal trial duration for many acute and sub-acute studies) and r is a measure of the reliability using the person separation index.

Our initial SCAR article [1] should be consulted for further justification of our hypotheses. We believe it has addressed the majority of the concerns by Professor Catz and his colleagues who developed the SCIM tool. With an emphasis on providing an objective interval scale, directed to measuring a primary trial endpoint, SCAR serves an entirely different purpose to SCIM. We are working on publications related to details of the Rasch properties of SCAR and, as we have already stated, the value of SCAR, motor scores and/or SCIM requires independent prospective study evaluation.