Introduction

The development of a multidisciplinary approach to functional surgery on upper limbs in tetraplegics has typified the development of rehabilitation medicine for the last 30 years. The Fifth International Conference on Tetraplegia in Melbourne in 1995 underlined the importance of the evaluation of the results. Subsequent conferences and numerous surgeons' publications have confirmed this priority.1,2,3,4,5,6,7

Although many instruments or tests are used to assess outcome after surgery, their reliability, validity and responsiveness have not been adequately proven. Methodology appears to be the major failing of the various scales used to assess these patients. Most of the authors mentioned difficulties of evaluation and the absence of an ideal tool in this domain.6,7,8,9,10,11,12,13,14,15,16,17,18 The existing assessment tools are inadequate both in terms of their validity and in the absence of clear conceptual models forming the basis of their scales. There is limited documentation of the guiding framework or conceptualization. Furthermore, the process of item selection is often unknown. Scales or instruments are also deemed to be too insensitive to document the small but meaningful functional gains made by tetraplegics after functional surgery.

To answer the need for a specific assessment tool for tetraplegics who undergo functional surgery, we have developed a national, multicenter, prospective and longitudinal study based on the International Classification of Functioning, Disability and Health (ICF) recently revised by the World Health Organization.19 Within this model, the last propositions of the Quebec Committee of the International Classification of Impairments, Disabilities and Handicaps (ICIDH) to improve the understanding of the disablement process have distinguished two concepts:20 the first concept is related to life habits that are activities of daily living and social roles recognized by the sociocultural context of a person according to age, sex, and social and personal identity. They include activities that should be accomplished on a daily basis (nutrition, fitness, personal care, communication, mobility, etc). Life habits presenting a significant level of disruption can create handicap situations. The second concept is in relation with motor capacities that correspond to the abilities of a patient to perform basic and functional tasks regardless of contextual factors (environmental and personal factors). For this reason, the Motor Capacities Scale (MCS) does not address basic activities of daily living (ADL) such as eating, dressing, bathing, etc. Performance in ADL is largely evaluated in clinical settings by tools such as Functional Independence Measure (FIM),21 Quadriplegia Index of Function (QIF)22 or Spinal Cord Independence Measure (SCIM),23 not specifically designed for tetraplegics. The purpose of the MCS is to focus on elementary motor abilities required to achieve ADL. It does not take into account contextual factors. In the framework of the tetraplegia, tasks identified as consistent with this concept were transfers, repositioning, locomotion, spatial exploration, grasping and gripping. They all involve hand and upper-limb motor abilities. They are gathered in the same scale in order to provide the examiner with a total score reflecting the whole contribution of functional surgery.

In accordance with this latter concept, requirements for an appropriate assessment tool of motor capacity were as follows: a scale in conformity with the predefined concept of motor capacity, a scale specifically designed for tetraplegic C4–C7 who undergo functional surgery, items suggested by tetraplegics themselves or by staff competent in the management of tetraplegics, items that are likely to be sensitive to functional surgery and satisfactory metrological properties.

A steering committee including 12 rehabilitation doctors, 12 occupational therapists, three surgeons, one statistician and two experts was formed. A list of tasks referring to daily living tasks was assembled from various sources: observation of and interviews with patients, review of existing literature, existing scales, and discussions with occupational therapists and other physicians. A total of 300 items were classified into either a group of 80 items of motor capacities or a group of 220 items of life habits. Two experts – a rehabilitation doctor and an occupational therapist – were asked to verify the clarity of description of each item, its proper assignment to the chosen group and category, and the due representation of items in each category. A qualitative agreement was reached by the experts and the steering committee reducing the ‘items of motor capacities’ group to 60 items. The resulting MCS includes six functional categories, each with a different number of tasks: transfers, repositioning on Bobath's couch,24 repositioning on wheelchair seat, locomotion in a manual wheelchair and in an electric wheelchair, motor capacities of spatial exploration and motor capacities for grasping and gripping. Functional categories were defined at the request of both experts.

Methods

Population

Inclusion criteria were the following: adults, complete motor tetraplegia, C5–C7 level – AIS A or B25, at least 3 months post spinal cord injury, at least 3 months post surgery, functional surgery on upper limbs or not. Subjects with psychiatric troubles, cognitive disorders, poor French or an unstable medical status were excluded. Demographic data, such as age, gender, level of education graded 1–7 and professional status, were collected. Neurologic data were obtained at the time of the evaluation, using the standards of the American Spinal Injury Association (ASIA).25 Motor examination consisted of manual muscle testing of 10 paired myotomes, using a six-point scale ranked from 0 (total paralysis) to 5 (normal active movement) for a total score ranging from 0 to 100. By adding the score of the 20 muscles, an ASIA total motor score was obtained. Description of the motor level complied with the International Standards for Neurological and Functional Classification of Spinal Cord Injury.25 Motor level corresponds to the lowest normal myotome. Upper limbs were categorized by the International Classification for Surgery of the Hand in Tetraplegia (ICSHT).26 This classification takes into consideration the remaining active muscles graded at least 4/5 in the forearm and the hand Table 1. For people who underwent functional surgery, the main procedures were mentioned. The time lapse until the follow-up examination was specified.

Table 1 International classification for surgery of the hand in tetraplegia

Procedures

In a preliminary study, the MCS was submitted to occupational therapists and to 40 tetraplegics for criticism. Based on empirical results, another reduction of the number followed, leading to the selection of 49 items of most pertinent items. The following three experimental stages were proposed:

  • An open study, aimed at studying the feasibility and the acceptability of the scale via a formed protocol of evaluation.

  • An intermediate study, carried out to assess inter-rater reproducibility, that is, the extent to which the scale is free from random error.

  • A prefinal study, focused on construct validity relating to grouping and scaling properties.

At each stage, the scale was applied to different patients who had functional surgery and to patients who had not undergone functional surgery. At the meeting with each patient, demographic information was collected first and then a physical examination was conducted. Assessment was performed by occupational therapists on the basis of an external evaluation and direct observation. A score, ranging from 1 to 5, was assigned for each task in the first four domains – transfers, repositioning on Bobath's couch, repositioning on wheelchair seat and locomotion. For motor exploration and for grasping and gripping, a two- and four-point scale were, respectively, chosen. A total score was calculated by summing the subscores of each functional category. Standardized and codified instructions for assessing the MCS were available.

Data analysis

Results were entered onto an ACCESS database by one single person – a rehabilitation doctor – in order to avoid misinterpretations or errors when key-boarding.

Demographic data

Quantitative variables were arithmetic mean value, extreme values, and standard deviation (SD). Qualitative variables were limited to a description or histograms of frequency.

Open study

A small sample of 33 tetraplegics (23 of whom had undergone surgery) was assessed to check for any misunderstanding. In this pilot study, analysis was performed item by item and was only qualitative, relating to redundancies, unclear items and to the comprehensibility of scoring and instructions.

Intermediate study (Table 2)

Table 2 Intermediate study (motor capacities items)

In all, 30 tetraplegics were included (Table 3). Of these, 10 of them had undergone functional surgery of the upper limbs. Normality of the distribution of item responses was evaluated by the Komolgorov–Smirnov test. The limit of significance of the test was 5%. For assessing inter-rater reliability, each subject was assessed twice, by two different raters, with an interval of 24 h between the two evaluations. This interval was chosen to avoid variations in clinical status and to avoid the patients remembering previous answers. Inter-rater reproducibility was calculated for each item using an intraclass correlation coefficients (ICC). It was deemed that an ICC of less than 0.70 should result in the elimination of the item, and that an ICC greater than 0.80 should retain the item. A discussion was opened for items whose ICC was greater than 0.70 and lower than 0.80.

Table 3 Demographic and clinical characteristics

Prefinal study

The complete and administered version of the prefinal scale is presented in Table 4. In total, 52 tetraplegics were included in this study. Demographic and clinical characteristics are displayed in Table 3.

Table 4 Prefinal version of MCS

Normality of the distribution was also evaluated by the Komolgorov–Smirnov test. Inter-rater reliability was calculated again for each item and for the total score using an ICC. The agreement between both raters was also assessed using Bland and Altman's method. In this method, the difference between two scores is plotted against the average of the same two scores for each patient. It shows both the differences observed and whether the difference is related to the score. ICC and Bland and Altman's method were demonstrated to provide complementary information on reliability.27 Item to item correlation referring to the correlation of a single item score with another item score was also calculated to identify redundancies, using the Spearman rank-correlation analysis. We considered that values above 0.80 suggested a high level of redundancy and should lead to a fusion of both items. Variance analysis consisted in calculating the explaining part of the variance of the total score by each of the functional domains.

Since no pre-existing ‘gold standard’ was available, construct validity was studied. This was intended to test two convergent and two divergent hypotheses against other measures involving, respectively, convergent and divergent concepts. Correlations were calculated using the nonparametric Spearman's rank-correlation coefficient (r), because a normal distribution could not be demonstrated for the studied parameters. Hypotheses were confirmed when P was <0.01. Correlation was considered to be excellent if r>0.91, good if 0.90<r<0.71, moderate if 0.51<r<0.70, low if 0.31<r<0.50 and null if r<0.30.28

Four criteria were identified for the elimination of some items: (1) a variance of the item equal to 0, (2) a low reproducibility – ICC lower than 0.70, (3) a high level of redundancy studied by item to item correlations – higher than or equal to 0.80 and (4) a low level of comprehension.

Results

Open study

At this stage, the number of motor capacities, initially of 49 items, was adjusted to 53 items. Clarifications were included in the written description of the items and in the formulation of the instructions. The duration of measurement was considered too long, about 50 min.

Intermediate study

Distribution of the total score of both groups (operated group and nonoperated group) was found to be normal, with P, respectively, over 0.999 and 0.7201. A reproducibility analysis showed that 39 items had an ICC higher than or equal to 0.80. All but four were retained. The removed items concerned spatial exploration of upper limbs and were not deemed sensitive enough to functional surgery by the majority of examiners. In all, 10 items had an ICC between 0.70 and 0.79. They were considered of low pertinence, especially items relating to interdigital grasp, and were eliminated. Four items had an ICC lower than 0.70. Three of them were eliminated. Item 19 was modified. None of the items had an ICC less than 0.6. The ICC for the total score was 0.99. The result was a further reduction of the items relating to motor capacities (36 items). Many modifications were made in the written description of the items, in scoring and in the instructions (such as by the addition of illustrations). The estimation of the duration of the evaluation was at least 20 min and at the most 50 min.

Prefinal study

Descriptive results for patients' scores are in Table 5. Total scores for motor capacities were found to be normally distributed. When considering both groups (operated on and nonoperated) individually, item responses considered individually were also normally distributed (P=0.821). There were no ceiling and floor effects.

Table 5 Descriptive results of patients' scores in the prefinal study

The ICC was greater than 0.75 for all items. For the total score, the ICC of 0.99 was excellent. The 95% confidence interval was 0.974–0.995. Bland and Altman graphic representation was applied to the scores of the 52 patients (Figure 1). The distribution of the differences was homogeneous with no systematic trend. The differences of results between both evaluations did not depend upon the mean score of motor capacities.

Figure 1
figure 1

Bland and Altman's graphic representation of the reliability of the MCS (means of differences, −1.6±7.35)

In items and categories analysis, six pairs of items were found to be redundant: items 2–3, items 6–7, items 8–9, items 8–10, items 9–10, items 31–32. Functional domain to functional domain correlations revealed a high correlation between the first three domains: transfers, repositioning on Bobath's couch and repositioning on wheelchair seat. The correlation matrix between each domain is reported in Table 6. Substantial correlations (over 0.8) were found between the first three domains. The explaining part of the total variance for each functional domain is homogeneous, varying at around 64%.

Table 6 Correlation matrix of all measured variables (prefinal study)

On the basis of the criteria of elimination mentioned above, no item was affected by criteria (1), (2) and (4). Consequently, redundancies revealed by item to item correlation led to the fusion of items 2 and 3 into one item, and of items 8 and 9 into one item. Item 31 was retained. Item 10 was removed. At the request of the steering committee, items 25 and 32 were removed because of a low pertinence and items 6 and 7 were not merged because of the specific role of each. Apparent and content validity were deemed good because the scale was originally developed by an experienced multidisciplinary team and was enhanced by the input of two independent experts. Items were well accepted by the patients. Exploratory factor analysis was not possible since the sample of patients was too small with respect to the number of items. Hypotheses were tested against other measurements. We hypothesized that:

  1. 1)

    ‘The total score of motor capacities is correlated with the Sollerman score of prehension’.29

  2. 2)

    ‘The total score of motor capacities is correlated with the global ASIA motor score’.25

  3. 3)

    ‘The total score of motor capacities is not correlated with the level of education’30 (Table 7).

    Table 7 Level of education
  4. 4)

    ‘The total score of motor capacities is not correlated with the interval of time between the assessment and the spinal cord injury’.

Expected convergent and divergent validities were observed. Correlation is shown in Table 8. The scale was strongly correlated with the Sollerman test and the ASIA motor score. Conversely, very low correlation was observed between the level of education and the interval of time between assessment and accident. Motor capacities items were reduced to 31 items representing the items of the next and final study.

Table 8 Convergent and divergent validity (prefinal study)

Discussion

The evaluation of arm and hand function is of utmost importance before and following hand surgery in tetraplegics. A standardized test is needed as a feedback for the surgeon to rectify some procedures or to adjust others. The conceptual models underlying the evaluation are all too often unspecified.

In the evaluation of hand function, Wuolle et al15 stresses the difficulties in distinguishing between what relates directly to the hand and what relates to the participation of the whole upper limb and the trunk. He also mentions the lack of pertinence of selected tasks for tetraplegics and recommends the use of standardized tools with strict instructions giving less scope for examiners to neglect rigorous methodology in their administration. Lo et al18 agrees with Wuolle when he states that tests assessing hand function separately are more sensitive to change than tests that include hand function in the assessment of the whole upper limb and the trunk. The absence of control groups is presented as another methodological failing.6 Harvey et al31 stresses the importance of such groups for better classification into good, medium or bad results, since no study has yet quantified the level of hand function attained by a large and representative cohort of tetraplegics. As far as the modalities of administration and the surgical procedures are concerned, some authors draw attention to the difficulties in the comparing of results, because the modalities of the assessment are either not described or not standardized, or because populations and surgical procedures are different. Moreover, casuistry of the population represented in the publications is often weak, made up of less than 20 subjects. Given these small samples, some reservations should be expressed with regard to the conclusions. Each team has its own habits of assessment. No single test seems to emerge from the numerous tests, and no tool has been established by usage. Evaluation is often qualitative, reduced to the presentation of examples of functional improvements.

Assessment of hand motor function in the literature is mainly represented by dynamometric measures. Instruments, conditions of administration and the position of the upper limb are not often described, preventing any comparison and giving a high level of variability to the results. Lamb et al32 built up a specific assessment of hand function in tetraplegics, which evaluates the manipulation of objects of different size, weight and shape. Scoring and conditions of administration are unclear. Lamb's scale has never been validated and was only used by the author himself and by Filipetti et al33 in an unvalidated French form.

Other examiners favored the evaluation of precision. Van der Linde et al34 assessed the key-grip by asking the patient to draw figures using an electronic pen and a digitizer. Movement velocity and dysfluency (ie the number of velocity changes per centimeter) were measured before and at several months after surgery. On the basis of five operated hands, Van der Linde concludes that this evaluation is sensitive to change and suggests that other factors could participate in the modification of the performances, such as deep sensibility, muscle coordination and others. Appreciation of precision could be helpful if, as suggested by some authors,5,6,7,8,9,10,11,12,13,14,15, 16,17,18,19,20,21,22,23,24,25, 26,27,28,29,30,31,32,33,34,35, 36 surgery seeks to provide the patient with a strong grasp in one hand and a sure key-grip in the other, even if the demonstration of a correlation between motor performances and functional results was never made.11,12,13,14,15,16,17,18

The results of restoration of elbow extension are generally expressed in terms of active range of motion,37,38,39,40 muscular strength,9,10,11,12,13,14,15,16,17,18,19, 20,21,22,23,24,25,26,27,28,29, 30,31,32,33,34,35,36,37,38,39, 40,41,42,43,44 isometric muscular force and torque measurements of power.39,40,41,42,43,44,45,46 None of the publications deals with the position of the upper limb. Bottero et al47 and Revol et al40 assessed isokinetic muscle strength of elbow extension through three movements of flexion/extension of the elbow in a sagittal plan, from 0° to 120° of flexion. The measure is considered as reliable even though samples are small. The results suggest that the mean flexion torque is, on average, very low after biceps-to-triceps transfer, especially at the end of the movement and remained acceptable after deltoid-to-triceps transfer. All these procedures of assessment are time-consuming and require a high level of technicality.

Besides the evaluation of hand motor function, there are many generic functional tests assessing hand function. In 1995, Sollerman evaluated 59 tetraplegic patients before surgery and found a good correlation with the ICSHT.29 He used a standardized test based on seven of the eight most common hand grip patterns and made up of 20 tasks for daily living. The originality of the Sollerman test relies on the scoring of the time of execution of each task. The duration of administration is too long (60–90 min). The test requires special and configured equipment. Despite its high level of reliability, this test has never been shown to have sufficient discriminative capacity in patients with weak muscular potentialities.15,16,17,18,19,20,21,22,23,24,25, 26,27,28,29,30,31,32,33,34,35, 36,37,38,39,40,41,42,43,44,45, 46,47,48 The ceiling effect is important probably because of the modalities of scoring. Patients who cannot complete a subtest before surgery receive the same score (zero) since both times were over 60 s.15

Others10,11,12,13,14,15,16,17,18,19,20, 21,22,23,24,25,26,27,28,29,30, 31,32,33,34,35,36,37,38,39, 40,41,42,43,44,45,46,47,48,49 used the Jebsen Taylor test, one of the first standardized tests of hand function, published in 1969. Stroh Wuolle et al15 found that it was not pertinent for tetraplegic patients for different reasons: low sensitivity to important change in hand function, low relevance of the tasks for tetraplegics, inadequate instructions for tetraplegics, too high sensitivity to additional factors other than those directly related to hand function.

The Grasp release test (GRT) is the only hand assessment test tailored to tetraplegics. It was initially developed to measure changes in hand function following the implantation of a neuroprosthesis hand in individuals with C5 and C6.15 tetraplegia. Six objects varying in size and weight have to be manipulated, three with lateral prehension (peg, paperweight and fork) and three with palmar prehension (block, can and videotape). In the test, users grasp, move and release each object as many times as possible in five 30 s trials for each object with or without the neuroprosthesis. Data from a small sample of patients suggested that performance with the neuroprothesis was above the baseline. Metrological properties of the GRT have never been studied. In the light of a hand neuroprosthesis, the designers of the GRT have understandably favored the assessment of the hand individually and independently of the motor control of the shoulder and the elbow.15 The pertinence of this approach is unfortunately low for tetraplegics who benefited from a restoration of elbow extension or from a complete surgical program involving both elbows and both hands. The drawback of such a reduced approach to hand assessment is that it may prevent the assessor from displaying the benefits of each stage of the surgical program as they arise. Our hypothesis is that hand surgery may be helpful for the use of the whole upper limb. Conversely, restoration of elbow extension may improve the use of a tetraplegic hand that was not operated on. Our daily experience has taught us that it is not within the scope of surgery to restore primary ADL capabilities but rather elementary motor abilities that participate in performing ADL capabilities, with the help of compensation and the previous experience of the patient.

As two groups of patients (operated and nonoperated) with a complete tetraplegia were consulted from the beginning of the study in a surgical prospective, the hypothesis is made that items were subjected to the ‘approval’ of people who expressed the wish to improve their upper-limb abilities. In any case, the process of elimination of items should contribute to exclude those items with little relationship to upper-limb function. In our opinion, the contribution of each stage of functional surgery should not lead to the assessment of the operated segment alone, but preferably to that of the whole upper limb. Furthermore, in tetraplegics, hand and upper-limb function are not limited to manual dexterity, but play an important role in some tasks such as transfers and locomotion. All these tests are regularly used, but fail to evaluate the small differences that might exist. Furthermore, the lack of reliable and valid measurement tools is one of the major shortcomings that prevent an objective assessment of the results of functional tests.

The MCS was conceptualized and developed to fill this gap, in accordance with rigorous methodology and metrological specifications. The influence of lower-limb dysfunction is a reality that cannot be ignored and that concerns any evaluation of the upper-limb function. It is the reason why the MCS takes into account all the possible benefits of hand surgery by assessing various aspects of hand and upper-limb performance in basic tasks such as transfers, repositioning, locomotion, spatial exploration, grasping and gripping. The MCS is intended to measure the improvement of basic abilities required to achieve ADL.

The pilot study was of prime importance to check the acceptability of the whole scale and of each item considered individually. Adaptations to the scales were made following the results of the pilot study. In the following stages of the study, the MCS displayed good metric properties. Apparent and content validity were satisfactory. The MCS's overall rating score had no floor or ceiling effects. MCS, in its prefinal version, has demonstrated statistically high inter-rater repeatability. The interval between scale administration of 24 h minimizes the rater's recollection of prior answers and provides a realistic view of the variability of changes in responses that may occur for nonspecific reasons. Furthermore, the number of items was high and would have made their memorization difficult. Construct validity, which is the major criterion of validity, was found to be partly good. We developed a series of hypotheses based on our daily practice. Overall rating scores of the MCS were expected to be significantly correlated with the Sollerman test and the AMS. As in the Sollerman test, tasks are not subjected to the ‘pressure’ of contextual factors and are related to basic motor abilities. The strong relationship between the MCS and the Sollerman test emphasizes the external validity of the MCS within the applied theoretical framework.

The scale was well accepted by patients and not unduly difficult for assessors. As the testing procedure was designed for French habits, some or all the tasks and scoring rules may need to be adjusted for other countries and cultures. Metrological properties were good enough to allow the final study.

Conclusion

With a view to ensuring that personal and environmental factors do not interfere with the interpretation of the results, the separation of two concepts as described by Fougeyrollas et al20 has appeared to be helpful. The MCS was designed for tetraplegics and on the basis of their experience before and after surgery on their upper limbs. The process of item reduction has led up to a specific scale of motor capacities that definitely excludes any participation of contextual factors. The data showed that the MCS had excellent construct validity and repeatability.

In the next stage, factorial analysis and sensitivity to change will be studied on a greater number of patients, with the hope that the MCS will be a valid means of assessment of the effectiveness of upper-limb functional surgery in tetraplegics. As the validation of a scale relating to handicap situations is in progress, we trust that both scales will provide surgeons and rehabilitation teams with complementary information about the physical and the functional benefits of the tetraplegic patients who underwent functional surgery.