Introduction

The International Standards for Neurological Classification of Spinal Cord Injury (ISNCSCI) are at present the gold standard for the neurological evaluation of persons with spinal cord injury (SCI) [1, 2]. This grading system allows the definition of lesion level and severity and a description of special syndromes (anterior spinal cord syndrome, Brown-Sequard syndrome etc.) [1]. The ISNCSCI represent a common language among all SCI professionals and constitute the main prognostic factor after a traumatic SCI. An early evaluation with the ISNCSCI (i.e., within 72 h after a SCI) is related to the neurological and functional status at 1 year after a traumatic lesion and, consequently, can be used to assist in discussing the chances of recovery with the patients and to optimize resource allocation during and after the acute phase of treatment [3,4,5,6,7]. Furthermore, the ISNCSCI are widely used in the research setting both as evaluation tool and outcome measure for clinical trials aiming at evaluating the efficacy of new therapeutic interventions for patients with SCI [3, 4].

Since 1982 there have been several versions of the ISNCSCI and all these versions have been validated with regard to validity, reliability and repeatability in patients with traumatic SCI [1, 3,4,5,6,7,8,9,10,11,12,13,14]. Consequently, the use of the ISNCSCI has been endorsed by the International Spinal Cord Society and the American Spinal Injury Association [15].

Non-traumatic spinal cord lesions represent a various group of pathologies with different presentation and evolution and are progressively becoming more frequent and relevant in the Western Countries. Although the epidemiology of non-traumatic SCIs is not perfectly known due to the paucity of dedicated studies, the incidence is calculated to be between 6 and 76 new cases per million per year [16, 17]. In some studies [18, 19], non-traumatic SCIs represent up to 60% of all new admissions for rehabilitation. Furthermore, non-traumatic SCIs include patients with different aetiologies (inflammatory, neoplastic, degenerative, and ischemic) [16] that could possibly show different clinical characteristics and different evolution along time. The ISNCSCI are widely used also for the evaluation and prognosis prediction of persons with non-traumatic spinal cord lesions [18, 20, 21], although there are no studies specifically aiming at evaluating their psychometric qualities for this specific population [22].

The aim of this study is to evaluate the psychometric characteristics of the ISNCSCI in a population of persons with non-traumatic SCI.

Methods

All patients with non-traumatic SCI consecutively admitted to three Italian SCI centers (IRCCS Fondazione Santa Lucia, Montecatone Rehabilitation Hospital and Istituti Clinici Scientifici Maugeri IRCCS of Pavia) between January 1st 2017 and June 30th 2020 have been prospectively enrolled in the study.

The study has been registered at Clinicatrials.gov with the identifier NCT04949763.

The study was approved by the ethic committee of IRCCS Fondazione Santa Lucia and all the patients signed an informed consent to the study.

Inclusion criteria were: having a non-traumatic SCI in the acute/subacute phase with any level and severity (ASIA Impairment Scale (AIS)) of injury and having a cognitive status that allows collaboration in the exam.

Exclusion criteria were the presence of dementia or cognitive decline; having a pathology of the peripheral nervous system that may affect the evaluation of ISNCSCI; having a multiple sclerosis.

The following data were prospectively recorded:

  • Recording of demographic and clinical history data. Concerning the onset of lesion, for the ischemic and inflammatory groups reference was made to the appearance of the first symptoms, while for the neoplastic and spondylogenetic myelopathies we referred to the date of surgical intervention which is usually accompanied by a worsening of the clinical picture.

  • Evaluation of neurological conditions according to the ISNCSCI (Revision 2015) [23] with registration of right and left motor and sensory level and of the Neurological Level of Injury (NLI), of the total motor score (MS), of upper extremities (UEMS) and lower extremities (LEMS) motor scores, light touch and pin prick sensory scores, and AIS. This assessment was carried out by two different experienced examiners (Table 1) in each center, 48–72 h apart. One of the two examiners also assessed the functional status of the patients through the Spinal Cord Independence Measure (SCIM) version 2 or 3 [24].

    Table 1 The table shows how many patients were enrolled in each center and the level of experience of the examiners who performed the ISNCSCI.
  • The patients were evaluated at admission with the possibility of repeating the evaluation also during rehabilitation stay and at discharge.

Statistics

Descriptive statistics: mean and standard deviation (SD) for quantitative data; frequencies and percentages for qualitative ones. Normality of data was assessed by the Shapiro–Wilk test. The NLI and the AIS grade have been transformed into numbers and treated as ordinal variables. For the NLI the level C1 corresponds to 1, and the level S4–5 to the number 29. For the AIS grade A correspond to 1 and grade E to 5.

Validity and reliability represents the main measurement psychometric properties of instruments. The validity of instrument means that it measures what it is intended to measure. [25] while reliability refers to its stability over time [26].

Different aspects of reliability were assessed with appropriate tests: correlation (Spearman), test-retest reliability (Krippendorf’s Alpha), and internal consistency (Cronbach’s Alpha) [27, 28]. For motor and sensory scores, we also compared the data of the two examiners by means of Wilcoxon matched pair test to evaluate if there was any significant difference.

As to the levels of injury (NLI, left and right sensory and motor level of injury) and AIS grade, the agreement between the two examiners was assessed through the Krippendorff’s Alpha. Furthermore, for the assessment of the levels, we compared the levels established by the two examiners, by counting the difference (1 level, 2 or move levels) in cases where assessments differed.

We evaluate psychometric properties of AIS scale on all sample and in each pathology subgroups.

As currently there is no gold standard for the neurological evaluation of persons with SCI other than the ISNCSCI, we have evaluated the convergent construct validity of the Standards through a Spearman correlation between the total MSs, the upper and lower extremities MSs and the total SCIM score as well as the subscores “Self-care” and “Mobility”. This correlation was performed by means of Spearman test.

According to Landis and Koch [29], we interpreted ICC values and the level of agreement by Kappa-values as follows:

0–0.1-virtually none

0.1–0.4-slight

0.41–0.6-fair

0.61–0.8-moderate

0.81–1-substantial

All analyses were performed with SPSS 22.

Significance was set at p < 0.05

Data have been reported according to the Guidelines for Reporting Reliability And Agreement Studies (Supplementary material, Table 15).

Results

One hundred and forty patients (92 males, 48 females) were evaluated. Mean age was 60 ± 16 years (range 15–86). The level of lesion was cervical in 30 patients, thoracic in 78 patients and lumbar in 32 patients. As for the AIS grade, 32 patients had an AIS A, 11 patients an AIS B, 33 patients and AIS C and 64 patients an AIS D grade. Fifty-two patients sustained an ischemic lesion, 34 presented a spondylosis of the spine with involvement of the nervous structures, 29 a neoplastic pathology and 25 an inflammatory/infectious pathology (mostly with transverse myelitis and bacteria spondylodiscitis) (Table 2).

Table 2 Description of the clinical features (Neurological Level of Injury, NLI, and ASIA Impairment Scale, AIS, Motor and Sensory Scores) of the participants divided by etiology.

The evaluations of the 140 patients included were organized as follows: 103 patients were evaluated with the ISNCSCI performed by two examiners only at admission, while 37 patients received two or more evaluations (at admission and during the rehabilitation stay or at discharge) with the ISNCSCI by two examiners, leading to a total of 169 couples of evaluations to assess the inter-rater reliability of the Standards.

All these 169 evaluations were accompanied from a functional evaluation with SCIM. In addition, 13 patients were evaluated once with the ISNCSCI and SCIM, leading to a total of 182 complete set of assessments with ISNCSCI and SCIM for the evaluation of construct validity (Fig. 1). Inter-rater reliability gave substantial results for MSs (r = 0.965; p < 0.001); the correlation for sensory scores was lower, but still substantial (r = 0.905 for light-touch and 0.902 for pin-prick; p < 0.001) (Table 3). Cronbach’s alpha highlighted a substantial internal consistency of the ISNCSCI (Table 3). The comparison of the data of the two examiners did not show any significant difference (Table 3).

Fig. 1: The figure depicts the number of evaluations for the ISNCSCI and SCIM.
figure 1

Representation of the different patients evaluation.

Table 3 Measurement properties of the motor and sensory scores for the entire cohort.

The agreement between the examiners regarding lesion levels was fair to moderate, although significant for all assessments (Table 4). The NLI was the same in 104/169 assessments (61%) and differed by one level in 35 evaluations (21%) and by two or more levels in 30 evaluations (18%) (Table 5). As for the severity of the injury (AIS grade), the agreement between the two examiners was substantial (Krippendorff’s alpha = 0.919; p < 0.001) (Table 4). The AIS grade was the same in 161/169 evaluations (95%) and differed by one grade in the remaining eight (Table 5). All these patients were assessed as AIS C by one examiner and as D by the other.

Table 4 The table shows the level of agreement between the two examiners as determined with Krippendorff’s Alpha agreement statistic for the entire cohort.
Table 5 Agreement for the levels of injury and AIS.

In the entire set of assessments, the correlation between the SCIM self-care subscore and the upper extremity motor score (UEMS) was fair, although significant (r = 0.407; p < 0.001). The correlations between the lower extremity motor score (LEMS) and the SCIM mobility subscore and between the total MS and the total SCIM score were moderate and significant (r = 0.666 and r = 0.683 respectively; p < 0.001) (Table 4). The correlations improved by considering persons with tetraplegia and paraplegia separately, dividing the assessment at admission from one at follow-up and dividing incomplete and complete lesions (Table 6).

Table 6 Validity of the ISNCSCI as assessed by the correlations between motor scores and SCIM scores (Spearmann correlation).

Comparable results have been found for the different pathologies (Supplementary material, Tables 714).

Discussion

The purpose of this study was to evaluate the internal consistency, the inter-rater reliability, and the construct validity of the ISNCSCI in the evaluation of patients with non-traumatic SCI.

Non-traumatic SCI includes a heterogeneous group of pathologies with different onset and evolution characteristics [16]. This aspect, together with the relative rarity of these etiologies in comparison with the traumatic form, is responsible for the limited number of studies in these categories of patients [16]. An exception is represented by ischemic SCI, a relatively frequent cause of non-traumatic lesion, which displays a few similar characteristics with traumatic SCI [30]. Indeed, this etiology was relatively well characterized in the context of the European Multicenter Study about SCI (www.emsci.org, ClinicalTrials.gov Identifier: NCT01571531). The traumatic and ischemic etiologies display different characteristics in terms of age, severity and level of injury, but are comparable in terms of single-event onset and neurological and functional evolution [30].

Our study demonstrated for the first time that the ISNCSCI display reliable measurement properties also when applied to patients with non-traumatic SCI. Interestingly, despite the difference in the clinical features the ISNCSCI showed comparable reliable properties in all the four etiology groups.

Both motor and sensory component of the ISNCSCI showed substantial agreement between different evaluators. The agreement was slightly better for the motor component than the sensory component. As regards the NLI and the AIS grade, the Krippendorff’s alpha coefficient between the two examiners showed a fair to moderate correlation with regard to the NLI and the sensory and motor levels, and substantial for the AIS grade.

To the best of our knowledge, at present there is no study evaluating the psychometric properties of the ISNCSCI in persons with non-traumatic spinal cord injuries [22]. Therefore, we can compare our results only with previous studies evaluating these characteristics in patients with traumatic SCI. However, overall, the results of our study are in line with those reported for traumatic SCI.

Cohen and Bartko [31] examined the reliability of the standards with 29 examiners from 19 centers and demonstrated very strong agreement for the ASIA scores with ICC values between 0.96 for light-touch and pin-prick scores and 0.98 for the MS. Marino [32] performed a study with 16 evaluators and 16 participants and reported inter-rater ICC values of 0.97 for MSs, 0.96 for light-touch and 0.88 for pin-prick. Jonsson [33] assessed the inter-rater reliability of the standards in 23 patients with incomplete SCI and found Kappa-values between 0 and 0.83 for the pin-prick, between 0 and 1 for the light-touch and from 0 to 0.89 for MSs. Furthermore, they found fair to poor agreement for the neurological levels. Savic [34] examined the inter-rater reliability of ISNCSCI, evaluating 45 persons through two expert examiners. The results of this study showed that the total scores had a strong correlation between the two examiners with ICC values >0.99 for motor and light-touch scores and 0.97 for pin- prick [34]. Regarding the level of the lesion, our cohort showed lower levels of agreement between the two examiners compared to Savic’s study. As in Savic’s study [34], lesion levels differ in most cases for one segment and only in some cases does the levels differ by two or more segments. Finally, Schuld [35] examined the percentage of concordance for the levels between typical SCI physicians and an appropriate calculator (EMSCI calculator) and reported levels of agreement comparable to the present onesat least with regard to motor levels and AIS grade. Difference with previous studies could be explained by different methodologies (for example the number of examiners and participants), the experience of the examiners and the different composition of the cohorts of participants as some studies [31, 34] included ~50% of complete lesions (compared to 18% of the present series) which are easier to evaluate [35, 36].

The convergent construct validity was evaluated by comparing the ISNCSCI with the functional evaluation based on the SCIM. Within the entire cohort analyzed and considering all the assessments pooled together, the correlations were moderate to good, although significant. The correlation was weaker for the SCIM self-care subscore and upper extremity MS compared to the other scores. We therefore performed more detailed analyses by dividing the participants according to the level of injury (paraplegia and tetraplegia) and the time of evaluation (first evaluation and follow-up) obtaining slightly better correlation scores (Table 4).

Also in this case, we compared our results with previous studies performed in persons with traumatic SCI. There are numerous articles that evaluate the relationship between ISNCSCI and functional status, using different methodologies and outcome measures. Overall, the results of these studies are comparable with ours, showing a moderate to good correlation between the ISNCSCI and functional status as assessed by the Quadriplegia Index of Function [37, 38], the Modified Barthel Index [39, 40], the Functional Independence Measure [39, 41], and the SCIM [42]. The fact that the correlation between the ISNCSCI and functional status in these studies, including ours, is only moderate, could be explained by the fact that the SCIM is obviously not only influenced by biological phenomena (spinal cord integrity and recovery), but also by other factors such as: age which is accompanied by a reduced vital capacity [43] and by difficulties, for elderly individuals, in translating motor recovery in daily activities [44]; the presence/absence of complications during rehabilitation [45]; presence/absence of comorbidity and pre-injury physical fitness [46]; the psychological status [47]; and finally the impact of rehabilitation. As demonstrated by Wirth [42], functional improvement partly occurs independently of neurological recovery. Persons with complete motor SCI recover skills in SCIM unrelated to changes in MSs. This improvement is believed to be due to a compensation mechanism (learning new movement strategies, including the use of new aids).

Our study has some limitations. First of all, we have not evaluated the agreement between the two examiners regarding each individual myotome and dermatome as in other studies [31, 32]. The second limitation is that we have not evaluated the prognostic value of ISNCSCI. This is related to the fact that for most of our patients, in particular those with spondylogenic myelopathies and with spinal cord dysfunction due to tumors, it is impossible to determine the exact onset of the lesion and therefore to perform the first evaluation at the time of injury. The only persons in which it is possible to know exactly the onset of the pathology are those with ischemic SCI; unfortunately, the number of patients with ischemic SCI in our sample is too low to evaluate the prognostic value. The third limitation is that we have not assessed intra-rater reliability (i.e., the relationship between the two assessments made by the same examiner). The latter test requires a period of at least 7–15 days between the two assessments to avoid the learning effect for both the examiner and the patients. Since all patients in our study had an acute/subacute lesion and underwent intensive rehabilitation, an improvement in their status is expected in 7–15 days, making the relationship between the two assessments poorly reliable.

Another limitation is related to the heterogeneity of non-traumatic SCI group, which includes also patients with ischemic SCI. As mentioned above, this etiology is much more similar to the traumatic form than to the other secondary forms for what concerns clinical evolution and functional recovery [30]. In consideration of the heterogeneity, we provided in Table 2 a stratification of our sample based on the etiologies.

Furthermore, we should acknowledge a limitation concerning the evaluation of construct validity. As previously discussed, since the ISNCSCI are the gold standard for the evaluation of neurological impairment after SCI, we lack a reference tool for the neurological examination to use as comparison. In line with previous studies for the validation of ISNCSCI in traumatic population, we opted to compare the ISNCSCI with SCIM total score and subscores, but we should highlight that the content of SCIM refers to the evaluation of independence in daily life activity after SCI, which is influenced by other factors beyond the neurological impairment.

Finally, a possible limitation is the lack of training standardization of the examiners.

In conclusion, our work fills a gap in the assessment of SCI, demonstrating that the ISNCSCI are a reliable and valid assessment tool for patients with non-traumatic SCI. ISNCSCI used in a population of persons with non-traumatic SCI have shown to have roughly the same psychometric characteristics that they have in patients with traumatic injury.

With regard to the usefulness of the ISNCSCI for the clinical practice, we believe that they could be safely used to describe the clinical situation of the patients. The main elements of the ISNCSCI (AIS grade and motor and sensory scores) show an excellent reliability and, in our opinion, are more than adequate in the clinical setting. However, nothing could be said on the prognostic value of the standards that deserves a dedicated study.

With regard to the use in clinical trials, ISNCSCI has been used for several different aims (characterization, inclusion/exclusion, subgrouping). Furthermore, elements of the ISNCSCI are used as an outcome measure [48, 49], although for some of them (for example the AIS grade), this use is not recommended [15]. For non-traumatic SCI cautions should be taken in the choice of the elements of the ISNCSCI to be used as outcome measure. While an improvement of the AIS grade and of the motor and sensory scores could be reasonably attributed to the effect of the treatment, the same does not hold true for the NLI and the motor and sensory levels. In fact, based on our results, a change of one level and probably also of 2 levels could be due to the scarce reliability of these elements rather than to the effect of the treatment.