Abstract
Study design:
Psychometric study.
Objectives:
To determine the intra- and inter-rater reliability and content validity of the International Spinal Cord Injury (SCI) Musculoskeletal Basic Data Set (ISCIMSBDS).
Setting:
Four centers with one in each of the countries in Australia, England, India and the United States of America.
Methods:
A total of 117 participants with a C2 to S1 neurological level and American Spinal Injury Association Impairment Scale A to D injury were recruited. The median (interquartile range) time since injury was 9 years (2–29). Fifty-seven participants were assessed by the same assessor, and 60 participants were assessed by two different assessors on two different occasions to determine the intra- and inter-rater reliability, respectively. Kappa statistics or crude agreement was used to measure reliability. Content validity was assessed through focus group interviews of people with SCI and health-care professionals.
Results:
The intra-rater reliability ranged from κ=0.62 to 1.00 and crude agreement from 75% to 100% for each of the variables on the ISCIMSBDS. The inter-rater reliability ranged from κ=−0.25 to 1.00, with a diverse crude agreement ranging from 0% to 100%. The inter-rater reliability was unsatisfactory for the following variables: ‘Date of fracture’, ‘Fragility fractures’, ‘Scoliosis, method of assessment’, ‘Other musculoskeletal problems’ and ‘Do any of the above musculoskeletal challenges interfere with your activities of daily living (transfers, walking, dressing, showers, etc.)?’. Results from validity discussions implied no major suggestions for changes.
Conclusion:
Overall, the ISCIMSBDS is reliable and valid, although 5 of the 12 variables may benefit from further refinement.
Similar content being viewed by others
Introduction
The International Spinal Cord Injury (SCI) Musculoskeletal Basic Data Set (ISCIMSBDS) aims to cover the most important musculoskeletal (MSK) problems that affect people with SCI.1 The ISCIMSBDS is one of the several International SCI Data Sets that was developed under the umbrella of the International Spinal Cord Society and the American Spinal Injury Association in order to standardize data collection. This is important for improving the examination, treatment, rehabilitation and prevention of SCI and for facilitating comparison of results across SCI centers and countries for research.2 The ISCIMSBDS form can be found in Appendix A.
MSK problems are common in people with SCI and include problems such as spasticity, fractures, heterotopic ossification (HO) and contractures. For example, 60–70% of people with SCI develop spasticity within a year, and about half of these receive antispastic medication.3, 4, 5 In addition, age-related MSK problems are increasing compared with able-bodied persons.1, 6 The incidence of fractures ranges from 1% to 34%.7 The relative risk for a fracture is doubled compared with controls and in particular with a much higher risk of lower extremity fractures and fragility fractures (low energy fractures) in individuals with SCI compared with controls.8 Risk of fracture increases with more severe motor impairment.9 There is no accurate data on the incidence of HO, although it is estimated that between 10% and 53% of people with SCI develop HO.10, 11 The incidence of contractures in major joints 1 year after SCI was found to be 11–43%, with the ankle, wrist and shoulder being most commonly affected.12 Contractures are a common and a disabling problem for individuals with SCI and a challenge to manage for clinicians.13 Degenerative changes or overuse injuries are most often located in the upper extremities, particularly the shoulders, elbows and wrists, as well as the neck, upper and lower back.1 Nearly all individuals develop scoliosis if they sustain their SCI at a young age.14 It is important to capture all these MSK problems in people with SCI. The ISCIMSBDS was designed for this purpose. However, it is important to determine its reliability and validity. The objective, therefore, of this study was to determine the intra- and inter-rater reliability, as well as discuss the content validity of the ISCIMSBDS.1
Material and methods
Study design
The study was designed as a test–retest reliability study. Two measures of reliability were performed: intra- and inter-rater reliability. Intra-rater reliability describes how well the same rater can reproduce the data twice on the same group, whereas inter-rater reliability describes reproducibility when two different raters perform the data collection. The study was carried out at four SCI centers located with one in each of the four continents: Australia, India, United States of America, and United Kingdom. Each center recruited 30 participants with SCI, giving a total of 120. Participants were enrolled from April 2013 to March 2014. Participants were included if they were >18 years of age and had sustained their SCI at least 6 months prior. Participants were included regardless of the level or etiology (traumatic or non-traumatic) of the SCI. Participants could have any number or severity of MSK symptoms provided they were stable and were not expecting changes in physical therapy or medication for pain or spasticity between interviews. Participants were recruited from a sample of convenience and included both inpatients and outpatients. All were recruited by personal contact (none were recruited by letter or phone). The setting was either hospital or SCI clinic. One center also recruited from a local residential home for people with SCI and another center recruited from a SCI summer camp (EmpowerSCI, Inc.). Three participants were excluded post hoc because they did not meet all inclusion or exclusion criteria. Consequently, 57 people participated in the intra-rater reliability and 60 in the inter-rater reliability aspect of the study.
Each study site had two raters. All raters were experienced SCI health-care professionals (physiotherapists and medical doctors). The first rater performed all the intra-rater tests. Inter-rater (inter-observer) reliability was tested by two different raters of which one was the same rater who performed all the intra-rater tests. The ISCIMSBDS was completed by patient interview and, where necessary, a review of patients’ medical records and through a physical examination. The latter was often the case when evaluating contractures, degenerative joint changes and scoliosis. This relatively unformalized way of data collection reflects the way the ISCIMSBDS will be used in the clinical and community settings.
Content validity was evaluated by focus group interviews including health professionals and consumers, thus using recognized subject matter experts from different domains, to evaluate to what extent the variables of the ISCIMSBDS adequately reflect the content domain15, 16 and whether the wording of the variables was appropriate. The health professionals were those involved in SCI management and would hence be potential users of the ISCIMSBDS in clinical practice. Consumers with SCI were recruited from the Indian Spinal Injuries Centre to form three focus groups, each with four individuals. They were aged between 26 and 50 years and included both females and males and were at least 6 months after injury. Group interviews were performed at the Indian Spinal Injuries Centre in New Delhi. The study was explained to all participants in the four groups. The discussions were facilitated and moderated by one of the investigators. The comments on the relevancy of each item in the ISCIMSBDS were compiled from each group separately, and a final consensus was achieved from each group. At the end, the panel of three investigators came to a final consensus on the data set. A total of seven discussions (four with consumers groups and three with the expert group) were conducted. Duration of each discussion was between 1.5 and 2 h. The discussions were conducted in English, and all the experts, consumers and investigators were fluent in English.
Statistical analysis
Cohen’s Kappa (κ) was used to determine reliability because it provides an estimation of agreement corrected for chance. However, Cohen’s Kappa is influenced by the prevalence (frequency) of conditions and systematic bias; hence, crude (percentage) agreement was also determined.17 Data from all four centers were pooled, as all included participants met the same inclusion criteria and all raters were representative of those who will use the ISCIMSBDS in the clinical setting. A κ-value across all centers was calculated for intra- and inter-raters, respectively.
κ-Values were interpreted based on Landis and Koch, 1977,17 where a score <0 reflected poor agreement, 0.0–0.20 reflected slight agreement, 0.21–0.40 reflected fair agreement, 0.41–0.60 reflected moderate agreement, 0.61–0.80 reflected substantial agreement and 0.81–1.00 reflected almost perfect agreement. κ-Values >0.61 (reflecting at least substantial agreement) with a crude (percentage) agreement of >90% were considered satisfactory.18
Frequency of the MSK problems was calculated as the mean value of the two raters’ recordings. The data set consists of variables with main questions that are answered by ‘yes’ or ‘no’. If these questions are answered as ‘yes’, then subcategory questions are answered. Agreement was only calculated for subcategory questions if both raters indicated ‘yes’ on the main question. Instances with missing data were excluded from the analysis (N used in analysis is shown in Tables 2A and 2B).
Agreement of the categories titled ‘Fractures’, ‘HO’, ‘Contractures’ and ‘Degenerative changes/overuse’ was first calculated for each possible location existing in the data set,1 and then all locations were summed in a 2 × 2 table for κ-analysis. Both location and side needed to be the same for the two answers in order to be considered as an agreement. ‘Fractures’ and ‘Degenerative changes/overuse’ had 28 locations and ‘HO’ and ‘Contractures’ had 16 locations to choose from. This gave a total N of 1597 (28 × 57) for intra-rater analyses and 1680 (28 × 60) for inter-rater analyses for the variables ‘Fractures’ and ‘Degenerative changes/overuse’. ‘HO’ and ‘Contractures’ had 912 (16 × 57) and 960 (16 × 60) possible locations for intra- and inter-rater analyses, respectively.
Data collection and data management were carried out with OpenClinica,19 which is an open source web-based software platform for managing clinical research.
Statistical analyses were calculated using the SAS statistical software version 9.4 for Windows (SAS Institute Inc., Cary, NC, USA) and IBM SPSS Statistics version 22 for Windows (IBM Corp., Released 2013, Armonk, NY, USA).
Statement of ethics
We certify that all applicable institutional and governmental regulations concerning the ethical use of human volunteers were followed during the course of this research, and the necessary approvals were obtained in each center. OpenClinica used for data collection in this study is designed to support regulatory guidelines such as 21 CFR Part 11.20
Results
Demographics
The characteristics of participants are listed in Table 1.
The mean (s.d.) time between interviews was 8.7 (3.3) days (median 7, interquartile range 7–11 days).
Frequency of symptoms
Frequency of symptoms in the study sample for the intra- and inter-rater groups is shown in Figure 1.
Neuro-musculoskeletal history before spinal cord lesion
Frequency of ‘Neuro-Musculoskeletal history before spinal cord lesion’ was low for all of the three categories (Figure 1).
Two participants (3%) from the intra-rater group had ‘Preexisting congenital deformities of the spine and spinal cord’. Crude agreement was 100% for the subcategory questions for these two participants, and there was 100% agreement for diagnosis, location, surgery and date of surgery. There were no reported ‘Preexisting congenital deformities’ for the inter-rater group.
One participant in the intra-rater group and three participants in the inter-rater group had ‘Preexisting degenerative spine disorders’. Intra-rater crude agreement was 98% (56/57), and inter-rater agreement was 98% (58/59). All raters agreed on the subcategories titled diagnosis, location, previous surgery and date for the few cases where the condition was present.
No participants in either groups had ‘Preexisting systemic neuro-degenerative disorders’, and thus for both groups there was 100% agreement on the absence of symptoms (Tables 2A and 2B).
Presence of spasticity and treatment of spasticity
‘Presence of spasticity/spasms’ was reported in 78–81% of the study sample. Half of all participants received ‘Treatment for spasticity/spasms’ within the past 4 weeks. There was almost perfect intra- and inter-rater reliability for the ‘Presence and treatment of spasticity’ (Tables 2A and 2B).
Fractures
Fractures were located in the lower body with most in the ‘Hip/femur’, followed by ‘Tibia/fibula’, ‘Knee’, ‘Foot’ and ‘Ankle’. The only fractures reported for the upper body were in the ‘Hand’. Both intra- and inter-rater reliability were almost perfect (the high crude agreement reflects the many locations where no symptoms were reported and hence agreement on the absence of a fracture).
The intra- and inter-raters agreed on the year of the fracture in 77% and 76% of cases, respectively. Out of these, 50% and 38% also agreed on date and month. Median (interquartile range) time since the fracture was 6 years (2–31) in the intra-rater group and 11 years (5–14) in the inter-rater group.
Intra-raters classified 25% of the fractures as a ‘Fragility fracture’ and inter-raters 63% of the fractures. Intra-rater reliability was satisfactory (Table 2A), but inter-rater reliability was unsatisfactory (Table 2B).
Heterotopic ossifications
‘HO’ was only reported for the ‘Hip/femur’, with one disagreement about HO for the knee. X-rays were used four times to document HO, and computed tomography+magnetic resonance imaging were used one time for the intra-rater group and all agreed. In the inter-rater group, it was agreed twice that X-ray was used and disagreed one time between X-ray and Triple-phase bone scan (it was not possible to determine whether both were performed).
Contractures
‘Contractures’ were reported in all locations with most reported for the ‘Hip’, ‘Knee’ and ‘Ankle’. Reliability was satisfactory for both intra- and inter-rater groups. Intra-rater reliability for each location ranged from substantial to almost perfect (Table 2A), and all locations were reported. Inter-rater reliability ranged from moderate to almost perfect (Table 2B). In this group, the lowest reliability was reported for the ‘Hip/femur’ and ‘Knee’ location.
Degenerative changes or overuse
There were a high number of recordings for ‘Degenerative changes or overuse’ for the upper body and spine from both the inter- and intra-rater groups, with the ‘Shoulder’ and ‘Cervical spine’ being the commonest site. There were very few or no recordings for the lower body. Similar to the situation with ‘Fractures’ and ‘Contractures’, there was high agreement on the absence of ‘Degenerative changes or overuse’ for all locations in both groups, but when raters identified the presence of ‘Degenerative changes or overuse’ there was considerable disagreement about the precise location for the inter-rater group. This led to a summed κ-score below satisfactory level (Table 2B). There were no clear patterns between the locations and reliability in the inter-rater group other than that ‘Lower back/lumbar spine’ had the lowest agreement in both groups.
Scoliosis
‘Scoliosis’ showed almost perfect reliability for both intra- and inter-rater reliability (Tables 2A and 2B). Of the method of assessment of scoliosis, ‘Plain radiographs in sitting’ had almost perfect inter-rater reliability, whereas ‘Observation in sitting’ and ‘Plain radiographs in standing’ had poor inter-rater reliability. The option ‘Observation in standing’ was not used at all.
There was perfect agreement in both groups for ‘Surgical treatment of scoliosis’. ‘Date of surgery’ of scoliosis was agreed upon in three of the four cases for intra-rater testing. Date was only recorded once by one rater for inter-rater testing corresponding to no agreement.
Other musculoskeletal problems; specify
Intra-rater reliability was satisfactory (Table 2A), but inter-rater reliability was just below satisfactory level (Table 2B). The ‘specify’ answers are listed in Table 3. The most frequently reported problem was related to pain (>30%). The others were tendon injuries, tendonitis, tendon-related surgery, osteomyelitis, osteoporosis, spinal stenosis, herniated discs, amputations and alloplastic surgery.
Do any of the above musculoskeletal challenges interfere with your activities of daily living (transfers, walking, dressing, showers, etc.)?
Intra-rater reliability was κ=0.68 (Table 2A) and inter-rater reliability was κ=0.59 (Table 2B). If the two categories ‘yes, a little’ and ‘yes, a lot’ were merged into one category—‘yes’—this yielded an intra-rater reliability of κ=0.74 and an inter-rater reliability of κ=0.65.
Content validity
All feedback from the validation group interviews is shown in Table 4. There were no major suggestions for changes.
Discussion
The ISCIMBDS has satisfactory intra-rater reliability for all variables, except the variables titled ‘Date of fracture’ and ‘Method of documentation of HO’. Not unexpectedly, reliability scores were higher for the intra-rater than the inter-rater group. Inter-rater reliability had satisfactory reliability in 9 out of 12 of the main variables, but the agreement was largely unsatisfactory for the sub-questions. As different clinicians will be using this data set, agreement between raters is important. The following variables had unsatisfactory inter-rater reliability: ‘Date of fracture’, ‘Fragility fractures’, ‘Degenerative changes/overuse’, ‘Scoliosis, method of assessment’, ‘Other musculoskeletal problems’ and ‘Do any of the above musculoskeletal challenges interfere with your activities of daily living (transfers, walking, dressing, showers, etc.)?’. These variables will be discussed in further detail.
Reporting of fractures showed good reliability. ‘Fractures since spinal cord lesion’ could be rephrased to ‘Fractures since spinal cord injury’ to follow the terminology in the data set. The date of fracture was below satisfactory level, and the day, month and year of fracture were only fully reported in 50% of instances for intra-raters and 38% for inter-raters when there was agreement of the year. When revising the data set, we suggest that only year of fracture is recorded. Agreement on fragility fractures between raters was unsatisfactory. This may reflect difficulties determining the cause of fractures, which in most cases occurred many years prior to assessment (median time 6 and 11 years for intra- and inter-raters, respectively).
Inter-rater reliability was unsatisfactory for ‘Degenerative changes’ or ‘Changes due to overuse’. This probably reflects a need to better define these variables in the data set. Pain and discomfort, which are common symptoms of degenerative changes or overuse, could cause differences between raters' interpretation of the variable.1 Pain owing to degenerative changes or overuse can be difficult to distinguish from other types of pain such as neuropathic or visceral pain—a more detailed pain evaluation is covered in the International SCI Pain Basic Data Set.21 The individuals in this study could suffer from overuse-induced pain in the upper body with extended wheelchair use, as the majority of the study population had American Spinal Injury Association Impairment Scale A, B and C and a high number of cervical lesions (Table 1). The locations of the degenerative changes or overuse were primarily in the upper body with only a few instances in the lower body.
Scoliosis showed almost perfect agreement, but there was only moderate reliability for the variable relating to the method of assessment. The option ‘Observation in sitting’ had the lowest reliability, and the option ‘Observation in standing’ was not used at all. The validation group suggested removing the sub-questions, which results from this study support. Otherwise the options could be reduced to, for example, ‘Observation’ and ‘Radiography’.
‘Other musculoskeletal problems’ had unsatisfactory inter-rater reliability, suggesting that this variable is a challenge to interpret. Some of the MSK problems reported (Table 3) could belong to the ‘Degenerative/overuse category’. This variable had also moderate reliability, suggesting disagreement between raters regarding which symptoms should be listed in these two categories. The last variable ‘Do any of the above musculoskeletal challenges interfere with your activities of daily living (transfers, walking, dressing showers, etc.)?’ showed unsatisfactory inter-rater reliability but improved to substantial when ‘yes, a little’ and ‘yes, a lot’ were merged into one category. This adjustment could be considered when revising the data set. Very few raters indicated that participants had any neuro-muscular history prior to SCI. This was captured in responses to the first variable in the data set, ‘Neuro-Musculoskeletal history before spinal cord lesion’. Following this, it is tempting to suggest removing this variable from the data set. However, we believe that this variable is important to retain because prior neuro-muscular problems may become more frequent in the future, as SCI becomes more common in the elderly. These people are likely to have MSK problems, such as spinal canal stenosis or spondylosis.
Contractures had overall satisfactory reliability, but inter-rater reliability was only moderate for the location of contractures in the lower extremities. This result probably reflects the differences between raters in their diligence when measuring range of motion.
Limitations of the study include the low frequency of reported disorders for some variables, meaning that agreement primarily reflected the absence of symptoms. Therefore, it is difficult to make any conclusions about these variables.
The study populations differed across the four centers with regard to their demographics, and there is a risk of selection bias if the populations were not representative. For example, the frequency of HO was lower in this study than reported in the literature.10, 11 Selection bias could also have arisen because of the recruitment procedure. Content validity was not tested statistically or compared with a golden standard because there was no gold standard to use, and the focus group discussions were also performed in a relatively small group of people.
Conclusion
Overall, the data set has acceptable reliability. Intra-rater reliability was satisfactory, and inter-rater reliability was satisfactory in 9 of the 12 variables for the main questions but largely unsatisfactory for many sub-questions of the variables. The variables ‘Date of fracture’, ‘Fragility fractures’, ‘Scoliosis, method of assessment’, ‘Other musculoskeletal problems’ and ‘Do any of the above musculoskeletal challenges interfere with your activities of daily living (transfers, walking, dressing, showers, etc.)?’ may need revising in the next version of the data set. The frequency of reported problems was low for some variables, making final conclusions more difficult as agreement was primarily based on the absence of symptoms. Validity discussions suggested only minor changes to a number of variables.
Data archiving
There were no data to deposit.
References
Biering-Sørensen F, Burns AS, Curt A, Harvey LA, Jane Mulcahey M, Nance PW et al. International spinal cord injury musculoskeletal basic data set. Spinal Cord 2012; 50: 797–802.
Biering-Sørensen F, Charlifue S, DeVivo M, Noonan V, Post M, Stripling T et al. International Spinal Cord Injury Data Sets. Spinal Cord 2006; 44: 530–534.
Sköld C, Levi R, Seiger A . Spasticity after traumatic spinal cord injury: nature, severity, and location. Arch Phys Med Rehabil 1999; 80: 1548–1557.
Kirshblum S . Treatment alternatives for spinal cord injury related spasticity. J Spinal Cord Med 1999; 22: 199–217.
Maynard FM, Karunas RS, Waring WP . Epidemiology of spasticity following traumatic spinal cord injury. Arch Phys Med Rehabil 1990; 71: 566–569.
Hitzig SL, Eng JJ, Miller WC, Sakakibara BM . An evidence-based review of aging of the body systems following spinal cord injury. Spinal Cord 2011; 49: 684–701.
Jiang S-D, Dai L-Y, Jiang L-S . Osteoporosis after spinal cord injury. Osteoporos Int 2006; 17: 180–192.
Vestergaard P, Krogh K, Rejnmark L, Mosekilde L . Fracture rates and risk factors for fractures in patients with spinal cord injury. Spinal Cord 1998; 36: 790–796.
Logan WC, Sloane R, Lyles KW, Goldstein B, Hoenig HM . Incidence of fractures in a cohort of veterans with chronic multiple sclerosis or traumatic spinal cord injury. Arch Phys Med Rehabil 2008; 89: 237–243.
Stover SL, Hataway CJ, Zeiger HE . Heterotopic ossification in spinal cord-injured patients. Arch Phys Med Rehabil 1975; 56: 199–204.
Reznik JE, Biros E, Marshall R, Jelbart M, Milanese S, Gordon S et al. Prevalence and risk-factors of neurogenic heterotopic ossification in traumatic spinal cord and traumatic brain injured patients admitted to specialised units in Australia. J Musculoskelet Neuronal Interact 2014; 14: 19–28.
Diong J, Harvey LA, Kwah LK, Eyles J, Ling MJ, Ben M et al. Incidence and predictors of contracture after spinal cord injury—a prospective cohort study. Spinal Cord 2012; 50: 579–584.
Harvey LA, Glinsky JA, Katalinic OM, Ben M . Contracture management for people with spinal cord injuries. NeuroRehabilitation 2011; 28: 17–20.
Allam AM, Schwabe AL . Neuromuscular scoliosis. PM R 2013; 5: 957–963.
Anastasi A, Urbina S. Psychological Testing. Prentice-Hall: Upper Saddle River, NJ, USA, 1997, pp 114.
Foxcroft C, Paterson H, Le Roux N, Herbst D Psychological Assessment in South Africa: A Needs Analysis: The Test Use Patterns and Needs Of Psychological Assessment Practitioners: Final Report, July, Human Sciences Research Council, Pretoria, South Africa, 2004, pp 48 and 73.
Sim J, Wright CC . The kappa statistic in reliability studies: use, interpretation, and sample size requirements. Phys Ther 2005; 85: 257–268.
Biering-Sørensen F, Alexander MS, Burns S, Charlifue S, DeVivo M, Dietz V et al. Recommendations for translation and reliability testing of international spinal cord injury data sets. Spinal Cord 2011; 49: 357–360.
OpenClinica, https://www.openclinica.com/ (accessed 27 January 2016).
FDA. Regulatory Information, http://www.fda.gov/RegulatoryInformation/Guidances/ucm125067.htm (accessed 27 January 2016).
Widerström-Noga E, Biering-Sørensen F, Bryce TN, Cardenas DD, Finnerup NB, Jensen MP et al. The International Spinal Cord Injury Pain Basic Data Set (version 2.0). Spinal Cord 2014; 52: 282–286.
Acknowledgements
Thanks to all the individuals with SCI who participated in the study. The database was developed by Knud Mejer Nelausen, IT Coordinator, Clinical Research Unit, Oncology Department, Herlev University Hospital, Copenhagen, Denmark. All authors were supported by their institutions’ internal funds.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no conflict of interest.
Appendix
Appendix
Rights and permissions
About this article
Cite this article
Baunsgaard, C., Chhabra, H., Harvey, L. et al. Reliability of the International Spinal Cord Injury Musculoskeletal Basic Data Set. Spinal Cord 54, 1105–1113 (2016). https://doi.org/10.1038/sc.2016.42
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/sc.2016.42
This article is cited by
-
Reliability of the International Spinal Cord Injury Upper Extremity Basic Data Set
Spinal Cord (2018)
-
Reliability of the International Spinal Cord Injury Musculoskeletal Basic Data Set; methodological and statistical issue to avoid misinterpretation
Spinal Cord Series and Cases (2016)
-
Response to: Reliability Of the International Spinal Cord Injury Musculoskeletal Basic Data Set; Methodological and Statistical Issue to Avoid Misinterpretation
Spinal Cord Series and Cases (2016)