Introduction

The Spinal Cord Injury Model Systems (SCIMS) National Database (NDB) collects demographic, diagnostic, functional, and long-term outcome information prospectively on individuals who incur traumatic spinal cord injury (TSCI) and start receiving inpatient rehabilitation care at contributing centers within 1 year of injury [1]. The National Institute on Disability, Independent Living, and Rehabilitation Research (NIDILRR) funds the NDB to study outcomes following the delivery of coordinated care for individuals with TSCI [1]. The NDB is one of the oldest and richest repositories in the United States (US) for TSCI information. Since its inception in 1973, 29 SCIMS centers have enrolled more than 32,000 patients and conducted more than 118,000 follow-up interviews [2]. This database is a key resource for TSCI research in the US, with more than 250 peer-reviewed journal articles and book chapters produced to date [15].

The NDB defines TSCI as “the occurrence of an acute traumatic lesion of neural elements in the spinal canal (spinal cord and cauda equina), resulting in temporary or permanent sensory and/or motor deficit” [6]. SCIMS centers recruit patients with TSCI who meet specific criteria [6] (Table 1) and collect data pertaining to acute care and inpatient rehabilitation through chart review, neurological examination, and personal interview; they conduct follow-up interviews at 1 and 5 years post injury, then every 5 years thereafter [1].

Table 1 SCIMS-NDB eligibility criteria

The representativeness of this sample to the entire population of individuals who receive inpatient rehabilitation for new onset TSCI in the US is unknown. Concerns regarding the representativeness of the data extend to the earliest publications utilizing the NDB data [79]. The availability of data from two inpatient rehabilitation center administrative data repositories provides a unique opportunity to evaluate the representativeness of the SCIMS-NDB [1012]. The Uniform Data System for Medical Rehabilitation (UDSMR) [13] began the UDS-PRO service in 1987 and the American Medical Rehabilitation Provider Association (AMRPA) [14] started a competing service, eRehabData, in 2001. Both systems collect data using the inpatient rehabilitation facility-patient assessment instrument (IRF-PAI) [15]. The repositories include data for all patients admitted to the participating inpatient rehabilitation centers (IRCs), whatever the reason for admission or the payment source. Approximately 74% and 18% of IRCs submitted data to UDS-PRO and eRehabData in 2007, respectively, representing at least 92% of IRCs in the US [16], and an even greater percentage of patients as UDS-PRO and eRehabData subscribers include the largest IRCs in the nation.

The goal of this study was to compare distributions of patient characteristics from the SCIMS-NDB sample with distributions from the national combined UDS-PRO and eRehabData TSCI IRC population. We hypothesized minimal differences exist between the SCIMS-NDB sample and the UDS-PRO/eRehabData TSCI IRC population.

Methods

Data sources

SCIMS-NDB sample

Table 1 summarizes eligibility criteria for SCIMS-NDB. We limited the SCIMS-NDB sample to patients who were aged 18 years or older at rehabilitation admission and were discharged alive between 2000 and 2010. These criteria yielded a sample of 5,969 cases from 19 centers.

TSCI IRC population

We negotiated agreements with UDSMR and AMRPA to obtain aggregate data for individuals with TSCI discharged from inpatient rehabilitation between 2000 and 2010. Table 2 summarizes inclusion criteria. We limited case selection to the earliest date that an individual was registered in a database, regardless of status of the admission (initial or readmission). All IRCs contributing to NDB also report to either UDS-PRO or eRehabData; therefore, to have a useful comparison between the two data sets, we removed NDB cases from the TSCI IRC population by simple subtraction of aggregate frequencies as described below in the analysis section. These criteria yielded a comparison data set with 99,142 cases after excluding 5,969 SCIMS-NDB records.

Table 2 TSCI IRC Selection Criteria

Variables of interest

Variables selected for comparison included those collected in a similar fashion across both the TSCI IRC population and the SCIMS-NDB. All variables were coded as categorical variables. Age and FIM motor scores were also assessed as continuous variables. In total, we made 48 comparisons across the 9 variables.

Demographic characteristics included age, sex, marital status, race/ethnicity, and preinjury occupational status. We categorized age in 10-year increments (20–29, 30–39, and so on) except for the youngest cohort (18–19). The SCIMS-NDB collects age at injury whereas age at rehabilitation admission is reported to UDS-PRO and eRehabData; however, due to the relatively brief period between injury and rehabilitation admission, we consider the age variables to be comparable. Race/ethnicity categories were white, black, hispanic, and other (Asian, Hawaiian/Pacific Islander, Native American/Aleut, and Unspecified). Marital status categories were never married, married, and previously married (divorced, separated, and widowed). Preinjury occupational status categories were employed, sheltered workshop, student, homemaker, not working, and retired.

Functional status was measured using the motor score of the FIM instrument to reflect functioning during the first three calendars days of rehabilitation stay [17]. The motor score is the sum of 13 items each rated from 1 (total dependence) to 7 (independent), with higher scores representing better motor functioning [17, 18]. With a range of 13–91, we categorized motor scores as 13, 14–23, 24–33, …, 84–91.

Time since injury was the number of months from the date of injury to rehabilitation admission. We defined categories in 1-month (30 days) increments, with the upper category including onset times of 7 months or longer.

Injury characteristics included level and completeness of injury. When available, we ascertained these variables from the impairment group code provided as part of the IRF-PAI; otherwise we determined level and completeness from ICD-9-CM codes (344.0–344.1). IRF-PAI level of injury categories were paraplegia, tetraplegia C1–C4, and tetraplegia C5–C8. Injury completeness categories were complete and incomplete. The SCIMS-NDB collects injury information according to the International Standards for Neurological Classification of Spinal Cord Injury (ISNCSCI) [19]. We combined ISNCSCI categories to achieve correspondence with the IRF-PAI categories. UDS-PRO and eRehabData subscribers report neurological status at admission to rehabilitation, while the injury information reported in the SCIMS-NDB is collected at discharge. The low likelihood of injury conversion during the relatively brief period of rehabilitation [20] allows us to assume these data are comparable.

Analyses

We compared the distribution of the nine categorical and of two continuous variables (age and FIM motor scores) to assess differences between the SCIMS-NDB sample and the TSCI IRC population, with cases from the NDB sample deleted from the latter. Even with the SCIMS-NDB sample constituting only about 6% of the TSCI IRC population, keeping the sample in the population would underestimate differences and make it more likely that “no difference” conclusions would be drawn. For categorical variables, NDB cases were removed from the TSCI IRC population by simple subtraction of aggregate frequencies. For continuous variables, means and standard deviations for the IRC population excluding the NDB cases were estimated from the total mean and standard deviation in the IRC population including NDB cases and the group mean and standard deviation in the NDB using standard formula for pooled statistics [21]. We did not use statistical testing for differences between the SCIMS-NDB and the TSCI IRC groups due to the large sample sizes, which would result in very minor distributional differences being statistically significant. Instead, we adopted a classification scheme used for a similar analysis applied to data collected by the Traumatic Brain Injury Model Systems [16, 22]. For categorical variables, we regard absolute differences of <5 percentage points as inconsequential, those of 5–10 percentage points as minor, and those greater than 10 percentage points as important. For continuous variables, we computed the relative difference (i.e., effect size) as the difference in means between the two data sets, divided by the standard deviation in the TSCI IRC population. We regard relative differences of <25% as inconsequential, those between 25% and 50% as minor, and those greater than 50% as important.

The HCA HealthOne Institutional Review Board approved the study. All subjects had given permission for SCIMS data collection; informed consent is not applicable to UDS-PRO and eRehabData, as these are de-identified administrative databases.

Results

Table 3 summarizes the sample characteristics for the TSCI IRC population and SCIMS-NDB. The table also reports absolute and relative differences between the SCIMS-NDB sample and the TSCI IRC population excluding SCIMS-NDB cases.

Table 3 Distributions and Comparisons of the US TSCI IRC Population and SCIMS-NDB Sample

We observed important differences for preinjury occupational status; the proportion of employed persons was greater (62.7 vs. 41.7%) and the percentage of retired persons was smaller (10.2 vs. 36.1%) in the SCIMS-NDB than in the TSCI IRC population.

The SCIMS-NDB sample was younger than the TSCI IRC population, with minor absolute differences in three age categories and an important relative difference in the mean age. The SCIMS-NDB had more individuals aged 20–29 years (27.1 vs. 17.7%), fewer individuals aged 70–79 years (4.5 vs. 11.0%), and 80–89 years (1.3 vs. 6.8%), and the mean age was substantially lower (40.6 vs. 51.5 years). There were minor sex distributional differences, with the SCIMS-NDB having a greater percentage of males (78.4 vs. 71.1%), and minor differences on marital status and race/ethnicity, such that the SCIMS-NDB had a larger percentage of individuals who were never married (45.3 vs. 37.6%), fewer who were white (62.0 vs. 71.8%), and more individuals who were black (24.7 vs. 16.1%).

The SCIMS-NDB sample tended to have lower admission FIM motor scores with one important and two minor absolute differences in categories. The NDB had a substantially larger percentage of individuals with scores at the floor (26.2 vs. 15.5%), while there were fewer with scores in the 34–43 (11.3 vs. 16.6%) and 44–53 (5.1 vs. 11.8%) range. There was also a minor relative difference in mean admission FIM motor scores, such that the SCIMS-NDB sample had a lower mean score than the TSCI IRC population (24.4 vs. 30.8). Time to rehabilitation admission was similar between the SCIMS-NDB sample and the TSCI IRC population, with the exception of two minor differences: the SCIMS-NDB had a lower percentage receiving less than 1 month of acute care before rehabilitation admission (72.9 vs. 79.1%) and a larger percentage receiving 1–2 months of acute care (17.4 vs. 10.8%). Level of injury and completeness of injury were similar between the SCIMS-NDB and the TSCI IRC; however, any meaningful comparison of these variables is limited by the high rates of missing data in both databases, particularly for the TSCI IRC population.

Discussion

2000–2010 SCIMS-NDB sample is comparable in most respects to the TSCI IRC population and thus is largely representative of adults in the US receiving inpatient rehabilitation for a newly acquired TSCI. Researchers using the SCIMS-NDB can be confident that research findings based on these data generalize well to the larger population receiving rehabilitation for TSCI in the US. However, there are differences that limit generalizability.

The SCIMS-NDB sample and the larger TSCI IRC population differed primarily in terms of age and preinjury occupational status. The SCIMS-NDB had a higher percentage of employed persons and a lower percentage of retired persons, perhaps reflecting differences in the age distribution. The TSCI IRC population is more than a decade older than the SCIMS-NDB sample (mean age 51.5 vs. 40.6) and therefore likely to have a larger percentage of persons who have retired due to age. The SCIMS-NDB has a larger percentage of individuals aged 20–29 years and a smaller percentage of individuals aged 70–89 years. The SCIMS-NDB also has a slightly larger proportion of males and those reporting black race and never having married. These differences may reflect the urban locations of most SCIMS centers [1, 2]. In fact, the Census has found that urban residents over age 18 are younger (median age of 45 years) compared to rural residents (median age of 51) [23]. The need to receive patient consent for participation in the SCIMS-NDB (but not the TSCI IR registries) may exacerbate these differences.

The SCIMS-NDB sample has slightly lower FIM motor scores and longer intervals between the injury date and admission to inpatient rehabilitation. NIDILRR selects SCIMS centers using competitive peer review that requires a demonstration of excellence in rehabilitation care, particularly for persons with complicated TSCI. Sample differences may reflect the larger referral networks of SCIMS centers and their capacity to provide tertiary rehabilitation.

Users of the SCIMS-NDB could adjust for the differences noted and more closely estimate characteristics of the US TSCI IRC population by using sample weights. Advanced weighting methodology such as iterative proportional fitting [24] can estimate weights for the SCIMS-NDB sample so that it aligns better with the characteristics of the TSCI IRC population. Such adjustment will only correct for discrepancies in the characteristics weighted and those strongly associated with them, but there is no guarantee that it will achieve equivalence in other clinical and demographic aspects.

Limitations

These results are subject to several limitations. The cases in the UDS-combined PRO/eRehabData do not comprise the entire population, as not all TSCI cases are reported to these two repositories; however it is estimated that more than 92% of all IRCs report data and these include the largest facilities, hence close to 100% of patients are likely reported. We do not know how well these data represent the TSCI IRC population outcomes after rehabilitation or how representative they are of all individuals who incur TSCI in the US, especially those who do not receive hospital-based inpatient rehabilitation (e.g., home care rehabilitation, nursing home rehabilitation, no rehabilitation, or death). The data used to assess the representativeness of the SCIMS-NDB are not without error. While we tried to exclude cases in the IRC population with diagnoses other than TSCI, we may have included some cases inadvertently. As with all administrative data, the UDS-PRO and eRehabData are subject to error and diagnostic judgment variations. Similarly, the SCIMS-NDB data are not error-free, despite extensive data verification. Both sources contain missing data, in particular for level and completeness of injury, making comparisons on these variables is somewhat tenuous. Furthermore, the classification of important and minor differences we used is arbitrary, though the TBI Model Systems set a precedent, which we adopted [16, 22]. Use of Cohen’s h to compare percentages results in similar conclusions when a threshold of 0.15 was considered meaningful. We have provided sufficient information in Table 3 for readers who wish to apply different criteria for the significance of distribution discrepancies; in so doing they may reach different conclusions. We examined aggregate data over an 11-year period; the distributions of demographic and injury characteristics may have changed after this period in either data set.

Conclusions

The SCIMS-NDB is a unique resource for addressing research questions that are important for people with TSCI, clinicians, researchers, and policymakers. The characteristics of the 2000–2010 SCIMS-NDB sample are similar to those of the larger US TSCI IRC population, with mostly minor or no differences; hence, research findings based on the SCIMS-NDB are largely representative of the US TSCI IRC population. Users of the SCIMS-NDB can apply statistical methodology, such as weighting, to adjust for the relatively younger age, smaller proportion of retired individuals, and larger proportion of employed individuals to increase generalizability.

Data archiving

All relevant data are within this manuscript and raw data are archived by the authors.