Introduction

Spinal cord injury (SCI) is a devastating disease process that results in disability, loss of quality of life and cost to the health-care system. Currently, there are no therapies that robustly improve functional outcomes.1 Magnetic resonance imaging (MRI) has been used in humans to diagnose SCI for three decades and has been proposed as a viable in vivo biomarker.2, 3, 4 High T2 signal intensity within the spinal cord following injury is correlated with identifiable edema, necrosis, cellular infiltrates and hemorrhage.5, 6 The presence of such lesions has been associated with slowed recovery across a number of model systems and in humans.2, 7, 8 More recently, spinal cord lesion volume as measured by MRI has been used to quantitatively characterize SCI following clinical interventions.9

Canine intervertebral disk herniation (IVDH) is a common, naturally occurring SCI that is seen in pet dogs.10, 11 Canine IVDH shares many features with human SCI including lesion histopathology, presence of compressive/contusive facets of injury and use of surgical/rehabilitative treatment modalities.10, 11 As is the case in humans, data from dogs with IVDH suggest that T2-hyperintense lesions within the spinal cord are correlated with behaviorally severe injuries and poorer long-term motor recovery.11

The need for large animal SCI models and similarities between human SCI and canine IVDH have increased the interest in vetting novel therapies in injured dogs. Despite the recent well spring of canine IVDH-based preclinical trials, data are currently not available concerning the reliability of lesion identification and lesion volume measurement using T2-weighted MRI. Here, we addressed these critical issues in a retrospective cohort of dogs with thoracolumbar IVDH that received MRI using a 3-tesla system. We hypothesized that (1) axial images would improve lesion recognition compared with sagittal images; (2) semiautomated lesion segmentation would enhance rater agreement compared with manual segmentation; and (3) segmented lesion volume would correlate with initial injury severity and long-term motor outcome.

Materials and methods

Spinal cord-injured dogs

A medical record database at the Texas A&M University was searched in December 2012 to identify dogs with IVDH-associated SCI. The following criteria were required for inclusion: (1) IVDH located between the T3 and L7 vertebrae; (2) ⩽7-day duration between SCI and MRI to capture acute T2 signal changes; and (3) treatment consisting of surgical decompression and physical rehabilitation. Dogs were excluded if (1) spinal cord diseases in addition to IVDH were identified; (2) other systemic diseases that might affect survival beyond 6 months were diagnosed; or (3) IVDH was treated medically or surgically previously.

Neurological assessments

All dogs were examined immediately before MRI and at 6-week in-hospital re-evaluation (when available) to permit the assignment of two SCI severity scores. The modified Frankel score (MFS; Table 1a) is an ordinal SCI scoring system that broadly parallels the American Spinal Injury Association Impairment Scale and is used to coarsely stratify animals by injury severity. The Texas Spinal Cord Injury Score (Table 1b) is a more refined SCI assessment tool than the MFS, with a gait score that was developed as a simplified version of the Basso, Beattie, Bresnahan Scale used to evaluate locomotion in rats. TSCIS gait scores (0–6 for each limb) were combined, giving each dog a maximum possible TSCIS of 12. Critically, both the MFS and TSCIS have been previously validated in dogs with IVDH, have excellent inter-rater agreement and predict long-term functional outcome.12

Table 1 Validated ordinal spinal cord injury scores used in dogs with IVDH

Magnetic resonance imaging

MRI was performed on a Siemens Verio 3-tesla MRI system (Siemens Medical, Malvern, PA, USA). Turbo spin echo T2-weighted images in the axial and sagittal plane were evaluated. In axial T2-weighted images, TR was 3850–8000 ms, TE was 81–120 ms and slice thickness was 3–4 mm. Pixel dimensions ranged from 0.25 × 0.24 to 0.77 × 0.78 mm (median=0.32 mm × 0.38 mm). In sagittal T2-weighted images, TR was 3440–5000 ms, TE was 86–126 ms, slice thickness was 2–2.5 mm and interslice gap was 0–0.1 mm. Pixel dimensions ranged from 0.42 × 0.47 to 0.94 × 1.19 mm (median=0.65 mm × 0.73 mm).

Image analysis

Axial and sagittal T2-weighted series were randomized, anonymized and evaluated independently by two experienced evaluators using commercially available software (eFilm 3.3, Merge Healthcare, Milwaukee, WI, USA). Evaluators categorized axial and sagittal series based on the presence or absence of a T2-hyperintense spinal cord lesion.

Lesion volume was estimated by each evaluator in axial series using a graphical user interface made with commercially available software (Figure 1, MATLAB 7, R2010b, The MathWorks Inc., Natick, MA, USA). Axial series were utilized for lesion volume estimation as previous data from normal dogs suggested axial images minimize partial volume effects.13 To deal with variations in signal intensity (SI), the algorithm used the mean SI of seven manually selected pixels representative of the lesion, five manually selected pixels representative of normal appearing spinal cord (NASC) and 5 manually selected pixels representative of cerebrospinal fluid (CSF) and epidural fat to determine maximal and minimal intensity thresholds.

Figure 1
figure 1

Axial image of a dog with spinal cord injury caused by intervertebral disk herniation. In a, there is T2 hyperintensity of the spinal cord gray matter. In b, seed points have been placed within the lesion (red dots), in normal appearing spinal cord (green dots) and in the CSF (blue dots). In c, the lesion has been segmented for volume estimation.

The graphical user interface allowed the user to specify ‘tolerance’ and ‘distance’. ‘Tolerance’ determined the range of SIs to be included in the segmented lesion. The equations used to calculate maximal and minimal intensity threshold were:

Thresholdmax=SIlesion × (1−tolerance/100)+SICSF/fat × tolerance/100 and Thresholdmin=SIlesion × (1−tolerance/100)+SINASC × tolerance/100.

‘Distance’ limited the spatial extent of the segmented lesion in-plane by limiting the propagation range from seed points (each segmented pixel automatically became a seed point). The algorithm also allowed through-plane propagation to adjacent slices, which was facilitated by the evaluator manually marking the center of the spinal cord to align adjacent slices. Tolerance was set at 45 and distance at 10 for the current study based on preliminary observations. During a separate session, the images were re-randomized and evaluators used the same interface to manually segment the lesion.

Data analysis

Agreement for binary categorical data was assessed using percent agreement, and its 95% confidence interval (CI) estimated using exact binomial methods and the kappa statistic with its 95% CI. McNemar’s Χ2-test was used to determine whether T2 hyperintensity was significantly more likely to be identified by both evaluators in the axial plane compared with the sagittal plane. Agreement for lesion volume estimation was assessed using the Bland–Altman method. First, we analyzed agreement between evaluators for both manual and semiautomated segmentation. In cases where evaluators did not agree regarding the presence or absence of a lesion, a value of 0 was assigned for the evaluator that did not perceive a lesion. Second, we analyzed the agreement for each evaluator between manual and semiautomated segmentation. For all analyses, plots of the difference versus the mean values were made to graphically depict the data. Least squares linear regression was used to evaluate the statistical significance of any linear relationship observed from Bland–Altman plots. The paired differences of measured values and residuals of estimated values were compared using Wilcoxon-signed rank tests. Association of breed type with evidence of T2-hyperintense lesions in each plane was examined using Χ2-tests. Comparisons of body weights by the presence or absence of T2-hyperintense lesions in each plane was made using Wilcoxon rank-sum tests.

To analyze the outcome associations for binary categorical MRI data, the distribution of MFS and TSCIS was compared between dogs with and without spinal cord lesions using the Wilcoxon rank-sum test for each evaluator and image plane. Significance of Spearman rank correlation coefficients for the ranks of follow-up TSCIS scores and volumes of lesions were tested using the method of Choi.14 Significance for all analyses was set at P<0.05; analyses were performed using S-PLUS (Version 8.2, TIBCO Inc., Seattle, WA, USA).

Results

Population characteristics

One hundred and nine dogs were included. The median age was 6 years (range, 2–18 years). There were 67 Dachshunds, 7 mixed breed dogs, 5 Shih Tzus and 19 other breeds with less than four individuals represented. There were 51 neutered males, 41 neutered females, 15 intact males and 2 intact females. The median body weight of dogs was 7.1 kg (range, 3.2–51 kg). The median duration of clinical signs was 2 days. The median MFS before MRI was 3 (range, 0–4). The median TSCIS gait score before MRI was 2 (range 0–10). Six-week post-MRI TSCIS was available for 86 dogs (median TSCIS 10; range, 0–12). Figures 2 and 3 are representative MRI studies.

Figure 2
figure 2

Axial (a, b) and sagittal (c) T2-weighted MR images of a dog with spinal cord injury caused by intervertebral disk herniation. (b) is located just cranial to the site of compression (a). In a, the spinal cord is dorsally displaced and compressed by extruded intervertebral disk material (arrow). There is poorly defined T2 hyperintensity of the spinal cord (arrowhead). In b, there is poorly defined T2 hyperintensity of the ventral and right lateral spinal cord (arrowheads). In c, there is subtle, poorly defined T2 hyperintensity of the spinal cord extending over the length of several vertebrae (arrowheads).

Figure 3
figure 3

Axial T2-weighted MR images of a dog with spinal cord injury caused by intervertebral disk herniation. Panel a is located just cranial to panels b and c. In a, the spinal cord is displaced to the left by extruded intervertebral disk material (arrow). There is poorly defined T2 hyperintensity of the spinal cord (arrowhead). In b, there is poorly defined T2 hyperintensity of the entire cross-sectional area of the spinal cord (arrowhead). In c, there is focal T2 hyperintensity of the spinal cord gray matter (arrowhead).

Agreement: binary data

Evaluator agreement for presence or absence of a T2-hyperintense spinal cord lesion was 78% (95% CI, 70–86%) in the axial plane and 86% (95% CI, 80–93%) in the sagittal plane. The kappa statistic was 0.550 (95% CI, 0.391–0.709) in the axial plane and 0.581 (95% CI, 0.391–0.771) in the sagittal plane. Evaluators were significantly more likely to agree that there was a lesion in the axial series (34/109) compared with the sagittal series (15/109, P=0.0001; Supplementary Table 1).

Agreement: volume estimation

The agreement for T2-hyperintense lesion volume estimation was assessed in the 58 dogs with at least one measurement made using Bland–Altman methods. The mean bias (mean difference between evaluators) did not significantly differ between manual and semiautomated segmentation (−142.7 and −91.7 mm3, respectively, P=0.3226). The semiautomated method was more precise (smaller s.d. of the mean bias) compared with the manual method (151.92 and 255.46 mm3, respectively, P=0.0103). The plots revealed two distinct patterns for both manual and semiautomated methods (Figures 4 and 5). First, differences between Evaluators 1 and 2 became more negative (Evaluator 2>Evaluator 1) as the estimated volume increased. Second, the amount of variation between observers increased as the estimated volume increased (that is, heteroscedasticity indicated by the spread among observations increasing from left to right).

Figure 4
figure 4

Bland–Altman plot of the difference in manual MRI lesion volume estimations for 58 dogs examined by two evaluators versus the mean manual volume estimations of the two evaluators. The dotted-and-dashed horizontal line represents a difference of 0 between evaluators (complete agreement). The solid horizontal line represents the mean difference (−142.7 mm3) and the dashed lines represent 1 s.d. (255.46 mm3) above or below the mean.

Figure 5
figure 5

Bland–Altman plot of the difference in semiautomated MRI lesion volume estimations for 58 dogs examined by two evaluators versus the mean semiautomated volume estimations of the two evaluators. The dotted-and-dashed horizontal line represents a difference of 0 between evaluators (complete agreement). The solid horizontal line represents the mean difference (−91.7 mm3) and the dashed lines represent 1 s.d. (151.92 mm3) above or below the mean.

For both manual and semiautomated segmentations, there was a significant (P<0.0001) negative association between the difference between the evaluators’ estimated volumes and the mean estimated volume. In manual lesion volume estimation, for each mm3 increase in the mean, the difference decreased by −0.82 mm3 (95% CI, −1.07 to −0.58 mm3). In the semiautomated lesion volume estimation, for each mm3 increase in the mean, the difference decreased by −0.65 mm3 (95% CI, −0.78 to −0.51 mm3).

Within each evaluator, semiautomated segmentation estimated lower values compared with manual segmentation (Evaluator 1: mean difference, −39.3 mm3; s.d.=97.11 mm3; Evaluator 2: mean difference, −90.3 mm3; s.d.=281.33 mm3). The agreement between segmentation methods was more variable with increased lesion volume (Figures 6 and 7). There was no significant linear association between the difference and the mean of the techniques (Evaluator 1: P=0.0816 and Evaluator 2: P=0.0697).

Figure 6
figure 6

Bland–Altman plot of the difference in semiautomated versus manual MRI lesion volume estimations for 58 dogs examined. Evaluator 1 versus the mean of the two methods. The dotted-and-dashed horizontal line represents a difference of 0 between evaluators (complete agreement). The solid horizontal line represents the mean difference (−39.3 mm3) and the dashed lines represent 1 s.d. (97.11 mm3) above or below the mean.

Figure 7
figure 7

Bland–Altman plot of the difference in semiautomated versus manual MRI lesion volume estimations for 58 dogs examined. Evaluator 2 versus the mean of the two methods. The dotted-and-dashed horizontal line represents a difference of 0 between evaluators (complete agreement). The solid horizontal represents the mean difference (−90.3 mm3) and the dashed lines represent 1 s.d. (281.83 mm3) above or below the mean.

Associations between measures of animal size and MRI parameters

We examined relationships between body weights and breed type (Dachshund versus non-Dachshund) and the presence of spinal cord T2 hyperintensity in order to assess the effect of animal size/conformation on lesion detection. There was no significant difference in the body weights of dogs that had T2-hyperintense lesions detected by both evaluators in either the transverse (median, 7.1 kg; range, 3.7–45.0 kg; N=34) or sagittal (median, 6.5 kg; range, 4.8–11.9; N=15) planes compared with those that had no observed T2-hyperintense lesions in either the transverse (median, 7.1 kg; range, 3.2–51.0 kg; N=75) or the sagittal (median, 7.2 kg; range, 3.2–51.0 kg; N=94) planes (P=0.8471 and P=0.0982, respectively). Similarly, the proportion of Dachshunds with and without spinal cord T2 hyperintensity in the sagittal and transverse planes did not differ significantly (P=0.8730 and P=0.8654, respectively). Likewise, there was no significant association of either breed or body weight on semiautomated lesion volume measured by either the Evaluator 1 or the Evaluator 2 (data not shown).

Outcome associations

For each evaluator, the presence of a T2-hyperintense spinal cord lesion in the axial plane was significantly associated with initial MFS, initial TSCIS and follow-up TSCIS (Table 2). The presence of a T2-hyperintense spinal cord lesion in the sagittal plane was significantly associated with initial MFS, initial TSCIS and follow-up TSCIS for Evaluator 1, but only with initial MFS for Evaluator 2 (Table 3). For semiautomated segmentation, Spearman rank correlation coefficients for the association of follow-up TSCIS values and lesion volumes were −0.1935575 (Evaluator 1) and −0.1474237 (Evaluator 2; P=0.2294 and P=0.3608, respectively). For manual segmentation, Spearman rank correlation coefficients were −0.1600821 (Evaluator 1) and −0.2230854 (Evaluator 2; P=0.3210 and P=0.1653, respectively).

Table 2 Lesion detection in the axial plane
Table 3 Lesion detection in the sagittal plane

Discussion

Data from this study highlight both strengths and limitations of using T2-weighted MRI to characterize naturally occurring SCIs in dogs. Here, we showed that axial images enhance lesion recognition compared with sagittal images. Further, lesions detected on axial images were more robustly associated with both initial injury severity and long-term motor outcome compared with those visualized via sagittal images. Despite these associations, rater agreement with regard to the presence or absence of T2-hyperintense spinal cord lesions on axial images was deemed moderate. Measurement of lesion volume using a semiautomated technique improved precision compared with manual lesion segmentation. However, we did not detect an association between measured lesion volume and initial injury severity or recovery of motor function. The implications of these data may extend beyond preclinical studies in dogs with naturally occurring SCI as they highlight the superiority of axial T2 images compared with sagittal plane images while simultaneously underscoring the need to carefully assess rater agreement and measurement precision across model systems.

Sagittal MRI has been commonly used in humans with SCI as a means to recognize T2-hyperintense spinal cord lesions, semiquantitatively evaluate lesion size via sagittal ratios and predict long-term functional outcome.2, 15 Data from this study suggest that axial T2-weighted MRI is superior to sagittal images with respect to lesion recognition and associations with measures of initial SCI severity and long-term motor recovery. We believe that this finding is likely generalizable across species as injury characteristics are similar between dogs with IVDH-associated SCI and humans.10, 11 The superior performance of axial images is potentially explained by the fact that voxels are anisotropic, with the longest dimension parallel to the long axis of the spinal cord. This minimizes partial volume effects, compared with sagittal images where the longest dimension of voxels parallels the width of the spinal cord.13

Agreement was moderate for detecting the presence or absence of a T2-hyperintense spinal cord lesion in both axial and sagittal plane images. Imperfect agreement suggests that spinal cord lesions in dogs with IVDH-associated SCI may be indistinct. Reasons for indistinct lesions may include the presence of compressive material in the epidural space, respiratory motion artifact and subtle changes in SI that represent the early phases of cellular infiltration or edema. Edema and cellular infiltration seen shortly after contusion in rodent SCI may only result in transient loss of gray/white matter distinction on T2-weighted images.16 We suggest that imperfect lesion identification is a critical limitation in using MRI-based techniques to assess the spinal cord in dogs with IVDH and may be critical in other species.

We utilized a semiautomated segmentation technique that parallels a developed algorithm used in multiple sclerosis.17 Thresholds were established on the basis of the intensity of user-defined pixels, and users were allowed to inspect the segmentation and make corrections as deemed appropriate. This added supervisory step increased the amount of time required per study, but eliminated gross errors in segmentation. The semiautomated technique more precisely estimated lesion volume compared with manual segmentation. This has been recognized by others in patients with multiple sclerosis18, 19 and represents an important benefit of automated or semiautomated segmentation.

Even though the semiautomated method was more precise compared with manual segmentation, the s.d. of the mean bias was still quite large and statistically significant associations with long-term motor function were not detected. As histologic lesion volume in SCI is well known to correlate with injury severity and outcome,20 it is likely that the lack of relationships between T2-hyperintense lesion volume and behavioral measures resulted from imperfect rater agreement and the tendency for some large lesions to be indistinct, which was evidenced by the heteroscedasticty seen on Bland–Altman analysis.

Critically, several factors could have affected the conclusions of this study. Perhaps, most importantly, MRI acquisition protocols varied to some degree between subjects. Protocol variation was necessary in this study population because of the broad range of breeds and sizes of affected dogs. Second, as this was an in vivo study, it was not possible to establish a histologic ground truth to assess the accuracy of manual or semiautomated segmentation. A histologic ground truth study, although challenging to conduct in the context of client-owned animals with naturally occurring disease, would allow optimization of the segmentation algorithm to achieve high overlap with histologically defined regions.

Data archiving

There were no data to deposit.