Introduction

Physical deconditioning of people with spinal cord injury (SCI) is a contributing and major risk factor of cardiovascular disease, and the clinical practice guideline for identification and management of cardiometabolic risk after SCI recommends exercise testing of all people with SCI when possible [1]. At the Department for Spinal Cord Injuries, Rigshospitalet, Denmark VO2peak testing is performed during rehabilitation as part of standard care, by different testers in patients with heterogeneous levels of physical functioning.

The most common modalities used for VO2peak testing in people with SCI have been arm-crank ergometers (ACE) or hand-bikes and manual wheelchair protocols on rolls or treadmill [2]. Arm ergometry protocols were suitable for people without motor function in the lower extremities, but underestimated aerobic capacity in people with an incomplete SCI and preserved motor function in the lower extremities [3]. This may be a problem when comparing individual test results to reference fitness values. A seated recumbent stepper with simultaneous involvement of the upper and lower extremities was found feasible in people with an incomplete SCI, and may be an alternative testing modality although reliability was not investigated [4]. High test-retest reliability of VO2peak was found in previous studies using ACE and robot-assisted treadmill protocols in people with SCI [3, 5]. However, these studies were not conducted in an inpatient rehabilitation setting with different testers involved.

Studies assessing VO2peak in a clinical inpatient setting during rehabilitation are few, and included arm ergometry protocols in wheelchair-dependent people only [2, 6,7,8,9]. However, inpatients during rehabilitation are heterogeneous due to differences in the level and severity of SCI, and a high number of people have a motor incomplete SCI at admission with a potential for improving motor functioning during rehabilitation [10, 11]. Therefore, a set-up with individualized test modalities for testing cardiovascular fitness may be more appropriate for achieving VO2 peak results presumed to be the most optimal in accordance with pre-defined criteria for VO2 peak.

The aim of this study was to investigate the test-retest reliability of individualized testing of VO2peak in a rehabilitation setting involving several testers and a heterogeneous sample of participants reflecting clinical practice. We hypothesized that the predefined test modalities used in this context would be reliable for use in standard care based on measures of relative and absolute reliability. We also hypothesized that selecting individualized test modalities, based on the participants’ motor function and the testers clinical judgement, would result in more optimal VO2peak values by achieving better thresholds for standard VO2peak criteria.

Methods

Participants and eligibility criteria

This study was part of a prospective cohort study with a consecutive enrollment of all patients with a new SCI during a 10 months period [12] at the Department for Spinal Cord Injuries, Rigshospitalet, Denmark. Participants were > 18 years, having sustained a SCI within the last 12 months before admission to rehabilitation (Table 1). Due to concerns with completing a valid exercise test, individuals with motor-complete injuries (American Spinal Injury Association (ASIA) Impairment Scale (AIS) A and B)) [13], at or above C4 requiring artificial ventilation, were excluded from the study Other and previously used exclusion criteria were the presence of decubiti, severe spasticity or musculoskeletal problems at risk of exacerbation or aggravation during testing or preventing completion of the test [6]. If diagnosed cardiovascular disease was present a medical doctor decided whether the participant could engage in the exercise test.

Table 1 Individual participant characteristics and results from the VO2peak test for all 23 participants.

Protocols and equipment

Two different test devices and two predefined protocols for each device were used for VO2peak testing. Primarily, one of the test modalities was used for each participant depending on functional level and overall condition. However, in some instances it was decided to switch to one of the other test modalities to reach a VO2peak presumed to be more optimal (Fig. 1).

Fig. 1: Flow chart showing the process of identifying an appropriate test protocol by clinical judgement based on functional level and overall condition of the participant.
figure 1

Abbreviations: Spinal cord injury SCI; lower motor extremity motor score LEMS; total body recumbent stepper exercise Test protocol TBRS-XT; arm crank ergometer ACE (adjusted for para (Para)- and tetraplegic (Tetra) individuals respectively); test 1a T1a; test 1b T1b.

A recumbent stepper (NuStep T5XR®, Ann Arbor, MI, USA) was primarily used for participants with a motor incomplete SCI and higher lower extremity motor score (LEMS) (Fig. 1). The two stage protocols incorporated were a Total Body Recumbent Stepper Exercise Test protocol (TBRS-XT) starting at 50 watts (W) with 115 steps per minute (SPM) and a 25 W incremental increase every 2 min during the first three stages and 30 W increments thereafter. The other protocol was a modified TBRS-XT (mTBRS-XT) starting at 25 W with 80 SPM and with 15 W increments every 2 min. Both protocols were described by Billinger et al. [14, 15].

Two different ACE protocols (SCI FIT Pro1®, Tulsa,OK, USA) were used for participants with a motor complete SCI, very de-conditioned participants or participants with an incomplete SCI but a low LEMS [2] (Fig. 1). The protocols were designed as incremental stage protocols starting at 5 W and increased 5 W each minute in people with tetraplegia (Tetra-ACE) and 10 W in people with paraplegia (Para-ACE) with 60 revolutions per minute. The protocols were entered into the stress test program of the SCI FIT Pro1® software before testing. The ACE was easy adjustable for optimal positioning of the upper extremities and was set up for synchronous arm-crank movements.

Testing procedure

Three physiotherapists performed the tests reflecting clinical practice. Participants were allocated by randomization to two VO2peak test sessions of either intra- or interrater reliability within the last two weeks before discharge from rehabilitation. However, independent of randomization, the physiotherapists who were skilled in VO2peak testing choose one of the four test modalities depending on the functional level of the participant. The choice of test modality was based on information provided by the participants’ primary physiotherapist and the testers clinical judgement. The flowchart in Fig. 1 guided the process of choosing between the recumbent stepper protocols or arm crank protocols. However, the protocol choice between the two recumbent stepper protocols was made from clinical judgement and feedback from the participant after trying the initial resistance and step cadence (Fig. 1). Participants were instructed to continue the test for as long as possible, and the test was terminated due to volitional exhaustion, when it was not possible to maintain cadence despite verbal encouragement or if an adverse event occurred.

Criteria for VO2peak were defined as the highest mean VO2 (L/min) recorded over a 30 s sample, and a corresponding respiratory exchange ratio (RER) > 1.0 [2, 6, 16, 17]. If predefined criteria for VO2peak were not reached during the first test (T1a), another protocol was chosen in order to yield a VO2 peak presumed to be more optimal. The new protocol was tested (T1b) and retested (T2b), resulting in three tests performed in total. VO2peak and RER from T1a and T1b were compared to determine if any improvement occurred. However, this approach was not possible if the Tetra-ACE protocol was too hard and the criteria for VO2peak were not reached. A protocol switch to a more demanding protocol was decided if the test duration exceeded 15 min in order to reach a VO2peak presumed to be more optimal. Likewise, if the test duration was shorter than 5 min a less demanding protocol was chosen if possible (Fig. 1).

Test-retest was performed at the same time of the day with 48 h to 5 days in between. Participants were instructed to refrain from caffeine, alcohol and intensive physical exercise prior to the test on the day of testing, as well as tobacco smoking two hours before testing. Bladder emptying was carried out immediately before testing. A warmup for maximum 5 min was performed if desired by the participant.

VO2peak was measured as absolute VO2 L/min by a portable metabolic system for cardiopulmonary exercise testing (Metamax 3B from Cortex Biophysik GmbH, Leipzig, Germany) and a Hans Rudolph mask with breath-by-breath measurements averaged every 15 s. The equipment was calibrated prior to every test according to the manufacturer recommendations as described previously [17].

Statistical analysis

Descriptive statistics were calculated for demographics and expressed as median and interquartile range (IQR). Analysis of VO2peak at discharge and VO2peak in relation to protocol and severity of SCI was based on data from T1 (T1a or T1b) and expressed as mean and 95% confidence interval (95%CI). Participants were divided into three subgroups based on SCI severity and described as C1-C8 AIS A,B,C; T1-S5 AIS A,B,C and ALL AIS D regardless of injury level [18].

Based on VO2peak relative reliability was analyzed for test-retest and intra-rater reliability by intra class correlation coefficient (ICC) with 95% CI, using a two-way mixed effects model, absolute agreement, single measurement. ICC for inter-rater reliability was analyzed using a two-way random effects model, consistency, single measurement [19]. Absolute reliability was analyzed by standard error of measurement (SEM) calculated as SD ∙ √ (1 – ICC). Also, limits of agreement was determined by Bland–Altman plots and linear regression was performed. An alpha level of 0.05 was used to indicate statistical significance. All statistical analyses were performed using IBM SPSS statistics version 22.

Results

Descriptives

Of 46 who consented, 23 were able to participate in the VO2peak test with 21 participants completing the test-retest procedures (Table 1). One participant did not complete T1a due to worsening of knee pain and one was discharged before T2a. T1a was performed within the last two weeks before discharge (median 5.5 days, IQR 4.0).

Individuals excluded from participation were classified as AIS A-D (52% AIS D) and characterized by a median age of 63 years (IQR 23.7), 12 men (52%) and 11 women (48%) with 16 (70%) having a non-traumatic and 7 (30%) a traumatic SCI. Limited motor function (N = 8), pain or aggravation of symptoms were the primary reasons for exclusion (N = 3) (Table 2).

Table 2 Reasons for exclusion from peak oxygen uptake test (VO2peak) at discharge from primary rehabilitation for 23 participants.

VO2peak

All participants exercised until volitional exhaustion except one who terminated testing due to knee pain, and thus VO2peak from this participant was excluded for analysis. Mean VO2peak obtained at T1 (T1a or T1b) was 1.91 L/min (95%CI: 1.31–2.51) with a minimum of 0.61 L/min achieved on the Tetra-ACE protocol and a maximum of 3.38 L/min on the TBRS-XT protocol (Fig. 2c). The highest VO2peak values were obtained among participants classified as AIS D (1.98 L/min, 95%CI: 1.55–2.41) and were significantly higher than VO2peak in participants with a T1-S5 level of injury classified as T1-S5 AIS A, B, or C (1.36 L/min, 95%CI: 0.81–1.92), p < 0.001 (Fig. 2a). The AIS D group also reached the highest workloads at VO2peak (Fig. 2b). VO2peak obtained on the TBRS-XT protocol (2.69 L/min, 95%CI: 2.24–3.14) was significantly higher than VO2peak obtained on the mTBRS-XT protocol (1.26 L/min, 95%CI: 0.91–1.61); p < 0.000 (Fig. 2c). Overall, participants with a low LEMS were tested on one of the ACE protocols and participants with higher LEMS scores, ranging from 40 to 50 points, were tested on one of the protocols in the recumbent stepper as expected (Table 1).

Fig. 2: Peak oxygen uptake and peak power.
figure 2

a Peak oxygen uptake (VO2peak) (L/min) according to neurological level of injury and severity of spinal cord injury (SCI) according to American Spinal Injury Association (ASIA) Impairment Scale (AIS) A, B, C, and D. Participant numbers are specified for each column in the same order as shown in Table 1. Dotted line indicates mean VO2peak in each severity group; b Peak power (W) achieved at VO2peak for each of the test modalities. Participant numbers are specified for each column. Dotted line indicates mean peak power for each test modality; c VO2peak (L/min) according to the different test modalities used. Participant numbers are specified for each column. Dotted line indicates mean VO2peak for each test modality. Abbreviations: Tetra ACE test protocol for people with complete tetraplegia on arm-crank ergometer Para ACE test protocol for people with complete paraplegia on arm-crank ergometer; mTBRS-XT modified Total Body Recumbent Stepper Exercise Test (modified protocol); TBRS-XT Total Body Recumbent Stepper Exercise Test (standard protocol).

A protocol shift from the mTBRS-XT protocol at T1a occurred in seven participants classified as AIS D. Of those, five participants did not reach predefined criteria for VO2peak. Test duration was short in three participants, but a switch to a less demanding ACE protocol was not possible due to upper extremity or neck problems. Two participants reached a higher VO2peak after switching to the TBRS-XT protocol on T1b, and improved VO2peak from 2.24 L/min (RER 0.92) at T1a to 3.38 L/min (RER 1.17) and from 1.94 L/min (RER 0.96) at T1a to 2.56 with L/min (RER 1.03), respectively (Table 1).

Two participants reached VO2peak at T1a after a prolonged test session exceeding 15 minutes. Switching to the TBRS-XT protocol on T1b resulted in faster volitional exhaustion after 10.00 and 10.30 min, respectively, but the more demanding protocol did not improve VO2peak. Values obtained at T1a and T1b were 2.05 L/min vs. 2.00 L/min, and 1.99 L/min vs. 1.93 L/min for these two participants.

Compared to reference values for able-bodied men and women VO2peak was lower in the study participants. In young men (20–29 years) mean VO2peak was 1.63 L/min (SD 0.53) compared to 4.30 L/min (SD 0.73), and 0.72 L/min (SD 0.18) vs. 1.79 L/min (SD 0.07) in women ≥70 years [20]. VO2peak obtained by 40–49-year-old men and SCI severity AIS D (n = 5) was high compared to the other participants (2.71 L/min, SD 0.52), but still much lower than the reference value (4.01 L/min, SD 0.62) [20]. Relative VO2peak values (mL kg (−1) min(−1)) were all in the lowest category regardless of gender and age compared to reference values for able bodied. Compared to reference fitness values for people with SCI [21], mean absolute VO2peak for our male participants with paraplegia (AIS A-D) was 1.85 L/min (SD 0.21) and for tetraplegia (AIS A-D) 1.73 L/min (SD 0.29). These are categorized as “excellent”. Our male participants with paraplegia (AIS A, B, C) had a mean VO2peak of 1.51 L/min (SD 0.19) and were in the category “good” [21].

Test-retest reliability

The test-retest sessions T1a and T2a were separated by a median of 2 days (IQR 1.5). When a protocol shift was made the test-retest sessions T1b and T2b were separated by a median of 2.5 days (IQR 3.25).

The overall ICC for the test-retest sessions for all four protocols together was 0.994 (95%CI: 0.986–0.998), SEM 0.041 L/min. Twelve participants (all AIS D) were randomized to assessment of the inter-tester reliability using the mTBRS-XT or TBRS-XT protocols, and nine participants were randomized to intra-tester reliability. Overall ICC for intra- and inter-tester reliability was 0.997 (95%CI: 0.986–0.999), SEM 0.02 L/min and 0.994 (95%CI: 0.978–0.998), SEM 0.04 L/min, respectively.

The Bland–Altman plots showed a mean difference of −0.005 (SD 0.12) for overall test-retest (Fig. 3a), 0.02 L/min (SD 0.08) for intra-tester reliability (Fig. 3b) and −0.02 L/min (SD 0.15) for inter-tester reliability, respectively (Fig. 3c). Linear regression analysis showed non-significant t-scores ranging from 0.441 for inter-tester test-retest results to 0.581 for intra-tester test-retest results.

Fig. 3: Bland-Altman plots for test-retest reliability.
figure 3

a Overall test-retest reliability of peak oxygen uptake (VO2peak) in individuals with spinal cord injury (N = 21); b Intra-tester reliability (N = 9); and c Inter-tester reliability (N = 12). Abbreviations: First test (T1) and retest (T2).

For the respective protocols, ICC for test-retest reliability was 0.992 for the Para-ACE, 0.961 for the mTBRS-XT and 0.990 for the TBRS-XT protocol. One participant only was assigned to the Tetra-ACE protocol, thus ICC was not calculated.

Discussion

This study described the content and test-retest reliability of an individualized approach for VO2peak testing developed for a clinical context. The outcomes reported reflected the recommendations for reporting by Eerden et al. including VO2peak, RER, workload at the end of the protocol and reason for termination [2]. Criteria for VO2peak were defined as the highest mean VO2 (L/min) recorded over a 30 s sample, and a corresponding respiratory exchange ratio (RER) > 1.0. Although low, RER > 1.0 has been used previously and recommended as criterion in people ≥65 years [2, 16]. Reaching maximal VO2 defined as reaching a plateau in VO2 despite an increased work rate may be challenging for some patients and sedentary people, and therefore this definition was not used [16]. Participants in our study exercised until volitional exhaustion.

The protocols used were based on previous studies and a systematic review in which small increments in workload resulting in termination of the test between 8 and 12 min is recommended [2, 14, 15]. Protocol length for the maximal exercise tests described was in general 6–15 min with one study reporting termination after 4.51 min in people with tetraplegia using arm cycle ergometry [2]. Therefore we considered a protocol shift if test duration was shorter than 5 min or exceeded 15 min and volitional fatigue was not reached, even if RER exceeded 1.0.

VO2peak is highly dependent on the level and completeness of SCI as well as testing modality. This study included participants with various neurological levels and SCI severity (AIS A-D) regardless of mobility status at discharge from rehabilitation. Mean VO2peak (1.91 L/min (95%CI: 1.31–2.51), obtained at T1 (T1a or T1b) was higher than previously reported in studies on VO2peak at discharge from inpatient rehabilitation [6, 9, 22]. These studies only included wheelchair-dependent participants (AIS A–D) in the age of 18–65 years. Mean VO2peak reported in two of the studies was 1. 32 L/min (±0.37) and 1.21 L/min (±0.4) in people with paraplegia, who performed a peak wheelchair test on a motor-driven treadmill [6, 9]. However, both studies had more participants included. The third study used a handcycle protocol and reported a VO2peak of 1.38 L/min (±0.54) [22]. The last study included people with an incomplete SCI but all participants were tested on a wheelchair protocol at discharge from rehabilitation resulting in a mean VO2peak of 1.05 L/min (±0.5) [23].

The higher mean VO2peak in our study was most likely due to the high proportion of participants classified as AIS D, and a high LEMS, making it possible for them to use more muscle mass during testing. This was probably the reason why the participants were categorized as “excellent” according to the SCI reference values, because these were primarily based on a sample of untrained people with motor complete SCI [21]. However, the high proportion of participants classified as AIS D in our sample was more representative for individuals with SCI capable of performing a VO2peak test during rehabilitation in the Nordic countries, where 65–80% of individuals with SCI are discharged with AIS D in Denmark and Norway [24]. In individuals not capable of performing a VO2peak test, and thus excluded from participation, limited motor function was the most common reason as expected. VO2peak increased with more incomplete injuries supporting the rationale of using different test modalities (Fig. 2a, c) (Table 1). In participants using the Para-ACE protocol, mean VO2peak was 1.51 L/min (95%CI: 1.10–2.04), which was also slightly higher than previously reported at this timepoint [25]. Despite the higher mean VO2peak the participants were deconditioned compared to reference values for able bodied [20].

Eight participants did not reach RER > 1.0 at T1a but in four participants a protocol shift to a less demanding ACE protocol was not possible. One was already tested on the least demanding ACE protocol, and three participants had upper extremity or neck problems. Smaller increments in workload per stage could have enhanced the participants’ ability to reach VO2peak, securing a stronger relationship between workload and oxygen uptake, but the pre-programmed protocols in the recumbent stepper were not possible to adjust manually [2].

Four participants made a protocol shift to the more demanding TBRS-XT protocol. Two participants did not improve VO2peak, while the other two improved VO2peak and reached a RER > 1.0. This illustrated that it was challenging for the testers to decide the appropriate protocol in some motor incomplete participants. However, while the effect of a protocol shift was inconclusive the shifts illustrated that even though the two recumbent stepper protocols were quite similar, the choice of protocol made a difference in those improving VO2peak illustrating the importance of individualized protocols.

The relative reliability for VO2peak in this study was high, with ICC ranging from 0.96 for the mTBRS-XT protocol to 0.99 for the Para-ACE protocol, when compared to other reliability studies involving ACE and wheelchair protocols (ICC 0.81–0.83) [26, 27]. However, the interpretation of the ICC values in this study must be taken with caution due to the heterogeneous sample of participants, which may increase the between-participants variability and improve the ICC [28].

Compared to wheelchair protocols, where wheelchair propulsion technique may have an impact on performance, the technical and biomechanical demands on the equipment used in our study seemed modest [29]. In addition, the participants in our study were accustomed to the equipment as it was used during rehabilitation, which may have contributed to limiting some of the expected variability.

Physical deconditioning which may be a major risk factor for cardiometabolic disease was present in all participants, why continual exercise testing of people with SCI throughout lifespan has been recommended [1]. In this perspective, selecting an individualized test modality may ensure that the VO2peak presumed to be most optimal is reached. The VO2peak test provided an opportunity to address the evidence-based exercise guidelines for adults with SCI, recommending moderate and hard exercise intensities [30]. A clinical observation was that many participants stated they had not exercised at this intensity during rehabilitation. Some stated that the VO2peak test, including the test result, motivated them to exercise at these intensities and that the VO2peak test provided an experience of what moderate and hard exercise intensities feel like. This observation contributes to the clinical relevance of testing VO2peak during rehabilitation.

Limitations and strengths

This study described the test-retest reliability of VO2peak tests in a heterogeneous sample, where most participants were male classified as AIS D and tested using one of the recumbent stepper protocols. This is also a strength as it largely is in accordance with the patient population today in our SCI clinic. In this study, a low number of participants were tested in each test modality and reliability needs to be further established in future studies with more participants included. However, testing reliability in a clinical inpatient setting with different clinicians performing the tests was a strength. A limitation of the study was the small sample size due to consecutive enrollment of participants with a new SCI who were able to perform the test at discharge from rehabilitation. However, this aspect also describes the feasibility of performing a VO2peak test among a heterogeneous group of patients in rehabilitation which is a strength.

Conclusion

Test-retest reliability of the protocols tested was high based on ICC values and narrow limits of agreement, but further investigation of reliability is needed due to the heterogeneous sample used and the low number of participants. An individualized approach to exercise testing seemed to yield more optimal results. Mean VO2peak was higher than previously described at discharge from inpatient rehabilitation. This was probably due to a large proportion of motor incomplete participants. Thus, the sample was representative of people with SCI undergoing rehabilitation in the Nordic countries and capable of performing a VO2peak test. Despite the higher VO2peak physical deconditioning was present in all participants. While exercise testing has been recommended, future studies performed during rehabilitation should investigate the effect of individualized test modalities on VO2peak depending on motor functioning. Also, establishing the reliability of VO2peak based on the recumbent stepper test modalities, with a larger sample size, is needed.

Data archiving

Data will be available by request from the corresponding author.