Introduction

Patients suffering from spinal cord injury (SCI) experience numerous symptoms affecting their quality of life (QoL). One of these symptoms is neurogenic bowel dysfunction (NBD). The most frequent manifestations of NBD are fecal incontinence (FI) and constipation. Imaginable, this bowel dysfunction can impact a patient’s life in several areas. For example, it can lead to a reduced participation in social activities [1].

The pathophysiology of NBD is quite well studied in patients with SCI. It is known that the internal anal sphincter has an exaggerated smooth muscle response to rectal distention, which induces large rectal contractions. These contractions are associated with deep anal reflex and will most likely result in defecation without any obvious increase in intra-abdominal pressure. Because of the unpredictable nature of the spinal reflex, this mechanism makes FI a difficult problem for patients with SCI [2]. Furthermore, constipation in patients with SCI is most likely due to prolonged colonic transit time, and side effects of medication (e.g., anticholinergics) and immobilization [3]. Other contributing factors are, e.g., loss of sensory function at the level of the rectum and perineum, incapability of active contraction of the pelvic floor muscles, variable loss of abdominal muscle contraction, and hence creation of intra-abdominal pressure. Most of patients experience both constipation and FI [4].

The NBD score is a questionnaire-based symptom score/patient-reported outcome (PRO) measure that was developed for clinical assessment of colorectal and anal dysfunction in patients with SCI [4].

Since bowel dysfunction can be a sensitive subject to discuss for patients, physicians should be aware of such dysfunction, and start the conversation about this subject and possible symptoms.

At this moment, there is no validated Dutch instrument available to measure NBD that can also be used as a tool to assess the impact on QoL in patients with SCI. The validation of the NBD score in the Dutch language will help to evaluate NBD objectively in the spinal cord-injured patients in the Netherlands. This PRO measure will make follow-up after therapy and clinical research possible and can function as an instrument to open up the conversation about NBD between patients and physicians. Implementation of the NBD score in guidelines for the care of patients with SCI can help standardize these conversations. The aim of this study is to validate the Dutch-language NBD score in spinal cord-injured patients, so bowel dysfunction can be measured and followed over time or after treatment. Furthermore, such validated NBD score could be an important addition to value-based healthcare for neurogenic patients, in which all aspects of pelvic health are measured and evaluated with PRO’s.

Methods

Study design and study population

We conducted this prospective cohort validation study at the urology department of the Erasmus University Medical Center Rotterdam and a general practitioner (GP) office. The institutional Medical Ethics Review Board reviewed the research proposal and gave ethical approval (MEC 2018-1050).

Between August 2018 and February 2020, adult patients with SCI visiting our urology department were invited to participate in this study. Patients were not eligible for this validation study if they had difficulty reading and/or understanding the Dutch language, suffered from cognitive impairment or inflammatory bowel disease, had a recent gastroenterological malignancy or a bowel stoma. After signing the informed consent form, participants were asked to fill in a set of questionnaires at baseline. A second set of questionnaires was handed out on paper and returned by post after 1–2 weeks. Questionnaires were excluded if they were returned ≥35 days after the first set was completed. Patients’ clinical characteristics were retrieved from their medical files.

For this validation process, we used data of a control group that was collected earlier for a similar trial in patients with multiple sclerosis (MS). This reference group was recruited at the participating GP office in Rotterdam, the Netherlands. Adult patients visiting the GP in September/October 2018 were asked to participate, while waiting for their appointment. Patients were found eligible if they had no difficulty reading and/or understanding the Dutch language. Exclusion criteria were similar to the patient group. All participants signed informed consent and references filled in the same set of questionnaires once.

Questionnaires

The NBD score is a questionnaire developed for patients with SCI and consists of symptom-based questions covering both constipation and FI [4]. The 10 multiple-choice questions have weighted answer options. Higher total scores are representing more severe bowel dysfunction (0–6 very minor, 7–9 minor, 10–13 moderate, and 14 or more severe). Some minor linguistic adjustments have been made after the original publication, and a scale question was added on general satisfaction of the current bowel management [5]. The latter is supposed to give the physician an insight of possible needs of the patient. Since this question was not included in the original questionnaire, and therefore not validated, we decided to omit this question in the current validation study. See attachment 1 for the Dutch-language version.

Three additional questionnaires were included in the set for the validation process. The first questionnaire is the Fecal Incontinence Quality of Life scale (FIQL), which measures the severity of incontinence for gas, mucus, liquid, and solid stool by scoring its frequency [6, 7]. This questionnaire has two different rating options, one for specialists and one for patients. The latter rating option was used for this study. The second questionnaire is the Fecal Incontinence Severity Index (FISI), which scores the severity of FI in patients by asking about the degree of occurrence of four aspects of incontinence: gas, mucus, liquid stool, and solid stool [7, 8]. Similar to the FIQL, this questionnaire has two rating options, a specialist specific and a patient-specific one. Again, for this study, patient-specific ratings were used. The final questionnaire is the European Quality of life 5-Dimension 3-Level questionnaire (EQ-5D-3L), a widely used instrument to measure health-related quality of life (HRQOL). The HRQOL is measured through five dimensions: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. This questionnaire also includes the EuroQol Visual Analog Scale, where patient can rate his or her health state from “best imaginable health state” to “worst imaginable health state”.

Linguistic validation

Standardized guidelines on linguistic validation were followed for the cross-cultural adaptation of the original English-language NBD score into the Dutch language [9]. During this process, three professional native Dutch-speaking translators performed the forward translation individually. During a meeting hosted by the researchers, consensus on the Dutch version of the questionnaire was reached by all translators and clinicians. After minor adjustments not affecting the context of the questionnaire, the translated version of the questionnaire was approved by two medical consultants with clinical experience in the treatment of patients with SCI. To finalize the translation procedure, an English native speaker performed the backward translation.

Measurement properties

Content validity

During the process of cross-cultural adaptation, content validity was assessed during face-to-face interviews with 17 patients at the outpatient’s clinic. These patients were asked to give their opinion on clarity of the questions and easiness to fill in. The researchers also performed an assessment focusing on the questionnaire items and the correlation to the known clinical symptoms, which were found to be adequate.

Internal consistency

Internal consistency determines if items in a questionnaire are correlated, and if these questions measure the same underlying concept. Cronbach’s alpha was measured for the total score of the NBD score, since no subscales are available. Values between 0.70 and 0.95 were considered sufficient, confirming adequate internal consistency [10].

Reproducibility

Reproducibility of a questionnaire is the degree of similarity of answers measured at different time points in a clinical stable situation. The test–retest period of 1–2 weeks was chosen following the recommendations of Terwee et al. [10], to determine the reproducibility of the NBD score. This time period is thought to be long enough to prevent recall bias, but short enough to prevent clinical imbalance. The interclass correlation coefficient (ICC) is used to determine the agreement of repeated test and retest measurements of the total NBD score. An outcome > 0.70 is considered adequate. The limits of agreement (LOA) are calculated as the mean change in scores of repeated measures ± standard deviations (SD) [11, 12].

Criterion validity

Criterion validity is determined by measuring the correlation between the NBD score and a gold standard. Such true gold standard does not exist for the NBD score, so we chose the FIQOL, FISI, and the EQ-5D-3L as a suitable combined gold standard. When linear associations are seen, the Pearson’s correlation coefficient (range −1, 1) is used to measure the correlation of the NBD score with the chosen questionnaires. If no linear association is seen, the Spearman correlation coefficient (range −1, 1) will be used to measure such correlation.

Construct validity

Construct validity was determined by testing the following predefined hypotheses and was considered adequate when ≥75% is confirmed [10].

  • Scores of references on the NBD score will be lower than scores of patients.

  • Patients who score lower on the FIQL will score higher on the NBD score.

  • Patients who score higher on the FISI will score higher on the NBD score.

  • Patients who score lower on the EQ-5D will score higher on the NBD score.

Floor and ceiling effects

When the lowest or highest score possible on a questionnaire is reached by ≥15% of all respondents, floor and ceiling effects are present [10]. These effects were assessed for the total score of the NBD score at baseline, for both the patients and the reference group.

Statistical methods

A sample size of 50 patients and 50 controls was chosen based on guidelines on validation of questionnaires that were followed during this validation study [10]. All statistical analyses were performed using SPSS version 25 (IBM Corp, Armonk, NY). Mean and SD are reported in case of continuous data, and numbers or percentages for categorical data. Differences between patients and references were tested with Chi-square tests for categorical variables and Student’s t test for continuous variables. P values of <0.05 were considered statistically significant.

Results

Between August 2018 and February 2020, patients were asked to participate in this trial. Fifty-eight patients agreed after being informed and the signed informed consent form. Two patients were excluded because the period between the first and the second sets was longer than 35 days. One patient was excluded because only the informed consent form was signed and further participation was declined. Eight patients were not included in the retest analysis, because they did not return the second set of questionnaires.

In September or October 2018, 50 control patients at the GP office agreed to participate in this trial and filled in the set of questionnaires, while waiting for their appointment. These references were significantly younger than the patients and there were significantly more men in the patient group, compared to the reference group. Significant differences of baseline total scores of all questionnaires between patients and references were seen. Characteristics of the patient and reference group can be found in Table 1.

Table 1 Characteristics and baseline scores, presented as mean ± standard deviation or numbers (%).

Content validity

After professional medical translators translated the questionnaire into Dutch, 17 patients were interviewed to assess the validity of content. The questions were found to be relevant, clear, and easy to fill in; content validity was confirmed without the necessity of adjustments.

Internal consistency

Since the NBD score consists of only one total score measurement, if one or more questions are left open, this could not be calculated. For this reason total scores of six patients are missing. Five of these six patients left the same questions open at test and retest, namely, the questions on “digital evacuation of stool” and “frequency of FI”. Regarding the last question, patients stated to miss an answer option “never” and left the question open because no suitable answer option was present. Internal consistency was found to be moderate to low for the total scores of the NBD score, Cronbach’s alpha was measured to be 0.56 for the test and 0.30 for the retest.

Reproducibility

The mean time between completing the first and second sets of questionnaires was 17.4 (SD ± 7.4) days. The mean change between total scores of the NBD score of the test and retest was −0.5 ± 3.46. For the total score of the questionnaire, the ICC was 0.87, indicating adequate reliability. The LOA ranges of the total scores were −7.28 to 6.78.

Criterion validity

Regarding criterion validity, significant correlations were found between total scores of the NBD score and total scores of the FIQL, FISI, and EQ-5D-3L. These correlations were found to be moderate. (Table 2).

Table 2 Criterion validity test.

Construct validity

Good construct validity was found, all predefined hypotheses were confirmed:

  • References had significant lower scores in the NBD score than patients (Table 1).

  • Patients who scored lower in the FIQL had significantly higher scores in the NBD score (Table 2).

  • Patients who scores lower on the EQ-5D-3L had significantly higher scores in the NBD score (Table 2).

  • Patients who scored higher in the FISI had significantly higher scores in the NBD score (Table 2).

Floor and ceiling effects

There were no floor and ceilings effects present in the patient group; one patient (1.8%) had the lowest score possible, no patient had the highest score possible. Floor effects were seen in the reference group, 52% of the participants had the lowest score (0) possible, indicating they experience no neurogenic bowel symptoms. Ceiling effects were not present in the reference group; no one scored the highest score possible.

Discussion

In the present study, the NBD score was translated into the Dutch language and this version was validated in patients with SCI. We demonstrated that the questionnaire is reliable and valid tool to measure NBD, and the impact of such dysfunction on a patient QoL. This enables physicians to evaluate NBD in a quick, easy, and objective manner.

The total scores of the NBD score of patients differed significantly from the references, showing that the NBD score can discriminate between those experiencing problems and those who are not. The reference group was significantly younger than the patient group and there were significantly more men in the patients group (Table 1). This discrepancy has had probably no major influence on the total scores of the NBD score, since the scores significantly differed, and therefore had no influence on the validation process.

Content validity was assessed during face-to-face interviews with 17 patients. In contrast to our other validation study in MS patients, no patients needed clarification on any of the questions [13]. A possible explanation for this is that the questionnaire is developed for SCI patients who are generally familiar with the used terminology like for example “digital evacuation of stool”. In addition, bowel dysfunction in patients with SCI is more consistent and in MS these symptoms are dependent on the course of the disease [3].

Internal consistency was measured for the total scores of the NBD score with Cronbach’s alpha and showed to be moderate to low (0.56 for the test and 0.30 for the retest). These measures are in line with the validation study of the Dutch-language NBD score in MS patients and a previous validation study by Erdem et al. [13, 14]. A low internal consistency alone is not problematic for a validation process according to guidelines on measurements in medicine [15]. The validity of a questionnaire is established by more than internal consistency, if the construct measured is evident, the questionnaire is still valid and reliable. A possible explanation for the lower internal consistency is that both FI and constipation are questioned in limited questions, with weighted answers and no subscales are available.

Criterion validity was measured using relatable questionnaires. Since no gold standard questionnaires for the validation of the NBD score exist, we choose the FISI, the FIQL, and the EQ-5D-3L as proxy gold standards, although the first two questionnaires solely measure FI and not constipation. Since all measurements significantly correlated, as could be seen in Table 2, we state that these questionnaires are suitable for this validation study.

Construct validity was determined by measuring the correlation of the total scores of the NBD score and the FIQL, FISI and the EQ-5D-3L. All hypotheses that were predefined were confirmed, showing good validity of the NBD score.

No floor and ceiling affects were seen in the in the patient group, no patient scored the highest or the lowest score possible. As expected, floor effects were seen in the reference group. A total of 52% of the references had the lowest score possible, indicating no bowel problems.

One of the strengths of this study is that the translation and validation process was performed, according to standardized and widely used guidelines. The adequate sample size was reached for both the patient as the reference group. Another strength is the homogeneous nature of the patient group with SCI, showing good validity and adequate reliability in this specific patient group.

A limitation of this study is the dropout rate for the retest analysis. Eight patients were excluded for this analysis because the second sets of questionnaires were not returned. This retest only consisted of measuring reproducibility of the NBD score. Patients’ characteristics of the excluded patients did not differ from the included patients, except for age. The excluded patients were significantly older, so clear instructions are recommended for the older patient population with SCI.

In conclusion, the NBD score is a valid and reliable tool, which enables physicians to open up the conversation on bowel dysfunction in this specific patients population and evaluate the impact on QoL. In addition, it will make it possible to evaluate different treatment options and it can also be used in future clinical research.

Furthermore, our group has previously validated the SF-Qualiveen on bladder symptoms [16] and the Multiple Sclerosis Intimacy and Sexuality Questionnaire on sexual dysfunction [17]. Together with these PRO’s, the NBD score forms a complete set evaluating pelvic health in neurogenic patients.