Introduction

Post stroke dysphagia (PSD) is common affecting upwards of 40% of patients in the hours to days after ictus, and is associated with poor outcome manifest as increased death or dependency, aspiration and pneumonia, and malnutrition1. PSD can be identified by screening and clinical bedside assessments, or diagnosed instrumentally (videofluoroscopy, VFS; fibreoptic endoscopic evaluation of swallowing, FEES); screening devices are also in development2. The severity of aspiration may be quantified using VFS or FEES, and is typically measured using the penetration aspiration scale (PAS)3. Similarly, a number of scales exist for grading the severity of clinical dysphagia based on oral intake, such as the functional oral intake scale (FOIS)4, and the dysphagia severity rating scale (DSRS)5.

The DSRS is a clinician rated scale that was developed from the dysphagia outcome and severity scale (DOSS)6. It grades how severe clinical dysphagia is, by quantifying how much modification is required to fluids and diet, as well as level of supervision, for safe oral intake. The DSRS comprises three subscales that are totalled to give a score ranging from 0 (best) to 12 (worst). The subscales are five-level ordinal assessments of fluid and dietary intake and supervision; each ranges from normal (score 0) to no intake (4) (Table 1).

Table 1 Dysphagia Severity Rating Scale (DSRS).

As with the DOSS, which ranked independence levels according to the Functional Independence Measures (FIM) model and was linked to severity7, supervision on the DSRS was also divided into independence levels; however, the DSRS does not require a VFS to be performed. To date, the DSRS has been used in several published trials of PSD5,8,9,10. The DSRS is copyright free and open access for research use. The aim of the present study was to test and describe the validity of the DSRS in patients with recent stroke. Consensual, content, concurrent criterion, and predictive criterion validity, and internal consistency, inter- and intra-rater reliability, sensitivity to change, and minimal clinically important difference were each assessed. Additionally the relationship with the FOIS, a validated dysphagia scale4, was examined.

Methods

Approvals, informed consent and ethical approval

This validation study of the DSRS used a mix of prospectively-collected data from completed clinical studies and newly collected prospective data; in each case, non-attributable anonymised data were analysed. The completed trials each had national ethics approvals and patients (or surrogates) had given written informed consent, this covering subsequent data analyses; an individual patient data metanalysis has already been published using the three pilot trials. For survey data, the University of Nottingham Faculty of Medicine Research Ethics Committee assessed that a full review by the committee was not indicated as the requests were distributed via professional networks; participation in the surveys was voluntary and anonymous and all data collection was performed in accordance with relevant guidelines and regulations set out by the University. Clinical audit data were collected by members of the clinical team and did not need research ethics approval. The authors will share a subset of anonymised individual patient trial data with the international VISTA Collaboration11.

Validation

Multiple approaches were taken to validate the DSRS including determining consensual, content, concurrent criterion and predictive criterion validity12. Additionally, internal consistency, inter- and intra-rater reliability, sensitivity to change, and minimal clinically important difference (MCID) were determined.

Data sources

Validation assessments used data from all trials and unpublished studies that are currently known to have used the DSRS as an outcome. These included raw data from published trials of pharyngeal electrical stimulation (PES)5,8,9,10, an individual patient data meta-analysis of the first three of these PES trials13, and unpublished studies (NCT03499574, NCT03700853). All studies involved patients with acute and/or sub-acute stroke.

Consensual and content validity were assessed from an anonymous survey sent to 20 UK-based Speech and Language Therapist (SLT) experts with experience of working with adults with acquired dysphagia. Relevant additional information was provided to the respondents regarding the background and purpose of the scale, and what patient group it was designed for. Similarly, to establish the MCID, an anonymous survey was distributed to a number of UK professional networks of SLTs with experience of working with adults with acquired dysphagia.

Consensual validity

This is the validity of a test determined by its general acceptance in the community of users, or by the number of users who judge it to be valid. Consensual validity was assessed by asking respondents to rate 5 scenarios using the DSRS, as recently used in validation of the International Dysphagia Diet Standardisation Initiative (IDDSI) functional diet scale14. Scenarios required respondents to rate recommendations of full amounts of oral intake, minimal and consistent oral trials, liquid only diets and accompanying levels of supervision. Respondents were asked to provide additional comments at the end of the survey. Excellent or good agreement were considered acceptable.

Content validity

This refers to the extent that a test includes all aspects of its construct, including relevance and comprehensiveness. Relevance was assessed using the content validity index (CVI), an indicator of inter-rater agreement that asks experts to appraise how relevant items are;15,16 it is particularly appropriate to use on instruments that have scales with multiple items15,16. Experts considered the relevancy (score 1 for not relevant, to 4 for highly relevant) for each item on each sub-scale. Item (I-CVI) and scale (S-CVI) level indices were calculated according to Polit15,16. In parallel, the subscales were assessed for comprehensiveness and whether their wording was clear.

Concurrent criterion validity

This demonstrates how well DSRS correlates with other stroke-related clinical and radiological measures taken at the same timepoint. These included radiological aspiration (penetration aspiration score, PAS by VFS3), swallowing (Toronto bedside swallowing screening test [TOR-BSST]17, using the sum of the 14 components rather than just the dichotomous pass/fail score; and FOIS4), neurological impairment (National Institutes of Health stroke scale, NIHSS), disability (Barthel index, BI18), dependency (modified Rankin scale, mRS18), and quality of life (EuroQoL 5-dimension 3-level, EQ-5D-3L; EuroQoL visual analogue scale, EQ-VAS19). Associations were performed at all available timepoints, typically at baseline, and on and after treatment, using Spearman’s correlation coefficient.

Predictive criterion validity

This demonstrates how well DSRS at baseline correlates with the stroke-related clinical and radiological measures assessed at a later timepoint; the measures are those as identified immediately above for concurrent criterion validity, and analysed using Spearman’s correlation coefficient.

Internal consistency

This assesses how well the components of the scale relate to each other and is a measure of scale reliability. The interrelation between scores from the three subscales were assessed using Cronbach’s alpha20. Data sources were the STEPS, Vasant and PHAST-TRAC trials, and anonymised clinical audit data from a stroke ward as determined by a Speech and Language Therapist (JB) and Research Practitioner (AH).

Inter/intra-rater reliability

These are the degree of agreement among raters, and among repeated measurements by one rater, respectively. Inter-rater and intra-rater reliability was performed by JB and AH using the same audit data as used for internal consistency. Both measures of reliability were assessed using the inter-class correlation (ICC).

Sensitivity to change

This is also known as responsiveness21 and refers to how well an instrument identifies longitudinal changes, in a proportionate manner16. Changes in the DSRS during the rehabilitation phase after stroke, i.e., from study baseline to final follow-up, were assessed using data from the STEPS trial.

Minimal clinically important difference (MCID)

The MCID is the minimum difference in a score that is considered valuable and changes patient management22. MCID was assessed in three different ways through assessment of statistical distribution (both half standard deviation and standard error of mean), anchor, and consensus through a survey23,24,25. Data for analysis of statistical distribution and anchor methods came from the STEPS trial and an individual patient data meta-analysis of three pilot trials of PES9,13. The survey involved UK-based SLTs. The survey was sent to a number of professional networks and it was up to the discretion of the network administrators whether the survey was forwarded.

Relationship between DSRS and FOIS

The DSRS and FOIS measure overlapping aspects of clinical dysphagia although they have an opposite direction of severity. Their relationship and interconversion were determined through mapping equivalent levels and using data from studies that measured both in parallel. Where a range of values was estimated the median of these is given.

Statistical analyses

In addition to the specific analyses detailed above, standard approaches were used to present results as number (%), median [interquartile range, IQR] or mean (standard deviation, SD).

Results

Trial individual patient data

Four trials of pharyngeal electrical stimulation after stroke have been performed where DSRS was recorded: Jayasekeran, Vasant, STEPS and PHAST-TRAC5,8,9,10. Data on DSRS and other clinical and radiological measures were available at baseline and variously at days 2, 14, 30 and 90. The clinical characteristics of patients by baseline DSRS are shown for these studies (Supplementary Table I), note for all supplementary tables, please see online resource. The mean age was 71 (SD 12) years with 109 (38%) female, mean onset to randomisation of 21 (SD 17) days; the most common clinical syndrome was partial-anterior circulation, 92 (43%) and just 3 (1%) patients had a posterior syndrome; 211 (85%) participants had an ischaemic stroke and 38 (15%) an intracerebral haemorrhage. A ceiling effect was noted at baseline with 139 (48%) patients having a maximum DSRS score of 12. Increasing dysphagia impairment, assessed using the DSRS, was significantly associated with time from onset to randomisation, worse neurological deficit (NIHSS), stroke type, dependency (modified Rankin scale), disability (Barthel index), swallow screening (component score on TOR-BSST), radiological aspiration (PAS) and non-oral feeding state (Supplementary Table I).

Consensual validity

As Speech and Language Therapists (SLTs) are the primary clinicians who treat dysphagia in the UK, anonymous surveys were sent to 20 invited UK based SLT clinicians. Between eight and ten respondents rated each scenario. Seventy percent of respondents had 10+ years’ experience. The areas of expertise of the respondents was: stroke (5), head and neck (2), dementia (1), other (2). Consensus was excellent (100%) for recommendations of full oral intake; moderate (78%) to low (56%) for minimal oral trials of liquids (e.g. 5 sips) and solids (e.g. 5 tsps.) respectively, and high (89%) and moderate (78%) for consistent oral trials of liquids (e.g. 100 ml) and solids (e.g. half portions of diet) respectively (Supplementary Table II). Consensus was excellent for scoring liquid-only  fluids (100%) but not the accompanying diet component of this scenario (63%) which means this component, overall, had a moderate consensus. Supervision scores were high (80–100%) for full oral intake, high (89%) for minimal oral trials and moderate (67%) for consistent oral trials (Supplementary Table II). Respondents’ comments requested clarification on how to score consistent amounts of oral trials and liquid diets.

Content validity

Ten of the 20 invited UK-based SLTs responded to the anonymous survey. This is an acceptable number of expert views for undertaking content validation of an instrument15. All but two components of the DSRS sub-scales had “excellent” relevance (I-CVI > 0.90); “pudding consistency” was good and “selected textures” was fair (Table 2). At a scale level, both the fluid and food scale achieved an S-CVI/Ave rating of 0.84 (good) and the supervision scale a rating of 0.96 (excellent). Expert feedback regarding wording and comprehensiveness are given in Supplementary Table III; many of these related to the lack of mention of IDDSI26 in the DSRS definitions, a point we address in the Discussion.

Table 2 Content validity of DSRS sub-scales assessed by 10 UK speech and language therapists.

Concurrent criterion validity

Data were available for all four trials5,8,9,10. In the largest (STEPS), DSRS at baseline and weeks 2 and 13 was associated significantly and in appropriate directions with measures, at the same time points of aspiration (PAS using VFS), swallowing (TOR-BSST), disability (Barthel index) and dependency (modified Rankin scale) (Table 3). At 2 weeks post randomisation, DSRS was also associated with impairment (NIHSS). DSRS was not related to quality of life measures at 13 weeks post randomisation. The three sub-scale components of the DSRS (fluids, diet and supervision) were also each associated significantly with aspiration at all three time points. Similar magnitudes of associations were seen in the smaller studies of Jayasekeran5 and Vasant8 (Table 3) although associations did not always reach significance in these studies8. Overall, associations were stronger between DSRS and measures of swallowing and aspiration then with global measures of impairment (NIHSS), disability (BI) and dependency (mRS).

Table 3 Concurrent criterion validity - Relationships between DSRS and clinical and radiological assessments at a variety of timepoints in trials of pharyngeal electrical stimulation. (Spearman’s rank correlation coefficient).

DSRS was strongly negatively correlated with FOIS at day 2 and week 13 in the PHAST-TRAC trial (Table 3); the association could not be performed at baseline since all participants had a DSRS of 12/FOIS of 1 as part of the trial’s inclusion criteria27.

Predictive criterion validity

Using data from the STEPS trial, baseline DSRS was associated with radiological aspiration (VFS PAS) at 2 and 13 weeks; and swallowing (TOR-BSST), disability (BI) and dependency (mRS) at 2 weeks (Table 4). There was no association with impairment (NIHSS), or quality of life (EQ-5D-3L, EQ-VAS). The three DSRS sub-scale components (fluids, diet and supervision) at baseline were also each associated significantly with radiological aspiration at 2 and 13 weeks.

Table 4 Predictive criterion validity - Relationships between DSRS at baseline with clinical and radiological assessments on or after treatment in trials of pharyngeal electrical stimulation. (Spearman’s rank correlation coefficient).

Associations between baseline DSRS and post-treatment measures in the trials of Jayasekeran and Vasant were not statistically significant. It was not possible to assess the relationship between baseline DSRS and post treatment FOIS in the PHAST-TRAC trial since all participants had a baseline DSRS score of 1227.

Internal consistency

The interrelation between the scores from the three subscales, at various timepoints, were assessed using Cronbach’s alpha20 using data from STEPS, Vasant and PHAST-TRAC trials8,9,10. Internal consistency was “Good” at baseline, varied between “Good” and “Excellent” over the first two weeks, and “Excellent” at 12 weeks (Supplementary Table IV). Similarly, audit of clinical data by JB and AH revealed “Excellent” consistency between the subscales (Supplementary Table V).

Inter/intra-rater reliability

DSRS was scored in 31–58 hospitalised stroke patients by JB and AH. The inter-rater reliability was “Excellent” for DSRS with intra-class correlation (ICC) = 0.955 (95% confidence intervals 0.925, 0.973); similarly, the intra-rater reliability was “Excellent” (Table 5). Assessments within the subscale were mostly excellent with one good and one moderate result.

Table 5 Intra- and inter-rater reliability for DSRS and subscales assessed using the intra-class correlation. Each rater scored data on two occasions separated by a month.

Sensitivity to change

DSRS scores were sensitive to spontaneous recovery for patients with acute/subacute PSD, declining during follow-up in STEPS with modal values of 12, 3 and 0 at weeks 0 (baseline), 2 and 13 respectively (Fig. 1). Similarly, the median (7, 4, 1) and mean (7.6, 4.9, 2.7) values declined at the same timepoints. As with VFS-PAS, DSRS was sensitive to treatment with pharyngeal electrical stimulation in a meta-analysis of three pilot trials being 1.7 points lower (p = 0.040) in the PES group as compared with the control group13. In contrast, the STEPS trial was neutral for the effect of PES on VFS-PAS and there was no difference in DSRS scores between treatment groups9.

Figure 1
figure 1

Histograms of distributions of Dysphagia Severity Rating Scale from STEPS trial. At baseline (n = 154), mean 7.6 (3.8), median 7.08, mode 12; at week 2 (n = 131), mean 4.9 (3.7), median 4.05, mode 3; at week 13 (n = 106) mean 2.7 (3.9), median 1.03, mode 0.

Minimal clinically important difference (MCID)

The survey was based on 84 responses from UK based SLTs, the majority of whom had more than 10 years’ experience. It was not possible to estimate the number that received the survey therefore response rate could not be calculated. MCID varied between 0.3 and 2.5 with all three approaches - statistical, anchor and survey - identifying a MCID of 1.0 as being important (Supplementary Table VI).

Relationship between DSRS and FOIS

FOIS could be extrapolated from DSRS scores (Supplementary Table VII); however, some combinations of DSRS subscale scores are incongruent from a clinical perspective (e.g. use of thickened fluids when taking a normal diet) and so these have no equivalent FOIS value. Conversely, DSRS could be estimated from FOIS although in most cases it was not possible to determine subscale results since the subscales of supervision and fluids above level 3 are not scored on the FOIS (Supplementary Table VIII).

The PHAST-TRAC trial recorded both DSRS and FOIS at multiple post-randomisation timepoints (days 2, 4, 6, 8, 10, 30 and 90)10. The frequency of paired scorings is shown in Supplementary Table IX. The inverse nature of DSRS and FOIS is noted and percentages match the estimated equivalents in Supplementary Tables VII and VIII.

Discussion

This comprehensive assessment of the DSRS suggests that it is a valid tool for grading dysphagia severity (based on oral intake and supervision requirements) in patients with post-stroke dysphagia. Using data from four randomised controlled trials and 2 surveys, the DSRS was found to exhibit consensual validity, content validity, concurrent criterion validity, predictive criterion validity and internal consistency. Once operationalisation of scoring for certain feeding scenarios was undertaken, inter- and intra-rater reliability were “excellent” when used in a clinical audit, and the minimal clinically important difference approximated to 1 unit irrespective of the method of estimation. The DSRS was sensitive to change during the natural resolution of dysphagia seen through the sub-acute and rehabilitation phases after stroke, and in response to treatment with pharyngeal electrical stimulation in some trials. The intrinsic relationship between DSRS and FOIS allowed these two dysphagia scales to be mapped to each other.

The main strength of this study is the large number and variety of detailed validations performed. We also provide data on minimal clinical important difference and a means for interconverting the DSRS and FOIS. Second, much data came from two phase III trials (STEPS, PHAST-TRAC) rather than just a number of smaller studies. Third, patients with a range of post-stroke severity were included in these validations, with mild-to-moderate patients coming from three trials5,8,9 and more severe ones from one10. Fourth, a large amount of clinical and radiological outcome data were available. This showed that overall, the DSRS was highly correlated with another clinical measure of dysphagia severity (FOIS). Measures of aspiration (VFS-PAS) and swallowing (TOR-BSST) were more strongly correlated than global measures of impairment, disability, and dependency (although these still showed some significant correlations) and (perhaps surprisingly) the DSRS was not correlated with a generic health status measure of quality of life.

There are a number of caveats to the study. First, although all trial protocols gave some guidance on how to use the DSRS, it was not the primary outcome measure in any study and was largely done according to local practice. Hence, the DSRS scores, whilst prospectively collected, are potentially less accurate than could be achieved with formal training and this was reflected in the consensual validity exercise and respondents’ accompanying comments. In particular, there was less consensus for scoring patients on oral trials and liquid diets, as noted previously14. There was also less consensus on assigning supervision scores for patients on consistent amounts of oral trials, i.e. respondents found it easier to score supervision for patients either on full oral intake or limited trials. It is important that raters routinely using the DSRS clearly specify supervision level when making recommendations following the clinical bedside assessment. In the updated version of the DSRS (in Table 6), we provide rules for scoring supervision, including assigning diet, fluid and supervision scores for oral trials.

Table 6 Updated Dysphagia Severity Rating Scale incorporating International Dysphagia Diet Standardisation Initiative (IDDSI) levels25.

Second, the DSRS was devised and first used in 2010 and so antedates the 2017 IDDSI scale for determining levels of fluid thickness and modified food textures26. Further, DSRS measures different domains from IDDSI. Nevertheless, some comments by respondents in our assessment of content validity commented on the fact that the DSRS does not contain IDDSI terminology regarding wording and comprehensiveness. Going forward, we have proposed a redefinition of the DSRS to reflect IDDSI descriptors (Table 6) and plan to validate this updated scale in due course. Third, although the associations between DSRS and other radiological and clinical measures in the trials of Jayasekeran and Vasant5,8 were similar in magnitude to those seen in the STEPS trial, most were statistically non-significant due to their much smaller sample size and so reduced statistical power. This emphasises the importance of having large data sets when performing validation studies of clinical scales. Last, the distribution of DSRS will depend on the population of patients being studied and timing after stroke, and ceiling and floor effects are present at different times after stroke; for example in STEPS, one third of participants had a maximum score of 12 at baseline (reflecting the trial’s inclusion criteria) and a minimum score of zero 13 weeks later after natural resolution of dysphagia; this situation is analogous with other scales used in stroke, e.g. the Barthel Index28.

In summary, this study has shown that the 12-level DSRS is robust in terms of consensual, content, concurrent criterion and predictive criterion validity. Further, it shows “good-to-excellent” internal consistency, “excellent” inter- and intra-rater reliability, is sensitive to natural and therapeutic change, and has a minimal clinically important difference of 1 point. However, distribution of scores will depend on patient population and time post-onset. Specific guidance for accurate use of the DSRS is provided in the updated version, which includes corporation of the new IDDSI descriptors. Overall, our results suggest the DSRS is a valid tool for grading the severity of dysphagia in stroke; its ease of use make it relevant for use in clinical service delivery and clinical trials to define baseline dysphagia severity and assess the effect of natural history or therapeutic change.