Introduction

Obstructive sleep apnea-hypopnea syndrome (OSAS), manifesting with repeated interruptions of ventilation during sleep in the form of apneas or hypopneas, is a prevalent sleep disorder which remains under-diagnosed and, consequently, under-treated. Its reported prevalence differs widely from 2 and 5% to almost 61% and 84% in women and men, respectively1,2,3. Several population-based studies have reported that OSAS escalates the risks of hypertension, vascular diseases, as well as metabolic and endocrine disorders, underscoring the need for a timely diagnosis and treatment4,5,6,7. One of endotypes recognized among OSAS patients is a positional disease, characterized by disordered breathing, occurring mainly in the supine sleeping position8,9. Therefore, diagnosing positional OSAS may have some clinical merit as avoiding sleep in the supine position may be a simple measure to treat this disorder10,11.

Polysomnography (PSG) is a diagnostic gold standard, however, not widely available. Therefore it seems useful to assess the individual risk of OSAS, i.e. by dividing patients into low and high risk groups, to prioritize its use. In effect, numerous papers have been published so far on the diagnostic value of single signs and symptoms or their combinations grouped in questionnaires12,13,14,15. One of them is STOP-BANG questionnaire (SBQ), based on a set of four symptoms and four signs. It was originally developed to stratify post-operative complication risk related to OSAS in a group of adult patients before a major surgery16,17. A recent meta-analysis proved that SBQ is a superior tool for detecting mild, moderate, and severe OSAS compared to other questionnaires but the clinical merit was affected by the population on which it was tested18. In comparison to a general population, patients referred to a sleep clinic are typically more symptomatic with higher prevalence of OSAS, which may affect the validity of SBQ18. Indeed, some investigators reported limited utility of SBQ in this group of patients19,20,21. We have recently reported that a single variable – BMI, can change pre-test probability of OSAS in a population of patients referred to a sleep center due to a presumptive diagnosis of OSAS. Because of high sensitivity of BMI in this group of patients, especially observed in the lateral sleeping position (LP), it was possible to exclude with high probability (98%) moderate or severe OSAS (apnea-hypopnea index, AHI ≥ 15) in LP, identifying a group of patients who could be qualified for positional treatment, even before making a PSG-based diagnosis22. The exclusion of at least moderate OSAS in LP based on normal BMI may be of some clinical value, as a PSG-based diagnosis is usually delayed due to limited resources. As BMI is only one of eight clinical variables, used in SBQ, we wondered whether the latter could make a more substantial change regarding the probability of OSAS occurrence, probably restricted not only to the sleep in the lateral position. Thus, we hypothesized that SBQ could have a high negative predictive value in identifying a subgroup of patients with low OSAS probability because of potential high sensitivity. Moreover, as male gender adds one point to SBQ score, we wondered whether the sex could differentiate SBQ performance. To the best of our knowledge, SBQ has not been used yet to assess the probability of OSAS in LP, and affecting a particular sex in a population referred to a sleep center, due to a presumptive diagnosis of OSAS.

Methods

We performed a retrospective study in the sleep center of the Department of Sleep Medicine and Metabolic Disorders, Medical University of Lodz, Poland. The database included 1,171 patients who underwent polysomnography between 2013 and 2015. All the patients were referred due to a presumptive diagnosis of OSAS, based on typical, not mutually exclusive complaints, including: snoring (78.6%), apneas (72.3%), sleepiness (50.6%) or unrefreshing sleep (32.6%). The SBQ scoring was based on the symptoms and signs that were part of our standard patient’s chart, filled in at the first ambulatory visit. After exclusion, of 48 patients, there were1,123 patients left who were eligible for an analysis (847 males). Exclusion criteria were the following: sleep time duration shorter than 3 hours (n = 36) or absence of at least one of the SBQ items (n = 12). For the analysis in the lateral sleeping position, only 977 patients were eligible, because 146 patients had less than 0.5 h sleep duration in LP, which was arbitrarily considered too low to assess the valid AHI index for this position. All patients gave their informed consent to the sleep study. The Ethics Committee of Medical University of Lodz approved the study protocol (RNN/23/15/KE).

Polysomnography

Patients were admitted to the sleep center at 21:00 hours (±0.5 hour) and were subjected to general physical examination, which involved measuring the body weight and height for the purpose of BMI calculation. They were ready to fall asleep between 22:00 and 23:00 hours due to approximately 0.5–1.0 hour needed to place sensors and electrodes. A standard nocturnal PSG was performed by recording the following channels: electroencephalography (C4\A1, C3\A2), chin muscles and anterior tibialis electromyography, electrooculography, measurements of oro-nasal air flow (a thermistor gauge), snoring, body position (a gravitational gauge placed on the sternum), respiratory movements of the chest and abdomen (piezoelectric strain gauges), unipolar electrocardiogram and hemoglobin oxygen saturation (SaO2) (Sleep Lab, Jaeger – Viasys, Hoechberg, Germany). Sleep stages were scored according to criteria, based on a 30 s epoch standard23. Apnea was attained after reducing air flow to less than 10% of the baseline, for at least 10 seconds. Hypopnea was defined as at least 30% reduction of air flow accompanied by 4% or greater decrease in SaO2 or an arousal24. All methods were carried out in accordance with relevant guidelines and regulations; electroencephalography arousals were scored according to American Academy of Sleep Medicine guidelines25.

Study variables

Sleep study evaluation provided variables related to sleep disordered breathing, i.e. a standard AHI calculated for total sleep time (AHI-total) and separately, for sleep in the lateral position (AHI-side). Relevant PSG and clinical variables are presented in Table 1.

Table 1 Summary of clinical and sleep study variables.

STOP BANG questionnaire

SBQ is a tool that combines evaluation of eight items, presumably risk factors for OSAS. It includes a subjective 4-item questionnaire (STOP) and a 4-item clinical assessment (BANG)16. SBQ includes four yes/no questions: S - “Do you Snore loudly (louder than talking or loud enough to be heard through closed doors)”, T - “Do you often feel Tired, fatigued, or sleepy during daytime?”, O - “Has anyone Observed you stop breathing during your sleep?“ and P - “Do you have or are you being treated for high blood Pressure?”. The BANG part of the questionnaire consists of four additional demographic queries: body mass index (BMI) over 35 kg/m2 (B), Age over 50 years, Neck circumference over 40 cm, and male Gender. For each question, answering “yes” scores 1, a “no” response scores 0, thus the total score ranges from 0 to 8 in men and 0 to 7 in women. We applied the standard cut-off of 3 points as a positive test result. Moreover, we also evaluated all scored levels of SBQ as cut-offs for their ability to change pre-test probability and to find the score with either the highest positive predictive value (PPV) or negative predictive value (NPV) in order to verify OSAS diagnosis. We did it for a standard AHI, calculated for the whole sleep (AHI-total) and separately, for AHI calculated for sleep in the lateral sleeping position (AHI-side), Table 2. As the male sex is one of OSAS risk factors and SBQ items, by definition women scored one point less than men. Therefore, we made a sub-analysis separately for males and females to verify whether this systemic bias influenced the analysis of sensitivity, specificity and predictive values of SBQ in OSAS diagnosis (Tables 3 and 4).

Table 2 Sensitivity and specificity with corresponding PPV, NPV, 95%Cl and Youden index for SBQ with different cut- off values from 1 to 8 due to TST > 3.0 hours N = 1135; TST > 3.0, TST(side) > 0.5 hours N = 977.
Table 3 Sensitivity and specificity with corresponding PPV, NPV, 95%Cl and Youden index for females with SBQ with different cut- off values from 1 to 7 due to TST > 3.0 hours N = 276; TST > 3.0, TST(side) > 0.5 hours N = 233.
Table 4 Sensitivity and specificity with corresponding PPV, NPV, 95%Cl and Youden index for males with SBQ with different cut- off values from 1 to 8 due to TST > 3.0 hours N = 847; TST > 3.0, TST(side) > 0.5 hours N = 744.

Statistics

The data were analyzed with Statistica 12 (TIBCO Software Inc, Palo Alto, California, United States) with a medical pack. Data distribution was tested with the Shapiro-Wilk test. The predictive values of SBQ score for AHI ≥ 5 and ≥15 events/h were calculated separately for total sleep (TS) and sleep in the lateral position (LP). Moreover, the same analysis was conducted separately for males and females. The ratios with 95% confidence intervals of sensitivity, specificity, PPV, and NPV were calculated by creating 2 × 2 contingency tables; chi square test with Yates correction were used to compare the corresponding values, calculated for TS and LP. A value of P < 0.05 was considered significant.

Compliance with Ethical Standards

The study was founded by an institutional grant of the Medical University of Lodz 503/0-079-06/503-01. An informed consent was obtained from all individual participants included in the study. The Ethics Committee of Medical University of Lodz approved of the study protocol (RNN/23/15/KE). All methods were carried out in accordance with relevant guidelines and regulations; electroencephalography arousals were scored according to American Academy of Sleep Medicine guidelines.

Results

Overall, the prevalence of OSAS in patients referred to a sleep center, defined as AHI ≥ 5, accompanied by typical symptoms, was high - 76.6%, while AHI ≥ 15, compatible with at least moderate OSAS diagnosis was observed in 51.6% of subjects. Substantially less patients revealed AHI ≥ 5 in LP, namely 43.2% and only 28.9% of the cohort presented with AHI ≥ 15 in this sleeping position. Detailed clinical and sleep related variables, also in relation to gender, are presented in Table 1.

Total sleep time analysis

The standard SBQ cut-off of 3 was very sensitive (0.969) but not specific (0.167); therefore, there was a small change in the probability of OSAS diagnosis – PPV for SBQ ≥ 3 was 0.792, while NPV reached 0.620, which was a substantial change from pre-test 0.234 (p < 0.001), but too low to definitely exclude diagnosis. Moreover, only 71 subjects out of 1,123 scored less than 3, which further compromised clinical usefulness of SBQ at this level. Only 29 patients scored 8 and one had AHI < 5 (a false positive); thus, at this level, SBQ was 0.996 specific with PPV of 0.966. The highest Youden index was calculated for SBQ ≥ 5, namely 0.319 which yielded 0.863 and 0.374 PPV and NPV, respectively. Similar results were obtained for at least moderate OSAS (AHI ≥ 15). Three or more points score revealed very high sensitivity (0.983) with low specificity (0.112), which yielded PPV of 0.542 and NPV of 0.859, a substantial change in the probability from 0.484, but still too low to rebut the diagnosis of at least moderate OSAS (p < 0.001). The highest PPV of 0.931 was calculated for SBQ of 8 (n = 29), while the highest Youden index of 0.294 – for SBQ ≥ 5. Results of the Bayesian analysis at different SBQ levels vs. OSAS severity for TS are summarized in Table 2.

Lateral sleeping position time analysis

In the group of 977 patients, eligible for an analysis in the lateral sleeping position, 422 revealed AHI ≥ 5 which yielded a pre-test probability of 0.432. Similarly, to TS analysis, at the standard cut-off of 3, SBQ was sensitive (0.995) but not specific (0.052) for OSAS diagnosis at AHI ≥ 5, thus in the same manner as for TS, it affected probabilities: PPV of 0.44 was not different from the pre-test probability, while NPV reached 0.935, a substantial rise from pre-test 0.568 (p < 0.001). Nevertheless, akin to TS analysis, the significance of this finding was diminished by the fact that only 31 out of 977 analyzed patients scored less than 3. Up from SBQ score of 3, PPV rose with each level to reach the highest value of 0.79 for SBQ of 8. So, unlike for TS, SBQ did not reach high specificity and, as a result, high PPV; only 19 patients scored 8 but 4 of them were false positive, i.e. they revealed AHI < 5 in this sleeping position. The highest Youden index was observed at SBQ ≥ 6 (0.325) with PPV and NPV of 0.626 and 0.703, respectively. Pre-test probability of not having at least moderate OSAS (AHI ≥ 15) in LP was 0.711 (i.e. 1 – prevalence). All 31 patients that scored less than 3 were true negatives, which yielded NPV of 1.0 (p < 0.001). Interestingly, 105 subjects scored less than 4 (which is about 11% out of 977) and only 6 were false negatives, which contributed to NPV of 0.943 (p < 0.001). Similarly to AHI ≥ 5 in LT, the highest Youden index was calculated at SBQ ≥ 6 with PPV and NPV of 0.479 and 0.844, respectively and PPV at SBQ = 8 was below 0.8. The results of Bayesian analysis at different SBQ levels vs. OSAS severity for sleep in LT are summarized in Table 2.

Total sleep time analysis in relation to gender

As it could have been expected, the prevalence of OSAS (AHI ≥ 5) and at least moderate OSAS (AHI ≥ 15), was lower in females. It reached 64.9 and 38.8% (n = 276) vs. 80.4 and 55.8% (n = 847) in males, respectively, p < 0.001. SBQ in females at the level ≥ 2, which corresponds to ≥3 in men, was of similar high sensitivity but low specificity (Tables 3 and 4). Interestingly, 28 women (10.1%) scored ≥6 and they were all true positives for AHI ≥ 5, which yielded PPV of 1.0 and was similar to the value of 0.957, calculated for men at SBQ ≥ 7. Conversely, 15 women (5.4%) scored less than 2 and they were true negatives for AHI ≥ 15, allowing for NPV of 1.0, while the same value of NPV was obtained only for 2 men that scored 1 (0.2%).

Lateral sleeping position time analysis in relation to sex

Akin to the TS analysis, the prevalence of OSAS (AHI ≥ 5) and at least moderate OSAS (AHI ≥ 15) in LP was lower in females. It reached 34.8 and 18.5% (n = 233) vs. 45.8 and 32.1% (n = 744) in men, respectively, p = 0.004 and <0.001. SBQ < 3 in females for AHI ≥ 5 revealed high NPV of 0.957, i.e. only 1 patient out of 23 that scored less than 3, had AHI ≥ 5 (a false negative); while the NPV for AHI ≥ 15 was 1.0 at this level of SBQ – all 23 patients were true negatives, out of 233 females feasible for the analysis. In males, at SBQ ≥ 3 for AHI ≥ 5, only 8 patients scored less and one was a false negative, which yielded NPV of 0.875; for AHI ≥ 15 at this level, NPV was 1.0 but the value of this finding was diminished by the fact that only 8 males out of 744 (1.1%) scored less than 3 in SBQ. PPV in males never exceeded 0.8, while in females for SBQ of 7, it was 1.0; however, only 4 women scored that high. All detailed results concerning the analysis related to gender are shown in Tables 3 and 4.

Discussion

Originally, SBQ was conceived as a screening tool for surgical patients to identify those with a substantial risk of complications, probably related to OSAS, in order to provide adequate post-surgery care16,26. The aim of our study was to evaluate SBQ as a tool which would enable to stratify patients into groups with high and low risk for OSAS. Consequently, we wondered whether this screening tool can be of any value when applied to a different population, i.e. the one that has already been screened by typical signs and symptoms and in consequence, with relatively high OSAS prevalence, and in effect, pre-test probability.

The pre-test probability of OSAS, as defined by AHI ≥ 5 plus typical symptoms (e.g. daily somnolence, unrefreshing sleep), was substantially higher in our cohort (ca 77%) than it was reported for a general population - from 5 to 60%, depending on diagnostic criteria and the studied population1,2,3. So, it seems quite improbable that SBQ would have such a high specificity to yield positive predictive value close to 1.0, thereby confirming diagnosis without a need for PSG evaluation. Conversely, it can be expected that as a high sensitivity screening tool, it can yield high negative predictive value, identifying a subgroup of patients with very low OSAS probability that do not urgently need a PSG examination. And indeed, our analysis showed that SBQ can rule in or rule out diagnosis but only at extreme values, which makes its clinical usefulness doubtful, due to a low number of subject scoring so low or high.

Our data are consistent with those obtained in a recently published meta-analysis, which reported sensitivity of 0.95 at SBQ ≥ 3 for AHI > 5. We observed sensitivity of 0.97, but due to low specificity, resulting PPV was only 0.78, which did not change the pre-test probability of 0.7618. Another study, reporting relatively high sensitivity at the cost of low specificity, was conducted on a group of patients with atrial fibrillation27. A very similar study regarding the type of population and the number of subjects was conducted by Farney et al.28. However, their results are different from ours, since more patients who underwent the SBQ, scored less than 3 (222 out of 1,426, i.e. 15.6%). However, this discrepancy may be related to a higher percentage of women in the cohort, namely 42%, who on average, score at least 1 point less due to the SBQ construct. This was supported by our data, e.g. for TS analysis, 55 females (19.9%) scored less than 3 vs. only 16 males (1.9%). Moreover, authors included all patients that underwent PSG, not only those suspected of OSAS. However, similarly to our results, NPV of SBQ < 3 was affected by a large number of false negatives – 134 (60%) at the AHI level of 5, and 63 (28%) at the AHI level of 15. Another study comparing the value of different questionnaires vs. polygraphy/polysomnography yielded similar results of SBQ sensitivity, specificity and related predictive values taking into account higher pre-test probability of OSAS (90%) and at least moderate OSAS (69%)29. In a recently published meta-analysis that summed up 11 studies, conducted on sleep clinic populations, the pooled sensitivity and specificity were 0.90 and 0.49, and 0.94 and 0.34 for diagnosis of OSAS or at least moderate OSAS, respectively20. These data differ substantially from those originally published by Chun et al.16, who reported much higher specificity (60 and 53%, at AHI ≥ 5 and AHI ≥ 15, respectively) in a population of in-hospital, pre-surgical patients. These discrepancies can be plausibly explained by differences in the investigated populations.

We also included AHI level of at least 15 in our analysis. It was done on purpose, as this level is generally accepted for commencement of CPAP treatment30,31. The majority of patients scored ≥3 and the change in probability in favor of diagnosis, or conversely, against it, was too low to change the decisions regarding the need for PSG examination. According to rules of the Bayesian analysis, any diagnostic test has the highest chance to influence the probability of diagnosis both ways if the pre-test probability is around 0.5. In our study group, this requirement was met for the analysis of OSAS in LP, where NPV was above 0.9 and 1.0 for AHI ≥ 5 and AHI ≥ 15, respectively at SBQ < 3. But in our previous report, similar results were revealed for normal BMI; BMI had high sensitivity >90% for AHI > 5 and AHI > 15 in LP and we concluded that normal BMI alone can be a useful diagnostic tool to exclude patients with AHI ≥ 15 in the lateral sleeping position32. As BMI is a part of SBQ, it was reasonable to expect that adding other clinical variables could make this tool more powerful at changing the disease probability. This was also a rationale for adding BANG part to STOP by Chung et al., who expected to increase sensitivity and NPV16. Nevertheless, in the total sleep analysis of our cohort, NPV at SBQ ≥ 3 did not reach probability >0.9 for AHI ≥ 5 or AHI ≥ 15, and only a small fraction of patients scored less than 3, which makes clinical usefulness of SBQ doubtful in this population. It fared better in the lateral sleeping position; NPV of SBQ ≥ 3 for AHI ≥ 15 in LP reached 1.0, but akin to TS analysis, only 31 (3%) patients scored less. When the level of SBQ was raised to ≥4, 105 patients (11%) scored 3 or less and at this level, NPV was 0.943. To compare, in our previous study, normal BMI alone (≤25 kg/m2) yielded similar NPV, i.e. 0.975 when applied to the same population. So it seems that in order to exclude OSAS diagnosis in LP, normal BMI is similar or even better than the combination of symptoms and signs in SBQ. One explanation for the absence of difference in favor of SBQ versus sole BMI may be the fact, that BMI level that scored positive in SBQ was arbitrarily set at the level of at least 35 kg/m2, substantially higher than normal BMI level used in our previous study. Another issue with the SBQ construct is the sex-related bias which moves women with similar symptoms, demography and history of hypertension one point down. This definitely influences the Bayesian analysis and, in consequence, must affect PPV and NPV at a given cut-off point. For example, high risk women would score maximum 7 which impedes specificity and related PPV of the SBQ score <8. Conversely, at the level <3, a higher SBQ score in women could have contributed to, lower sensitivity and, and in effect, lower, NPV. And indeed, what seems interesting, in our gender sub-analysis, higher percentage of women revealed extremes of SBQ score, i.e. ≥6 or <3, and at these levels, PPV or NPV usually reached a value of 1.0, which was not noted in men. Therefore, SBQ may be a more reliable tool to exclude OSAS diagnosis in females, but this finding needs to be confirmed in a prospective study.

Our analysis for TS was similar to results of other studies, conducted on similar populations of patients, referred to a sleep lab with a presumptive OSAS diagnosis. Chung et al. suggested that SBQ does not provide sufficient predictive values for individual clinical decision-making, regarding the need or urgency for PSG and therapy32. Overall, despite high sensitivity, low specificity was responsible for low PPV. Conversely, NPV was quite high but at the cut-off level of 3 it was compromised by a low number of patients scoring less in this population. At the SBQ level of 4 or more, NPV was similar to normal BMI and at this level, the number of referred patients, scoring less than 4 or with normal BMI, was comparable, namely ca 10%. Alternatively, a recent meta-analysis proved that SBQ is a superior tool for detecting mild, moderate, and severe OSAS in comparison to other questionnaires, but overall, its clinical value seems limited, especially in the sleep lab setting18.

Advantages and limitations of this study

SBQ as a diagnostic tool is not sensitive or specific enough to yield high NPV or PPV values in a population with high prevalence of OSAS, due to a substantial number of false negatives and false positives. This quite accurately repeats the results of previous studies conducted on similar populations of patients, referred to a sleep lab28,29. Nevertheless, we expanded the study by providing the analysis of OSAS in the lateral sleeping position and sex and obtained interesting results concerning changes in OSAS probabilities.

One of weaknesses of the present study is a selection bias as our results apply to a preselected group, i.e. Caucasian patients with high pre-test probability of disease, who are not really representative for the general adult population. However, this was our intention to test the utility of SBQ on the population, referred to a sleep lab with a presumptive diagnosis of OSAS. Another one is a retrospective design of the study; our results need to be repeated in a prospective study.

In conclusion, it seems that patients referred to a sleep lab are pre-screened by the signs and symptoms which render this population with high OSAS prevalence. SBQ applied to this selected population is not specific enough to significantly change pre-test probability of disease at the most frequent SBQ score levels from 3 to 5. Conversely, it has some value, as patients scoring less than 4, and especially women demonstrate very low probability of at least moderate OSAS in the lateral sleeping position. Positional treatment, although still a subject of an ongoing debate regarding the method, feasibility criteria, long-term efficacy and compliance, can be an option in patients with this OSAS endotype. In clinical practice, it means that a positional treatment can be safely tried in this selected group of low SBQ score while awaiting PSG assessment10,11.