Polysomnographic phenotyping of obstructive sleep apnea and its implications in mortality in Korea

Conventionally, apnea–hypopnea index (AHI) is used to define and categorize the severity of obstructive sleep apnea. However, routine polysomnography (PSG) includes multiple parameters for assessing the severity of obstructive sleep apnea. The goal of this study is to identify and categorize obstructive sleep apnea phenotypes using unsupervised learning methods from routine PSG data. We identified four clusters from 4,603 patients by using 29 PSG variable and arranged according to their mean AHI. Cluster 1, spontaneous arousal (mean AHI = 8.52/h); cluster 2, poor sleep and periodic limb movements (mean AHI = 12.16/h); cluster 3, hypopnea (mean AHI = 38.60/h); and cluster 4, hypoxia (mean AHI = 69.66/h). Conventional obstructive sleep apnea classification based on apnea–hypopnea index severity showed no significant difference in cardiovascular or cerebrovascular mortality (Log rank P = 0.331), while 4 clusters showed an overall significant difference (Log rank P = 0.009). The risk of cardiovascular or cerebrovascular mortality was significantly increased in cluster 2 (hazard ratio = 6.460, 95% confidence interval 1.734–24.073) and cluster 4 (hazard ratio = 4.844, 95% confidence interval 1.300–18.047) compared to cluster 1, which demonstrated the lowest mortality. After adjustment for age, sex, body mass index, and underlying medical condition, only cluster 4 showed significantly increased risk of mortality compared to cluster 1 (hazard ratio = 7.580, 95% confidence interval 2.104–34.620). Phenotyping based on numerous PSG parameters gives additional information on patients’ risk evaluation. Physicians should be aware of PSG features for further understanding the pathophysiology and personalized treatment.


Variable reduction analysis.
A two-step variable reduction analysis was conducted. Briefly, clinically relevant PSG variables were selected, followed by principal component analysis-based dimension reductions and K-means cluster analysis 18 .
PSG results consisted of 62 variables (Supplementary Table S1). Among them, 29 PSG variables were selected (Table 1). Three practicing sleep clinicians (JW Kim, IY Yoon, and SW Cho) at the Seoul National University Bundang Hospital reviewed and selected seemingly clinically relevant variables. To identify phenotypes solely using PSG parameters, other measurements, such as age, sex, and anthropometric measurements, including body mass index (BMI) and neck circumference, were excluded. Moreover, sleep questionnaire scores, including the Epworth Sleepiness Scale (ESS) score and Pittsburg Sleep Quality Index (PSQI) score, were not used as variables. Redundant variables (eg. AHI in non-supine position) that could be easily extracted from the other variables (AHI, supine AHI, and percent sleep time in supine position) were excluded. As we considered to analyze patients whose total sleep time is more than 4 h, total sleep time, time in bed (sum of sleep latency, time of wake after sleep onset, and total sleep time) were not considered. Instead, we selected sleep efficiency which could represent these time variables. PSG variables were categorized into the domains of sleep architecture, breathing disturbance, desaturation, limb movement, and arousal, based on known mechanisms of cardiovascular consequences of OSA 4,19 which is similar to a recent study by Zinchuk et al. 13 . Although most of the variables are overlapping with those from an aforementioned study, several different variables were selected. Durations of respiratory events were included for analysis since these variables are known to differ among patients with similar AHI, with different cardiovascular outcome 20,21 . The central apnea index was not included in variable reduction or cluster analysis, since among the variables in the apnea index category, the distribution of the central apnea index was the most skewed [skewness = 19.4, with D (4,603) = 0.417 and P < 0.001 by Kolmogorov-Smirnov test for normality], and therefore, was not suitable for standard principal component analysis 22 . AHI is a composite of variables that it could be easily calculated as sum of apnea index and hypopnea index. However, as AHI is the most commonly reported statistically significant predictor of cardiovascular morality and all-cause mortality, AHI was included in our study. Finally, sleep position in OSA was also considered. Therefore, AHI in supine position, time percent of supine position during sleep was considered.
After variables for analysis has been selected, patients with missing variables were excluded.
Scientific RepoRtS | (2020) 10:13207 | https://doi.org/10.1038/s41598-020-70039-5 www.nature.com/scientificreports/ cluster analysis. Variables were standardized with a mean of zero and standard deviation of one. Using these variables as input, principal components analysis was used to derive a set of low-dimensional components while still preserving as much variance as possible. Selection of components that retained at least 75% of the total variance (the sum of variances of all individual principal components) were determined 23 (Supplementary  Table S2). After application of varimax rotation to maximize item variance and simplify interpretability, scores of given principal components representing PSG features were acquired for each subject 24 . The number of clusters from this dataset was then estimated via the gap static 25 with 500 bootstraps. The suggested number of clusters was 4 (Supplementary Figure S1). Thereafter, K-means cluster analysis was performed using the factors. Cluster validation had been performed by calculation of Dunn index and average silhouette width index 26 (Supplementary Table S3).

Mortality status.
Outcomes of different clusters were evaluated using mortality status. Deaths that occurred up to December 31, 2016, were identified by matching each subject's Korean Identification Number and name with death records provided by Statistics Korea (https ://kosta t.go.kr). The date and primary cause of death were collected from the national death statistics. Causes of death were classified based on the underlying causes described in the deceased's death certificate, as recommended by the World Health Organization 27 . Disease-specific causes of death in the following categories were evaluated: (1) cardiovascular or cerebrovascular diseases, such as stroke, acute myocardial infarction, acute ischemic heart disease, sudden cardiac arrest, and cardiac dysrhythmias, (2) cancer, (3) fatal events including car accidents injuries, falls, and suicide, (4) others, including pneumonia, gastrointestinal bleeding, end-stage renal disease, amyloidosis, etc 3 .
Statistical analysis. PSG and other quantitative clinical variables were described separately for each cluster using mean and standard deviation. Comparisons between clusters were performed using one way analysis of variance for continuous variables, and the chi-squared test for categorical variable. The relationships between clusters and both disease-specific and all-cause mortality rates were evaluated using Kaplan-Meier survival analysis. A Cox proportional hazards regression analysis was used to estimate hazard ratios (HR) and 95% confidence intervals (CI), which were adjusted for age, sex, BMI, and underlying medical condition. Cox-Snell resid-

Results
General characteristics of patients. Final analysis was performed with 4,603 patients (PSGs). The mean AHI was 22.6 ± 22.3 events/h. According to current classification for OSA severity based on AHI, the number of patients in each severity category was 1,194 (25.9%), 1,081 (23.5%), 975 (21.2%), and 1,353 (29.3%) for AHI < 5 (normal), 5 ≤ AHI < 15 (mild), 15 ≤ AHI < 30 (moderate), and AHI ≥ 30 (severe), respectively. There were significant differences in age, sex, BMI, ESS, and PSQI according to current classification for OSA severity (P < 0.001). General patient characteristics are shown in Table 2 cluster characteristics. Twenty-nine PSG parameters were reduced into nine factors, and based on these factors, K-means cluster analysis resulted in four OSA subgroups. Subsequently, the four clusters were rearranged according to their mean AHI, similar to conventional categorization of OSA severity categorized based on AHI. A summary of cluster characteristics is described in Table 3. All 29 PSG variables used for analysis differed significantly among the four clusters (P < 0.001). Other clinical parameters that were not used for analysis, such as age, BMI, ESS, and PSQI, were also significantly different among the four clusters (P < 0.001). Cluster labeling was based on AHI severity and differences in PSG features between clusters. Distribution of patients' OSA severity, according to the AHI within each cluster, is described in Fig. 1. Cluster 1. Cluster 1 was the largest among the 4 clusters (n = 2,563, 55.7%). The mean AHI was 8.52 ± 7.43/h, the lowest among all other clusters. This cluster tended to have the lowest proportion of stage 1 non-rapid eye movement (NREM) sleep, 9.27 ± 4.85% of total time analyzed, while other sleep stages were longer compared to other clusters. This cluster had the highest spontaneous arousal index (5.28 ± 4.13/h) among all clusters. This cluster was labeled as "normal to mild OSA with spontaneous arousal". Cluster 2. Cluster 2 was the least common, comprising only 7.6% (352/4,603). The mean AHI was 12.16 ± 11.61/h, significantly higher than that of cluster 1 (P < 0.001). This cluster had the longest sleep latency (26.19 ± 31.42 min) and REM latency (155.99 ± 96.94 min) and the lowest sleep efficiency (74.65 ± 11.03%). The most significant PSG parameters in this cluster were periodic limb movement (PLM) index score (61.96 ± 33.34/h) and PLM associated arousal index (10.93 ± 11.58/h). This cluster was the oldest with a mean age of 63.91 ± 10.36 years. ESS score was the lowest (8.44 ± 4.98) in this cluster, while PSQI score was the highest (9.40 ± 7.15). This cluster was labeled as "normal to mild OSA with poor sleep and PLMs".  30,31 . Because the BMI of cluster 4 was the highest, the PSG features were compared for subgroup analysis according to BMI status (BMI ≥ 25 kg/m 2 , and BMI < 25 kg/m 2 ) to evaluate the impact of obesity. Among 410 patients, 68 were non-obese (BMI < 25 kg/m 2 ) while 342 were obese (BMI ≥ 25 kg/m 2 ). The mean AHI was significantly higher in obese patients than in non-obese patients (71.02 ± 14.29 vs 62.80 ± 13.92, P < 0.001). However, there was no significant difference in AHI during REM over AHI during NREM, ratio of supine AHI over lateral AHI, and fraction of hypopnea index between the 2 groups. Interestingly, the mean total apnea-hypopnea duration was significantly higher in non-obese patients ( Table S5. Kaplan-Meir survival analysis showed a significant difference in cardiovascular or cerebrovascular disease-specific mortality, according to clusters (Log rank, P value = 0.009, Fig. 2a). Clusters 2 and 4 showed 6.4-and 4.8-fold greater risk of mortality due to cardiovascular or cerebrovascular disease-specific mortality, respectively, compared to that in cluster 1 (Table 5). However, after adjustment for age, BMI, sex, and underlying medical condition only cluster 4 had a significantly higher risk for disease-specific mortality as compared to cluster 1 (HR = 7.580; 95% CI 1.852-31.029; P = 0.005). However, according to the conventional OSA classification based on AHI severity, there was no significant difference in disease-specific mortality (Log rank, P value = 0.331; Fig. 2b). Moreover, Cox proportional hazards model demonstrated no significant risk of mild to severe OSA compared to normal individuals with AHI less than 5/h after adjustment for age, BMI, sex, and comorbidities (Table 6). For all-cause mortality, both conventional OSA classification and clusters showed a significant difference. After adjustment for age, BMI, sex, and underlying medical condition, cluster 4 and severe OSA patients had significantly higher risk of all-cause mortality as compared to cluster 1 and normal individuals. However, the clusters showed no significant differences in mortalities due to other causes (Supplementary  Figs. S2 , S3; Tables S5 , S6)

Discussion
The current classification of OSA is based on AHI severity only, since each OSA severity is known to have a different outcome as shown in a longitudinal cohort study 4 . The goal of this study was to identify multiple OSA phenotypes based on various PSG features and to assess whether these PSG features were distinct from each other. It is prudent to understand the meaning of PSG features, including AHI, to identify and understand the pathophysiology of OSA. Numerous studies have reported on the importance of other PSG parameters besides AHI [5][6][7][8] ,however, these studies mainly focused on specific PSG parameters. There is a paucity of studies that have simultaneously analyzed and compared all PSG parameters. All-cause mortality was significantly different for both conventional classification of OSA and cluster-based classification. The adjusted HR for all-cause mortality in severe OSA group was significantly higher compared to normal individuals. However, mortality differed significantly only among the clusters when cardiovascular or cerebrovascular disease were considered in the analysis. After adjustment for age, sex, BMI, and underlying comorbidities, cluster 4 had significantly higher risk of mortality due to cardiovascular or cerebrovascular diseases as compared to cluster 1, while patients with severe OSA did not have a significantly higher risk compared to normal individuals. The current result contradicts our previous report, which stated that patients with severe OSA had significantly higher risk of mortality due to cardiovascular and cerebrovascular diseases as compared to normal individuals. However, this contradiction seems to be related to the recent decreasing trends of mortality due to cardiovascular and cerebrovascular diseases in Korea 32,33 .
Although both cluster 1 and cluster 2 are mostly composed of patients with normal to mild OSA, they represent different clinical outcomes. Cluster 2 is significantly older, and is characterized by a high PLM index and low sleep efficiency. It is known that prevalence of PLM increases with age 34,35 and it is also known to be associated www.nature.com/scientificreports/ with OSA, such that PLM syndrome (PLMS) is more common in patients with sleep-disordered breathing than in the general population 36 . One study demonstrated a decreased PLM after CPAP therapy in mild OSA, suggesting the development of PLM as one of the consequences of OSA 37 . However, PLM sometimes newly appears after the initiation of positive airway pressure therapy 38 and therefore the direct causal relationship between PLMS and OSA is poorly understood PLMs are followed by arousal-related nervous system events, which manifest as cortical activity, and heart rate increases significantly in the first 5 s after the PLM 39 . Previous studies have advocated the importance of PLM since events may be representative of decreased dopaminergic activity or increased activation of the autonomic nervous system, which may be related to increased vascular consequences 7,13,34,40 . In our study, cluster 2 had a significantly higher risk of disease-specific mortality compared to cluster 1. However, after adjustment for age, increased risk in cluster 2 was no longer significant. Therefore, whether poor outcomes in cluster 2 are due to age, or other clinical findings possibly related to characteristic PSG features needs to be further validated.
Another important cluster was cluster 4. This cluster was characterized by the most severe type of OSA with the highest AHI. Contrary to the other clusters, this cluster had the lowest proportion of hypopnea, ratio of REM AHI to NREM AHI, and ratio of supine AHI to lateral AHI. Although age, BMI, and ESS were not included in the clustering, this cluster had the highest ESS score, highest BMI, and the youngest age. Severe obesity and high www.nature.com/scientificreports/ ESS scores seem to be the cause and effect, respectively, of cluster 4. Increased sleep efficiency in this cluster may be the result of excessive daytime sleepiness 41 . In Korea, severe obesity is more prevalent in younger adults 42 , which may explain the relatively young age of cluster. Patients in cluster 4 had the highest age-adjusted cardiovascular/cerebrovascular mortality. Similar trend was observed for patients with severe OSA (AHI ≥ 30), a statistical significance was lacking, suggesting that not all patients in severe OSA carry a similar cardiovascular/cerebrovascular risk. PSG measurements representing the cumulative exposure to hypoxemia during sleep (i.e. time of O 2 saturation < 90%), or depth and duration of hypoxia induced by respiratory distress, are known to be strongly associated with cardiovascular mortality 43,44 . However these measurements are only moderately associated with AHI, and therefore may not be captured by frequency based metrics only. Time of O 2 saturation < 90% and mean apnea hypopnea duration were highest in cluster 4 than in the other clusters suggesting that patients in cluster 4 are grouped by combinations of PSG measurements associated with adverse cardiovascular outcome. This seems to result in additional cardiovascular/ cerebrovascular mortality risk in cluster 4. Cluster 3, which had a mean AHI of 38.6 ± 13.65/h, had a higher level of fraction of hypopnea, ratio of REM AHI to NREM AHI, and ratio of supine AHI to lateral AHI compared with cluster 4. This cluster had lower mean apnea-hypopnea durations. The fraction of hypopnea, sleep stage dependency, position dependency, and duration of breathing disturbance are known to be related to other pathophysiological traits. For example, increased fraction of hypopnea may indicate decreased arousal threshold, while increased apnea-hypopnea duration with desaturation may indicate increased arousal thresholds 45 . The ratio of REM AHI to NREM AHI may also indicate alterations in muscle responsiveness with higher values suggesting higher muscle responsiveness 45,46 . Although most of the patients in clusters 3 and 4 belonged to the same category of severe  www.nature.com/scientificreports/ OSA, the PSG characteristics and clinical outcomes were different between the two clusters, which may suggest different underlying pathophysiologies. As the mean BMI was the highest in cluster 4, severe obesity may have resulted in different PSG characteristics and disease outcomes in this cluster. The AHI, ratio of REM AHI to NREM AHI, ratio of supine AHI to lateral AHI, and severity of nocturnal hypoexmia are known to be correlated to BMI 30,31 . However, severe obesity cannot explain all the observed features of cluster 4. In this cluster, 16.6% (68/410) of the patients were non-obese (BMI < 25 kg/m 2 ). Although the mean AHI and mean ODI scores were higher in the obese group, mean AHI in the non-obese group was still high (62.85 events/h), and there was no significant difference in the ratio of REM AHI to NREM AHI and ratio of supine AHI to lateral AHI between the obese and non-obese groups.
Cluster analysis with variables limited to breathing disturbance and desaturation (such as AHI, ODI, time of O 2 saturation < 90%, and lowest O 2 saturation) may be a simpler approach compared to our study which used 29 PSG variable from 5 different domains. One recent study demonstrated that oximetric parameters were able to describe a different phenotype with a high risk of mortality among patients with moderate to severe OSA 47 . However, our study revealed how other PSG features were grouped together so that we could characterize each cluster from many other perspective. In either way, these analytic methods may ultimately help us to understand the pathophysiology of OSA by phenotyping patients with PSG features beyond AHI.
The current study has several limitations. First, no adjustment was made for patient treatments. CPAP therapy is the gold standard procedure to treat OSA; however, the long-term, definitive effects of CPAP to prevent adverse cardiac events, stroke, or death are still controversial [48][49][50] . Some of the patients in our study population may have been using CPAP therapy. However, until 2018, CPAP therapy was not covered by the insurance system in Korea. Therefore, CPAP treatment was used as an off-label therapy and no systematic data are available on the use of CPAP therapy. Presently, CPAP treatment is covered by insurance, therefore, future prospective studies to validate the effect of CPAP treatment in disease prevention are necessary. Our clustering algorithm may improve the efficacy of CPAP therapy for certain patient groups. Second, there were no adjustments for the treatment of underlying comorbid diseases. Although serum cholesterol, smoking history were collected, these were not adjusted in survival analysis. These are the limitations that originated from the retrospective study design. A relatively short follow up period and therefore a low event rate is another limitation. A longer follow up period or perhaps prospective multicenter studies, with more detailed clinical information may overcome these limitations. In addition, this is a single center study mainly composed by a Korean population, therefore, generalization should be avoided as ethnic factors such as diet, craniofacial morphological differences, obesity, and prevalence of comorbidities are known to be different from Western countries 51,52 . Finally, K-means clustering may not be the optimal clustering algorithm for the current dataset. Theoretically, K-means clustering assumes that all clusters lie within a sphere with the same radius. When this equal-radius, spherical assumption is violated, as in the case of elliptically distributed data, K-means clustering can behave non-intuitively. Moreover, the number of K of groupings is fixed, and therefore, groupings would be easily violated by a small number of outliers 53 . Therefore, our results need to be further validated by other clustering algorithm.

conclusion
This study identified four distinct clusters of patients with OSA solely based on the various PSG features. There was a significant difference in disease outcome among the clusters, and such a difference could not be found in the standard classification of OSA based on AHI severity scores. Distinct characteristics among clusters imply different underlying pathophysiologies. Based on these phenotypes, further understanding and personalized treatment of OSA can be made. Our findings suggest that physicians should consider the AHI score along with other PSG parameters in their assessment of patients with OSA.