Validation of peripheral arterial tonometry as tool for sleep assessment in chronic obstructive pulmonary disease

Obstructive sleep apnea (OSA) worsens outcomes in Chronic Obstructive Pulmonary Disease (COPD), and reduced sleep quality is common in these patients. Thus, objective sleep monitoring is needed, but polysomnography (PSG) is cumbersome and costly. The WatchPAT determines sleep by a pre-programmed algorithm and has demonstrated moderate agreement with PSG in detecting sleep stages in normal subjects and in OSA patients. Here, we validated WatchPAT against PSG in COPD patients, hypothesizing agreement in line with previous OSA studies. 16 COPD patients (7 men, mean age 61 years), underwent simultaneous overnight recordings with PSG and WatchPAT. Accuracy in wake and sleep staging, and concordance regarding total sleep time (TST), sleep efficiency (SE), and apnea hypopnea index (AHI) was calculated. Compared to the best fit PSG score, WatchPAT obtained 93% sensitivity (WatchPAT = sleep when PSG = sleep), 52% specificity (WatchPAT = wake when PSG = wake), 86% positive and 71% negative predictive value, Cohen’s Kappa (κ) = 0.496. Overall agreement between WatchPat and PSG in detecting all sleep stages was 63%, κ = 0.418. The mean(standard deviation) differences in TST, SE and AHI was 25(61) minutes (p = 0.119), 5(15) % (p = 0.166), and 1(5) (p = 0.536), respectively. We conclude that in COPD-patients, WatchPAT detects sleep stages in moderate to fair agreement with PSG, and AHI correlates well.

subjects and OSA patients, Hedner and colleagues found no significant differences in total sleep time (TST) and sleep efficiency (SE), and a moderate agreement (Cohen's Kappa (κ) coefficient of 0.48) in detecting wakefulness, light sleep, deep sleep and rapid eye movement (REM) sleep [12][13][14] . COPD and the medication used by these patients often causes increased sympathetic activity 15,16 . As the PAT signal based on sympathetic activity is central in the WatchPAT sleep scoring and breath analysis algorithm, it is plausible that the accuracy of the WatchPAT in COPD patients may differ from that in OSA patients and the normal population.
In this study we present the results from an investigation where WatchPAT was used to estimate sleep quality and AHI in COPD patients. The WatchPAT results were compared to PSG recordings analysed by two technicians, blinded from each others' results and from the automated WatchPAT score.

Material and Methods
Subjects. Patients attending a 4 week, in-patient pulmonary rehabilitation program at the LHL-clinics, Glittre, Norway, were recruited from September 2016 to June 2017. Prior to inclusion, a diagnosis of COPD in a stable state was verified by pulmonary function tests including post-bronchodilator spirometry, diffusion capacity of the lungs, body plethysmography and blood gas assessment. Exclusion criteria were COPD exacerbation within last 3 weeks, chronic respiratory failure (daytime arterial oxygen pressure (PaO2) ≤ 7.3 kPa or daytime arterial carbon dioxide pressure (PaCO2) ≥ 6.3 kPa), other diseases significantly affecting upper and lower airway function, a prior diagnosis of OSA, non-sinus cardiac arrhythmias, implanted pacemaker, coronary arterial disease with unstable angina pectoris or myocardial infarction last 3 months, uncontrolled hypertension, history of cerebral infarction, use of nitrate medication. All subjects used prescribed medication, but no respiratory depressant drugs were taken from 48 hours prior to first PSG-recording until end of study.
Informed consent was obtained from all individual participants included, all procedures performed were in accordance with the 1964 Helsinki declaration and its later amendments, and the study protocol was approved by The Regional Committee for Medical Research Ethics in Mid-Norway (2016/1360/rek midt). Study design. Double-blind, randomized, crossover, intervention trial in which patients consumed 70 mL of either beetroot juice containing nitrate or placebo immediately before bedtime. Subjects slept three nights semi-unattended in their hospital room, hooked up to PSG and WatchPAT simultaneously. The first night for acquaintance with the equipment (no recording was done), the second and third nights were block randomized to nitrate intervention or control (placebo) according to sex and age. The pre-programmed WatchPAT and the PSG recording were started when the patient went to bed, and stopped by the department nurse when the patient wanted to get up in the morning. The nurse reported on time for lights off. Here, data from the control nights are presented ( Fig. 1), whereas results from the nitrate intervention are published elsewhere.
As PSG is manually scored it implies some inter-rater variability, which should be addressed when comparing agreement with other devices. Thus, the PSG's were scored independently by two experienced registered polysomnography technicians (RPSGT's) according to the rules set by the AASM 18 . An apnea was scored when the nasal pressure dropped ≥90% for ≥10 econds. A hypopnea was scored when the nasal pressure dropped ≥30% for ≥10 seconds, causing a ≥4% drop in SpO2 from baseline. Parameters used for analyses include stage score (Wake, N1, N2, N3, REM sleep) for each epoch as well as TST, sleep efficiency (SE, percentage of sleeping time between lights off and lights on) and AHI. peripheral arterial tonometry. The WatchPAT (version 200, Itamar Medical, Israel) is a wrist worn sleep data recorder, connected to a finger probe with a plethysmograph and a pulse oximeter, and a microphone with an actigraph taped below the sternal notch. Recordings were analysed by the software zzzPAT (Itamar Medical, Israel). Stages (wakefulness, light sleep, deep sleep, REM sleep) were scored in 30-second epochs based on spectral components from the PAT-signal. Respiratory events (apneas, hypopneas) were scored based on PAT-signal amplitude, heart rate and SpO2. The algorithm was set to score hypopneas in the presence of ≥4% drop in SpO2. First, we analysed WatchPAT's agreement with the RPSGT's in scoring sleep and wake by calculating sensitivity (e.g. WatchPAT = sleep when PSG = sleep) and specificity (e.g. WatchPAT = wake when PSG = wake). Then we calculated WatchPAT's positive and negative predictive values, considering the PSG's as gold standard. Subsequently, we performed an epoch-by-epoch comparison of sleep/wake stages, the PSG-scored stages N1 and N2 considered equal to WatchPAT's light sleep. Finally, the agreement in terms of TST, SE and AHI were calculated. All the above mentioned outcomes were also analysed comparing the results from the two RPSGT's.

Statistical analysis.
In their OSA study comparing WatchPAT to PSG, Pang et al. found AHI correlating with Pearson's r = 0.929, 95% confidence interval 0.858-0.965 12 . Assuming we would find an equivalent correlation in COPD patients, and a 5% level of significance with 80% power, a sample size of 15 subjects was calculated a priori.
Agreement was considered when an epoch had the same stage by both scorers (WatchPAT or RPSGT), reported as percentage of agreement and as Cohen's κ coefficients. Other correlations were calculated as Pearson's r. Intra-class correlation coefficients (ICC) was calculated for TST, SE and AHI, and the mean differences in TST, SE and AHI were compared by one sample T-tests, inspection of Bland Altman plots and linear regression of the differences. Two-sided P values of ≤0,05 were considered significant. All analyses were performed using IBM SPSS Statistics version 25.

Results
As shown in Fig. 1, five subjects withdrew after the first night, main reason was fear of not being able to sleep with electrodes on head and body. One sleep study was excluded as the WatchPAT did not start recording.
WatchPAT's ability to detect sleep (positive predictive value) was 85.9% and 85.4%, whereas it correctly scored wake (negative predictive value) in 71.2% and 71.9% of the epochs. WatchPAT correctly predicted non rapid eye movement (NREM) sleep in 80.6% and in 79.8% of the epochs. Other positive predictive values are listed in Table 1.  (Table 2). However, age, gender, COPD severity (as characterised by FEV1, RV/TLC ratio, BMI, PaO2, PaCO2) and daytime sleepiness (Epworth score) showed no significant correlations with the percentage of sleep score agreement or Cohen's κ. In subject #1 and #2, a total of 948 epochs were scored as wake by RPSGT#1, but only 274(28.9%) of these were scored as wake by WatchPAT, the rest as light sleep 526(55.5%), deep sleep 59(6.2%) and REM sleep 89(9.4%). Equivalent numbers were found between WatchPAT and RPSGT#2.
Total sleep time, sleep efficiency and apnea hypopnea index. As shown in Table 3, ICC coefficients for WatchPAT versus each of the RPSGT's regarding TST were low, but Table 4 shows that no significant difference www.nature.com/scientificreports www.nature.com/scientificreports/ in TST was found. Notably, the 95% confidence intervals for the ICC standard deviations were high, and the 95% confidence intervals for TST difference ranged from −7 to +57 minutes (WatchPAT versus RPSGT#1) and from −3 to 59 minutes (WatchPAT versus RPSGT#2). Likewise, for the difference in sleep efficiency the 95% confidence intervals were −2% to +13% between WatchPAT and each of the RPSGT's.
Inspection of Bland Altman plots showed that subject #2 differed highly between WatchPAT and the two RPSGT's, both in TST (188 and 187 min's) and SE (40.6% and 40.3%). However the AHI-differences in this subject were only 3.5 and 3.1, and neither WatchPAT nor the RPSGT's reported AHI ≥ 5 (i.e. not OSA). Linear regression did not show any proportional bias in the distribution of the TST and SE differences when controlling for subject#2.
The WatchPAT-PSG differences in TST were positively correlated with Cohen's κ for detecting all sleep stages (RPSGT#1: r = 0.601, p = 0.014; and RPSGT#2: r = 0.628, p = 0.009). Positive correlations of similar magnitude   www.nature.com/scientificreports www.nature.com/scientificreports/ were also found for SE differences vs. Cohen's κ, as well as for TST differences and SE differences vs. percentage of wake and sleep stage agreement. In other words; when WatchPAT failed in sleep staging, TST and SE was affected.
Intra-class correlation coefficients for AHI were high as shown in Table 3, and the WatchPAT's AHI did not differ significantly from the RPSGT's on a group level (Table 4). However, as indicated in Fig. 2, WatchPAT failed to detect correct OSA severity in two patients. In patient #19, AHI scored by WatchPAT was 1.0 whereas the RPSGT's scored 14.0 and 13.0 (TST 356 min's, 410 min's and 402 min's, respectively), and in patient #22, WatchPAT scored 14.2 vs 1.9 and 0.3 (TST 401 min's, 381 min's and 382 min's) scored by RPSGT#1 and RPSGT #2, respectively. All subjects were scored likewise according to OSA severity by both RPSGT's (Table 5).

Discussion
This study of patients with moderate to severe COPD demonstrated that the WatchPAT scores sleep with less agreement with PSG compared to previous studies of healthy individuals and OSA patients. TST and SE estimated by WatchPAT does not correlate with PSG, however, despite wide individual variability in TST and SE the differences are not statistically different on a group level. The WatchPAT's AHI is reasonably accurate.
Although we found the automated WatchPAT's sensitivity in detecting sleep to be good, it's specificity (as scoring wake when the RPSGT's scored wake) was only about 51%; compared to the 66-74% found by Hedner et al. in non-COPD subjects 19 . We found WatchPAT's abilities in predicting wake, REM and light sleep to be in line with Hedner's findings, however, it correctly scored deep sleep in only 30-37% of the epochs, compared to 69% found by Hedner 14 . In two subjects, more than half of the epochs scored as wake by the RPSGT's were scored as light www.nature.com/scientificreports www.nature.com/scientificreports/ sleep by WatchPAT. As TST and SE were overestimated in these subjects and underestimated in others, the mean values did not differ significantly between the WatchPAT and the PSG recordings, but the individual differences could not be explained by demographic or COPD severity indices. Interestingly, although the TST difference between WatchPAT and the RPSGT's in subject #2 was more than 3 hours, the AHI differed only between 3.1 and 3.5, and the AHI was scored <5 both by WatchPAT and the RPSGT's. Thus, differences in sleep status did not seem to affect the AHI in this case, but in subjects #19 and #22, significant AHI differences of >10 was accompanied by TST differences of approx. 50 min's and 20 min's, respectively.
TST and SE were moderately correlated with the wake/sleep agreement, both according to percentage and as Cohen's κ. The WatchPAT's epoch-by-epoch agreement with the RPSGT's in scoring wake, light sleep, deep sleep and REM sleep varied between the study subjects from 36% to 79%, with Cohen's κ varying from 0.09 to 0.66 (Table 2). Jacob Cohen suggested the κ result to be interpreted as fair in the interval 0.21-0.40 and moderate in the interval 0.41-0.60 20 . Thus, the overall agreement between WatchPAT and the RPSGT's with a κ of 0.41-0.42 can be described as fair to moderate, in contrast to the κ of 0.48 previously found in normal subjects and OSA-patients published by Hedner's group 14 . However, no definite consensus is established in considering what is an adequate agreement, some statisticians describe κ in the range 0.40-0.59 as weak, arguing that little confidence should be placed in results with κ < 0.60 21 .
The association between autonomic output and sleep states has previously been extensively studied, showing different autonomic patterns in wake, NREM and REM sleep as reflected in heart rate, heart rate variability, blood pressure and combinations of these [22][23][24][25] . In the transition from wake to sleep, sympathetic activity decreases and parasympathetic activity increases, with further progress as sleep is consolidated. REM sleep, however, is associated with considerable sympathetic variability and smaller changes in parasympathetic activity, resulting in profound changes in vasoconstriction and heart rate variability in the PAT-recordings 25,26 . In OSA, peak sympathetic activity is associated with the termination of respiratory events 27 . The algorithms in WatchPAT for detecting sleep/wake status and differentiating between light sleep, deep sleep and REM-sleep, as well as identifying respiratory events (e.g. apneas and hypopneas) are based on these associations. In sleeping COPD patients, a marked worsening of breathing and arterial blood gases are often seen, especially during REM-sleep, in contrast to healthy individuals where sleep usually represent a restful period for the pulmonary and cardiovascular systems 7,28 . Arterial stiffness is increased in sleeping COPD patients compared to healthy controls, particularly during REM sleep, and their vascular tone was reduced in response to nasal high flow oxygen therapy 29,30 . It is unclear if the WatchPAT algorithm can identify and correct for these COPD specific changes in the PAT-signal, blood gases and heart rate variability. Thus, it is possible that the low κ correlation between WatchPAT and the PSG scores (approximately half of the study subjects had κ < 0.40) is due to autonomic dysfunction in some of the COPD patients. Likewise, the WatchPAT-PSG differences in AHI found in study subjects #19 and #22 can be due to changed sympathetic activity from altered lung mechanics, hypoxia and/or hypercapnia. Finally, Hedner's group found that greater movement intensity in patients with severe OSA was associated with less agreement in sleep/wake status between the two methods, and pointed out that overestimation of sleep time appears to be a general limitation of actigraphy 19 . Thus, the low specificity and the wide individual differences we found in sleep staging, TST and SE might not depend on the chronic obstructive lung disease per se, but rather on the sleeping patients' movements. On the other hand, despite the fact that half of the study population had a RPSGT-scored AHI ≥ 5 (Table 5), only one of these had AHI ≥ 15, indicating that the severity of Overlap COPD-OSA in this population was low.
To our knowledge, this is the first study comparing WatchPAT with PSG in COPD patients. As such, it has some limitations. First, the sample size was calculated on the assumption that the correlation would be in line with previous studies of OSA-patients. We fount the best match ICC coefficient (95% confidence interval) for AHI to be 0.96 (0.88-0.99), indeed comparable to Pang et al.'s Pearson's r of 0.93 (0.86-0.97) 12 , but with only sixteen study subject, identifying possible characteristics in COPD that mislead the WatchPAT algorithm was not possible.
Also, although we synchronized the WatchPAT and PSG epochs by the internal clock, position and movement signals, an off-set of up to 30 seconds is possible. This implies that the first and last epoch in a given sleep stage may not match, resulting in several mismatches in very fragmented sleep. However, with more than 14600 epochs compared in all, we find it unlikely that this would significantly affect the agreement between the two methods at a group level.
COPD patients can have other sleep related breathing disorders than OSA, e.g. sleep related hypoventilation (SH) 7 , and as we did not record PCO2 during sleep it is possible that SH can explain some of the differences between WatchPAT and the PSG scorings.  www.nature.com/scientificreports www.nature.com/scientificreports/ Finally, the zzzPAT does not report apneas and hypopneas on a continuous timeline, only as the total number of events, and in this study the RPSGT's did not score hypopneas as obstructive or central. Thus, we were not able to explore the major differences in AHI found in two of the subjects.

conclusion
We have compared automated sleep scoring by WatchPAT to the gold standard PSG in moderate to severely ill COPD patients in a stable state of their disease, and found a good sensitivity (WatchPAT = sleep when PSG = sleep) but a weak specificity (WatchPAT = wake when PSG = wake). The WatchPAT underestimates deep NREM sleep and shows only fair to moderate agreement with PSG in the overall detection of wake, NREM light sleep, NREM deep sleep and REM sleep. TST and SE does not differ significantly, but wide individual differences makes WatchPAT's usefulness in evaluating sleep quality questionable in these patients. However, AHI shows reasonably good correlation.
Autonome dysfunction in COPD can be a possible explanation for the discrepancies found here as compared to previous studies in normal populations and in OSA-patients, but our study was too small to explore such associations.
Until proven otherwise in bigger studies, we conclude that the WatchPAT must be used with caution for sleep evaluation in COPD, but AHI correlates well enough for screening of OSA in these patients.

Data availability
The datasets generated and analysed during this study are available from the corresponding author on reasonable request.