A Comparative Study of ECG-derived Respiration in Ambulatory Monitoring using the Single-lead ECG

Cardiorespiratory monitoring is crucial for the diagnosis and management of multiple conditions such as stress and sleep disorders. Therefore, the development of ambulatory systems providing continuous, comfortable, and inexpensive means for monitoring represents an important research topic. Several techniques have been proposed in the literature to derive respiratory information from the ECG signal. Ten methods to compute single-lead ECG-derived respiration (EDR) were compared under multiple conditions, including different recording systems, baseline wander, normal and abnormal breathing patterns, changes in breathing rate, noise, and artifacts. Respiratory rates, wave morphology, and cardiorespiratory information were derived from the ECG and compared to those extracted from a reference respiratory signal. Three datasets were considered for analysis, involving a total 59 482 one-min, single-lead ECG segments recorded from 156 subjects. The results indicate that the methods based on QRS slopes outperform the other methods. This result is particularly interesting since simplicity is crucial for the development of ECG-based ambulatory systems.

Pre-processing. ECG. The ECG signals were first normalized by subtracting the mean and dividing by the standard deviation (i.e., the standard score) and then segmented into minutes, with the segments indexed by k, with k = 1, …, K and K the total number of signal segments. In total, 1218, 4711, and 53553 min were collected for the Drivers, Fantasia, and Sleep datasets, respectively. Then, a signal quality index (SQI), denoted q(k), was computed to quantify the presence of artifacts and noise, using the algorithm proposed in 45 . The SQI ranges from 0 to 100, where higher values correspond to better signal quality. Segments with q(k) > 80 are considered of high quality. Baseline wander was removed from the ECG using a forward/backward, fourth-order Butterworth highpass filter with cutoff frequency at 0.5 Hz. The reason for using this filter relies on the results presented in 46 , where it was identified as one of the most accurate methods and yet simple to implement.
R-peaks were detected using the algorithm described in 47 . Missing beats and false alarms were corrected using the RR interval adjustment algorithm described in 5 . The corrected RR intervals were then used to construct the tachogram, resampled to 5 Hz using cubic spline interpolation.
Since the performance of the EDR methods depends on accurate QRS detection, a procedure was implemented to automatically identify aberrant QRS morphologies that might have remained after the RR interval correction. These undesired complexes might still be present in the data due to abnormal or aberrant morphology not detected by the RR interval adjustment, which focuses on rhythm abnormalities. First, each QRS complex was segmented using a window of 60 ms before and after the R-peak, see Fig. 1(a). Then, the mean of each QRS complex was subtracted and its variance computed. This resulted in a time series {σ 2 (1), σ 2 (2), …, σ 2 (L)} ( Fig. 1(b)), where L is the number of QRS complexes in the segment. The 25th (Q 1 ) and 75th (Q 3 ) percentiles and the interquartile range (IQR) were then computed and the lower (Q l ) and upper (Q u ) limits of accepted QRS variance were defined as Q l = Q 1 − 2.5 ⋅ IQR and Q u = Q 3 + 2.5 ⋅ IQR. Only QRS complexes fulfilling Q l < σ 2 (i) < Q u were accepted for further analysis. Figures 1(c,d) illustrate QRS complexes accepted and excluded from further analysis, respectively. Note that this procedure is specifically developed for ensembles with beats of the same morphology, where noisy or abnormal beats are in minority. This means that arrhythmias such as bigeminy will not be handled by this procedure.
Respiration. All respiratory signals were segmented into minutes and downsampled to 5 Hz following antialiasing filtering. Then, forward/backward filtering using a fourth-order Butterworth bandpass filter with cutoff frequencies at 0.05 Hz and 1 Hz was applied. The spectral properties of each segment were used to determine whether the estimation of the respiratory signal and derived parameters deteriorates with more complex respiratory patterns. Therefore, the estimation errors obtained for respiratory signals with a single fundamental frequency were compared to those obtained for more irregular and abnormal respiratory patterns. The spectral properties were characterized using two indices. The first index describes the bandwidth of the respiratory signal, calculated as the width of the frequency band containing 90% of the total power. The power spectral density (PSD) was obtained using Welch's method with a Hamming window of 30 s, an overlap of 20 s, and 1024 points. The bandwidth, denoted b(k), is used to differentiate narrow-and broadband respiratory signals. For instance, respiratory signals during periods of relaxation or deep sleep (without the presence of apneas) are characterized by narrow PSDs with a clear respiratory rate, see Fig. 2, where segments during (a) deep sleep, (b) apnea, and (c) driving are exemplified. The segments in Fig. 2(b,c) are characterized by broadband spectra, either due to the presence of artifacts (e.g. during driving) or physiological events like apnea. The second index is given by the number of modes (i.e. local maxima), denoted m(k), in the PSD within b(k). As shown in Fig. 2, b(k) is larger during both apnea and driving than during deep sleep, however, these patterns can be further differentiated by m(k). Therefore, both b(k) and m(k) are proposed as indicators of respiratory patterns, used for splitting the respiratory signals according to their spectral properties. Figure 2(d) shows the distribution of these indices for the different datasets.
Following pre-processing, each 1-min segment of data consisted of: R-wave amplitude ( r i ( ) r ) is simply defined by the R-wave amplitude. R-to-S-wave ( r i ( ) rs ) is defined by the difference between the R-and the S-wave amplitudes. The latter is calculated as the minimum amplitude in a 80-ms window after the R-wave 22 .
Principal component analysis ( r i ( ) p ) accounts for the global variation in amplitude of the QRS samples 11 . The QRS complex is segmented using a symmetric 120-ms window centered around the R-wave, after which all QRSs are organized as rows in a matrix to which principal component analysis (PCA) is applied. The first principal component of the matrix is used as EDR signal.
Kernel principal component analysis ( r i ( ) k ) is the kernel version of PCA (kPCA). This method first maps the QRS matrix, computed as for  r i ( ) p and contained in the input space, into a higher dimensional space by means of a kernel function. The classical PCA is then performed on the transformed dataset, and the first principal component is mapped back to the input space and taken as the EDR signal 12 . Q-R slope ( r i ( ) up ) uses the upward slope of the R-wave as the EDR signal 13 . A straight line is fitted to the samples in an 8-ms window centered around the sample with the steepest upward slope. The slope is then used as the EDR signal.
R-S slope (r i ( ) dw  ) is identical to that of the Q-R slope except that it uses the window centered around the sample with the steepest downward slope 13 . It is important to keep in mind that r i ( ) rs  ,  r i ( ) dw , and r i ( ) up  strongly depend on the ECG lead analyzed, since QRS complexes with prominent R-waves and clear Q-and S-waves are required.
R-wave angle (r i ( ) θ  ) is estimated from the QRS slopes (r i ( ) up  and  r i ( ) dw ), and taken as the EDR signal 13 . QRS slope range ( r i ( ) sr ) is defined by the difference between the maximum and the minimum slopes in the QRS complex 27 , computed from the first derivative in a symmetric window of 100-ms centered around the R-wave.
Central moment (r i ( ) cm  ) is defined by the 4-th order central moment of the bandpass filtered (0.5-45 Hz) ECG signal in the RS interval 14 .
QRS area ( r i ( ) a ) is defined by the area of the QRS complex 10 . The EDR signals, sampled at the R-wave positions, were resampled to 5 Hz using cubic spline interpolation, thereby facilitating the comparison with the reference respiratory signals. The signals were then band-pass filtered in the same way as done for the reference. The resulting EDR signals are denoted r n ( ) edr  , where the subscript edr is given by the signal considered, i.e., edr ∈ {r, rs, p, k, up, dw, θ, sr, cm, a}.
Performance measures. Respiratory rate. The EDR signals are evaluated with respect to respiratory rate, denoted f(k), computed using the method described in 13 . The method involves the following three steps: PSD estimation, peak-conditioned averaging, and respiratory rate estimation. A subscript is added to f(k), either th, ab, or na, depending on the reference respiratory signal considered. The notation  f k ( ) edr is used when estimated from an EDR signal, e.g.,  f k ( ) rs indicates that the respiratory rate is estimated from  r n ( ) rs . The relative error in the estimation of the respiratory rate, denoted e k ( ) f edr , is always calculated using r n ( ) th k ( ) as reference, and is defined by Wave morphology similarity. The similarity between EDR and reference respiratory signals is evaluated using cross-correlation and spectral coherence. In the k-th segment, the absolute maximum cross-correlation, denoted |ρ(k)|, is computed within ± 3 s. The mean time-frequency (TF) coherence, denoted γ k ( ) r , is computed within the respiratory band b(k) using the method described in 48 . The mean TF coherence is used to reduce the effect of nonstationarities in each segment.
In all datasets, r n ( ) Cardiorespiratory interactions. An important parameter when investigating different diseases and conditions such as stress and sleep disorders is the amount of information transferred from respiration to heart rate 1,6,49,50 . For example, studies have shown that during apnea the respiratory modulation of the HRV, known as respiratory sinus arrhythmia, is attenuated 4,6,51 . With this in mind, the goal is to determine if the EDR can be used to identify attenuation in cardiorespiratory interactions.
This information is quantified not only for the reference respiratory signals but also for the EDR signals using two novel approaches; the partial TF method 52 and the transfer entropy obtained using the information decomposition method 49 . The reason for selecting these two approaches relies on the fact that partial TF deals with nonstationarities in the signals and quantifies the coherence between the signals, while transfer entropy explores causality and predictability of the tachogram using the respiratory signal as an independent variable. Partial time-frequency. This method analyzes and interprets systems characterized by two single-inputs and a single-output under nonstationary conditions, using a non-parametric, multivariate quadratic TF representation proposed in 48 . It uses the TF coherence function to decompose the spectrum of the single-output into two spectra reflecting the contributions of the two single, uncorrelated inputs. This method was used in 52 to quantify the influence of heart rate variability (HRV) on QT interval variability, but here to quantify the cardiorespiratory coupling as the contribution of respiration to the TF spectrum of the tachogram. This contribution, denoted k ( ) xy γ , is quantified by the mean TF coherence between the tachogram and the respiratory signal of the k-th segment in the signal. Details on this approach and its implementation can be found in 52 .
The relative error in the estimation of the mean TF coherence is calculated using r n ( ) th k ( ) as reference, and is defined by www.nature.com/scientificreports www.nature.com/scientificreports/ Transfer Entropy. Transfer entropy was computed using information dynamics, being a framework derived from the field of dynamical information theory. Using information dynamics, the amount of information stored in a system and the information transferred from one system to another (i.e. respiration to heart rate) is estimated. In this work, the focus is on the information transferred from x to y, referred to as transfer entropy T x→y ; x is the respiratory signal, and y the tachogram. The larger the amount of information transferred from respiration to heart rate, the larger is the transfer entropy.
A method to quantify T x→y in cardiorespiratory analysis was proposed in 49 . This method links information theory and predictability, resting on the assumption that x and y are jointly Gaussian. With this assumption, it is possible to describe their dynamics using a linear vector, autoregressive model of order p, determined using the Akaike information criterion. In this way, T x→y can be linked to the error probabilities of an autoregressive model, with heart rate and respiration as the dependent and independent variables, respectively.
The relative error in the estimation of the transfer entropy of the k-th segment, is calculated using r n ( ) th k ( ) as reference, and is defined by Statistical analysis. The performance is evaluated at different noise levels, quantified by q(k). This separation shows the effect of noise on EDR signal morphology and cardiorespiratory parameters, being evaluated using the Kruskal-Wallis test with α = 0.05. A multicomparison test with Bonferroni correction was used whenever required.
Using the Sleep dataset, similarity and relative errors are evaluated for normal activity and apnea events. Again, the Kruskal-Wallis test is used with α = 0.05 and a multicomparison test with Bonferroni correction.
The relationships between, on the one hand, the similarity and the relative errors e f (k), e γ (k), and e T (k), and, on the other hand, the spectral characteristics b(k) and m(k) of the respiration, are evaluated using both Pearson's and Spearman's correlation coefficients. Figure 3 shows two examples of respiratory signals of high-quality ECG segments of the Sleep dataset. These examples illustrate the difference in bandwidth of the reference respiratory signal, r th (n) and modes of the respiratory spectrum, quantified by b(k) and m(k).

Results
To evaluate the performance, b(k) was divided into ranges to differentiate between narrow-and broadband respiratory spectra. Table 1 presents the percentage of segments per dataset belonging to each range. Most segments are contained in 0.1 Hz < b(k) ≤ 0.5 Hz. About 36% of the segments in the Sleep dataset have a b(k) ≤ 0.2 Hz, which was expected since more regular respiration is typical during sleep. This is also observed in Fig. 2. Moreover, for the Drivers dataset, 21.7% of the segments are characterized by a bandwidth larger than 0.5 Hz, probably explained by poor-quality recordings or drivers constantly moving and speaking while driving 43 . This observation is supported by the result that the bandwidth was on average 0.26 Hz during the first 15 min when the drivers were relaxed with the car in idle, while the bandwidth increased to 0.45 Hz during driving.
Concerning m(k), there was a significant difference between all datasets, being on average, 4.29 ± 2.10, 3.73 ± 1.71, and 2.93 ± 1.48 for the Drivers, Fantasia, and Sleep datasets, respectively. www.nature.com/scientificreports www.nature.com/scientificreports/ The relationship between b(k) and m(k) of the respiratory signals and the five performance measures will be presented at the end of this section. First, the performance measures will be discussed.
The ECG signal quality was assessed for all three datasets, resulting in that 2.4%, 3.1%, and 6.1% of the segments were identified as low-quality in the Drivers, Fantasia, and Sleep datasets, respectively. The number of aberrant QRS complexes per segment were different between low-and high-quality segments in all datasets. On average, segments with q(k) ≤ 80 had 7 ( ≈ 10% per segment) QRS complexes removed from the analysis, while high-quality segments had on average 2 ( ≈ 3% per segment) aberrant QRS complexes. An important observation is that the distributions of the number of segments removed from low-and high-quality segments were right-skewed, with skewness equal to 2.5 and 4.1, and median values of 3 ( ≈ 7% per segment) and 0, respectively. This suggests that q(k) captures changes in the ECG morphology that could be related to noise and artifacts, and that the excluded QRS complexes in the high quality segments are in minority.
One of the reasons to include the apnea dataset was to evaluate if abrupt changes in the respiratory pattern, like those observed during apneas, affect the estimation of the respiratory parameters. This can be related to either the fact that the morphology of the QRS complexes is affected beyond acceptance, or to the low sensitivity of the , and relative errors were obtained using r th (n) as reference and are indicated in %. * Significantly different from normal events. www.nature.com/scientificreports www.nature.com/scientificreports/ EDR to capture those changes. In order to study this, the number of QRS complexes removed from segments containing apneas was compared to the number removed from normal segments (see Table 2).
Respiratory rate. The estimation errors of the respiratory rate are shown in Fig. 4. The errors associated with the Sleep dataset were significantly lower than those associated with the other two datasets, while the errors associated with the Drivers set were the largest (p < 0.05). When looking at each dataset separately, the lowest errors were obtained using  r n ( ) for both the Fantasia and Sleep datasets. In the Sleep dataset, the estimation errors determined during normal activity were compared to those during apneas and presented in Table 2. Only errors obtained for  r n ( ) sr are indicated, but they are similar to those obtained for all other EDR signals. In general, the errors were significantly higher (p < 0.05) during OSH than during normal activity.
Regardless of the estimation error observed during OSH and normal activity, the estimated respiratory rates could discriminate (p < 0.05) normal activity from OSA, CEN, HPA, and MIX events, see Figure 5 where the results for r th (n) and r n ( ) sr  are shown. Similar results were obtained for all EDR signals but only those of  r n ( ) sr are indicated. The lowest respiratory rates were observed during central events (CEN) for both the reference and estimated respiratory signals.
The agreement between the reference and estimated respiratory rates was also evaluated at different rates. For illustration purposes, only the results obtained for r n ( ) Fig. 6, but the  . The least-squares regression line is indicated for each case. For this comparison, a distinction was made between broad-and narrowband respiratory signals using a threshold of b(k) = 0.3 Hz. Note that for lower rates, the EDR signals tend to overestimate the rate since The opposite is observed for higher rates. The estimation error, however, is larger at wider bandwidths. The relative errors in the estimation of the respiratory rate were evaluated for different ranges of f th (k), and the tendency towards larger errors when f th (k) < 0.1 Hz and f th (k) > 0.4 Hz was present in all EDRs. These errors were 1.93 ± 0.15% for f th (k) < 0.1 Hz, 0.07 ± 0.04% for 0.1 ≤ f th (k) < 0.2 Hz, 0.04 ± 0.02% for 0.2 ≤ f th (k) < 0.3 Hz, 0.06 ± 0.03% for 0.3 ≤ f th (k) < 0.4 Hz, and 0.42 ± 0.07% for f th (k) > 0.4 Hz.
Wave morphology similarity. The distribution of |ρ| and γ r for each dataset is shown in Fig. 7, together with those of r ab (n) and r na (n) for the Sleep dataset. No significant difference between |ρ| and γ r was found in the Drivers dataset, while |ρ| was significantly higher than r γ for all EDR signals in the Fantasia and Sleep datasets. In the Sleep dataset, the correlation was highest between r ab (n) and r th (n) since they both correspond to respiratory effort, while the correlation between r na (n) and r th (n) was comparable to that of the EDR signals. On the other hand, the coherence between the reference respiratory signals was significantly larger than those obtained for the EDR signals. Note that the similarity in the low-quality ECG segments was low also for the reference respiratory signals. This suggests that the SQI actually identifies movement artifacts not only affecting the ECG signals but also the respiratory signals.
A comparison of high-and low-quality segments with q(k) ≤ 80 was performed for each EDR signal independently. As expected, similarity was lower for low-quality segments, however, the lower similarity was significant (p < 0.05) for both the Fantasia and the Sleep datasets, while it was only significant for r n ( ) dw  ,  r n ( ) θ , and  r n ( ) sr for the Drivers dataset. When comparing datasets, |ρ| and γ r were lowest (p < 0.05) for the Drivers dataset for all EDR signals. In contrast, similarity was highest in the Fantasia dataset for all EDR signals, except for r n ( ) r  and r n ( ) a  . The EDR signal most sensitive to noise was r n ( ) . These EDR signals achieved the largest Spearman correlation (0.25) between both |ρ(k)| and k ( ) r γ , and q(k). The similarity measures were also analyzed for different ranges of breathing rate. As with respiratory rate estimation, worse wave morphology approximation was obtained for segments with f th (k) closer to 0.1 Hz or 0.4 Hz. In these ranges, |ρ| and γ r were lower than 0.3, while for 0.1 ≤ f th (k) < 0.4 Hz mean values were always larger than 0.6.
The relationship between the similarity measures and the cutoff frequencies of the high-pass filter used for baseline removal was investigated. These frequencies were 0.1 Hz, 0.3 Hz, 0.5 Hz, and 1 Hz. The original ECG signal and the filtered versions were used to approximate the respiratory wave morphology. The similarity measures were lowest when no baseline wander was removed and highest when the cutoff at 1 Hz was used. It is important to mention that no difference was found between the similarity measures obtained using 0.5 Hz and 1 Hz as cutoff frequencies. For example, |ρ| and r γ values for segments in the Sleep dataset, without baseline removal, were 0.71 ± 0.13 and 0.67 ± 0.16, respectively. After removing the baseline wander using cutoff frequencies of 0.1 A separate analysis was performed to evaluate if the EDR signals capture ECG baseline information. To this end, the baseline was computed using a cutoff frequency of 0.5 Hz and the EDR signals were computed using the original ECG signals (i.e., before applying the baseline removal step). The similarity measures |ρ| and r γ were then computed between the baseline and the different EDR signals, and values of 0.32 ± 0.15 and 0.37 ± 0.17 were obtained for all EDR signals, respectively. Similar |ρ| and γ r values were obtained between the baseline wander and the reference respiratory signal, namely, 0.34 ± 0.13 and 0.38 ± 0.12, respectively. These results suggest a weak relationship between the baseline wander and the respiratory modulation of the ECG. In other words, the effect of respiration on the ECG signal goes beyond the baseline wander.
In the Sleep dataset, there was a distinction between the similarity during normal activity and apnea events. Correlation and coherence were significantly lower for all EDR signals during OSA and MIX and significantly larger during normal activity ( Table 2).
In order to determine if the similarity was affected negatively by the number of excluded QRS complexes per segment, the correlation between this number and the wave morphology similarity was computed for the high-quality segments. Correlation values for all datasets together and all EDR signals were − 0.04 ± 0.01, confirming that the estimation of wave morphology is not affected by the few ( ≃ 2 for q(k) > 80) excluded QRS complexes.
After comparing |ρ| and γ r for high-quality segments, the EDR signals which best captured the respiratory changes in time and frequency were r n ( ) Cardiorespiratory interactions. The estimation errors related to cardiorespiratory interaction, quantified using γ k ( ) xy and T x→y (k), were evaluated at high signal quality, i.e., q(k) > 80. The average estimation error of T x→y (k) was 50% for all EDR signals and datasets, while the average error of γ k ( )  and T x→y (k) were able to identify respiratory events (Fig. 5). In other words, their absolute values were significantly larger (p < 0.05) for normal activity than for apnea events, capturing "weaker" cardiorespiratory interactions during apnea events. This effect was observed for all EDR signals except for r n ( ) a  . Table 3 presents the EDR signals that achieved the highest similarity and the lowest estimation errors in the different datasets. The marked signals produced results that were not significantly different (p > 0.05) among each other but were significantly different (p < 0.05) than those produced by the signals not indicated in the table. Results are indicated for the high-and low-quality segments. In 6 out of the 9 experiments, there was no significant difference between the estimated errors produced by the different EDR signals in the low-quality segments.
Relationship with the spectral properties of the respiration. The relationship between the five performance measures and the spectral characteristics (b(k) and m(k)) of the respiratory signal was evaluated using Pearson's and Spearman's correlation coefficients. All results were comparable for both coefficients and for all EDR signals, except for  r n ( ) a , where correlation values close to 0.1 were achieved for all parameters. For the other 9 EDR signals, the average correlation coefficients between the similarity measures and b(k) and m(k) were −0.5. These results suggest a weak inverse linear relationship between the spectral complexity of the reference signal and its morphologic approximation. Concerning the relationship between the respiratory rate, cardiorespiratory interactions, and b(k) and m(k), a weak one was observed with correlation coefficients lower than 0.4.

Discussion
In total, 10 EDR algorithms were compared in 3 datasets and 3 different tasks (i.e., 9 experiments). The EDR signals that performed best in at least 5 experiments and in high quality segments were  r n ( ) Even though those 4 signals were the most consistent with respect to the approximation of wave morphology, respiratory rate, and cardiorespiratory interactions, there are some considerations that need to be taken into account when selecting an EDR signal for a particular task, namely the expected level of noise and the type of respiratory dynamics.
Concerning the estimation of respiratory rate, r n ( ) dw  ,  θ r n ( ), and  r n ( ) sr consistently produced the lowest errors for all 3 datasets. The estimated errors were smaller when the respiratory rate was between 0.1 Hz and 0.4 Hz. This is in agreement with previous studies that reported that the estimation of respiratory rate works better within this range 53,54 . The errors also increased with bandwidth and modes of the power spectrum of the reference respiratory signal (Fig. 6), but were the lowest when compared against the other EDR signals. For example,  r n ( ) rs and r n ( ) sr  were found to be more robust than r n ( ) r  , as shown in 55 , and than r n ( ) p  and  r n ( ) k when broadening the respiratory bandwidth, as shown in this study. This represents a disadvantage of PCA methods since they are not only sensitive to outliers but also to the complexity of the reference respiratory signal. The errors were largest in the Drivers dataset and lowest in the Sleep dataset, with intermediate values for the Fantasia dataset. Although subjects in the Fantasia dataset were "at rest" in a supine position, they were still awake. This means that their respiratory patterns were more complex due to various respiratory and cardiac modulators that act during wake periods 4 . This was confirmed by the spectral parameters, namely b(k) and m(k), which were larger in the Fantasia dataset than during sleep (see Fig. 2(d)). Furthermore, a weak inverse relationship was found between these parameters and the estimation errors, which could explain the tendency to larger errors in the Fantasia dataset.
The lowest correlation and coherence were observed in the Drivers dataset, containing generally higher noise levels. Even though subjects had some minutes of relaxation before driving, these periods were not completely stress-free. Subjects were still moving around and some discomfort was observed during the recording of the signals 43 . As a result, the morphology of the ECG is affected not only by respiratory modulation but also by movement, thereby reducing the performance of the EDR signals.
The SQI of the Sleep dataset was significantly lower than that of the Fantasia dataset although larger than for the Drivers dataset. This could be explained by the effect apneas have on the morphology of both ECG and respiratory signals 6 . Despite these morphological ECG changes, the proposed procedure to identify aberrant QRS complexes was able to detect these changes as normal since no difference between the number of excluded complexes was found between normal and apnea segments.
When combining all datasets, the strongest relationship between similarity and signal quality was observed for , suggesting that these algorithms are the most sensitive to poor signal quality. The results obtained in the Fantasia dataset were compared to those reported in 32 , where two models were used to extract the EDR on the same dataset. One model was based on Gaussian processes (GP) and another on phase space reconstruction (PSR). The same parameter |ρ| was used in 32 to evaluate performance and mean values of 0.703 and 0.643 were obtained for all 40 subjects using the GP and PSR, respectively. In this study, similar values of |ρ| were obtained for r n ( ) dw  , θ  r n ( ), and r n ( ) sr  , namely, 0.717, 0.708, and 0.708, respectively. The advantage of the methods evaluated in this work is that they do not assume independence as in GP and do not require the definition of an embedding as in PSR.
The similarity measures were also computed between the respiratory effort measured around the thorax and the other two respiratory signals available in the Sleep dataset, namely, the respiratory effort around the abdomen and the nasal airflow. As expected, the correlation between the two effort signals were highest, while the correlation between r th (n) and r na (n) was comparable to the ones obtained between r th (n) and most of the EDR signals. This can be explained by the fact that the effort might still be present during obstructive events, whereas the airflow is completely interrupted. Spectral coherence demonstrated that the different EDR signals achieved results comparable to those obtained with the reference respiratory signals, suggesting that spectral information related Scientific RepoRtS | (2020) 10:5704 | https://doi.org/10.1038/s41598-020-62624-5 www.nature.com/scientificreports www.nature.com/scientificreports/ to respiratory effort can be accurately extracted from the morphological changes of the ECG 10 . If the airflow needs to be analyzed, as typically done in sleep diagnosis, the use of EDR signals may not necessarily be the best option.
After analyzing the correlation and coherence between the reference respiration and the different EDR signals, it was found that the best results were obtained for  r n ( ) . These signals are simple as they are based on amplitudes and slopes around the R-waves, offering an advantage over the other techniques. Signals based on PCA, i.e.,  r n ( ) p and r n ( ) k  , on the other hand, tend to outperform other simpler techniques during quasi-stationary conditions 12 . However, these methods require eigendecomposition of a (kernel) matrix which is computationally demanding. Moreover, PCA-based methods are extremely sensitive to outliers 56,57 as demonstrated by the poor performance in the Drivers dataset (Table 3). It is well-known that the classical PCA solution is optimal from a least-squares perspective, but also known to be very sensitive to outliers of low signal-to-noise ratio. Therefore, future work could focus on the improvement of PCA-based algorithms by including weighting schemes 58 or probabilistic approaches 59 .
When comparing the performance of PCA and kPCA, it is clear that there is no benefit in using the latter. The potential advantage of kPCA was not observed on any of the 3 datasets since its performance was very similar to its linear counterpart. This raises the question whether the effect of respiration on ECG morphology is linear and nonlinearities are negligible. Nevertheless, this result could be also caused by the wrong model estimation in kPCA. Future studies should address this problem.
In addition to analyzing respiratory rate and wave morphology, the quantification of cardiorespiratory interactions, using the tachogram and the EDR signals, was evaluated. These interactions were quantified by the coherence between the signals, denoted γ k ( ) xy , and the predictability of the tachogram from the EDR, denoted T x→y (k). The errors obtained in both cases indicate that the coherence between the signals can be estimated with, on average, errors lower than 0.2 times the reference value. On the other hand, the errors were much larger when predictability was estimated. This can be related either to possible delays in the estimation of an EDR signal that might affect the construction of the autoregressive model used in the calculation of T x→y (k), or to poor morphology approximation.
In general, the errors in the estimation of cardiorespiratory parameters were significantly larger during apnea events than during normal activity for all EDR signals. This can be explained by the fact that during an apnea event, in particular during obstructive events, the respiratory effort might still be present while no air is entering the lungs. As a result, the information captured by the EDR might be unrelated to respiration, thereby causing an overestimation of the cardiorespiratory interactions. In other words, any EDR signal captures the chest movements while the cardiorespiratory information may capture the actual filling of the lungs and its effect on the heart rate.
It is worth noting that the information of different EDR signals can be fused in order to improve the estimation of the respiratory information. Several fusion methodologies have been proposed in the literature, especially for estimating the respiratory rate 60 . Such fusion has been shown to be more effective when combining EDR signals containing complementary (i.e., non-redundant) respiratory information. Future work could focus on evaluating whether the fusion of the best-performing EDR signals in this study could result in an increased performance or if an improvement could be reached by fusing the worst-performing ones.
Finally, this study evaluated performance on three datasets, recorded with different equipment. The Drivers and the Fantasia datasets were collected using "ambulatory" systems, while the Sleep dataset was recorded using polysomnography. This means that  r n ( ) dw , r n ( )  θ , r n ( ) sr  , and  r n ( ) cm produced the best results not only during different physiological conditions but also for different recording systems. However, before concluding which signals can be used interchangeably the following considerations need to be taken into account. On the one hand, r n ( ) cm  requires R-and S-wave delineation, and additional computations to quantify the changes in the morphology of the interval between these two fiducial points by means of the 4th order central moment. On the other hand,  r n ( ) dw ,  θ r n ( ), and r n ( ) sr  only require the detection of the R-wave and a definition of a fixed window around it, which makes them computationally simpler than  r n ( ) cm . Therefore, since QRS detection is much simpler that QRS delineation,  r n ( ) dw , θ  r n ( ), and  r n ( ) sr are selected as the best performing and simplest to compute EDR signals.

Conclusions
This study showed that the simplest methods for derivation of respiratory information, namely methods exploring either morphological changes in the segment between the R-wave and the S-wave or the slope range of the QRS complex, can be used to accurately estimate the respiratory wave morphology, respiratory rate, and cardiorespiratory information from the ECG. This result is concluded from analyzing different physiological conditions such as periods of relaxation, stress, and both normal and distorted sleep patterns. Furthermore, real life conditions like changes in baseline, transients, artifacts and noise were also considered for evaluation. These findings are crucial for the development of ambulatory systems that can monitor cardiorespiratory parameters using cheap and easy-to-use technology.