Features from the photoplethysmogram and the electrocardiogram for estimating changes in blood pressure

There is a growing emphasis being placed on the potential for cuffless blood pressure (BP) estimation through modelling of morphological features from the photoplethysmogram (PPG) and electrocardiogram (ECG). However, the appropriate features and models to use remain unclear. We investigated the best features available from the PPG and ECG for BP estimation using both linear and non-linear machine learning models. We conducted a clinical study in which changes in BP (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta$$\end{document}ΔBP) were induced by an infusion of phenylephrine in 30 healthy volunteers (53.8% female, 28.0 (9.0) years old). We extracted a large and diverse set of features from both the PPG and the ECG and assessed their individual importance for estimating \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta$$\end{document}ΔBP through Shapley additive explanation values and a ranking coefficient. We trained, tuned, and evaluated linear (ordinary least squares, OLS) and non-linear (random forest, RF) machine learning models to estimate \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta$$\end{document}ΔBP in a nested leave-one-subject-out cross-validation framework. We reported the results as correlation coefficient (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho _p$$\end{document}ρp), root mean squared error (RMSE), and mean absolute error (MAE). The non-linear RF model significantly (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p<0.05$$\end{document}p<0.05) outperformed the linear OLS model using both the PPG and the ECG signals across all performance metrics. Estimating \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta$$\end{document}ΔSBP using the PPG alone (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho _p$$\end{document}ρp = 0.86 (0.23), RMSE = 5.66 (4.76) mmHg, MAE = 4.86 (4.29) mmHg) performed significantly better than using the ECG alone (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho _p$$\end{document}ρp = 0.69 (0.45), RMSE = 6.79 (4.76) mmHg, MAE = 5.28 (4.57) mmHg), all \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p < 0.001$$\end{document}p<0.001. The highest ranking features from the PPG largely modelled increasing reflected wave interference driven by changes in arterial stiffness. This finding was supported by changes observed in the PPG waveform in response to the phenylephrine infusion. However, a large number of features were required for accurate BP estimation, highlighting the high complexity of the problem. We conclude that the PPG alone may be further explored as a potential single source, cuffless, blood pressure estimator. The use of the ECG alone is not justified. Non-linear models may perform better as they are able to incorporate interactions between feature values and demographics. However, demographics may not adequately account for the unique and individualised relationship between the extracted features and BP.


SI: 2 Additional details of ECG feature extraction
Complexity and entropy signals were extracted from each segmented ECG signal within the window w. For a given ECG segment, x, of length N x , the following features were extracted: Hjorth parameters The Hjorth mobility and complexity parameters indicate activity variations in a signal 1 . The mobility parameter represents the signal's mean frequency. The signal complexity is an estimate of the signal bandwidth. Hjorth mobility and complexity are computed as: where var(·) denotes the variance.
Fractal dimension The fractal dimension of an ECG signal segment provides a measure of self-similarity. The fractal dimension of a time series is a number between 1 (straight line) and 2 (defining a surface), where a higher number represents a signal with higher fluctuations and complexity. To calculate the fractal dimension, we used the Higuchi algorithm 2 where increasing length scales (up to k max ) are used to estimate the signal length. The fractal dimension is then defined as the proportionality constant representing the relationship between length scale and signal length. Brezinski 3 showed that increasing values of k max beyond a value of 17 resulted in only a marginal increase on the fractal dimension for regular ECG beats (defined as an ECG without arrhythmia episodes). Therefore, we set k max as 17 in this work.
Shannon's entropy (SE) was used to measure the uncertainty of the information content in a time series based on it's probability distribution. The probability distribution of the time series was estimated by a normalised histogram. SE was then computed as: where p k and w k are the probability and width of the k th bin of the histogram.
Approximate entropy (approxEnt) quantifies the regularity of a time series and the likelihood that similar patterns of observations will not be followed by additional similar observations. For example, a time series containing many repetitive patterns has a small approxEnt. approxEnt was computed using the following steps: 1. Divide the signal, x, into consecutive segments of length m = 2. Following the work of Li 4 , m = 2.
2. For each segment, i, compute the Chebyshev distance to all other segments and calculate C m i as the number of segments with a Chebyshev distance less than r to the i th segment. Following the work of Li 4 , r = 0.2.
3. Define φ m (r) as the average number of segments of length m that are suitably similar to each other within a tolerance of r: To compare φ m (r) to the subsequent data point, increase the dimension to m+1 and compute φ m+1 (r)

5.
The approximate entropy is computed as: Sample entropy (sampEnt) is a modification of approxEnt where each segment cannot be compared to itself. In approxEnt, the comparison between each segment and the rest of the segments also includes comparison with itself, as a result signals are interpreted to be more regular than they actually are. These self matches are not included in sampEnt resulting in a more stable estimate of entropy, reducing bias. We computed sampEnt using the same hyperparamters (m and r) as approxEnt.
Multi-scale entropy (MSE) is an extension to sample entropy. MSE is the application of sampEnt to the signal at increasingly coarser scales. For each s th scale, the original signal samples are grouped into non-overlapping windows, of length s, and the windowed samples are averaged. sampEnt is then applied to this averaged signal. We computed MSE at scales 2, 4, 6 and 8.

2/10
SI: 3 Results for mean arterial pressure and diastolic blood pressure    PWC: participant-wise correlation, q1 and q3 are the first and third quartile PWV values across the cohort.

5/10
Figure SI 4 shows median SBP, MAP, and DBP ranking coefficients for both (a) RF and (b) LASSO+OLS SHAP values feature importance. We quantified the agreement between the feature ranks for pairs of SBP, MAP, and SBP using the Kendall rank correlation coefficient, ρ k 5 . It was found that the feature importance for SBP, MAP, and DBP estimation showed strong agreement with each other (ρ k > 0.6 for all).