Machine learning based algorithms to impute PaO2 from SpO2 values and development of an online calculator

Ren, Shuangxia; Zupetic, Jill A.; Tabary, Mohammadreza; DeSensi, Rebecca; Nouraie, Mehdi; Lu, Xinghua; Boyce, Richard D.; Lee, Janet S.

doi:10.1038/s41598-022-12419-7

Download PDF

Article
Open access
Published: 17 May 2022

Machine learning based algorithms to impute PaO₂ from SpO₂ values and development of an online calculator

Shuangxia Ren¹^na1,
Jill A. Zupetic^2,3^na1,
Mohammadreza Tabary^2,3^na1,
Rebecca DeSensi^2,3,
Mehdi Nouraie^2,3,
Xinghua Lu^1,4,
Richard D. Boyce^1,4 &
…
Janet S. Lee^2,3

Scientific Reports volume 12, Article number: 8235 (2022) Cite this article

1512 Accesses
4 Citations
1 Altmetric
Metrics details

Subjects

Abstract

We created an online calculator using machine learning (ML) algorithms to impute the partial pressure of oxygen (PaO₂)/fraction of delivered oxygen (FiO₂) ratio using the non-invasive peripheral saturation of oxygen (SpO₂) and compared the accuracy of the ML models we developed to published equations. We generated three ML algorithms (neural network, regression, and kernel-based methods) using seven clinical variable features (N = 9900 ICU events) and subsequently three features (N = 20,198 ICU events) as input into the models. Data from mechanically ventilated ICU patients were obtained from the publicly available Medical Information Mart for Intensive Care (MIMIC III) database and used for analysis. Compared to seven features, three features (SpO₂, FiO₂ and PEEP) were sufficient to impute PaO₂ from the SpO₂. Any of the ML models enabled imputation of PaO₂ from the SpO₂ with lower error and showed greater accuracy in predicting PaO₂/FiO₂ ≤ 150 compared to the previously published log-linear and non-linear equations. To address potential hidden hypoxemia that occurs more frequently in Black patients, we conducted sensitivity analysis and show ML models outperformed published equations in both Black and White patients. Imputation using data from an independent validation cohort of ICU patients (N = 133) showed greater accuracy with ML models.

A transformation of oxygen saturation (the saturation virtual shunt) to improve clinical prediction model calibration and interpretation

Article 05 August 2019

Continuous noninvasive blood gas estimation in critically ill pediatric patients with respiratory failure

Article Open access 14 June 2022

Predication of oxygen requirement in COVID-19 patients using dynamic change of inflammatory markers: CRP, hypertension, age, neutrophil and lymphocyte (CHANeL)

Article Open access 22 June 2021

Introduction

The ratio of the partial pressure of oxygen (PaO₂) to the fraction of oxygen (FiO₂) delivered, or the PaO₂/FiO₂, is the reference standard measurement for the assessment of low blood oxygen levels, or hypoxemia, in mechanically ventilated patients with respiratory failure. The PaO₂/FiO₂ ratio (PF ratio) has predictive value for mortality in patients with acute respiratory distress syndrome (ARDS)¹ and is also part of a severity index scoring system called the Sequential Organ Failure Assessment (SOFA) score that is used to predict severity of illness in patients with critical illness^2,3,4. Additionally, the PF ratio is relevant to clinical decision-making including the decision to initiate prone positioning in ARDS patients with PF ratios ≤ 150⁵. Currently, measurement of the PF ratio requires invasive arterial blood gas (ABG) sampling and does not provide a continuous measure of the patient’s oxygenation. Increasingly, non-invasive monitoring with pulse oximetry is utilized instead of ABGs^6,7, particularly in low-resource settings where ABG monitoring may not be readily available. In contrast to invasive blood gas sampling, the SpO₂ (peripheral saturation of oxygen)/FiO₂ ratio can be calculated without blood collection, arterial puncture, or blood gas analyzers and may serve as a surrogate for the PaO₂/FiO₂ ratio. Notably several studies have evaluated the SF ratio in children where non-invasive measurements are increasingly favored^8,9,10.

A few studies have examined non-linear imputation of PaO₂/FiO₂ from SpO₂/FiO₂ measurements recorded at the same time^11,12. These studies have reported that the accuracy of non-linear imputation is superior to log-linear or linear imputation, especially for moderate to severe hypoxemic respiratory failure with ARDS where the PF ratio is < 200^11,13. However, in patients with respiratory failure requiring mechanical ventilation, the optimal equation for imputation of PaO₂/FIO₂ from the SpO₂/FIO₂ remains unclear. An algorithm to accurately impute the PaO₂ from the SpO₂ in mechanically ventilated patients would be beneficial for predictive modeling and clinical research to facilitate recruitment of patients for clinical trials if an ABG is not available. Ideally, this approach would include only variables that contribute to the relationship between SpO₂ and PaO₂ but would not require the same invasive ABG measurement as the PaO₂. From the clinical perspective, SF ratio can be utilized as a surrogate for PF ratio to diagnose ARDS or ALI with less invasive nature and comparable reliability¹⁴.

The objective of this study is to develop a calculator utilizing machine learning algorithms to impute PaO₂ using non-invasive SpO₂ measurements from mechanically ventilated patients in the Medical Information Mart for Intensive Care (MIMIC) III database¹⁵ and to compare the accuracy of the machine learning models to the previously published non-linear and log-linear equations^11,13. In this study, three common machine learning approaches (neural network¹⁶, regression¹⁷, and kernel-based methods^18,19) were tested for regression and classification tasks using data available in MIMIC III²⁰ with 7 clinical variable features and a subsequent 3-feature model. We created models to perform a regression task to impute PaO₂ from SpO₂ values and a classification task to predict patients with moderate to severe hypoxemic respiratory failure based on a cut-off of a predicted PF ratio ≤ 150¹¹. Our overall hypothesis is that a machine learning algorithm would perform better in predicting the PaO₂ from SpO₂ across the entire span of SpO₂ values when compared to the previously published equations.

Methods

The MIMIC-III database v1.4 (https://mimic.physionet.org) is an openly available dataset developed by the Massachusetts Institute of Technology Lab for Computational Physiology¹⁵. It contains de-identified health data associated with approximately 40,000 intensive care unit admissions for patients admitted to critical care units in the Beth Israel Deaconess Medical Center between 2001 and 2012. MIMIC-III is a relational database that contains information on demographics, vital signs, mechanical ventilation status, laboratory tests, medications, and mortality. We also utilized a validation cohort obtained from an existing database of de-identified clinical information from intensive care unit patients with Pseudomonas aeruginosa respiratory isolates from 2 hospitals within the University of Pittsburgh Medical Center (UPMC). This dataset similarly contains information of demographics, mechanical ventilation status, ventilator parameters and laboratory tests. Our study utilizing the MIMIC-III database was determined as exempt by the University of Pittsburgh Institutional Review Board (STUDY19100068). The University of Pittsburgh Institutional Review Board approved the Pseudomonas aeruginosa ICU respiratory isolates database as waiver of informed consent (STUDY21030010) and also approved the use of this database as an independent validation cohort (STUDY21090073). All methods were carried out in accordance with relevant guidelines and regulations.

Data processing

For the MIMIC-III database, we identified unique ICU encounters (icustay_id) with mechanical ventilation status. We next identified the lab event PaO₂ and chart event SpO₂ occurring at the same time of the mechanical ventilation status. In order to minimize error between matched PaO₂ and SpO₂, we constrained the time gap between the lab event PaO₂ and the chart event SpO₂ to be no more than 30 min. To minimize repeated sampling from the same subjects, we restricted the search of PaO₂ measurements to the first 24 h of mechanical ventilation and obtained the first PaO₂ recorded within this time frame. For chart events including tidal volume (TV), positive end-expiratory pressure (PEEP), FiO₂, temperature, and mean arterial pressure (MAP), we constrained the time gap to within 2 h of the selected SpO₂ measurement. If a patient was treated with vasoactive infusions, it was recorded as a categorical variable. Data extraction and processing methods are available at https://github.com/renshuangxia/PaO2PredictorDjango²¹. The online calculator is available at https://dikb.org/pa02-predictor.

For the 3-feature model in the UPMC validation cohort, the database was queried for unique ICU patients requiring mechanical ventilation. The validation set cases include 133 discrete individuals with ABGs obtained within 30 min of an SpO₂ reading similar to the constraints defined in the MIMICS III derivation cohort.

Machine learning methods for regression task

For the regression task we implemented 3 different models—a neural network model, a linear regression model, and support vector regression (SVR), a type of kernel-based modeling. For each model, we applied a tenfold cross-validation²².

For the neural network model, we tested different network structures and various numbers of features to arrive at two models used for comparison with the linear and support vector regression models. One model used seven input features and three hidden layers (16, 8, 5 neurons for layers 1–3). The other model used only three input features and two hidden layers (6, 3 neurons for layers 1 and 2). Both final models used a tangent activation function for all layers except the output layer which used a linear function in both models. Also, both models were trained for 200 epochs with Adam optimizer using gradient descent. The learning rate was 0.001 and the batch sizes were 50 for both models.

For the linear regression model, the output variable can be computed by a linear combination of the input variables. We trained the linear regression equation by the Ordinary Least Squares approach. We used the linear_model.LinearRegression method from scikit-learn 0.22 (https://scikit-learn.org/stable/) with default hyperparameters for predicting PaO₂ values.

For the SVR model, we tested multiple kernels including linear kernel, polynomial kernel, and radical basis function kernel (RBF). Based on the performance in the training data, the RBF kernel was selected.

Machine learning methods for classification task

We utilized PaO₂/FiO₂ ≤ 150, an accepted threshold previously utilized to capture patients with moderate to severe disease meeting the criteria for ARDS^11,13. We utilized this cut-off to test machine learning methods to predict this diagnostic threshold PaO₂/FiO₂ ≤ 150 for the different imputation techniques. We implemented three classification models including neural network, logistic regression, and a kernel-based model, SVM.

For each machine learning model, we applied a tenfold cross-validation and calculated the sensitivity, specificity, likelihood ratios, diagnostic Odds Ratio (OR), Area Under Receiver Operating Characteristic curve (AUROC), F1 score and Bayesian Information Criterion (BIC) to compare across models. The two neural network models for classification were similar to the neural networks used in regression, except the output layer used the sigmoid function. As with the regression models, various topologies were tested to arrive at the final two multi-layer perceptron (MLP) classifiers, one with an input size of seven features and the other with an input size of three features. The hidden layer size is (12, 8, 6, 4, 4) for the model with seven input features. For the other model which utilizes only three input features, we used two hidden layers of size 6 and 3. All hidden layers used the tangent activation function. We trained both models for 200 iterations with Adam optimizer, setting seven feature classifier momentum value as 0.8 and three feature classifier momentum value as 0.6. The learning rate was 0.001 and the batch size was 200 for both models.

In addition, we implemented a basic logistic regression model for classification purposes as well as the SVM model which classifies examples with an optimal hyperplane. For the logistic regression, it uses logistic function to model a binary dependent variable. We utilized the linear_model.LogisticRegression method provided in the scikit-learn library without regularization, and other arguments were set as default. For the SVM model, we compared the results by applying different kernels and the RBF kernel outperformed other kernels. Methods were similar to those used in the regression task.

Comparison of machine-learning based algorithm to published non-linear and log-linear equations

We compared the performance of our machine learning algorithms to the previously published equations. For the non-linear equation from Brown et al.¹¹ the PaO₂ was imputed from the SpO₂, where PO₂ = PaO₂, S = SpO₂ and F = FiO₂ which is illustrated in the Eq. (1). For situations where the recorded SpO₂ was 100% (or, 1.0), the SpO₂ was substituted with 0.996 given that the equation would not permit the calculation of S = 1.0.

Non-linear equation to impute PaO₂ from the SpO₂ (Reprinted with permission - see Acknowledgment section).

$$\begin{aligned} PO_{2} & = \left\{ {\frac{11,700}{{\left( {{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 S}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$S$}} - 1} \right)}} + \left[ {50^{3} + \left( {\frac{11,700}{{{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 S}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$S$}} - 1}}} \right)^{2} } \right]^{1/2} } \right\}^{1/3} \\ & \quad + \left\{ {\frac{11,700}{{\left( {{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 S}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$S$}} - 1} \right)}} - \left[ {50^{3} + \left( {\frac{11,700}{{{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 S}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$S$}} - 1}}} \right)^{2} } \right]^{1/2} } \right\}^{1/3} . \\ \end{aligned}$$

(1)

For the log-linear equation from Pandharipande et al.^11,13, the PaO₂/FiO₂ was imputed from SpO₂/FiO₂ utilizing the Eq. (2):

Log-linear equation to impute PaO₂ from the SpO₂ (Reprinted with permission - see Acknowledgment section).

$$PO_{2} = F \cdot 10^{{\left( {0.48 + 0.78 \cdot log_{10} \left( \frac{S}{F} \right)} \right)}} .$$

(2)

Sensitivity analysis

To compare the performance of our machine learning algorithms to previously published equations, a sensitivity analysis was performed by selecting either self-reported White or Black race. For each machine learning model, we implemented a tenfold cross-validation and calculated the sensitivity, specificity, likelihood ratios, diagnostic OR, AUROC, F1 score, RMSE (root-mean-square deviation), and BIC to compare across models.

Results

A parsimonious three features model is sufficient to impute PaO2/FiO₂ ratio using a large dataset

An overview of the machine learning tasks is outlined in Fig. 1. We initially chose seven relevant features from the chart events (SpO₂, FiO₂, TV, MAP, temperature, PEEP and vasopressor administration) representing recorded bedside measurements that were independent from an invasive arterial blood gas measurement. When applying the seven features to impute the PaO₂, the final data set contained 9900 unique ICU encounters from 9302 mechanically ventilated patients (Supplementary Table e1). The relationship between SpO₂/FiO₂ (S/F) and the PaO₂/FiO₂ (P/F) was examined in dataset 1 containing 9900 unique ICU events from the MIMIC-III database and was best described by a log-linear relationship between the transformed logarithmic value of the SF and PF ratios as previously described by Pandharipande et al.¹³ (Supplementary Fig. e1). The relationship between S/F and P/F ratios showed high variance across the distribution of mechanically ventilated subjects (R² = 0.21).

For the regression task, we derived the RMSE and BIC for each of the different seven feature machine learning models (neural network, linear regression, support vector regression) to assess the performance of the imputation techniques. The RMSE and BIC of the three machine learning methods are shown in Supplementary Table e2. All the machine learning models outperformed the previously published non-linear and log-linear equations as shown by lower RMSE score; the same was observed for subset 1 (SpO₂ < 97%). For the classification task, the three machine learning methods achieved similar classification performance according to F1 scores, as shown in Supplementary Table e3; the same pattern was observed for subset 1 (SpO₂ < 97%).

To improve practicality of the method at the bedside, we attempted to use the smallest number of features possible to predict the PaO₂ or PaO₂/FiO₂ ratio from the regression and classification tasks, respectively. Compared to the other measured variables, PEEP had the strongest correlation with PaO₂/FiO₂ (r = − 0.31) outside of the SF ratio (SpO₂/FiO₂) (Table 1). Using this information, we created a 3-feature model using SpO₂, FiO₂ and PEEP. As compared to seven features, three features were sufficient to impute PaO₂/FiO₂ ratio with a similar degree of accuracy. The 3-feature model was therefore utilized in the remainder of the analysis for the machine learning algorithms. The final 3-feature data set (dataset 2) contained 20,198 ICU encounters from 17,818 unique patients (Table 2). Forty percent of subjects were of female sex and the mean age was 64 years. The degree of hypoxemic respiratory failure, as measured by the PaO₂/FiO₂ ratio¹, showed a distribution in which 26% had mild respiratory failure (PaO₂/FiO₂ = 201–300), 22% had moderate respiratory failure (PaO₂/FiO₂ = 101–200), and 8% had severe respiratory failure (PaO₂/FiO₂ ≤ 100).

Table 1 Correlation coefficients between PF ratios and variables.

Full size table

Table 2 Subject characteristics based on three features.

Full size table

Machine learning models show improved performance when compared to the prior published equations for regression

We quantitatively derived the RMSE for all of the machine learning and previously published models and the BIC for each of the three machine learning models to assess the performance of the different imputation techniques (Table 3). The RMSE of the neural network, linear regression and SVR machine learning models were 84.7, 88.8 and 85.9, respectively, compared to 117.7 and 91.8 for the log-linear and non-linear equations. The lower RMSE values indicate that the three machine learning models outperformed the previously published equations. Of the machine learning models, the neural network method showed the lowest RMSE as well as the lowest BIC in both the whole dataset (dataset 2) and for SpO₂ < 97% (subset 2). A Bland–Altman Plot suggests that the neural network model is comparable to the published equations (Supplementary Fig. e2). There was decreasing accuracy at higher PaO₂/FiO₂ ratios for all the methods examined.

Table 3 RMSE and BIC of the 3-feature machine learning models regression tasks compared to published methods.

Full size table

Machine learning models show improved performance for the classification task

We compared the performance of the machine learning models with the log-linear and non-linear equations using F1 scores. Similar to the findings for the regression task, all three machine learning models performed better in the whole dataset than log-linear and non-linear equations (Table 4). When the dataset was limited to SpO₂ < 97% (subset 2), the machine-learning methods performed slightly better than log-linear and better than non-linear equations, respectively (Table 4). The F1 scores for all three machine learning methods were similar when using the whole dataset (dataset 2) and for subset 2 where SpO₂ < 97%. As shown in Fig. 2, when comparing the 3 machine learning models to one another, the neural network preformed slightly better in the whole dataset (area under the precision recall curve = 0.94 for the neural network compared to 0.93 and 0.91 for the logistic regression and support vector machine model, respectively). The three models had similar performance in subset 2.

Table 4 Prediction performance of machine learning classification models based on three features.

Full size table

Sensitivity analysis

Hidden hypoxemia, or the discrepancy between peripheral oxygen saturation (SpO₂) measurements and the arterial oxygen saturation (SaO₂) measured by ABG, was recently identified to occur in 5.3–5.5% of patients in the ICU setting^23,24. Hidden hypoxemia, defined as SpO₂ ≥ 88% despite an SaO₂ ≤ 88%, was observed in all races and ethnic groups but occurs with higher prevalence in Black patients^23,24. We conducted a sensitivity analysis to compare the performance of the machine learning models between self-reported Black and White race in dataset 2. For the regression task, among Black patients, machine learning algorithms outperformed both non-linear and log-linear equations in terms of the regression task (RMSE: 88.7, 91.1, 90.1, 117.4, and 95.8 for neural network, linear regression, SVR, log-linear, and non-linear models, respectively). Among machine learning algorithms, neural network revealed the highest performance in Black patients (Supplementary Table e4). Focusing on Black patients with SpO₂ < 97% (subset 2), machine learning models showed superior performance over previously published equations (RMSE: 72.1, 74.4, 71.5, 85.0, and 95.6 for neural network, linear regression, SVR, log-linear, and non-linear models, respectively). The same pattern was observed for White patients in both the whole population and patients with SpO₂ < 97% (subset 2) (RMSE in White patients: 84.6, 88.3, 85.9, 117.7, and 91.8; RMSE in White patients with SpO₂ < 97%: 67.8, 68.3, 70.5, 72.2, and 81.2 for neural network, linear regression, SVR, log-linear, and non-linear models, respectively).

Considering the classification task, all machine learning algorithms performed better than or comparable to previously published equations in Black patients (F1: 0.93, 0.92, 0.93, 0.89, 0.92 for neural network, linear regression, SVR, log-linear, and non-linear models, respectively). Of note, neural network model performed slightly better than the other two machine learning algorithms in Black patients (AUC: 0.78, 0.77, 0.68 for neural network, logistic regression, and SVM model, respectively). Considering Black patients with SpO₂ < 97% (subset 2), machine learning models outperformed conventional equations (F1: 0.82, 0.82, 0.84, 0.81, 0.73 for neural network, linear regression, SVR, log-linear, and non-linear models, respectively). Among White population, machine learning models outperformed conventional equations in both the whole population and patients with SpO₂ < 97% (subset 2) (F1 in White patients: 0.92, 0.92, 0.92, 0.87, and 0.91; F1 in White patients with SpO₂ < 97%: 0.81, 0.80, 0.81, 0.80, and 0.70 for neural network, linear regression, SVR, log-linear, and non-linear models, respectively), and neural network was the preferrable model. These findings are summarized in Supplementary Table e5.

Machine learning algorithms show a better accuracy in the validation cohort

We developed an online calculator using the three machine learning algorithms requiring three inputs (SpO₂, FiO₂, and PEEP): https://dikb.org/pa02-predictor. The calculator was then utilized in an independent validation cohort of 133 mechanically ventilated ICU patients to impute the PaO₂ in a regression task. The imputed PaO₂ was compared to the actual PaO₂ obtained by ABG. The accuracy of the machine learning algorithms was compared to the non-linear equation and was reported as the RMSE and adjusted R-squared (Table 5). The neural network and SMV models had lower RMSE than the previously published non-linear equation, demonstrating improved performance in the imputation of PaO₂. Adjusted R-squared was also higher in the neural network and SMV models. To clarify the models proposed in this study, the following example is worth mentioning: with the assumption of SpO₂ = 100%, FiO₂ = 0.6, and PEEP = 5 cmH₂O (observed PaO₂/FiO₂ = 190), the predicted PaO₂ is estimated as 203.0, 186.2, 188.4 using neural network, SVM, and regression models, respectively, while the estimate of conventional non-linear model is 167 (Table 6).

Table 5 RMSE of the 3-feature machine learning models regression task compared to the published non-linear equation.

Full size table

Table 6 Examples of comparing four models applied to four cases from different categories of PaO₂ (< 150, 150–200, 200–300, > 300).

Full size table

Discussion

We used the publicly available MIMIC-III database as a derivation cohort to develop and evaluate machine-learning algorithms to impute PaO₂ utilizing non-invasive SpO₂ in patients who are mechanically ventilated. We tested three machine learning models (neural network, linear regression and SVR) first using seven available clinical variables SpO₂, FiO₂, PEEP, TV, MAP, temperature, and vasopressor administration to impute the PaO_2. We subsequently used a parsimonious model with three clinical variables (SpO₂, FiO₂ and PEEP) to non-invasively impute PaO₂ in both a derivation and validation cohort. The imputation of PaO₂ from the regression tasks enabled us to derive the PaO₂/FiO₂, a clinically meaningful ratio with predictive value^1,25. Additionally, we performed a classification task to predict PaO₂/FiO₂ ≤ 150, a cut off that has been used to capture those patients with moderate to severe respiratory failure in ARDS cohorts^11,13 and to guide patient management⁵. To increase the clinical applicability of our work, we also developed an open-access online calculator to impute the PaO₂ using the 3-feature model requiring only non-invasive bedside parameters in mechanically ventilated patients. Our calculator showed improved accuracy in the imputation of the PaO₂ when compared to the previously published non-linear equation in both our initial cohort and the validation cohort.

To develop the machine learning algorithms, we initially evaluated clinical variables such as PEEP, TV, MAP, temperature, and vasopressor administration that are easily obtained at the bedside. TV, MAP, temperature and vasopressor use demonstrated a stochastic distribution and did not significantly alter the accuracy of the machine-learning based algorithms and were therefore removed to create the 3 features model (SpO₂, FiO₂, PEEP). This 3-feature model provides a framework for generalizability using large datasets of mechanically ventilated patients.

We considered other clinical variables such as skin pigmentation, pulse oximeter location, oximeter manufacturer, vasopressor infusion, and laboratory variables such as serum bicarbonate, serum chloride, serum creatinine, serum sodium but others have shown these variables provided negligible improvement in the accuracy of imputation in a prior prospective study¹¹ and were therefore not included. However, it is worth mentioning that recent studies showed discrepancy between peripheral oxygen saturation (SpO₂) measurements and the arterial oxygen saturation (SaO₂) measured by ABG. This discrepancy, defined as SpO₂ ≥ 88% despite an SaO₂ ≤ 88% and referred to as hidden hypoxemia, was present in all racial and ethnic groups but showed higher prevalence in Black patients^23,24. Considering this discrepancy between SpO₂ and arterial oxygen saturation occurs more frequently in Black patients²⁴, we performed a sensitivity analysis showing that our machine learning algorithms outperform previously published equations both in the Black and White race.

Our study shows that a machine learning based method for both the regression and classification task, when applied to the MIMIC-III critical care database, improved the accuracy compared to the previously published non-linear and log-linear imputation methods. As is evidenced by comparing the F1 and discrimination measures in Table 4, the performance improvement was more modest for the classification task in subset 2 where SpO₂ < 97%. A possible explanation is that there were fewer ICU events (smaller N) per group in the subset.

Prior studies have examined the relationship between SF and PF ratios for patients with ARDS to determine whether the non-invasive SF ratio can be substituted for the invasively obtained PF ratio^11,13,26. Panharipande, et al. studied matched measurements of SpO₂ and PaO₂ in a heterogeneous population (i.e., patients undergoing general anesthesia and patients with ARDS) to determine the association between SF and PF ratios in order to calculate the respiratory parameter of the SOFA score¹³. In their study, matched SpO₂ and PaO₂ values were obtained from two groups of patients: Group 1 comprised of the derivation set and was obtained from patients undergoing general anesthesia from a single center and Group 2 comprised a validation set utilizing data from patients enrolled in a multi-center randomized clinical trial examining low versus high tidal volume for acute respiratory management of ARDS (ARMA)²⁷. All SpO₂ values > 97% were also excluded from analysis in order to maximize matched data to those values likely to be within the linear range of the oxyhemoglobin dissociation curve. Data from 4728 matched SpO₂ and PaO₂ measurements showed that the relationship was best described by a log-linear equation with slight variation based upon the level of PEEP. In the setting of a more heterogeneous population, a poorer correlation was noted between SF and PF ratios. The regression equation of Log(PF) = 0.48 + 0.78 × Log(SF) yielded an R-square of 0.31¹³.

Additionally, a retrospective analysis of arterial blood gas measurements from three ARDS Network studies compared the performance of non-linear, log-linear and linear imputation methods to derive PaO₂ from the SpO₂¹². In all patients (N = 1184), the nonlinear imputation was equivalent to log-linear imputation. However, in those patients with SpO₂ < 97% (N = 707), the nonlinear imputation showed lower error than either linear or log-linear equations. A prospective study was subsequently conducted in patients enrolled in the Prevention and Early Treatment of Acute Lung Injury network¹¹ to assess the performance of the non-linear equation to impute PaO₂ from the SpO₂ and compare it to the prior log-linear and linear equations^11,13,26. This study included 1034 arterial blood gases from 703 patients, of which 650 arterial blood gases had matched SpO₂ < 97%. The non-linear equation showed lower error and better identified moderate to severe ARDS patients (defined in the study as PaO₂/FiO₂ ≤ 150) when compared to log-linear or linear imputation methods.

In our study, we similarly found a high degree of variance across SpO₂ values and corresponding measured PaO₂ values which was noted when we formally examined the relationship between SF and PF. This may be attributed to the retrospective nature of the data collection and the numerous variables that may confound the reliability of a recorded SpO₂ measured non-invasively to reflect the arterial SaO₂^8,10,12. Despite this limitation, the machine learning algorithms performed better on both regression and classification tasks when compared to the log-linear and non-linear published equations.

We used a validation cohort to show improved accuracy for the neural network and kernel-based machine learning algorithms when compared to the previously published non-linear equation. Another strength of our study is the development of an online calculator that can be used to impute the PaO₂ from three noninvasive parameters (SpO₂, FiO₂ and PEEP) and may serve as a tool for future studies in large electronic health record datasets. Additionally, our machine learning models allow for the evaluation of all mechanically ventilated patients with available data rather than narrowing the analysis to a specific population such as those with ARDS. Given the inclusion of all mechanically ventilated patients, a significant number of SpO₂ values were > 97% (N = 8510 for seven features and N = 16,918 for three features). While this reduced the accuracy of the imputed PF ratio, particularly above a certain threshold, the machine learning models were applied to the data without a pre-defined restriction placed upon the range of SpO₂ values and showed better performance than both the log-linear and non-linear equations on both the regression and classification tasks.

Imputation of PaO₂ from SpO₂ has been increasingly implemented in clinical and research settings using previously published equations for subjects that do not have invasive ABG measurements readily available. This underscores the need to improve upon existing published equations and the clinical importance of machine learning models proposed. Machine learning models are currently being used to answer numerous clinical questions; these models have substantially impacted different scopes of medicine from early-warning systems for sepsis to imaging diagnostics²⁴. Herein, we proposed three machine learning algorithms which can provide a framework for future investigations. The online calculator, on the other hand, can provide feasible prediction of PF ratio from SF ratio at the bedside for clinicians working in the critical care settings.

We showed that machine learning models outperformed previously published equations in terms of imputing PaO₂ from SpO₂ in the mechanically-ventilated adult population. Consistent with our findings, Sauthier et al., utilized neural network models to validate a continuous and noninvasive method of hypoxemia estimation in pediatric population²⁸. They utilized convolutional neural network (CNN), long short-term memory network (LSTM), and multilayer perceptron (MLP) to impute PaO₂. Intriguingly, they concluded that bias was lowered when using neural network models compared to mathematical equations.

In summary, any of the tested machine learning models applied to MIMIC-III dataset enabled imputation of PaO₂ from the SpO₂ with lower error and provided greater accuracy in predicting PaO₂/FiO₂ ≤ 150 across the entire range of SpO₂ examined when compared to that of published equations in two independent cohorts. All machine learning models proposed in this paper outperformed log-linear and non-linear equations. Future work will be required to prospectively test ML algorithms for use in clinical practice. Additionally, our study provides a clinically relevant online calculator for the imputation of the PaO₂ from the 3-feature machine learning models. The calculator requires the input of SpO₂, FiO₂, and PEEP all of which are non-invasive and readily available at the bedside of mechanically ventilated patients.

Abbreviations

PaO₂ :: Partial pressure of oxygen
FiO₂ :: Fraction of inspired oxygen
SpO₂ :: Peripheral saturation of oxygen
PF ratio:: PaO₂/FiO₂
SF ratio:: SpO₂/FiO₂
ARDS:: Acute respiratory distress syndrome
SOFA:: Sequential organ failure assessment
ABG:: Arterial blood gas
TV:: Tidal volume
PEEP:: Positive end-expiratory pressure
MAP:: Mean arterial pressure
SVR:: Support vector regression
RBF:: Radical basis function kernel
AUROC:: Area under receiver operating characteristic curve
BIC:: Bayesian information criterion
RMSE:: Root-mean-square deviation

References

Force, A. D. T. et al. Acute respiratory distress syndrome: The Berlin Definition. JAMA 307, 2526–2533. https://doi.org/10.1001/jama.2012.5669 (2012).
Article CAS Google Scholar
Vincent, J. L. et al. The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure. On behalf of the working group on sepsis-related problems of the European Society of Intensive Care Medicine. Intensive Care Med. 22, 707–710. https://doi.org/10.1007/bf01709751 (1996).
Article CAS PubMed Google Scholar
Ferreira, F. L., Bota, D. P., Bross, A., Melot, C. & Vincent, J. L. Serial evaluation of the SOFA score to predict outcome in critically ill patients. JAMA 286, 1754–1758. https://doi.org/10.1001/jama.286.14.1754 (2001).
Article CAS PubMed Google Scholar
Vincent, J. L. et al. Use of the SOFA score to assess the incidence of organ dysfunction/failure in intensive care units: Results of a multicenter, prospective study. Working group on “sepsis-related problems” of the European Society of Intensive Care Medicine. Crit. Care Med. 26, 1793–1800. https://doi.org/10.1097/00003246-199811000-00016 (1998).
Article CAS PubMed Google Scholar
Guerin, C. et al. Prone positioning in severe acute respiratory distress syndrome. N. Engl. J. Med. 368, 2159–2168. https://doi.org/10.1056/NEJMoa1214103 (2013).
Article CAS PubMed Google Scholar
Garland, A. & Connors, A. F. Jr. Indwelling arterial catheters in the intensive care unit: Necessary and beneficial, or a harmful crutch?. Am. J. Respir. Crit. Care Med. 182, 133–134. https://doi.org/10.1164/rccm.201003-0410ED (2010).
Article PubMed Google Scholar
Garland, A. Arterial lines in the ICU: A call for rigorous controlled trials. Chest 146, 1155–1158. https://doi.org/10.1378/chest.14-1212 (2014).
Article PubMed Google Scholar
Khemani, R. G., Patel, N. R., Bart, R. D. 3rd. & Newth, C. J. L. Comparison of the pulse oximetric saturation/fraction of inspired oxygen ratio and the PaO₂/fraction of inspired oxygen ratio in children. Chest 135, 662–668. https://doi.org/10.1378/chest.08-2239 (2009).
Article PubMed Google Scholar
Khemani, R. G. et al. Comparison of SpO₂ to PaO₂ based markers of lung disease severity for children with acute lung injury. Crit. Care Med. 40, 1309–1316. https://doi.org/10.1097/CCM.0b013e31823bc61b (2012).
Article CAS PubMed Google Scholar
Lobete, C. et al. Correlation of oxygen saturation as measured by pulse oximetry/fraction of inspired oxygen ratio with PaO₂/fraction of inspired oxygen ratio in a heterogeneous sample of critically ill children. J. Crit. Care 28(538), e531-537. https://doi.org/10.1016/j.jcrc.2012.12.006 (2013).
Article Google Scholar
Brown, S. M. et al. Nonlinear imputation of PaO₂/FIO₂ from SpO₂/FIO₂ among mechanically ventilated patients in the ICU: A prospective, observational study. Crit. Care Med. 45, 1317–1324. https://doi.org/10.1097/CCM.0000000000002514 (2017).
Article PubMed PubMed Central Google Scholar
Brown, S. M. et al. Nonlinear imputation of PaO₂/FiO₂ from SpO₂/FiO₂ among patients with acute respiratory distress syndrome. Chest 150, 307–313. https://doi.org/10.1016/j.chest.2016.01.003 (2016).
Article PubMed PubMed Central Google Scholar
Pandharipande, P. P. et al. Derivation and validation of SpO₂/FiO₂ ratio to impute for PaO₂/FiO₂ ratio in the respiratory component of the Sequential Organ Failure Assessment score. Crit. Care Med. 37, 1317–1321. https://doi.org/10.1097/CCM.0b013e31819cefa9 (2009).
Article PubMed PubMed Central Google Scholar
Bilan, N., Dastranji, A. & Ghalehgolab Behbahani, A. Comparison of the SpO₂/FIO₂ ratio and the PaO₂/FIO₂ ratio in patients with acute lung injury or acute respiratory distress syndrome. J. Cardiovasc. Thorac. Res. 7, 28–31. https://doi.org/10.15171/jcvtr.2014.06 (2015).
Article PubMed PubMed Central Google Scholar
Johnson, A. E. et al. MIMIC-III, a freely accessible critical care database. Sci. Data 3, 160035. https://doi.org/10.1038/sdata.2016.35 (2016).
Article CAS PubMed PubMed Central Google Scholar
Cheng, B. & Titterington, D. M. Neural network: A review from a statistical perspective. Stat. Sci. 9, 2–30 (1994).
MathSciNet MATH Google Scholar
Friedman, J. H. & Popescu, B. E. Gradient directed regularization for linear regression and classification. In Technical Report, Department of Statistics and Stanford Linear Accelerator Center, Stanford University (2004).
Drucker, H., Burges, C. J., Kaufman, L., Smola, A. J. & Vapnik, V. Support vector regression machines. In NIPS'96: Proceedings of the 9th International Conference on Neural Information Processing Systems, 155–161 (1996).
Suykens, J. & Vandewalle, J. Least squares support vector machine classifiers. Neural Process. Lett. 9, 293–300. https://doi.org/10.1023/A:1018628609742 (1999).
Article Google Scholar
Peng, C. Y., Lee, K. L. & Ingersoll, G. M. An introduction to logistic regression analysis and reporting. J. Educ. Res. 96(2002), 3–14. https://doi.org/10.1080/00220670209598786 (2010).
Article Google Scholar
Code for the Imputation of PaO₂/FIO₂ from SpO₂ Values from the MIMIC-III Critical Care Database Using Machine-Learning Based Algorithms (Github.com, 2020).
Krough, A. & Vedelsby, J. Neural network ensembles, cross validation and active learning. In NIPS'94: Proceedings of the 7th International Conference on Neural Information Processing Systems, 231–238 (1994).
Sjoding, M. W., Dickson, R. P., Iwashyna, T. J., Gay, S. E. & Valley, T. S. Racial bias in pulse oximetry measurement. N. Engl. J. Med. 383, 2477–2478. https://doi.org/10.1056/NEJMc2029240 (2020).
Article PubMed PubMed Central Google Scholar
Wong, A. I. et al. Analysis of discrepancies between pulse oximetry and arterial oxygen saturation measurements by race and ethnicity and association with organ dysfunction and mortality. JAMA Netw. Open 4, e2131674. https://doi.org/10.1001/jamanetworkopen.2021.31674 (2021).
Article PubMed Google Scholar
Bellani, G. et al. Epidemiology, patterns of care, and mortality for patients with acute respiratory distress syndrome in intensive care units in 50 countries. JAMA 315, 788–800. https://doi.org/10.1001/jama.2016.0291 (2016).
Article CAS PubMed Google Scholar
Rice, T. W. et al. Comparison of the SpO₂/FIO₂ ratio and the PaO₂/FIO₂ ratio in patients with acute lung injury or ARDS. Chest 132, 410–417. https://doi.org/10.1378/chest.07-0617 (2007).
Article PubMed Google Scholar
Ventilation with lower tidal volumes as compared with traditional tidal volumes for acute lung injury and the acute respiratory distress syndrome. The Acute Respiratory Distress Syndrome Network. N. Engl. J. Med. 342, 1301–1308 (2000).
Sauthier, M., Tuli, G., Jouvet, P. A., Brownstein, J. S. & Randolph, A. G. Estimated PaO(2): A continuous and noninvasive method to estimate PaO(2) and oxygenation index. Crit. Care Explor. 3, e0546. https://doi.org/10.1097/cce.0000000000000546 (2021).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank Dr. William G. Bain for providing thoughtful edits. Equation 1 and 2 were reprinted from Brown, S. M. et al., Nonlinear Imputation of PaO₂/FiO₂ From SpO₂/FiO₂ Among Patients With Acute Respiratory Distress Syndrome. Chest 150, 307-313 (2016)¹², with permission from Elsevier.

Funding

This work was supported by the National Heart, Lung, And Blood Institute of the National Institutes of Health under Award Numbers F32 HL152504 (J.Z.); P01 HL114453, R01 HL136143, R01 HL142084, K24 HL143285 (J.S.L.), and R01 LM012011 (X.L. and S.R.). The University of Pittsburgh holds a Physician-Scientist Institutional Award from the Burroughs Wellcome Fund (J.Z.); content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or any other sponsoring agency.

Author information

These authors contributed equally: Shuangxia Ren, Jill A. Zupetic and Mohammadreza Tabary.

Authors and Affiliations

Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA
Shuangxia Ren, Xinghua Lu & Richard D. Boyce
Division of Pulmonary, Allergy, and Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA, USA
Jill A. Zupetic, Mohammadreza Tabary, Rebecca DeSensi, Mehdi Nouraie & Janet S. Lee
Department of Medicine, Acute Lung Injury Center of Excellence, University of Pittsburgh, NW628 Montefiore University Hospital, 3459 Fifth Avenue, Pittsburgh, PA, 15213, USA
Jill A. Zupetic, Mohammadreza Tabary, Rebecca DeSensi, Mehdi Nouraie & Janet S. Lee
Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
Xinghua Lu & Richard D. Boyce

Authors

Shuangxia Ren
View author publications
You can also search for this author in PubMed Google Scholar
Jill A. Zupetic
View author publications
You can also search for this author in PubMed Google Scholar
Mohammadreza Tabary
View author publications
You can also search for this author in PubMed Google Scholar
Rebecca DeSensi
View author publications
You can also search for this author in PubMed Google Scholar
Mehdi Nouraie
View author publications
You can also search for this author in PubMed Google Scholar
Xinghua Lu
View author publications
You can also search for this author in PubMed Google Scholar
Richard D. Boyce
View author publications
You can also search for this author in PubMed Google Scholar
Janet S. Lee
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.R. performed the data extraction and processing, analysis, and interpreted the data. J.Z. and M.T. performed data analysis, interpreted the data and wrote the manuscript. R.B., and X.L. interpreted the data and revised the work for important intellectual content. R.D. performed the data extraction for the validation cohort. M.N. provided critical statistical expertise, designed, analyzed, interpreted the data, and wrote the manuscript. J.S.L. conceived, designed, analyzed, interpreted the data, and wrote the manuscript. J.S.L., S.R., and M.T. performed critical revision, S.R. and M.T. are the guarantors of the paper.

Corresponding author

Correspondence to Janet S. Lee.

Ethics declarations

Competing interests

J.S. Lee discloses a paid consultantship with Janssen Pharmaceuticals, Inc. unrelated to this study. The authors have no other relevant conflicts of interest to disclose.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ren, S., Zupetic, J.A., Tabary, M. et al. Machine learning based algorithms to impute PaO₂ from SpO₂ values and development of an online calculator. Sci Rep 12, 8235 (2022). https://doi.org/10.1038/s41598-022-12419-7

Download citation

Received: 05 November 2021
Accepted: 15 February 2022
Published: 17 May 2022
DOI: https://doi.org/10.1038/s41598-022-12419-7

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

A transformation of oxygen saturation (the saturation virtual shunt) to improve clinical prediction model calibration and interpretation

Continuous noninvasive blood gas estimation in critically ill pediatric patients with respiratory failure

Predication of oxygen requirement in COVID-19 patients using dynamic change of inflammatory markers: CRP, hypertension, age, neutrophil and lymphocyte (CHANeL)

Introduction

Methods

Data processing

Machine learning methods for regression task

Machine learning methods for classification task

Comparison of machine-learning based algorithm to published non-linear and log-linear equations

Sensitivity analysis

Results

A parsimonious three features model is sufficient to impute PaO2/FiO2 ratio using a large dataset

Machine learning models show improved performance when compared to the prior published equations for regression

Machine learning models show improved performance for the classification task

Sensitivity analysis

Machine learning algorithms show a better accuracy in the validation cohort

Discussion

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links

A parsimonious three features model is sufficient to impute PaO2/FiO₂ ratio using a large dataset