Introduction

Predictor transformation (e.g., using the squared values of a predictor) is a well understood method to optimize the performance of a predictor known to be non-linearly related to clinical outcomes.1 Transformation has been shown to enhance the empirical performance for a predictor having a U-shape non-linear association with clinical outcomes, such as body temperature, for which high and low values are both associated with increased risk of adverse clinical outcomes.2 The pulse oximeter oxygen saturation (SpO2) is widely used as a clinical indicator of the degree of impairment in gas exchange.3 However, the SpO2 does have a sigmoidal shape non-linear association with clinical outcomes because of the well-known sigmoid shape of the oxygen dissociation curve. This non-linear association between SpO2 and clinical outcomes is a major limitation for using SpO2 as a predictor in linear prediction models.

To circumvent this limitation, we propose a simple transformation of SpO2, the saturation virtual shunt (VS). The key element of our transformation is the concept of physiological VS.4 The concept of physiological VS was initially used to describe the non-linear relationship between FiO2 and the arterial partial pressure of O2 (PaO2) using iso-shunt curves.4 The physiological VS describes the overall loss of oxygen content between the inspired gas and the arterial blood5 and is linearly related to the degree of impairment in oxygen exchange. The physiological VS can be defined as the proportion of blood that would need to bypass the lungs to produce the difference between the calculated end capillary venous oxygen content and arterial blood oxygen content and is also known as venous admixture.5 The physiological VS quantitatively describes the efficiency of the gas exchange and the severity of a disease process that may lead to hypoxemia.3 The physiological VS can also be adjusted for the fraction of inspired oxygen (FiO2).

The physiological VS is typically calculated based on assumed values of the arterial/mixed venous oxygen content difference, hemoglobin level, pH, and temperature unless these values can be measured.5,6 A more common formula, the difference between the oxygen partial pressure in the alveoli (PAO2) and systemic arteries (PaO2) (P[A–a]O2), has been used to represent the shunt calculation. This formula does adjust for changes in FiO2 (due to changes in altitude or oxygen administration) but the reliance on an invasive measurement to obtain PaO2 makes it impractical and the partial pressure of oxygen is used as a surrogate for oxygen content. The partial pressure of oxygen, however, is not linearly related to the degree of gas exchange abnormality and hence this approach is sub-optimal.5,6,7

This article develops a physiologically based transformation of SpO2 called the saturation VS (for clinical interpretation and prognostic research), with an illustrative application of using the saturation VS as one candidate predictor of hospital admission, compared to the previously used counterparts (the dichotomized SpO2 and the untransformed SpO2), in a cohort of children visiting the emergency department at the Kamudini Women Medical College Hospital in Bangladesh.8

Methods

Calculation of physiological VS

Following Karlen et al.,5,6 we derive physiological VS from inspired oxygen FiO2 and arterial oxygen saturation (SaO2) for a full range of theoretical subjects that satisfy the following assumptions.

  • The loss due to capillary diffusion is negligible, which allows alveoli oxygen content (PAO2) to be approximated by end-capillary oxygen content.

  • SaO2 is estimated without error from SpO2 obtained from pulse oximetry.

  • Patients are on room air at the time of SpO2 measurements. The barometric pressure is at sea level (101 kPa) and inspired oxygen (FiO2) is at 21%.

  • Values for water vapor pressure, alveolar CO2 partial pressure, respiratory quotient, incomplete capillary diffusion, arteriovenous oxygen difference, oxygen-binding capacity of hemoglobin, blood concentration of hemoglobin, and solubility of O2 in hemoglobin are assumed to be normal and constant.

With the above assumptions, we can theoretically calculate the physiological VS using the previously established Alveolar Gas equation and the Severinghaus and Severinghaus–Ellis equations.9,10,11 PAO2 was estimated using the Alveolar Gas equation.9 Alveoli oxygen concentration (SAO2) was then estimated from PAO2 using the Severinghaus equation.10 To calculate arterial oxygen content, SaO2 was transformed to PaO2 using the Severinghaus–Ellis equation.10,11 The detailed mathematical descriptions of the above calculation can be found in the Supplementary Text S1 or in Karlen et al.5,6 Supplementary Fig. S1 provides an intuitive graphical illustration for the flow of calculations.

Saturation VS

To produce a simple and more clinically useful method to describe the non-linear relationship between the physiological VS and SpO2, we fitted several common non-linear functions, such as polynomials and logarithmic functions. The unknown parameters of these functions were estimated using the non-linear least squares method.12 For this fitting process, we selected SpO2 at 1% intervals in the range from 50% to 98%. We chose this range of values because the previously described empiric formulae are not valid for SpO2 values >98%.5,6 We also excluded SpO2 values <50% as they are rare and typically associated with severe clinical cyanosis.

The saturation VS was then defined based on the fitted relationship between the physiological VS and SpO2.

Evaluation of the empirical performance of the proposed transformation

We evaluated the use of the saturation VS compared to the dichotomized SpO2 and the untransformed SpO2 in a recently completed prospective observational study at the Kumudini Women’s Medical College Hospital’s in Bangladesh, a rural tertiary care hospital.8 Ethics approval and informed consent were obtained prior to data collection.

The study aimed to develop a simple model to predict the need for facility admission that could be used in a community setting. Children aged <5 years presenting at the outpatient or emergency department were enrolled. Study physicians collected clinical signs and symptoms from the facility records and performed recordings of SpO2, heart rate, and respiratory rate. Facility physicians made the decisions about the need for hospital admission on clinical grounds without knowledge of the oxygen saturation measurements. SpO2 value was taken as the median over a minute at the time of initial assessment. SpO2 readings >98% were considered to be equal to 98%, because 98% is the theoretical maximum reading possible on room air at sea level. Readings >98% occurred owing to the tolerance level or bias of the pulse oximeters.5,6 Children who showed high SpO2 variability (range > 6%) in combination with low perfusion were excluded. Motion artifact, ambient light, and poor positioning of the sensor typically resulted in high variability and low perfusion leading to a high likelihood of erroneous SpO2 readings. Low perfusion was assessed post hoc based on the amplitude of the photoplethysmogram and the pulse oximeter device perfusion index (low/medium/high). Children with SpO2 <75% (a danger sign) were also excluded from predictive modeling, for they were considered critically ill and should be directly admitted into higher-level facilities regardless of any model-based predictions.13 The data for this study are publicly available (https://doi.org/10.1371/journal.pone.0143213.s003).

Since the objective of this subsection is to illustrate the usefulness of the proposed transformation of SpO2, we fitted three univariate logistic regression models for more straightforward demonstration. We are not proposing that SpO2 be used on its own as a predictor of severity. Far from it, we aim to illustrate that a transformation of SpO2 could enhance the usefulness of SpO2 as one candidate predictor for illness severity. A multivariable prediction model using a transformed version of SpO2 and vital signs was previously described.8

The three univariate models used different predictors to predict the need for facility admission: (1) hypoxemia, defined as SpO2 <90% by the World Health Organization (WHO; dichotomized SpO2 model), (2) the observed SpO2 (untransformed SpO2 model), and (3) the saturation VS (saturation VS model). We compared these models in terms of overall accuracy, calibration, and clinical interpretation. Overall accuracy was assessed using the area under the receiver operating characteristic curve (AUC ROC) and its 95% confidence interval (CI). AUC ROC measured the probability that a randomly selected admitted child would receive a higher predicted probability of requiring admission than a randomly selected child who was not admitted.14 For the untransformed SpO2 model and the saturation VS model, calibration was assessed by plotting the observed admission rate against the group average of the predicted probability of requiring admission for each of the three groups determined a priori: SpO2 <90%, SpO2 from 90% to 97.5% and SpO2 equal to 98%. A chi-square goodness-of-fit test was then applied.15 In addition, the observed admission rates were plotted against ten equally spaced SpO2 or ten equally spaced saturation VS categories sharing similar ranges. The plots were fitted to linear and non-linear trends using the method of least squares,12 which aimed to minimize the total squared difference between the observed admission rates and the risks of admission directly interpreted from the category-average SpO2 or saturation VS levels. The accuracy of the fitted relationship was quantified by the standard deviations of the difference between the observed admission rates and the interpreted risks of admission based on the SpO2 or saturation VS category labels. The 95% CIs for these standard deviations were calculated based on chi-square distributions.12

Results

Results for calculation of physiological VS

Owing to the empirical nature of the physiological equations,9,10,11 the physiological VS corresponding to SpO2 98% was a small negative value (−0.78). We thus added 0.78 to all physiological VS values so that a normal SpO2 98% corresponded to exactly zero physiological VS.

The transformation formula for the saturation VS

The functions of the form y = a × log10(b − x) + c were sufficient to capture the non-linearity in the relationship between physiological VS and SpO2. The relationship was statistically best fitted by an equation VS = 68.864 × log10(103.711 − SpO2) − 52.110 (Fig. 1). The saturation VS was therefore defined as saturation VS = 68.864 × log10(103.711 − SpO2) − 52.110.

Fig. 1
figure 1

Scatterplot of the physiological virtual shunt (VS) (%) and the saturation VS (%) against SpO2 (%). “Physiological VS” is estimated by solving simultaneous equations using multiple physiological variables, “Saturation VS” is computed by “Saturation VS = 68.864 × log10(103.711 − SpO2) − 52.110”. The standard deviation of the differences between “Physiological VS” and “Saturation VS” is 0.37%, indicating that 95% of the differences between “Saturation VS” and “Physiological VS” are within 0.74%

Prediction performance and model calibration and interpretation

In total, 2943 of the 3374 recruited cases had adequate SpO2 recordings, of whom 831 were admitted and 2112 were not admitted. We excluded 5 cases showing SpO2 variability >6% in combination with low perfusion. We adjusted the 868 SpO2 readings >98% to be equal to 98%. The 12 cases with SpO2 <75% were all admitted and they were excluded from predictive modeling. Table 1 summarizes the information about the clinical diagnosis.

Table 1 Summary of clinical diagnosis: frequency (prevalence in %)

The distribution of the untransformed SpO2 and that of the saturation VS revealed more informative discrimination of the outcome group than that of the dichotomized SpO2 (Fig. 2). In addition, the distribution of the saturation VS was less skewed (skewness 2.26 vs −3.54 for the admitted and 0.98 vs −2.84 for the not admitted cases) than that of the untransformed SpO2 (Fig. 2).

Fig. 2
figure 2

Distribution of dichotomized SpO2, untransformed SpO2 (%), and the saturation virtual shunt (VS) (%) by the outcome group. By having a less skewed distribution, the saturation VS is superior to the untransformed SpO2 in terms of the ability to more evenly stratify patients by sickness severity

The dichotomized SpO2, the untransformed SpO2 model, and the saturation VS model all demonstrated that a SpO2 <90% was associated with increased risk of admission and the latter two unsurprisingly had a much higher AUC ROC (Table 2). Despite the identical AUC ROC, the untransformed SpO2 model demonstrated a statistically significant lack of fit (p value < 0.0001, χ2 = 19.973, df = 1), whereas the saturation VS model did not (p value = 0.098, χ2 = 2.744, df = 1). A closer look at the data revealed that the untransformed SpO2 model significantly underestimated the risk of admission among the 1017 children with SpO2 between 90% and 97.5% and significantly overestimated the risk of admission among the 1750 children with SpO2 ≥ 98% (Fig. 3). Therefore, the saturation VS model was better calibrated than the untransformed SpO2 model. In terms of clinical interpretation, a 5% decrease in SpO2 and a 5% increase in the saturation VS were respectively predicted to be associated with a 286% and 55% increase in the odds of requiring admission (Table 2). The magnified odds ratio obtained from the untransformed SpO2 model was due to the dense distribution of SpO2 data between 85% and 98% (Fig. 2) and that in this region a 5% decrease in SpO2 corresponded to a >5% increase in the saturation VS (Fig. 1). In addition, the contrast between an odds ratio of 1.55 for only 5% increase in the saturation VS and an odds ratio of 11.5 for a much larger increase of the saturation VS from between 0% and 26% to between 26% and 67% is consistent with Fig. 2. This marked difference is also not surprising in view of the sigmoid shape of the oxygen dissociation curve, in which switching from hypoxemia absent to present signals a major deterioration in the efficiency of gas exchange in the lung, whereas 5% increase in the saturation VS indicates a much smaller gradual loss in the efficiency of gas exchange in the lung.

Table 2 Summary of the three prediction models for the need of facility admission
Fig. 3
figure 3

Calibration plot of the untransformed SpO2 model and the saturation virtual shunt model applied to the 2926 cases with SpO2 ≥75%. The dotted line is the line of equality on which the model-predicted admission probabilities perfectly coincide with the observed admission rates

The observed admission rates exhibited a non-linear relationship with SpO2 but an approximately linear relationship with saturation VS (Fig. 4). More specifically, each 4% increase in the saturation VS was on average associated with an approximately 8.2% increase in the admission rate (e.g., 286 out of 1757 children were admitted with the saturation VS from 0% to 4%, whereas 84 out of 288 children were admitted with the saturation VS from 12% to 16%). In contrast, each 2% decrease in SpO2 would be associated with varying increases in the admission rate due to the nature of the non-linear trend in Fig. 4.

Fig. 4
figure 4

Observed admission rates compared to equally spaced SpO2 and saturation virtual shunt categories. Each category includes the right endpoint but excludes left endpoint (e.g., 80–82 includes SpO2 = 82% but excludes SpO2 = 80%). For the curve-fitting, the categories are coded from 1 to 10. For SpO2, the best-fitted sigmoid curve (a common type of non-linear curves for S-shaped relationships) corresponds to an equation y = 0.980 − 2.138/(1 + e−0.266 × (x − 11.073))

Discussion

The transformation of the SpO2 to the saturation VS improves clinical interpretation, accuracy, and calibration of prediction models. For instance, in our study in children aged <5 years in Bangladesh the saturation VS improved clinical prediction, calibration, and interpretation of the need for hospital admissions. This is not surprising because dichotomizing the SpO2 is ill-suited to decision-making in clinical medicine and resulted in a significant degradation of prediction performance. The saturation VS may also have additional importance in estimating severity of disease and response to treatment (such as oxygen administration) since it incorporates the non-linearity in hemoglobin–oxygen dissociation curves. Small changes in SpO2 on the flat portion of the oxygen saturation curve (near 100%) reflect a much greater change in physiology than the same change at a lower SpO2. In contrast, the saturation VS is linearly related to the changes in the physiological and clinical state and provides more granular information of adverse changes in physiology. For example, on the scale of SpO2, a decrease from 95% to 90% would reflect more impairment in gas exchange, and therefore may indicate more significant change (deterioration in clinical condition), than a decrease from 90% to 85%. This non-linear interpretability of SpO2 is particularly undesirable when it is used as a predictor (e.g., in Amatet et al.16), whether in univariate or multivariate analysis, for interpretations of regression models often involve a description of the average amount of outcome change that will be associated with a given amount of predictor change. Such description is only meaningful if the amount of predictor change is clinically comparable for different baseline values. This has typically been resolved by dichotomizing the SpO2 values. However, the use of continuous predictors has been recommended to prevent information loss and decrease in predictive capability resulting from dichotomization.17 To improve model interpretability, it would be unwise to use hypoxemia as a surrogate predictor in view of the loss in accuracy. Instead, the use of the saturation VS in lieu of observed SpO2 as a predictor would not only maintain the prediction accuracy but also increase clinical interpretation and calibration of prediction models. Such benefits are especially valuable for resources-limited settings where staff trainings may also be inadequate. The use of equally spaced saturation VS categories may also provide a more intuitive interpretation for clinicians to linearly interpret sickness severity (e.g., hospital admission rate) that is not achievable with the direct use of the untransformed SpO2.

The major limitation of the proposed transformation is that the derivation makes many assumptions about normal clinical conditions. A change in the saturation VS may be a result of changes in other unmeasured variables in the model and may not be a result of abnormal gas exchange. A further limitation is that this study modeled the outcome of admission, which was not necessarily linked to issues of respiratory compromise. This would be artificially associated with lower AUC values than if modeled using a cohort of children being assessed with a presumed respiratory illness. However, since our predictive variables were all based on oxygen saturation, the comparative differences remain internally valid. Despite these limitations, this approach has the potential to fill an important gap in the utilization of oxygen saturation data in both clinical and research settings. Further validation in clinical settings is therefore required to better define its utility in this context.

In conclusion, the SpO2 transformed saturation VS provides an intuitive measure of hypoxemia and may prove to be a useful aid in clinical practice when measuring SpO2 and as a component of clinical prediction models when included with electronic devices (such as mobile phones) that can easily perform the required calculation. Further validation is necessary prior to adoption into clinical practice due to the many assumptions about normal clinical conditions during the derivation of the physiological and saturation VS.