Introduction

Avoiding both hypoxemia and hyperoxemia is vital in critically ill patients1,2,3,4. Because hypoxemia poses the risk of tissue hypoxia, adequate oxygen should be prescribed for ventilation. In contrast, hyperoxemia has been associated with increased mortality5 and fewer ventilator-free days6. Pulse oximetric oxygen saturation (SpO2) monitor is widely used in clinical situation to monitor hypoxemia. However, hyperoxemia is frequently under-recognized when PaO2 is over 100 mmHg, because SpO2 is difficult to determine the level of PaO2 in this situation. In regard to the oxygen delivery, it is recommended that arterial oxygen saturation (SaO2) be kept at a certain range in mechanically ventilated patients. Recently, a multicentre randomised trial suggested that maintaining the oxygen saturation measured by pulse oximetry (SpO2) at 97% might be ideal for managing critically ill patients7. From the British Thoracic Society guidelines, the target saturation range for acutely ill patients who are not at risk of hypercapnic respiratory failure is 94–98%8. Despite the control and maintenance of adequate oxygen saturation targets, hyperoxemia has been reported in some critically ill patients9; approximately one out of three patients were managed with SpO2 > 97%. One reason for unresolved hyperoxemia was the fluctuations in the oxygenation status. In mechanically ventilated patients, despite setting an adequate SpO2 range, SpO2 can easily change due to posture, respiratory pattern, and the presence of airway secretions. Thus, it might be difficult to strictly control the oxygenation status within the ideal SpO2 range manually.

INTELLiVENT-ASV is one of the unique ventilator modes which available only for Hamilton Medical ventilators. It has the original function which allow to monitor patients’ SpO2 and end-tidal carbon dioxide (ETCO2) and automatically control the delivered fraction of inspired oxygen (FIO2) and minute ventilation volume10,11. The automatic FIO2 control function can be used with the SpO2 monitoring probe attached to the ventilator. Generally, SpO2 is controlled from 93 to 97% in normal lung setting by adjusting FIO2 between 0.21 and 1.0, but SpO2 target range can also easily alter depend on each patients’ clinical situation. Because INTELLiVENT-ASV is able to monitor and adjust oxygen breath by breath continuously, it is easier to use INTELLiVENT-ASV than a physician-driven control to maintain an appropriate SpO2 range. However, it is unclear how accurate the SpO2 measured during INTELLiVENT-ASV compared to the actual SaO2. Several studies have suggested that SpO2 measurements tend to overestimate the oxygenation status in critically ill patients12,13,14. In addition, few studies have evaluated the relationship between actual SpO2 and SaO2 using either the Nihon Kohden or the Masimo pulse oximeter, which are the only two available for INTELLiVENT-ASV15.

In this study, we aimed to compare the accuracy of bias and precision of the two pulse oximeters in mechanically ventilated patients in an intensive care unit (ICU) setting. In addition, we evaluated the accuracy of SpO2 measurements in various SaO2 range categories.

Results

Enrolment and baseline characteristics

In total, the results of 1854 blood gas analyses performed in 100 patients ventilated with HAMILTON G5 ventilator (Hamilton Medical AG, Rhäzüns, Switzerland) were considered for evaluation (Fig. 1). Of these results, the following were excluded: those with missing data on the Nihon Kohden and Masimo SpO2 measurements (n = 268), those in which the Nihon Kohden SpO2 measurements were ≤ 60% (n = 77), those in which the difference in value between the Nihon Kohden SpO2 measurement and the actual SaO2 was > 10% (n = 5), and those in which the difference in value between the Masimo SpO2 measurement and the actual SaO2 was > 10% (n = 8).

Figure 1
figure 1

Patients’ flow chart.

Table 1 shows the characteristics of patients ventilated with a HAMILTON G5 ventilator. The mean value of the blood gas analyses was 15.0 ± 15.4, mean Acute Physiology and Chronic Health Evaluation II score was 24.6 ± 8.3, and the mean duration of mechanical ventilation was 9.0 ± 7.4 days. The arterial oxygen tension (PaO2)/FIO2 ratio was 249.6, categorised into PaO2/FIO2 > 300 (26%), 200 < PaO2/FIO2 ≤ 300 (36%), 100 < PaO2/FIO2 ≤ 200 (22%), and PaO2/FIO2 ≤ 100 (16%). ICU and hospital mortality rates were 5.0% and 16.0%, respectively.

Table 1 Patient characteristics.

Bland–Altman plot comparing SpO2 and SaO2

A Bland–Altman plot was generated to evaluate bias and precision (Fig. 2). Mean SpO2 was 96.4% using the Nihon Kohden and 96.9% using the Masimo monitor. The bias was lower in the Nihon Kohden SpO2 measurements than in the Masimo measurements, although the precision was not significantly different (mean ± SD, 0.72 ± 1.86% vs 1.17 ± 1.68%, respectively: Table 2).

Figure 2
figure 2

Bland–Altman plot comparing SaO2 and SpO2. The horizontal axis represents (SpO2 + SaO2)/2. The vertical axis represents SpO2 – SaO2. The middle horizontal red line represents bias. The upper horizontal red dotted line represents the upper limits of agreement, and the lower horizontal red dotted line represents the lower limits of agreement. Black dots represent each test result. The left figure portrays Nihon Kohden SpO2, and the right figure portrays Masimo SpO2.

Table 2 Bias and precision for each pulse oximeter (n = 10 000).

We also analysed all raw data, including outliers. The bias improved slightly in the Masimo; however, the precision deteriorated in the Masimo measurements compared to the Nihon Kohden measurements (0.72 ± 2.07% vs 1.08 ± 2.42%) (see Supplementary Table S1 and Supplementary Fig. S1).

Differences between SaO2 and pulse oximeters’ SpO2 among SaO2 categories

Among the three SaO2 range categories (SaO2 < 94%, 94% ≤ SaO2 < 98%, and SaO2 ≥ 98%), we evaluated the differences in SpO2 measurements by both monitors within each group (Fig. 3 and Table 3). In the “SaO2 < 94%” group, there was no significant difference between the Nihon Kohden SpO2 measurements and the Masimo measurements (1.41 ± 2.28% vs 1.74 ± 2.15%, P = 0.083). However, in the “94% ≤ SaO2 < 98%” and “SaO2 ≥ 98%” groups, there were significant differences between the Nihon Kohden SpO2 measurements and the Masimo measurements (0.57 ± 1.77% vs 1.10 ± 1.60%, P < 0.0001; 0.51 ± 1.49% vs 0.84 ± 1.28%, P = 0.006; respectively), although there was a slight difference of -3% or more compared to actual SaO2 and the difference was larger than Masimo SpO2 in “SaO2 ≥ 98%” group (Table 3).

Figure 3
figure 3

Differences in the pulse oximeters’ SpO2 measurements among SaO2 categories. Three SaO2 categories were established (SaO2 < 94%, 94% ≤ SaO2 < 98%, SaO2 ≥ 98%). The vertical axis represents SpO2 – SaO2, the horizontal axis represents Nihon Kohden SpO2 and Masimo SpO2 for each of the three groups. There were significant differences in the groups with SaO2 ≥ 94%. However, there was no significant difference in the SaO2 < 94% group. *: P < 0.005.

Table 3 Differences in SpO2 measurements between pulse oximeters.

Area under the receiver operating characteristic curve to detect SaO2 < 90% and SaO2 ≥ 98%

We evaluated each pulse oximeter’s ability to detect SaO2 < 90%. The area under the receiver operating characteristic curves (AUCs) were 0.966 using the Nihon Kohden and 0.971 using the Masimo monitor. When cut off point was set at SpO2 90% in Nihon Kohden and Masimo, its sensitivity and specificity were 62.8% and 99.0% in Nihon Kohden SpO2, and 56.9% and 99.3% in Masimo SpO2, respectively. There was no statistically significant difference in the AUCs between both monitors (P = 0.530). Regarding SaO2 ≥ 98%, the AUCs were 0.837 using the Nihon Kohden and 0.835 using the Masimo monitor. When cut off point was set at SpO2 98% in Nihon Kohden and Masimo, its sensitivity and specificity were 87.1% and 66.2% in Nihon Kohden SpO2, and 90.8% and 58.7% in Masimo SpO2, respectively. However, there was no statistically significant difference in the AUCs of both monitors (P = 0.841; Fig. 4).

Figure 4
figure 4

AUC comparison between Nihon Kohden SpO2 and Masimo SpO2 for patients with SaO2 < 90% and SaO2 ≥ 98%. The blue line represents Masimo SpO2 measurements, and the red line represents Nihon Kohden SpO2. The vertical axis represents true positive rate calculated by sensitivity, and the horizontal axis represents false positive rate calculated by (1—specificity). To detect SaO2 < 90%, the area under the receiver operating characteristic curves (AUCs) were 0.966 using the Nihon Kohden and 0.971 using the Masimo monitor. When cut off point was set at SpO2 90% in Nihon Kohden and Masimo, its sensitivity and specificity were 62.8% and 99.0% in Nihon Kohden SpO2, and 56.9% and 99.3% in Masimo SpO2, respectively. There was no statistically significant difference in the AUCs between both monitors (P = 0.530). Regarding SaO2 ≥ 98%, the AUCs were 0.837 using the Nihon Kohden and 0.835 using the Masimo monitor. When cut off point was set at SpO2 98% in Nihon Kohden and Masimo, its sensitivity and specificity were 87.1% and 66.2% in Nihon Kohden SpO2, and 90.8% and 58.7% in Masimo SpO2, respectively. However, there was no statistically significant difference in the AUCs of both monitors (P = 0841). AUC, receiver operating characteristic curve.

Discussion

In this study, we found that Nihon Kohden SpO2 measurements presented a lower bias than did Masimo SpO2 measurements compared with the actual SaO2. The precision and correlation coefficients of SpO2 measurements were similar in both devices; specifically, these trends were seen in the “SaO2 ≥ 94%” group. In the low SpO2 group, there were no significant differences between SpO2 values measured by both devices.

Pulse oximeters indicate the calculated oxygen saturation values using the transmitted light signals of an LED irradiated on body. The theoretical formula for determining SpO2 from transmitted light signals has not been established, and each manufacturer uses its own formula. Therefore, the accuracy of SpO2 includes two factors: the validity of the formula and deviation from the formula. Also, pulse oximeters should detect the pulsation of arterial blood, but when the other pulsatile noises such as body movement are detected, it might be cause erroneous calculation. To reduce these noises, each manufacturer uses its own original technology. The results of this study might be influenced by these technological differences between Nihon Kohden and Masimo SpO2.

To safely perform automatic control of a closed-loop ventilation system, it is essential to collect high-quality data continuously. When using the current INTELLiVENT-ASV, two types of SpO2 measuring devices are now available: the Masimo and the Nihon Kohden. A previous study reported that Masimo SpO2 measurements tended to overestimate SaO2 compared to Nihon Kohden measurements15. In this study, although the precisions of both devices were similar, Masimo SpO2 measurements were higher than Nihon Kohden SpO2 measurements. Specifically, Masimo SpO2 measurements were significantly different for SaO2 readings > 94%. These results suggest we should be more cautious regarding overestimation of SaO2 when using the Masimo SpO2 readings. In this regard, using the Nihon Kohden SpO2 monitor may be preferable with INTELLiVENT-ASV. On the other hand, the SpO2 measured by Nihon Kohden often showed a difference of -3% or more compared to actual SaO2 and the difference was larger than Masimo. It should be interpreted cautiously that the likelihood of misdetermination of hyperoxemia as normal was higher than that of Masimo.

At the low SpO2 range (SpO2 < 94%), the difference between devices was not statistically significant; concerning hypoxaemia detection, each pulse oximeter presented relatively high AUC values without a significant difference. According to a previous study, there was no significant difference between Nihon Kohden and Masimo SpO2 measurements in the range of 85% < SpO2 ≤ 90%15. Our result is similar; although, the relatively small number of analyses in our study might have influenced this result. In this study, both devices presented high SpO2 compared with actual SaO2. In clinical settings, it is important to check the actual SaO2 when SpO2 presented higher than expected. Also, there was a difference that Masimo's equipment had true positive rate of 90% or more when false positive rate is 10%, but Nihon Kohden has a false positive rate of 15% to raise true positive rate to 90%. The small number of this group (n = 51) might partly be affected on this result. Because we usually control SpO2 over 90% in critically ill patients, the number of blood gas analyses was relatively small compared with another group. Further examination is needed to evaluate this results in critically ill patients.

It is important to mount SpO2 sensors correctly, especially in an ICU setting. A previous study evaluated four types of pulse oximeters that could accurately detect low perfusion during motion16. Motion impaired the performance of all four oximeters at all ranges, with lesser accuracies observed at the lower SaO2 range. In contrast, at a lower perfusion, only the Nihon Kohden SpO2 measurements remained accurate. In addition, when the probe was not symmetrically placed, SpO2 measurements were inaccurate without an abnormal SpO2 waveform; this phenomenon is called the “penumbra effect”17. To avoid this problem, we selected a seal-type sensor for all patients in this study.

This study has several limitations. First, it was a single-center observational study. All patients evaluated were Asian, except for one American. Because the accuracy of a pulse oximeter is affected by race18, further studies are warranted. Second, the mean SaO2 values were in the optimal range (95.7 ± 2.9%); SaO2 values were distributed at higher levels with respect to hypoxaemia. Thus, it might be difficult to evaluate the accuracy of these monitors in the low SaO2 range group. Furthermore, we did not measure the quality and perfusion indexes of the Massimo SpO2 measurements, which might have affected these results. However, we recorded these data when the SpO2 value was stable. Third, although we defined the outlier as a difference of ≥ 10% in SpO2 and SaO2 values, its validation was unclear. However, we re-analysed the set of raw data, including these outliers, and found that the results were not affected by the exclusion of outliers. Fourth, we did not evaluate the Masimo SpO2 measurements without mounting it on the G5. In this study, it was difficult to mount both SpO2 sensors on the G5 simultaneously. However, the mechanics of each SpO2 sensor were almost the same, with or without mounting on the G5. Finally, we could not evaluate continuously whether SpO2 sensors were mounted correctly. Specifically, we did not consider the perfusion index of the Masimo. Also, we did not consider the presence of covariates, such as age, gender, and/or diseases. Further studies are needed to evaluate accuracy based on probe attachment.

Despite these limitations, this study also has several strengths. First, this is the first prospective observational study evaluating the bias and precision of pulse oximeters in critically ill mechanical ventilated patients with Hamilton ventilator. In addition, we evaluated the oxygenation status with respect to SpO2 ≥ 98% and SpO2 < 90%. To keep SaO2 strictly in the recommended range, it might be better to use Nihon Kohden SpO2 measurements due to the smaller bias compared to the Masimo SpO2 measurements.

Conclusions

We found that Nihon Kohden SpO2 measurements presented lower bias than Masimo SpO2 measurements compared with the actual SaO2. The precision of SpO2 measurements was relatively similar with both devices. This study suggests that when using INTELLiVENT-ASV and selecting automatic control of oxygenation in mechanically ventilated patients, it is preferable to use Nihon Kohden SpO2 monitoring.

Methods

Study design and setting

This was a single-center, prospective, observational study conducted in the general ICU of a university hospital (Tochigi, Japan) from June 2017 to November 2018 (UMIN000027671). Patients who were ventilated with a G5 ventilator (Hamilton Medical AG, Switzerland) in the ICU were included in this study. Clinical decisions, including changing ventilation mode, were made at the discretion of the attending ICU physicians. The study protocol was approved by the Institutional Research Ethics Committee of Jichi Medical University Hospital (A18-110). Written informed consent was obtained from each participant or nearest relative when the patient was incapacitated or unconscious. All methods were performed in accordance with the relevant guidelines and regulations.

Participants

Patients were eligible for enrolment if they were ≥ 20 years old and ventilated with a G5 during their ICU stay. The exclusion criteria were age < 20 years, quality index of Nihon Kohden SpO2 ≤ 60%, an unstable value of SpO2 due to change in FIO2 just before the blood sample was drawn, lack of SpO2 data, and a history of management with veno-arterial extracorporeal membrane oxygenation. In addition, patients with a difference of > 10% in SpO2 and SaO2 values were excluded due to the possibility of outliers. For validation purposes, we added the analysis of the raw data, including data of the possible outliers.

Patient baseline characteristics, including age, sex, height, body weight, body mass index, disease classification, and the use of veno-venous extracorporeal membrane oxygenation, were collected from electronic medical records. In addition, the results of blood gas analyses, including pH, PaO2, arterial carbon dioxide tension, haemoglobin, and SaO2 values, were recorded. Furthermore, the PaO2/FIO2 ratio and severity of hypoxaemia were determined. Underlying medical histories were obtained, including information on hypertension, ischaemic heart disease, chronic heart failure, chronic obstructive pulmonary disease, cerebrovascular accidents, diabetes mellitus, and chronic kidney disease requiring haemodialysis. The Acute Physiology and Chronic Health Evaluation II19 was used to assess organ dysfunction. ICU stays and mechanical ventilation durations were evaluated. In addition, ICU and hospital mortality rates were assessed.

Pulse oximeter and blood gas analysis measurements

SpO2 was simultaneously measured by the TL-271T3 (Nihon Kohden, Tokyo, Japan) attached to Hamilton G5 ventilator, and by the RD SET™ NEO (Masimo, Irvine, CA, USA). Seal-type sensor probes were used and attached on the same side of the patient’s hand. We replaced the sensors at least three times a day, and the sensors were replaced at the same time during study periods. The selection of fingers to use in sensor replacement was dependent on the bedside nurse’s decision. The number of blood drawings or blood gases analyses were recorded and the SaO2 and SpO2 were compared. The timing of blood gas analyses was at the physician’s discretion. Blood samples were immediately transferred and analysed by the RAPIDLAB1265 (Siemens Healthcare Diagnostics Inc., Tarrytown, NY, USA) or the ABL800 FLEX device (Radiometer Medical ApS, Denmark).

Statistical analysis

The relationship between measured SaO2 and SpO2 was evaluated using a Bland–Altman plot. Bias was calculated as the difference between SpO2 and the actual SaO2 from each pulse oximeter. Precision was determined from the standard deviation of the calculated bias. The bias and precision of the differences between SpO2 and SaO2, upper and lower limits of agreement, and mean values of SpO2 were calculated using bootstrapping methods. Resampling was randomly done with replacement up to 10,000 repetitions.

The ability to detect SaO2 ≥ 98% was assessed using an AUC because SpO2 97% is default upper range of SpO2 on INTELLiVENT-ASV20. The ability to detect SaO2 < 90% was also assessed using an AUC to detect hypoxemia.

To evaluate the differences in oxygenation, we established three categories of SaO2 ranges (SaO2 < 94%, 94% ≤ SaO2 < 98%, and SaO2 ≥ 98%). Among these groups, differences between SpO2 and SaO2 were evaluated using the student’s t-test. All analyses were performed using JMP 15 pro (SAS Institute Inc., Cary, NC, USA). Data are presented as means ± standard deviations (SD) or as percentages where appropriate. P-values < 0.05 were considered statistically significant.

Prior presentations

We presented prelimited data of this article at the European Society of Intensive Care Medicine 31st Annual Congress in Paris, France, 17 August 2018.