Accuracy of zero-heat-flux thermometry and bladder temperature measurement in critically ill patients

Core temperature (TCore) monitoring is essential in intensive care medicine. Bladder temperature is the standard of care in many institutions, but not possible in all patients. We therefore compared core temperature measured with a zero-heat flux thermometer (TZHF) and with a bladder catheter (TBladder) against blood temperature (TBlood) as a gold standard in 50 critically ill patients in a prospective, observational study. Every 30 min TBlood, TBladder and TZHF were documented simultaneously. Bland–Altman statistics were used for interpretation. 7018 pairs of measurements for the comparison of TBlood with TZHF and 7265 pairs of measurements for the comparison of TBlood with TBladder could be used. TBladder represented TBlood more accurate than TZHF. In the Bland Altman analyses the bias was smaller (0.05 °C vs. − 0.12 °C) and limits of agreement were narrower (0.64 °C to − 0.54 °C vs. 0.51 °C to – 0.76 °C), but not in clinically meaningful amounts. In conclusion the results for zero-heat-flux and bladder temperatures were virtually identical within about a tenth of a degree, although TZHF tended to underestimate TBlood. Therefore, either is suitable for clinical use. German Clinical Trials Register, DRKS00015482, Registered on 20th September 2018, http://apps.who.int/trialsearch/Trial2.aspx?TrialID=DRKS00015482.


Scientific Reports
| (2020) 10:21746 | https://doi.org/10.1038/s41598-020-78753-w www.nature.com/scientificreports/ fever. However, to date no large comparison in critically ill patients has been performed with an accepted gold standard like blood temperature. There is only one study available that has included a small number of patients with blood temperature as a reference method 26 . The aim of this study was to compare T Core measured with a ZHF-thermometer (T ZHF ) and with T Bladder against a gold standard T Blood measured in the iliac artery or pulmonary artery to determine if the new ZHF-thermometer is more accurate than T Bladder .

Methods
The current prospective clinical study was conducted in accordance with the declaration of Helsinki at the University Hospital of Göttingen, Germany, after obtaining local ethics committee approval (Ethics committee of the University Medicine Göttingen, Application number: 13/05/18) for the experimental protocol and registration on German Clinical Trials Register (DRKS00015482). According to the approval of the local ethical committee we used deferred (proxy) consent in emergency critical care research 17 as the study was totally non-invasive and observative. If patients were able to give informed written consent this consent was used. If informed proxy consent was necessary, it was given in written form of the proxy. We did not exclude patients who did not recover and died during their hospital stay. The local ethics committee had approved this procedure. The article adheres to the STROBE guidelines 18 .
Critically ill adult patients already having a bladder catheter with a temperature probe and an arterial catheter with a temperature probe placed in the iliac artery (Pulsiocath Arterial Thermodilution Catheter 5F; Pulsion Medical Systems AG, Munich, Germany) or a pulmonary artery catheter (Arrow Hands-Off Thermodilution catheter 7F; Arrow International, Athlone, Irland) in place were included in this this study. The only exclusion criteria were pregnancy and refusal to take part in the study.
In all patients, core temperature was measured additionally with a single use, continuous, non-invasive ZHFthermometer (3 M SpotOn Temperature Monitoring System, 3 M, St. Paul, MN, USA) attached to the lateral forehead of the patients.
Then every 30 min T Core measured by T Blood , T Bladder and the ZHF-thermometer were documented at the same time points until the patient lost the T Blood sensor or T Bladder sensor, left the ICU or at least after 5 days. If data of a temperature source were missing the couple of data was not used for comparison. In addition to the temperature data age, weight, height, sex and medical diagnosis at admission to the Intensive Care Unit (ICU) were documented.
As a primary statistical method Bland-Altman statistics were used for interpretation of accuracy (bias = mean difference between methods) and precision (limits of agreement = 1.96 standard deviation) using the Bland and Altman random effects method for repeated measures data adjusted for unequal numbers of measurements per patient 19 . Additionally, we calculated the proportion of all differences that were within ± 0.5 °C or ± 1 °C of T Blood .
For each of the two measurement modalities sensitivity, specificity, positive and negative predictive values for the detection of hypothermia and fever were calculated. Hypothermia was defined as a T Blood < 36 °C and fever was defined as T Blood > 38.3 °C 10 .
Additionally, we performed an error grid analysis 20 to determine if some measurement differences would lead to wrong clinical decisions. The Zones were defined as follows: Zone A begins with an area of a ± 0.5 °C error on either side of a perfectly accurate measurement between T Blood and the temperature measured by T ZHF or T Bladder . Measurement errors smaller than ± 0.5 °C are considered by most authors as clinically not relevant. In addition, if both measurements indicate hypothermia < 36 °C or fever > 38.3 °C the absolute error is considered to be clinically irrelevant because the same treatment or diagnostic workup will be initiated.
Zone B describes the zone where measurement errors are > 0.5 °C but this will not result in a clinical wrong decision. E.g. if T Blood is 36.5 °C and T ZHF shows a temperature of 37.4 °C both temperatures will not lead to active warming therapy or a diagnostic workup for infection.
In contrast Zone C indicates errors larger than 0.5 °C that will lead to wrong clinical decisions and may do harm to the patient. e.g. if T Blood is 34 °C and T ZHF shows 37 °C the patient will not receive active warming although this would be indicated.

Results
55 potentially eligible patients were screened. Three patients could not be enrolled because we could not obtain proxy consent and two patients were not included due to technical problems. The remaining 50 patients were enrolled. 36 patients (72%) were male, 14 (28%) were female. Mean age was 61.9 (± 16.8) years, mean height was 1.75 (± 0.07) m, mean weight was 86.4 (± 36.3) kg resulting in a mean body mass index of 28.2 (± 11.3) kg/m 2 . Of these patients 16 were suffering from sepsis, 18 patients had neurologic injury (subarachnoid hemorrhage, intracerebral hemorrhage), 6 patients had trauma, 4 patients had respiratory failure, 2 patients had accidental hypothermia, 3 patients had cardiac surgery, and 1 patient had visceral surgery. Of all 50 patients 49 had an arterial catheter with a temperature probe placed in the iliac artery and one patient had a pulmonary artery catheter with temperature probe. No patient was excluded from the study after enrolment.
Globally 3970.5 h were recorded. 7665 T Blood values, 7086 values of T ZHF and 7358 T bladder values were documented. 276 T Blood values, 855 values of the ZHF-thermometer and 583 T Bladder values were missing. The major reason for missing values was a disconnection of the temperature probes for transportation of the patient to the CT, OR, neuroradiology suite, or to the cardiac catheter lab. After these procedures the devices were often not reconnected immediately. Only 17 temperature values of T Bladder and 16 values of T ZHF were missing due to technical problems. 12 values below 30 °C could not be recorded by the ZHF-thermometer because the device did not give a reading at these low temperatures. Bland Altman analysis. Bias between T ZHF and T Blood was − 0.12 °C with an upper limit of agreement of 0.51 °C and a lower limit of agreement of − 0.76 °C (Fig. 1). Bias between T Bladder and T Blood was 0.05 °C with an upper limit of agreement of 0.64 °C and a lower limit of agreement of − 0.54 °C (Fig. 2).
Proportion of differences within ± 0.5 °C and ± 1 °C. The proportion of differences within ± 0.5 °C of T Blood was 90.98% for T ZHF and 95.99% for T Bladder and the proportion of differences within ± 1.0 °C of T Blood was 98.99% for T ZHF and 99.01% for T Bladder .
Sensitivity, specificity, positive and negative predictive values. The calculated sensitivity, specificity, positive and negative predictive values for the detection of hypothermia and fever are shown in Table 1.  www.nature.com/scientificreports/ Error grid analysis. Error grid analysis showed that 91.6% of all T ZHF measurements were clinically not different from T Blood , or would still lead to the same treatment or diagnostic workup. In 6.2% measurement errors were > 0.5 °C, but the result would not lead to a clinical wrong decision. Only 2.2% of the measurements would lead to wrong clinical decisions (Fig. 3). Error grid analysis of T Bladder showed that 96.3% of all measurements were clinically not different from T Blood or would still lead the same treatment or diagnostic workup. In 2.4% measurement errors were > 0.5 °C but this would not result in a clinical wrong decision. Only 1.3% of the measurements would lead to wrong clinical decisions (Fig. 4).
Adverse events. The ZHF-thermometer sensors were well tolerated in all patients and no burn or skin reaction was observed during the study period.

Discussion
In this study with critically ill patients, T Bladder represented T Blood more accurate than T ZHF . In the Bland Altman analyses the bias was smaller and limits of agreement were narrower. The proportion of differences within ± 0.5 °C of T Blood were higher, and there were less values in Zone B and C of the error grid analysis. In addition, the ZHF thermometer failed to record core temperatures below 30 °C. However, compared to the published results for other non-invasive thermometers like infrared tympanic membrane thermometers, temporal artery thermometers, or axillary thermometers 12 the ZHF-thermometer is more accurate.
Interpretation of our results. The results of the Bland Altman analysis of T Bladder were comparable to the results that were obtained by Nierman 21 and slightly different from the results of Lefrant et al. 22 who observed a bias of − 0.21 °C and more narrow limits of agreement of ± 0.20 °C. In general, the high level of accuracy of T Bladder is remarkable because oliguria, that is very frequent in ICU patients, reduces the accuracy of T Bladder  www.nature.com/scientificreports/ measurements in operative patients 23,24 . On the other hand, in critically ill patients, oliguria does not seem to influence the accuracy of bladder temperature very much 10 .
The results of the Bland Altman analysis of the ZHF-thermometer were a slightly better than the results that were obtained by Eshraghi et al. 13 before and after cardiopulmonary bypass, and in the same range as found by Mäkinen et al. 15 during cardiac surgery when the patients were off cardiopulmonary bypass. During surgery with slow temperature changes Boisson et al. 14 could obtain better results with a bias to T Blood of − 0.1 °C with limits of agreement of ± 0.4 °C.
The proportion of differences within ± 0.5 °C of T Blood was 84% in the study of Eshraghi 13 and 94% in the study of Boisson 14 . Our results of 91% are also in that range. Other studies that have evaluated the ZHF-thermometer in critically ill patients did not compare it to a gold standard and are therefore of limited value for the comparison with our results 25,26 .
The question is, if the accuracy of the ZHF-thermometer is still good enough to be used in ICU. Many studies that compare new temperature monitoring devices with a gold standard use a definition that the combined inaccuracy (bias and limits of agreement) should be smaller than 0.5 °C 27 to be accurate enough. In our opinion this objective is very high and most of the studies that have investigated new non-invasive thermometers 13,28,29 did not find an accuracy that met this criterion. Still they came to the conclusion that the new devices agree sufficiently enough for clinical practice 13,28,29 .
Another possibility is to look at the proportion of differences within the range of ± 0.5 °C of the T Blood . In our study 91% of all measurement values of the ZHF-thermometer were within the range of ± 0.5 °C of T Blood and 99% were within the range of ± 1 °C. That seems to be acceptable.
Another interesting way of interpreting the results is the error grid analysis 20 . In this analysis 91.6% of the values of the ZHF-thermometer lead to the right clinical decision and only 2.2% of the measured values would lead to wrong clinical decisions. This seems to be sufficient, especially because T Core changes do not require an immediate change in therapy in the next minutes. However, it has been argued, that no single measurement value should be in Zone C as this will lead to wrong clinical decisions 20 . This seems to be very demanding. If we would accept this, methods like non-invasive blood pressure measurement or pulse oximetry would have to be abandoned immediately.

Limitations of the study
In most studies comparing temperature measurement devices there are many data pairs per subject and the number of data pairs per patient are not equal. This can induce random effects because there are independent influences of the different patients and there are influences of time in each individual patient. This influence is not totally independent. To account for this effect, we have used the Bland and Altman random effects method for repeated measures data adjusted for unequal numbers of measurements per patient 19 .
A potential limitation of the methods used is the use of error grid analysis. This method has not been used for the comparison of temperature measurement devices before. Error grid analysis is highly dependent on the zones, which can are by definition arbitrarily defined. In this study the zones were defined by the authors a priori using published and well accepted definitions. Zone A was defined as an area of a ± 0.5 °C error on either side of a perfectly accurate measurement between T Blood and the T ZHF or T Bladder because measurement errors smaller than ± 0.5 °C are considered by most authors as clinically not relevant. In addition, if both measurements indicate www.nature.com/scientificreports/ hypothermia < 36 °C or fever > 38.3 °C the absolute error is considered to be clinically irrelevant because the same treatment or diagnostic workup will be initiated. It can be argued that there is a clinically relevant difference between 35.0 °C and 26 °C. This would still lead to a data point that is in the Zone A. However, it is extremely difficult to define thresholds for this situation. In addition, we did not observe this.
Other potential limitations of our study are that we have studied a relatively small population with only 50 patients. However, in average every patient was monitored more than 3.3 days, resulting in an average of about 140 measurement points per patient.
Another potential limitation is that we have studied a mixed ICU patient collective. This may also be seen as an advantage because we have measured different patients in different critically ill states and with different influences like renal replacement therapy (RRT) or Extracorporeal Membrane Oxygenation (ECMO). Patients undergoing targeted temperature management after cardiopulmonary resuscitation which might be an interesting and challenging patient cohort in which T Core measurement is of utmost importance are missing in our collective. This may be a limitation to the generalizability of the study results.
In some of our patients the gold standard blood temperature may be distorted by the rapid infusion of unwarmed fluids or extracorporeal devices like RRT or ECMO. It is well known that a rapid infusion of unwarmed or cold fluids can lower blood temperature temporarily. This effect is typically used for the measurement cardiac output with a pulmonary artery catheter. This effect varies depending on the temperature, amount, and rate of the fluid given. Initiation of RRT also temporarily changes blood temperature to a small amount but a stable running RRT does not lead to changes in blood temperature. The same is probably true for ECMO. Infusion of intravenous fluids or RRT are typical measures in ICU and it is not possible to exclude patients that need intravenous fluids. In our patient group 17 patients had RRT and 2 patients had ECMO. This may have contributed to the observed inaccuracy of the ZHF-thermometer and T Bladder . Another potential problem may be that the analogue data transfer from the ZHF-thermometer to the general ICU monitoring may have introduced an additional error.
We did also not observe many measurements for temperature above 39 °C. Therefore, it is not possible to make any conclusions about the accuracy the devices in these extremely high temperature range.
The use of vasopressor therapy and especially the use of high dose vasopressor therapy may also influence the accuracy of the ZHF-thermometer. Unfortunately, we did not look at this potential source of inaccuracy. This might be investigated in another study.
Also we did not measure the urine output of our patients, therefore a correlation to accuracy of T Bladder is impossible.
Some studies have used more complex statistical methods 29 like population analysis 30 . However, very often these complex analyses do not add very much new information about the accuracy of the studied devices. We included sensitivity, specificity, positive and negative predictive values for the detection of hypothermia and fever for both methods because this has not been done yet. We also included an error grid analysis because this kind of analysis may be clinically useful although the definition of the three zones in that error grid analysis can be discussed.

Conclusion
In conclusion the results for zero-heat-flux and bladder temperatures were virtually identical within about a tenth of a degree, although T ZHF tended to underestimate T Blood . Therefore, either is suitable for clinical use and can be used if bladder temperature is not available.

Data availability
The datasets used for the analysis in the current study are available from the corresponding author on reasonable request.