Introduction

Intraoperative hypothermia is associated with significant morbidity and mortality rates. Its impact on coagulation and blood loss, surgical wound healing, and prolonged recovery are well documented1,2,3,4,5,6. Even moderate inadvertent hypothermia has been shown to impair coagulation mechanisms2,7. Furthermore, hypothermia is causally associated with postoperative shivering. Hypothermia is not only commonly described as one of the most uncomfortable immediate postoperative experiences but also increases oxygen consumption, thereby increasing the risk of cardiovascular complications8,9,10.

Numerous potential causes of inadvertent perioperative hypothermia have been discussed. They can be subdivided into surgical factors, anesthesiologic factors, environmental factors (e.g., operating room temperature), and patient characteristics11,12. Surgical factors include the magnitude of the procedure, laparoscopic vs. open surgery, blood loss, or the cooling effects of evaporating wound disinfectants13,14. Anesthetic factors are the effects of intravenous and inhaled anesthetics and neuraxial anesthesia on thermoregulatory homeostasis and the shivering threshold or the redistribution of cold blood from the body’s periphery to the patient’s core during general anesthesia induction15,16,17,18,19,20,21,22. Furthermore, several patient characteristics have been associated with perioperative hypothermia in smaller-scale studies, including sex, age, body mass index (BMI), diabetes, and hypothyroidism12,23,24,25, and a mixture of various pre- and intraoperative factors were identified in a larger-scale study26.

Consequently, several expert panels have published recommendations for temperature measurements and perioperative thermal management. For example, according to the National Institute for Health and Care Excellence (NICE) guidelines and the Surgical Care Improvement Project, patients’ core temperatures should be measured and active body-surface warming systems (ABWS) should be applied in general anesthesia with a duration > 30 min27,28.

However, Sun et al. demonstrated in a large cohort that hypothermia occurred frequently during the first hour of anesthesia even in actively warmed patients. Nearly 50% of all patients in their study had a continuous core temperature below 36 °C, 20% were below 35.5 °C for more than 1 hour29.

Given the high incidence of mild to moderate perioperative hypothermia despite the use of warming devices30,31,32, only the optimally allocated use of a combination of thermoregulatory interventions for patients at risk is likely to further increase the rate of perioperative normothermia.

In the present study, we developed three multivariable prediction models for hypothermia based on different levels of information that are available, since healthcare professionals have different degrees of knowledge about their patients at different timepoints. The first model contains only basic demographic data, the second adds vital signs data acquired immediately before induction of anesthesia, and the third incorporates additional data from the preanesthetic clinic (PAC). All three models enable the prediction of the minimum temperature during surgery and, consequently, the prediction of a decrease in core temperature below a certain threshold and do not rely on any intraoperative data unknown before start of anesthesia. The prediction models were validated by assessing the discrimination and calibration within a test set. Additionally, since the developed algorithm is complex, a web application was developed that can be easily accessed by care providers, to deliver the thermoregulatory risk estimation based on preoperative parameters.

Methods

Study population

With the approval of the ethics committee, intraoperative temperature data as well as baseline characteristics and potential predictors of all surgery cases at the Vienna General Hospital between September 26, 2013, and May 24, 2019 were extracted from the IntelliSpace Critical Care and Anesthesia (ICCA; Philips GmbH Healthcare, Vienna, Austria) database and the Vienna General Hospital information management system (AKIM; Allgemeines Krankenhaus Informationsmanagement) (Siemens AG Österreich, Vienna, Austria). After acquisition, patient data were anonymized, cleaned, and stored in a database.

Cases with invalid temperature measurements, surgeries with an anesthesia duration shorter than 60 min, and patients undergoing therapeutic hypothermia (e.g., cardiac surgery) or therapeutic hyperthermia (e.g., hyperthermic intraoperative peritoneal chemotherapy) were excluded. In addition, cases were excluded when temperature monitoring started later than 45 min after entering the operating room, when it was interrupted for more than 30 min, or when the temperature monitoring duration was less than 30 min. Patients were also excluded if the surgeries lasted less than 60 or > 1000 min or if no active intraoperative warming therapy was used. Only patients aged 18 years or older were included in the study. The standard operating room temperature was set to 21 °C at the Vienna General Hospital. Patients did not receive pre-warming due to infrastructural limitations at our hospital and were typically directly transferred from the ward with only a short stop in the holding area to the operating rooms.

Ethics

The study protocol complies with the Declaration of Helsinki and was approved by the Ethics Committee of the Medical University of Vienna, Austria (EK 1062/2019; Chairperson: Prof. Jürgen Zezula; on February 19.2019) with waiver of informed consent.

Predictors

Preoperative data, initial vital signs measured before induction of anesthesia, and known comorbidities recorded in the PAC were available and pre-selected based on discussions with anesthesiologists. Supplementary Table S1 lists all potential predictors that were considered in the analysis. To avoid outliers in the patients’ weights, weight was set to the 0.5th and 99.5th percentiles if the value was below or above, respectively. We created the binary predictor “laboratory values available” indicating whether any blood parameters were measured before surgery. Expectation of “High intravenous fluid turnover/bleeding” was derived from nursing procedures before start of surgery like preparation of fluid warming systems and blood salvage systems. For vital signs (systolic blood pressure, oxygen saturation, and heart rate), the first valid measurements were taken.

Outcome definition

The primary outcome was defined as the minimum core temperature measured during surgery. Temperature measurements were automatically collected every 2 min via ICCA. The method of temperature measurement depended on the operation. For urinary bladder temperature measurements, a Sensor series 400 balloon catheter (RÜSCH Austria Gesellschaft m.b.H., Vienna, Austria) was used. For esophageal or nasopharyngeal measurements, a Medical Level 1 disposable General Purpose Temperature Probe (Smiths Medical Österreich GmbH, Brunn am Gebirge, Austria) was used. In patients with short surgeries and without the need for a urinary catheter, and when esophageal placement was not feasible, measurement was performed rectally or, on very rare occasions, inguinal. If a temperature measurement differed by more than 0.5 °C from the directly preceding measurement, it was declared invalid. This was done because a temperature change of more than 0.5 °C in such a short time was considered unrealistic and most likely due to an artifact. Consecutive measurements were considered invalid until the last valid measurement ± 0.5 °C was reached (e.g., the probe was put back in place, etc.). If invalid measurements occurred for more than 20 consecutive minutes, the entire case was excluded. In addition, temperature measurements were considered invalid if temperatures fell below 30 °C or exceeded 40 °C.

Statistical analysis

Continuous variables were summarized as means with standard deviations, and categorical variables were presented as absolute frequencies and percentages.

For model building, the data were temporally split at April 1, 2018 into a training set and a test set of approximately 80% and 20% of all cases, respectively; that is, surgeries between September 26, 2013 and March 31, 2018 constituted the training set, whereas surgeries from April 1, 2018 to May 24, 2019 were considered as the test set. This procedure corresponds to temporal validation and is preferred over one random split33. Prediction models were developed using the first 80% (training set), and validation was performed on the later 20% (test set). The minimum temperature measured during surgery was modelled using linear regression. Probabilistic predictions falling below 35 °C, 35.5 °C, and 36 °C at any time during surgery (hypothermia) were obtained based on the assumed normal distribution of the continuous outcome. In total, three linear prediction models based on different levels of information were developed.

First, the “basic model” using simple, preclinically available data (see Supplementary Table S1) was created using backward-elimination variable selection with the Akaike Information Criterion (AIC) as the stopping criterion, which is preferred for predictive purposes34. In the second, more complex model, the initial vital signs measured upon entering the operating room (“vital signs model”) were added. The vital signs model was fitted by incorporating the linear predictor of the basic model as an offset and by selecting vital signs using backward elimination with AIC. For the third model (“clinic model”), the data set was restricted to patients who visited the PAC. In addition to predictors in the vital signs model, comorbidities recorded in the PAC were selected based on forward selection with AIC. All continuous predictors were incorporated into the models as restricted cubic splines with three degrees of freedom to gain flexibility and to cover potential non-linear relations. By estimating one linear model instead of three logistic models (one for each threshold) at each level of information, we could ensure that the same predictors were relevant for prediction regardless of the threshold defining hypothermia at each level of information.

The predictive performance of all three models was assessed in the test set, which was temporally independent of the training set. The estimated predicted probabilities for hypothermia below 36 °C, 35.5 °C, and 35 °C were evaluated by (a) the scaled Brier score (a) discrimination by the concordance statistic (i.e., area under the receiver operating characteristic curve), the discrimination slope, and boxplots thereof, and (b) calibration by means of a calibration plot, the calibration slope, and calibration-in-the-large. Confidence intervals for all measures were calculated based on 2000 bootstrap samples. The importance of each predictor was assessed by partial explained variation based on the minimum temperature35. A complete case analysis was conducted for each prediction model.

R (4.0.0, The R Foundation for Statistical Computing, Vienna, Austria) was used for all statistical calculations and modelling36.

Results

In total, temperature measurements were available for 105,413 surgical cases. After applying the exclusion criteria (Fig. 1), the final dataset included 36,371 cases. Because of missing data, 21,119 cases and 21,193 cases were included in the training set of the basic model and the vital signs model, respectively. For the clinical model, only patients who had a check-up in our PAC were eligible, resulting in a sample size of 8598 for the training set.

Figure 1
figure 1

Patient flow diagram.

Baseline characteristics of the training and test sets

In the training set, the mean patient age was 53.1 years (SD ± 17.5) and 51.8% of the patients were female (Table 1). Most surgical procedures were performed in general surgery (23.8%) and orthopedic surgery (22.0%). The demographic and morphometric characteristics of the patients in the test set were similar to those in the training set. The test set had fewer missing values in information on medical history, since a mandatory visit to the PAC before surgery was enforced more strictly strictly since the end of 2017 (Supplementary Table S2).

Table 1 Baseline characteristics, surgery information, medical history, anesthesia methods and first vital signs measured in the operation room of patients.

Outcome: minimum temperature during surgery

The minimum temperature was approximately normally distributed with a mean of 35.9 °C in the training and test set (SD 0.525 and 0.51, respectively), which enabled the estimation of a linear regression model. With a hypothermia threshold of 36 °C, the rates of hypothermic patients in the training and test set were 51.9% and 49.7%, respectively. Only 18.5% and 4.3% of patients fell below 35.5 °C and 35 °C in the training set (respectively), whereas there were slightly fewer hypothermic patients as per these thresholds in the test set (17.0% and 3.7%, respectively).

Performance

The performance of the prediction models was evaluated in the test set. As expected, the clinic model obtained the highest scaled Brier score for 36 °C as well as 35.5 °C thresholds (0.134 and 0.079), thus, had the highest overall performance, followed by the vital signs model (0.122 and 0.075) and the basic model (0.098 and 0.063) (Table 2).

Table 2 Scaled Brier score, concordance statistic, discrimination slope, calibration in the large and calibration slope for hypothermia below 36 °C and 35.5 °C based on the test set.

Discrimination

The concordance statistic is the probability that a hypothermic patient has a higher predicted probability for the occurrence of hypothermia than a non-hypothermic patient. The concordance statistics were 0.680 for the basic, 0.703 for the vital signs, and 0.713 for the clinic model predicting hypothermia below 36 °C (Table 2). When predicting hypothermia below 35.5 °C the concordance statistics were 0.676, 0.703 and 0.713, respectively. The corresponding receiver operating characteristic curves are shown in Fig. 2A,C. The discrimination slope measures the difference between the mean predicted risk in hypothermic patients and the mean predicted risk in non-hypothermic patients (Table 2). The discrimination slopes were similar in the basic, vital signs and the clinic model (0.007, 0.008, and 0.004, respectively) for hypothermia below 36 °C, whereas it was nearly 0 for all models for a threshold of 35.5 °C. Predicted risk for hypothermia in hypothermic and non-hypothermic patients are depicted in Fig. 2B,D, showing that the range of predicted probabilities was the largest in the vital signs and clinic model for a threshold of 36 °C, and as expected, predicted risks for hypothermia below 35.5 °C are generally lower than for hypothermia below 36 °C.

Figure 2
figure 2

ROC curves and discrimination plots for the basic, the vital signs and the clinic model. Temperature threshold of 36 °C defining hypothermia: (A) ROC curves for the basic model (red), the vital signs model (violet) and the clinic model (blue) and (B) discrimination plots. Temperature threshold of 35.5 °C defining hypothermia: (C) ROC curves for the basic model (red), the vital signs model (violet) and the clinic model (blue) and (D) discrimination plots. Light grey boxplots in (B) and (D) represent the predictions for non-hypothermic patients and dark grey boxplots represent predictions for hypothermic patients.

Calibration

Figure 3A,B shows the agreement between predicted probabilities and observed risk, or if low- and high-risk individuals were correctly identified by the models. For both temperature thresholds, the models seemed reliable, as the calibration curves were close to the diagonal. These findings are also present in the calibration-in-the-large and calibration slope (Table 2) because these values were close to 0 and 1, respectively.

Figure 3
figure 3

Calibration plots for the basic model (red), the vital signs model (violet) and the clinic model (blue) for temperature thresholds of (A) 36 °C and (B) 35.5 °C defining hypothermia. The shaded areas represent the 95% confidence intervals.

Performance for temperature threshold of 35.0 °C defining hypothermia

The performance of the basic, vital signs, and clinic models decreased with decreasing temperature thresholds defining hypothermia (Table 2, Supplementary Table S3 and Supplementary Fig. S1) due to decreasing incidence rates. Although only 4.3% of patients in the training set fell below 35 °C, discrimination, and calibration for a threshold of 35 °C were still moderate. In general, the vital signs model seems to be the best calibrated model across all temperature thresholds (Supplementary Fig. S1).

Selected predictors in the basic, vital signs, and clinic models

In Supplementary Table S4, the final predictors selected by backward elimination (basic and vital sign models) and by forward selection (clinic model) are listed and ranked by their importance, or their partial explained variation. The partial explained variation is the proportion of variation explained by one predictor on top of all the others in the model. In the basic model, patient weight and urgent surgery were the most important predictors. In the vital signs model, heart rate achieved an higher partial explained variation than urgent surgery; a lower heart rate before induction of anesthesia was associated with a higher risk of intraoperative hypothermia. This influence on the prediction of the minimum temperature during surgery is shown in Supplementary Fig. S2. The ranking of predictors may change across the three models, which are based on slightly different datasets. Sex, age, and orthopedic and trauma surgery were also moderately important predictors in the basic and vital signs model (partial explained variation between 1.07 and 0.65). In the vital signs model, sex was less important, but high i.v. fluid turnover expected and otolaryngologic surgery achieved an explained variation over 0.75, which could also be considered moderately important for predicting hypothermia. The influence of additional variables in the clinic model on the minimum temperature in terms of the explained variation was negligible.

Sub analysis of high-risk patients

Additionally, we defined high-risk patients by a predicted probability for hypothermia (below 35.5 °C) of 36% or higher, which is twice the incidence of hypothermia in the training set. In the test set, we evaluated the performance of the model when using a cut-off of 36% for the risk of hypothermia. Between 10.8 to 14.1% of the patients in the test sets of the respective models were assigned predicted risks above 36%, and thus, were classified into the high-risk group. This group of high-risk patients also had lower observed minimum temperatures (on average 0.3 to 0.4 °C depending on the applied model).

The models obtained moderate accuracies ranging from 0.76 to 0.80 with high specificities of 0.89 to 0.92, meaning that the models correctly classified up to 80% of the patients and correctly identify around 90% of the non-hypothermic patients.

Calculation of predictions and web-based prediction tool: TempSage

As the model is quite complex and predictions are not easily calculable by hand, a web-based implementation of the algorithm was built for the prediction of intraoperative hypothermia below 36 °C, which is compatible with most common mobile browsers (https://sny.cemsiis.meduniwien.ac.at/~cw45u2/tempsage/). Depending on the available information at hand, it is possible to either use the basic, vital signs, or clinic model for prediction. For example, a healthy 22-year-old male with urgent appendectomy has a risk of 2.25% for intraoperative hypothermia below 35.5 °C. In comparison, a 90-year-old female weighing 55 kg with non-insulin-dependent diabetes mellitus, Alzheimer’s disease, and atrial fibrillation receiving a dynamic hip screw under general anesthesia and using a supraglottic airway device, has a predicted risk of intraoperative hypothermia of 70.96%. Supplementary Fig. S3 shows the calculations for these two hypothetical patients.

For the sake of completeness, Supplementary Table S5 provides coefficients, the knots for spline bases, and formulas to predict the minimum temperature with the basic, vital signs, and clinic models.

Discussion

In this study, we developed and tested three prediction models for intraoperative hypothermia and evaluated their predictive capabilities. All models achieved good discriminatory ability and demonstrated proper calibration for temperature drops below 36 °C and 35.5 °C. The models neither overestimated nor underestimated the risk of hypothermia in the test set, making them useful in clinical settings by giving anesthesiologists the opportunity to intensify their temperature management efforts when identifying patients at risk preoperatively. Because the developed algorithms are too complex for paper-based calculations, a web-based implementation of the algorithm was built to provide a convenient way to use the model.

To the best of our knowledge, there are currently only two published prediction models for intraoperative hypothermia. Both have different approaches when compared to our models26,37. Kasai and colleagues developed a logistic model based on 400 cases and achieved a sensitivity of 81.5% and a specificity of 83% for intraoperative hypothermia. Unfortunately, neither discrimination nor calibration were reported. Furthermore, their model was developed and tested on a very specific patient group, namely American Society of Anesthesiologists (ASA) score I and II patients without diabetes, hypertonia, thyroid conditions, dysautonomia, or Raynaud’s syndrome undergoing major abdominal surgery with epidural anesthesia, and patients were excluded if they received blood transfusions or catecholamines. Therefore, the model by Kasai et al. is only applicable to a relatively small subgroup of patients.

The second prediction model was developed by Yi and colleagues. The concordance statistics of their prediction model were 0.789 and 0.771 for the derivation and test sets, respectively, which are better than ours of 0.713 (0.696, 0.730) in the test set. However, the information needed to apply the prediction model proposed by Yi and colleagues was in part not available preoperatively, e.g. length of anesthesia and amount of intravenous fluid administered can only be roughly estimated before induction of anaesthesia37.

In our model, the predictors for inadvertent intraoperative hypothermia with the highest explained variation were patient weight, urgency, and preoperative heart rate, followed by different surgery types (see Supplementary Table S4). According to the NICE guidelines, data concerning the influence of patient weight on the incidence of perioperative hypothermia is inconclusive. A study by Poveda et al., on the other hand, showed a positive correlation between greater BMI and mean intraoperative body temperature, with more obese patients having a lower incidence of inadvertent intraoperative hypothermia28,38. In a study by Kongsayreepong et al., the influence of the urgency of surgery on the incidence of postoperative hypothermia was investigated; however, no significant difference between elective and emergency surgery was discovered13.

Another interesting finding is the comparably high partial explained variation of the preoperative heart rate in our models. To the best of our knowledge, this has only been described in the prediction study by Kasai et al., which also found a significant association between lower preoperative heart rate and intraoperative hypothermia26. Although, no statement concerning a causal relation can be made based on these findings, it is possible that preoperative heart rate is a surrogate for the general health condition as well as the catecholamine levels of the patient. For example, a low heart rate right before surgery could either be due to arrhythmia (e.g., sinus node dysfunction, atrial fibrillation with bradycardia), medication (e.g., beta blocker, antiarrhythmic medication) or failure of the patient to recognize a situation that is normally perceived as stressful39,40. All three would be indicative of poor general health and consequently associated with intraoperative hypothermia. Additionally, a low endogenous catecholamine level associated with low heart rate would go hand in hand with higher peripheral perfusion and faster heat loss. On the other hand, higher heart rate would most likely be associated with vasoconstriction and therefore less heat loss as well as higher heat production. The fact that this positive effect has a cut off at about 100/min (see Supplementary Fig. S2) could also be explained by an association of extremely high heart rate with poor health or hypovolemia, both associated with intraoperative hypothermia.

Concerning the different surgery types, most major prior publications tended to divide surgeries either by the magnitude of surgery (major, intermediate, minor) or only differentiated between laparoscopic and open surgery13,14.

The high incidence of hypothermia (51.9% below 36 °C in the training set) is consistent with previous findings by Sun et al., who reported that 64.4% of patients reached a core temperature below the threshold of 36 °C and again emphasized the need for awareness and taking resolute and pre-emptive action to avoid intraoperative hypothermia29. The high incidence of intraoperative hypothermia occurred despite the standard use of FAWs at the Vienna General Hospital. This high incidence can partly be explained by the decrease in core temperature during the first hour of anesthesia, regardless of the type of warming device used, as described in previous publications29,41,42. Without prewarming, cold blood from the periphery flows to the patient’s core after induction of anesthesia due to the vasodilating action of the anesthetic drugs, leading to an initial drop in core temperature17,18,21.

A visit to the PAC is an important tool for risk assessment as well as an opportunity to obtain timely informed consent and assess the possibility of perioperative optimization and preparation43,44,45. To date, major anesthesiology societies do not mention perioperative hypothermia specifically in their guidelines for PACs46,47. Nevertheless, the information gathered in PACs can also help clinicians in their decision-making concerning perioperative temperature management. Although each single additional predictor in the clinic model did not add much in terms of explained variation, we were able to show an improved predictive performance for the combined information recorded in the PAC in terms of discrimination and calibration.

In principle, when a patient has been identified to have an increased risk of inadvertent intraoperative hypothermia, there are several options for prophylactic thermoregulatory interventions. For example, prewarming with forced air or self-warming blankets, which are used in the holding area or during patient transfers, has been shown to be beneficial and to prevent redistribution hypothermia32,48,49,50,51. However, depending on the infrastructure of the holding area, even a relatively short prewarming of 30 min can be difficult to implement. Both this lack of infrastructure and short stays in the holding area typically prohibit adequate prewarming in many institutions52,53. The presented prediction tool can help effectively target patients who will potentially benefit the most from prewarming. Also, other additional thermoregulatory interventions like conductive heating with optimized surface contact between the patient’s skin and the warming device may be used in patients at increased risk for more severe inadvertent intraoperative hypothermia. These interventions may also be synergistically combined with FAWs and conductive heating has also been shown to be effective54,55,56,57. In selected patients with a particularly high risk of perioperative hypothermia (e.g., extensive burn surgery), even intravenous patient warming may be used to reduce the incidence of intraoperative hypothermia31,58. However, an increase of OR temperature to decrease the risk of hypothermia has been shown to not be very efficient. A recent prospective study demonstrated that the effect of ambient temperature, especially when FAW devices are used, is negligible59 and that lower ambient temperatures do not influence core temperature once active warming is established13,60.

This study has several limitations. First, it is important to note that the algorithm was validated internally with independent data from the Vienna General Hospital. Secondly, more detailed surgical information or patient information could help to further improve the prediction accuracy of the model. Another limitation of our study is the lack of preoperative, accurate non-invasive temperature measurements since they are not measured on a regular basis in the Vienna General Hospital. Additionally, the information concerning anesthesia type was from the actual cases therefore rare conversions from failed spinal to general anesthesia were not accounted for.

Another limitation is the fact that our model merely predicts which patients are at higher risk for hypothermia without suggesting particular interventions for each patient. Although a prescriptive analytic model would hypothetically be ideal; additional evidence beyond the scope of this analysis is needed to know which patients would benefit from which additional temperature management methods. In addition, most features of the model (e.g. sex, age, history of disease) cannot be altered before start of surgery and cannot be linked to specific warming interventions. However, it is reasonable to assume that patients with higher risk for inadvertent intraoperative hypothermia would likely benefit from additional efforts as specified above.

Finally—as with most retrospective study design—cause and effect relationships can only be hypothesized61.

In the present study we demonstrated that intraoperative hypothermia still occurs frequently and developed an accurate prediction model to identify—at different preoperative timepoints—patients at risk for mild and moderate inadvertent intraoperative hypothermia to whom additional prophylactic thermoregulatory interventions may be preferentially allocated.