Development and internal validation of an algorithm to predict intraoperative risk of inadvertent hypothermia based on preoperative data

Intraoperative hypothermia increases perioperative morbidity and identifying patients at risk preoperatively is challenging. The aim of this study was to develop and internally validate prediction models for intraoperative hypothermia occurring despite active warming and to implement the algorithm in an online risk estimation tool. The final dataset included 36,371 surgery cases between September 2013 and May 2019 at the Vienna General Hospital. The primary outcome was minimum temperature measured during surgery. Preoperative data, initial vital signs measured before induction of anesthesia, and known comorbidities recorded in the preanesthetic clinic (PAC) were available, and the final predictors were selected by forward selection and backward elimination. Three models with different levels of information were developed and their predictive performance for minimum temperature below 36 °C and 35.5 °C was assessed using discrimination and calibration. Moderate hypothermia (below 35.5 °C) was observed in 18.2% of cases. The algorithm to predict inadvertent intraoperative hypothermia performed well with concordance statistics of 0.71 (36 °C) and 0.70 (35.5 °C) for the model including data from the preanesthetic clinic. All models were well-calibrated for 36 °C and 35.5 °C. Finally, a web-based implementation of the algorithm was programmed to facilitate the calculation of the probabilistic prediction of a patient’s core temperature to fall below 35.5 °C during surgery. The results indicate that inadvertent intraoperative hypothermia still occurs frequently despite active warming. Additional thermoregulatory measures may be needed to increase the rate of perioperative normothermia. The developed prediction models can support clinical decision-makers in identifying the patients at risk for intraoperative hypothermia and help optimize allocation of additional thermoregulatory interventions.

www.nature.com/scientificreports/ measurements were considered invalid until the last valid measurement ± 0.5 °C was reached (e.g., the probe was put back in place, etc.). If invalid measurements occurred for more than 20 consecutive minutes, the entire case was excluded. In addition, temperature measurements were considered invalid if temperatures fell below 30 °C or exceeded 40 °C.
Statistical analysis. Continuous variables were summarized as means with standard deviations, and categorical variables were presented as absolute frequencies and percentages. For model building, the data were temporally split at April 1, 2018 into a training set and a test set of approximately 80% and 20% of all cases, respectively; that is, surgeries between September 26, 2013 and March 31, 2018 constituted the training set, whereas surgeries from April 1, 2018 to May 24, 2019 were considered as the test set. This procedure corresponds to temporal validation and is preferred over one random split 33 . Prediction models were developed using the first 80% (training set), and validation was performed on the later 20% (test set). The minimum temperature measured during surgery was modelled using linear regression. Probabilistic predictions falling below 35 °C, 35.5 °C, and 36 °C at any time during surgery (hypothermia) were obtained based on the assumed normal distribution of the continuous outcome. In total, three linear prediction models based on different levels of information were developed.
First, the "basic model" using simple, preclinically available data (see Supplementary Table S1) was created using backward-elimination variable selection with the Akaike Information Criterion (AIC) as the stopping criterion, which is preferred for predictive purposes 34 . In the second, more complex model, the initial vital signs measured upon entering the operating room ("vital signs model") were added. The vital signs model was fitted by incorporating the linear predictor of the basic model as an offset and by selecting vital signs using backward elimination with AIC. For the third model ("clinic model"), the data set was restricted to patients who visited the PAC. In addition to predictors in the vital signs model, comorbidities recorded in the PAC were selected based on forward selection with AIC. All continuous predictors were incorporated into the models as restricted cubic splines with three degrees of freedom to gain flexibility and to cover potential non-linear relations. By estimating one linear model instead of three logistic models (one for each threshold) at each level of information, we could ensure that the same predictors were relevant for prediction regardless of the threshold defining hypothermia at each level of information.
The predictive performance of all three models was assessed in the test set, which was temporally independent of the training set. The estimated predicted probabilities for hypothermia below 36 °C, 35.5 °C, and 35 °C were evaluated by (a) the scaled Brier score (a) discrimination by the concordance statistic (i.e., area under the receiver operating characteristic curve), the discrimination slope, and boxplots thereof, and (b) calibration by means of a calibration plot, the calibration slope, and calibration-in-the-large. Confidence intervals for all measures were calculated based on 2000 bootstrap samples. The importance of each predictor was assessed by partial explained variation based on the minimum temperature 35

Results
In total, temperature measurements were available for 105,413 surgical cases. After applying the exclusion criteria ( Fig. 1), the final dataset included 36,371 cases. Because of missing data, 21,119 cases and 21,193 cases were included in the training set of the basic model and the vital signs model, respectively. For the clinical model, only patients who had a check-up in our PAC were eligible, resulting in a sample size of 8598 for the training set.
Baseline characteristics of the training and test sets. In the training set, the mean patient age was 53.1 years (SD ± 17.5) and 51.8% of the patients were female (Table 1). Most surgical procedures were performed in general surgery (23.8%) and orthopedic surgery (22.0%). The demographic and morphometric characteristics of the patients in the test set were similar to those in the training set. The test set had fewer missing values in information on medical history, since a mandatory visit to the PAC before surgery was enforced more strictly strictly since the end of 2017 (Supplementary Table S2).
Outcome: minimum temperature during surgery. The minimum temperature was approximately normally distributed with a mean of 35.9 °C in the training and test set (SD 0.525 and 0.51, respectively), which enabled the estimation of a linear regression model. With a hypothermia threshold of 36 °C, the rates of hypothermic patients in the training and test set were 51.9% and 49.7%, respectively. Only 18.5% and 4.3% of patients fell below 35.5 °C and 35 °C in the training set (respectively), whereas there were slightly fewer hypothermic patients as per these thresholds in the test set (17.0% and 3.7%, respectively).
Performance. The performance of the prediction models was evaluated in the test set. As expected, the clinic model obtained the highest scaled Brier score for 36 °C as well as 35.5 °C thresholds (0.134 and 0.079), thus, had the highest overall performance, followed by the vital signs model (0.122 and 0.075) and the basic model (0.098 and 0.063) ( Table 2).

Discrimination.
The concordance statistic is the probability that a hypothermic patient has a higher predicted probability for the occurrence of hypothermia than a non-hypothermic patient. The concordance statistics were 0.680 for the basic, 0.703 for the vital signs, and 0.713 for the clinic model predicting hypothermia  (Table 2). When predicting hypothermia below 35.5 °C the concordance statistics were 0.676, 0.703 and 0.713, respectively. The corresponding receiver operating characteristic curves are shown in Fig. 2A,C. The discrimination slope measures the difference between the mean predicted risk in hypothermic patients and the mean predicted risk in non-hypothermic patients ( Table 2). The discrimination slopes were similar in the basic, vital signs and the clinic model (0.007, 0.008, and 0.004, respectively) for hypothermia below 36 °C, whereas it was nearly 0 for all models for a threshold of 35.5 °C. Predicted risk for hypothermia in hypothermic and nonhypothermic patients are depicted in Fig. 2B,D, showing that the range of predicted probabilities was the largest in the vital signs and clinic model for a threshold of 36 °C, and as expected, predicted risks for hypothermia below 35.5 °C are generally lower than for hypothermia below 36 °C.
Calibration. Figure 3A,B shows the agreement between predicted probabilities and observed risk, or if lowand high-risk individuals were correctly identified by the models. For both temperature thresholds, the models seemed reliable, as the calibration curves were close to the diagonal. These findings are also present in the calibration-in-the-large and calibration slope ( Table 2) because these values were close to 0 and 1, respectively.
Performance for temperature threshold of 35.0 °C defining hypothermia. The performance of the basic, vital signs, and clinic models decreased with decreasing temperature thresholds defining hypothermia ( Table 2, Supplementary Table S3 and Supplementary Fig. S1) due to decreasing incidence rates. Although only 4.3% of patients in the training set fell below 35 °C, discrimination, and calibration for a threshold of 35 °C were still moderate. In general, the vital signs model seems to be the best calibrated model across all temperature thresholds ( Supplementary Fig. S1).
Selected predictors in the basic, vital signs, and clinic models. In Supplementary Table S4, the final predictors selected by backward elimination (basic and vital sign models) and by forward selection (clinic model) are listed and ranked by their importance, or their partial explained variation. The partial explained variation is the proportion of variation explained by one predictor on top of all the others in the model. In the basic model, patient weight and urgent surgery were the most important predictors. In the vital signs model, heart rate achieved an higher partial explained variation than urgent surgery; a lower heart rate before induction www.nature.com/scientificreports/ of anesthesia was associated with a higher risk of intraoperative hypothermia. This influence on the prediction of the minimum temperature during surgery is shown in Supplementary Fig. S2. The ranking of predictors may change across the three models, which are based on slightly different datasets. Sex, age, and orthopedic and trauma surgery were also moderately important predictors in the basic and vital signs model (partial explained variation between 1.07 and 0.65). In the vital signs model, sex was less important, but high i.v. fluid turnover expected and otolaryngologic surgery achieved an explained variation over 0.75, which could also be considered moderately important for predicting hypothermia. The influence of additional variables in the clinic model on the minimum temperature in terms of the explained variation was negligible.
Sub analysis of high-risk patients. Additionally, we defined high-risk patients by a predicted probability for hypothermia (below 35.5 °C) of 36% or higher, which is twice the incidence of hypothermia in the training set. In the test set, we evaluated the performance of the model when using a cut-off of 36% for the risk of hypothermia. Between 10.8 to 14.1% of the patients in the test sets of the respective models were assigned predicted risks above 36%, and thus, were classified into the high-risk group. This group of high-risk patients also had lower observed minimum temperatures (on average 0.3 to 0.4 °C depending on the applied model).
The models obtained moderate accuracies ranging from 0.76 to 0.80 with high specificities of 0.89 to 0.92, meaning that the models correctly classified up to 80% of the patients and correctly identify around 90% of the non-hypothermic patients.
Calculation of predictions and web-based prediction tool: TempSage. As the model is quite complex and predictions are not easily calculable by hand, a web-based implementation of the algorithm was built for the prediction of intraoperative hypothermia below 36 °C, which is compatible with most common mobile browsers (https:// sny. cemsi is. medun iwien. ac. at/ ~cw45u2/ temps age/). Depending on the available information at hand, it is possible to either use the basic, vital signs, or clinic model for prediction. For example, a healthy 22-year-old male with urgent appendectomy has a risk of 2.25% for intraoperative hypothermia below 35.5 °C. In comparison, a 90-year-old female weighing 55 kg with non-insulin-dependent diabetes mellitus, Alzheimer's disease, and atrial fibrillation receiving a dynamic hip screw under general anesthesia and using a supraglottic airway device, has a predicted risk of intraoperative hypothermia of 70.96%. Supplementary Fig. S3 shows the calculations for these two hypothetical patients.
For the sake of completeness, Supplementary Table S5 provides coefficients, the knots for spline bases, and formulas to predict the minimum temperature with the basic, vital signs, and clinic models.

Discussion
In this study, we developed and tested three prediction models for intraoperative hypothermia and evaluated their predictive capabilities. All models achieved good discriminatory ability and demonstrated proper calibration for temperature drops below 36 °C and 35.5 °C. The models neither overestimated nor underestimated the risk of hypothermia in the test set, making them useful in clinical settings by giving anesthesiologists the opportunity to intensify their temperature management efforts when identifying patients at risk preoperatively. Because the   www.nature.com/scientificreports/ developed algorithms are too complex for paper-based calculations, a web-based implementation of the algorithm was built to provide a convenient way to use the model. To the best of our knowledge, there are currently only two published prediction models for intraoperative hypothermia. Both have different approaches when compared to our models 26,37 . Kasai and colleagues developed a logistic model based on 400 cases and achieved a sensitivity of 81.5% and a specificity of 83% for intraoperative hypothermia. Unfortunately, neither discrimination nor calibration were reported. Furthermore, their model was developed and tested on a very specific patient group, namely American Society of Anesthesiologists (ASA) score I and II patients without diabetes, hypertonia, thyroid conditions, dysautonomia, or Raynaud's syndrome undergoing major abdominal surgery with epidural anesthesia, and patients were excluded if they received blood transfusions or catecholamines. Therefore, the model by Kasai et al. is only applicable to a relatively small subgroup of patients. The second prediction model was developed by Yi and colleagues. The concordance statistics of their prediction model were 0.789 and 0.771 for the derivation and test sets, respectively, which are better than ours of 0.713 (0.696, 0.730) in the test set. However, the information needed to apply the prediction model proposed by Yi and colleagues was in part not available preoperatively, e.g. length of anesthesia and amount of intravenous fluid administered can only be roughly estimated before induction of anaesthesia 37 . www.nature.com/scientificreports/ In our model, the predictors for inadvertent intraoperative hypothermia with the highest explained variation were patient weight, urgency, and preoperative heart rate, followed by different surgery types (see Supplementary  Table S4). According to the NICE guidelines, data concerning the influence of patient weight on the incidence of perioperative hypothermia is inconclusive. A study by Poveda et al., on the other hand, showed a positive correlation between greater BMI and mean intraoperative body temperature, with more obese patients having a lower incidence of inadvertent intraoperative hypothermia 28,38 . In a study by Kongsayreepong et al., the influence of the urgency of surgery on the incidence of postoperative hypothermia was investigated; however, no significant difference between elective and emergency surgery was discovered 13 .
Another interesting finding is the comparably high partial explained variation of the preoperative heart rate in our models. To the best of our knowledge, this has only been described in the prediction study by Kasai et al., which also found a significant association between lower preoperative heart rate and intraoperative hypothermia 26 . Although, no statement concerning a causal relation can be made based on these findings, it is possible that preoperative heart rate is a surrogate for the general health condition as well as the catecholamine levels of the patient. For example, a low heart rate right before surgery could either be due to arrhythmia (e.g., sinus node dysfunction, atrial fibrillation with bradycardia), medication (e.g., beta blocker, antiarrhythmic medication) or failure of the patient to recognize a situation that is normally perceived as stressful 39,40 . All three would be indicative of poor general health and consequently associated with intraoperative hypothermia. Additionally, a low endogenous catecholamine level associated with low heart rate would go hand in hand with higher peripheral perfusion and faster heat loss. On the other hand, higher heart rate would most likely be associated with vasoconstriction and therefore less heat loss as well as higher heat production. The fact that this positive effect has a cut off at about 100/min (see Supplementary Fig. S2) could also be explained by an association of extremely high heart rate with poor health or hypovolemia, both associated with intraoperative hypothermia.
Concerning the different surgery types, most major prior publications tended to divide surgeries either by the magnitude of surgery (major, intermediate, minor) or only differentiated between laparoscopic and open surgery 13,14 .
The high incidence of hypothermia (51.9% below 36 °C in the training set) is consistent with previous findings by Sun et al., who reported that 64.4% of patients reached a core temperature below the threshold of 36 °C and again emphasized the need for awareness and taking resolute and pre-emptive action to avoid intraoperative hypothermia 29 . The high incidence of intraoperative hypothermia occurred despite the standard use of FAWs at the Vienna General Hospital. This high incidence can partly be explained by the decrease in core temperature during the first hour of anesthesia, regardless of the type of warming device used, as described in previous publications 29,41,42 . Without prewarming, cold blood from the periphery flows to the patient's core after induction of anesthesia due to the vasodilating action of the anesthetic drugs, leading to an initial drop in core temperature 17,18,21 .
A visit to the PAC is an important tool for risk assessment as well as an opportunity to obtain timely informed consent and assess the possibility of perioperative optimization and preparation [43][44][45] . To date, major anesthesiology societies do not mention perioperative hypothermia specifically in their guidelines for PACs 46,47 . Nevertheless, the information gathered in PACs can also help clinicians in their decision-making concerning perioperative temperature management. Although each single additional predictor in the clinic model did not add much in terms of explained variation, we were able to show an improved predictive performance for the combined information recorded in the PAC in terms of discrimination and calibration.  www.nature.com/scientificreports/ In principle, when a patient has been identified to have an increased risk of inadvertent intraoperative hypothermia, there are several options for prophylactic thermoregulatory interventions. For example, prewarming with forced air or self-warming blankets, which are used in the holding area or during patient transfers, has been shown to be beneficial and to prevent redistribution hypothermia 32,[48][49][50][51] . However, depending on the infrastructure of the holding area, even a relatively short prewarming of 30 min can be difficult to implement. Both this lack of infrastructure and short stays in the holding area typically prohibit adequate prewarming in many institutions 52,53 . The presented prediction tool can help effectively target patients who will potentially benefit the most from prewarming. Also, other additional thermoregulatory interventions like conductive heating with optimized surface contact between the patient's skin and the warming device may be used in patients at increased risk for more severe inadvertent intraoperative hypothermia. These interventions may also be synergistically combined with FAWs and conductive heating has also been shown to be effective [54][55][56][57] . In selected patients with a particularly high risk of perioperative hypothermia (e.g., extensive burn surgery), even intravenous patient warming may be used to reduce the incidence of intraoperative hypothermia 31,58 . However, an increase of OR temperature to decrease the risk of hypothermia has been shown to not be very efficient. A recent prospective study demonstrated that the effect of ambient temperature, especially when FAW devices are used, is negligible 59 and that lower ambient temperatures do not influence core temperature once active warming is established 13,60 .
This study has several limitations. First, it is important to note that the algorithm was validated internally with independent data from the Vienna General Hospital. Secondly, more detailed surgical information or patient information could help to further improve the prediction accuracy of the model. Another limitation of our study is the lack of preoperative, accurate non-invasive temperature measurements since they are not measured on a regular basis in the Vienna General Hospital. Additionally, the information concerning anesthesia type was from the actual cases therefore rare conversions from failed spinal to general anesthesia were not accounted for.
Another limitation is the fact that our model merely predicts which patients are at higher risk for hypothermia without suggesting particular interventions for each patient. Although a prescriptive analytic model would hypothetically be ideal; additional evidence beyond the scope of this analysis is needed to know which patients would benefit from which additional temperature management methods. In addition, most features of the model (e.g. sex, age, history of disease) cannot be altered before start of surgery and cannot be linked to specific warming interventions. However, it is reasonable to assume that patients with higher risk for inadvertent intraoperative hypothermia would likely benefit from additional efforts as specified above.
Finally-as with most retrospective study design-cause and effect relationships can only be hypothesized 61 .
In the present study we demonstrated that intraoperative hypothermia still occurs frequently and developed an accurate prediction model to identify-at different preoperative timepoints-patients at risk for mild and moderate inadvertent intraoperative hypothermia to whom additional prophylactic thermoregulatory interventions may be preferentially allocated.