Individualized fluid administration for critically ill patients with sepsis with an interpretable dynamic treatment regimen model

Fluid strategy is the key to the successful management of patients with sepsis. However, previous studies failed to consider individualized treatment strategy, and clinical trials typically included patients with sepsis as a homogeneous study population. We aimed to develop sequential decision rules for managing fluid intake in patients with sepsis by using the dynamic treatment regimen (DTR) model. A retrospective analysis of the eICU Collaborative Research Database comprising highly granular data collected from 335 units at 208 hospitals was performed. The DTR model used a backward induction algorithm to estimate the sequence of optimal rules. 22,868 patients who had sepsis according to the Acute Physiology and Chronic Health Evaluation (APACHE) IV diagnosis group were included. Optimal fluid management (liberal [> 40 ml/kg/d] versus restricted [< 40 ml/kg/d]) strategy were developed on the Day 1, 3 and 5 after ICU admission according to current states and treatment history. Important determinants of optimal fluid strategy included mean blood pressure, heart rate, previous urine output, previous fluid strategy, ICU type and mechanical ventilation. Different functional forms such as quadratic function and interaction terms were used at different stages. The proportion of subjects being inappropriately treated with liberal fluid strategy (i.e. those actually received liberal fluid strategy, but could have longer survival time if they received restricted fluid strategy) increased from day 1 to 5 (19.3% to 29.5%). The survival time could be significantly prolonged had all patients been treated with optimal fluid strategy (5.7 [2.0, 5.9] vs. 4.1 [2.0, 5.0] days; p < 0.001). With a large volume of sepsis data, we successfully computed out a sequence of dynamic fluid management strategy for sepsis patients over the first 5 days after ICU admission. The decision rules generated by the DTR model predicted a longer survival time compared to the true observed strategy, which sheds light for improving patient outcome with the aim from computer-assisted algorithm.


Scientific Reports
| (2020) 10 Management of critically ill patients with sepsis is imperative as sepsis can attack all human organs leading to a fatal consequence in a short period of time. Among multiple aspects for saving the life of patients with sepsis, we believe that fluid management strategy remains essential to the successful treatment because the cardiac and renal functions, which play important roles in the physiological regulation of fluid balance, are usually impaired in sepsis. Numerous studies have been conducted to explore an optimal fluid strategy. A trio of multinational trials named Protocolized Care for Early Septic Shock (ProCESS), Australasian Resuscitation in Sepsis Evaluation (ARISE), and Protocolized Management in Sepsis (ProMISe) aimed to investigate whether early goal directed therapy (EGDT) could improve the mortality outcome [1][2][3] . The results showed that EGDT did not significantly reduce mortality rate as compared with the usual care group. Thus, the optimal fluid strategy remains largely unknown. One important limitation of these trials is that they included all sepsis patients without considering between-subject heterogeneity. As a matter of fact, there has been large body of evidence showing that sepsis is highly heterogenous and different subphenotypes can have distinct responses to fluid administration 4 . Latent profile analysis and hierarchical clustering are two major techniques to identify subphenotypes of sepsis 5 , which however are limited by two factors. First, the capacity to establish a causal relationship between treatment and subphenotypes is insufficient. Second, most studies used cross-sectional data when modeling the subphenotypes. We should aware that fluid management in a sepsis patient is a dynamic process over entire treatment course; we should not solely focus on the initial fluid strategy but ignore the importance at later stages. There was evidence that de-resuscitation (negative fluid balance) at later phase of sepsis is beneficial 6 . Therefore, we need to adjust our fluid management strategy according to the patient's changing state and treatment history. Reinforcement learning (RL) is a well-developed algorithm of machine learning focusing on how a computing system gives a response based on environmental inputs so the system can maximize cumulative reward 7 . RL has been used to determine the optimal fluid treatment strategy in sepsis [8][9][10] . However, these results are difficult to interpret due to the black-box algorithms, which significantly prohibited their use in clinical practice. Dynamic treatment regimen (DTR) method borrows the idea of RL, but simplifies the functional forms of the multi-dimensional feature space, which can help clinicians to better understand the decision rules 11 . This study aimed to optimize fluid treatment strategy (liberal versus restrictive) by using DTR during the course of sepsis. Instead of using cross-sectional data on one single phase, we applied DTR model on three different days (Day 1, Day 3, and Day 5) over the first week of treatment with a purpose of examine dynamic nature of the fluid management on sepsis treatment. We hypothesized that the optimal fluid treatment strategy recommended by DTR would be better than the treatment actually received in terms of survival outcome.

Methods
Database and study population. The present study utilized the eICU Collaborative Research Database, which is a multi-center intensive care unit (ICU) database with high granularity data for over 200,000 admissions to ICUs monitored by eICU Programs across the United States 12 . Patients with sepsis were included for our analysis. Sepsis was defined according to the sepsis-2.0 definition 13 . Sepsis-2.0 was defined when a patient had suspected or documented infection plus 2 of the SIRS criteria including temperature ≤ 36℃ or ≥ 38℃,heart rate ≥ 90 bpm, respiratory rate ≥ 20 breath/min or PaCO2 < 32 mmHg, and white blood cell count > 12,000 or < 4000 cells/mm 3 or > 10% band. Sepsis was further categorized by infection sites including cutaneous, gastrointestinal, pulmonary, urinary tract, other location and unknown location 14 .
Some definitions. The primary outcome was patient status (alive versus expired) at hospital discharge, which was considered as time to event survival data. Subjects who were discharged alive were considered as censored. We further defined the survival time of the patient from ICU admission to expiration. Patients who discharged alive were considered as censored. The state (feature space) of patients at each stage (Day 1, 3 and 5 after ICU admission) was constructed by variables including heart rate (HR), mean blood pressure (mBP), respiratory rate (RR), Glasgow coma scale (GCS), body temperature, creatinine, lactate, hemoglobin, bilirubin, use of vasopressor, platelet count, PaO 2 /FiO 2 (P/F) ratio, daily urine output and the use of mechanical ventilation (MV).
Total fluid intake was calculated as the sum of fluid intake for a 24-h interval. We assumed that the fluid strategy was determined at the beginning of each interval. Fluid intake was also normalized by the body weight. We defined liberal and restricted fluid administration as ≥ 40 ml/kg/day and < 40 ml/kg/day, respectively. This cutoff point was chosen according to the distribution of the fluid intake in the study population so that both categories would have balanced number of observations. Statistical analysis. Descriptive statistics were analyzed conventionally using the CBCgrps package in R 15 .
The DTR model estimated a sequence of treatment strategy that maximized the survival time across stages of clinical intervention 16 . We defined three stages on day 1, 3 and 5 after ICU entry and recorded the survival times within each stage. For the first stage, the survival time T 1 corresponded to the time (in days) from the beginning of day 1 to day 3 or day 1 to death if the patient died before day 3. The survival times T 2 and T 3 were defined similarly.
The DTR model used a backward induction algorithm to estimate the sequence of optimal rules. In the first step, the optimal stage 3 decision rule (day 5) was estimated by modeling the counterfactual survival time in stage 3 ( T a 1 ,a 2 ,a 3 3 ) as a function of the treatment received on day 5 (restricted [ a 3 = 0 ] or liberal [ a 3 = 1 ] fluid administration) and of the feature variables measured on day 5 or before ( h 3β and h 3ψ ): Scientific Reports | (2020) 10:17874 | https://doi.org/10.1038/s41598-020-74906-z www.nature.com/scientificreports/ The term a 3 ψ T 3 h 3ψ is the stage 3 blip function. It represents the effect of receiving liberal fluid administration instead of restricted fluid administration and its interaction with feature variables. The term ψ 3 is a vector of coefficients for feature variables and h 3ψ represents information (previous treatment, covariates and survival times) available prior to making the stage 3 treatment. We included age, mBP, unit type, HR, RR, use of vasopressor, MV, urine output, temperature, P/F ratio, and treatment strategies on days 3 and 4 as potential features to determine the optimal treatment. Variables were excluded from the final model if the statistical significance level was greater than 0.05. We retained some important variables such as age and mBP according to expertise. The optimal stage 3 treatment was identified for each subject who entered stage 3 by a opt 3 = I(ψ T 3 h 3ψ > 0) . If optimal rule recommends liberal fluid administration on day 5 if the condition ψ T 3 h 3ψ > 0 is satisfied, and restricted fluid administration otherwise.
In the second step, we estimated the optimal stage 2 treatment strategy by modeling the counterfactual survival time ) . It represents the survival time from day 3 (stage 2) onwards had the patient received his optimal stage 3 treatment. It is equal to the observed survival time T 2 + T 3 if the patient received his optimal stage 3 treatment and is larger that T 2 + T 3 otherwise. A similar strategy as in the third stage was adopted to find the optimal stage 2 treatment rule a opt 2 = I(ψ T 2 h 2ψ > 0). In the third step, we proceeded to the optimization of the first stage treatment by modeling the counterfactual (overall survival time had both the stage 2 and 3 treatments been optimal). The optimal stage 1 treatment rule was also of the form a Ethics approval and consent to participate. Data were available on request. This study was an analysis of the third-party anonymized databases with pre-existing IRB approval.

Results
Baseline characteristics of included subjects. A total of 22,868 patients with sepsis were identified from the database (Fig. 1 (Table 2).   (Table 3). However, it was not straightforward to improve survival outcome by simply reducing fluid intake because the association was not causality. Thus, we needed to employ DTR to prescribe optimal amount of fluid for each individual, based on their current status and treatment history.
The DTR and its interpretation. The DTR model estimated a sequence of treatment rules to recommend liberal or restricted fluid intake across stages in order to obtain a better survival outcome. On day 1 (stage 1), subjects should receive liberal fluid administration if they satisfied the following condition: Otherwise, they should receive restricted fluid administration in order to achieve a better survival outcome. Subscripts of variables in the equation denote the ICU days. Positive coefficient of a variable means that the presence or increase in the variable favors liberal fluid administration. The rule indicated that liberal fluid administration was more likely to be beneficial to patients from cardiac surgery ICU (CSICU) (coefficient: 0.413; 95% CI 0.370-0.455; p < 0.001), NeuroICU (1.485; 95% CI 1.438-1.532; p < 0.001) and surgical ICU (SICU) (0.44; 95% CI 0.357-0.522; p < 0.001); but was harmful to patients from medical  (1) 98 (1) 29 (1) Other Hospital 481 (2) 373 (2) 108 (3) Direct Admit 1291 (6) 1058 (6) 233 (6) Emergency  This rule indicated that more urine output (coefficient: 0.175; 95% CI: 0.128-0.223; p < 0.001 for day 2) in previous days mandated liberal fluid administration on day 3. The mBP on day 3 did not follow a parabolic function (the quadratic term is not statistically significant). In this case, higher mBP mandated less fluid administration.
The condition for liberal fluid administration on day 5 (stage 3) was: There was a significant interaction (coefficient: − 0.457; 95% CI − 0.873 to − 0.041; p = 0.039) between day 3 and day 4 treatment strategy for determining the day 5 treatment strategy. The interaction indicated that if liberal fluid administration was given on day 3, restricted fluid administration was more likely to be beneficial on day 5 if liberal fluid balance was also given on day 4 (Table 4).   www.nature.com/scientificreports/ Table 5 compares the difference between optimal and observed treatments for each subject. The proportion of patients who actually received liberal fluid administration but who would have better survival outcome had they received restricted fluid increased from 19.3% on day 1 to 29.5% on day 5. This result indicated that patients were more likely to receive too much fluid at latter phase of sepsis than that at the early phase. If all patients had received the optimal treatment strategy at all stages as recommended by DTR, the survival time could be significantly prolonged (5.7 [2.0, 5.9] vs. 4.1 [2.0, 5.0] days; p < 0.001).

Discussion
This study developed a simple and interpretable algorithm for calculating fluid management strategy in critically ill patient with sepsis. The DTR model was modified from complex ML algorithm and taking patients' current characteristics and their treatment history into the calculation. The fluid management recommendation was report on the Day 1, 3, and 5 after ICU admission. We also showed that following the optimal treatment strategy at each stage significantly improved the survival time. In other words, our hypothesis was supported. Inspection over different stages, we observed discrepancy between calculated and actual fluid administration. Specifically, we found that sepsis patients were more likely to receive inappropriate liberal fluid administration at later stage than that at early stage (Table 4). Here, we will discuss why calculated fluid management strategy by the DTR model would predict a better clinical outcome, and why clinician tended to employ inappropriate (liberal) fluid administration at the later stage of sepsis.
Conventionally, fluid administration was guided by a variety of biomarkers reflecting the circulatory status such as serum lactate, ScvO2, and capillary refill time. The 2016 version of the surviving sepsis bundle recommended maintaining MAP > 65 mmHg and normalization of lactate 18 . However, most studies investigating the effectiveness of fluid resuscitation targeting these parameters showed neural effect on mortality [19][20][21] . For example, the ANDROMEDA-SHOCK trial investigated fluid strategy by targeting lactate clearance versus normalization of capillary refill time, which showed no difference between the two groups 22 . One important reason for the failure of these trials lied in the heterogeneity of the sepsis population. It has been shown that sepsis population was highly heterogenous and that it could be further categorized into subphenotypes based on routinely measured clinical characteristics 4,5,23 . It is important to give different fluid strategy to different patients at different stages 24 ; however, it is almost impossible to fulfill this goal using conventional model based on physician's judgement over a few clinical variables. Aids from computing technology are needed for this assignment. RL is a novel technique to help an agent to select appropriate treatment to maximize final outcome based on current states. By feeding in highly granular electronic healthcare data, RL is capable to adopt clinical reasoning from experienced physicians and yield the optimal appropriate fluid management strategy based on current condition and the treatment history for each individual patient 8,9 . However, the RL method based on deep learning algorithm is difficult to understand for ICU physician and cannot be easily implemented in clinical practice. Thus, we used a simplified RL algorithm by considering a binary variable space (liberal vs. restricted fluid administration) and modeling the decision rule (blip function) with generalized linear model. The resultant DTR model was clinically interpretable and could easily guide clinical practice, which was an improvement over other less accessible RL algorithms.
The appropriate selection of feature variables for the blip function was essential for applying the DTR model. Although the SSC guideline recommended targeting lactate clearance as the resuscitation guide, lactate was not statistically significant in the blip function, and thus we excluded this variable. CVP was not included in the DTR model in our study because it has been documented less useful for assessing fluid status 25 . The mBP was an important determinant of fluid strategy and it was shown to be statistically significant in our blip function. However, the mapping from (functional form of) mBP to fluid strategy was not similar in early versus later Table 5. Cross table showing the difference between optimal and actually received treatment. The cross table shows the difference between the optimal treatment and the treatment that was received by the patient. For example, 10,728 patients received restricted fluid administration on day 1, which was consistent with the optimal treatment strategy. However, there were 4,410 patients who received liberal fluid administration, but they were expected to have better clinical outcome (survive to discharge) had they been treated with restricted fluid administration. In contrast, 5,248 patients received restricted fluid administration but would have had a better outcome had they received liberal fluid administration. Data on days 3 and 5 are interpreted in the same way. www.nature.com/scientificreports/ stage. On the first day, a parabolic function was fit with the turning point at 62.1 mmHg, which was very close to the 65 mmHg as recommended in the SSC guideline. At the later stages (Day 3 and 5), we found that previous urine output and treatment strategy were important determinants of current fluid strategy. This novel finding indicated that the optimal treatment strategy must take previous responses (e.g. urine output in response to fluid administration) into consideration. Another strength of this study was that we calculated fluid strategy over 5 days after ICU admission. The majority of previous trials focused on the first 6 or 12 h to investigate the effect of fluid resuscitation strategy 1,21,22 . This may lead to unsatisfactory results of previous trials. We argued that fluid management should be carried out during the entire disease course. Our DTR model correctly captured the clinical variables over the dynamic process of sepsis and provide a sequential decision rules to maximize the survival time. Our result showed that the proportion of subjects being inappropriately treated with liberal fluid strategy increased from Day 1 to 5 (i.e. these patients can have longer survival time had they treated with restrictive fluid administration). More recently, the concept of de-resuscitation (active removal of fluid using diuretics or renal replacement therapy) after hemodynamic stabilization has received more and more attentions 26 . There was evidence that negative fluid balance achieved with de-resuscitative measures resulted in lower mortality 6 . These studies also highlighted the importance of careful fluid management in later phase after hemodynamic stabilization.

Actually-received treatment
There are limitations in the current study. For example, the study population of sepsis was based on sepsis-2.0 definition, which may identify different population than that identified by using the most updated Sepsis-3.0 criteria 27 . However, the study was not a prospective study in which screening criteria could be prospectively collected. In the dataset, there was missing data on required items for the definition of sepsis-3.0. For example, the sepsis-3.0 definition requires an acute increase in the SOFA score, which means that we must have information for the baseline SOFA score to implement the Sepsis-3.0 definition. In fact, the database did not contain such complete information for the implementation of sepsis-3.0 criteria.

Conclusions
In conclusion, the study successfully computed out a sequence of dynamic fluid management strategy for sepsis patients over the first 5 days after ICU admission with a large volume of electronic healthcare data. The decision rules on day 1, 3 and 5 adopted different functions of covariates and treatment histories. The optimal treatment strategy generated by the DTR model could significantly improve the survival outcome as compared with the actual fluid strategy. The decision rules developed in the study require further validation in prospective cohorts.