Machine learning for early discrimination between transient and persistent acute kidney injury in critically ill patients with sepsis

Acute kidney injury (AKI) is commonly present in critically ill patients with sepsis. Early prediction of short-term reversibility of AKI is beneficial to risk stratification and clinical treatment decision. The study sought to use machine learning methods to discriminate between transient and persistent sepsis-associated AKI. Septic patients who developed AKI within the first 48 h after ICU admission were identified from the Medical Information Mart for Intensive Care III database. AKI was classified as transient or persistent according to the Acute Disease Quality Initiative workgroup consensus. Five prediction models using logistic regression, random forest, support vector machine, artificial neural network and extreme gradient boosting were constructed, and their performance was evaluated by out-of-sample testing. A simplified risk prediction model was also derived based on logistic regression and features selected by machine learning algorithms. A total of 5984 septic patients with AKI were included, 3805 (63.6%) of whom developed persistent AKI. The artificial neural network and logistic regression models achieved the highest area under the receiver operating characteristic curve (AUC) among the five machine learning models (0.76, 95% confidence interval [CI] 0.74–0.78). The simplified 14-variable model showed adequate discrimination, with the AUC being 0.76 (95% CI 0.73–0.78). At the optimal cutoff of 0.63, the sensitivity and specificity of the simplified model were 63% and 76% respectively. In conclusion, a machine learning-based simplified prediction model including routine clinical variables could be used to differentiate between transient and persistent AKI in critically ill septic patients. An easy-to-use risk calculator can promote its widespread application in daily clinical practice.


Study population.
This study included adult patients who were admitted to ICU with sepsis and developed AKI within the first 48 h of the ICU stay. Sepsis was defined based on the updated Sepsis-3 criteria as suspected infection (the concomitant administration of antibiotics and sampling of body fluid culture) with the Sequential Organ Failure Assessment (SOFA) score ≥ 2 points 23,24 . Patients with suspicion of infection more than 24 h before or after ICU admission were excluded. The microbiology information was extracted to verify the locations and pathogens of positive cultures taken during the suspected infection time. SOFA score was calculated using data within the first 24 h after ICU admission. AKI was diagnosed and staged according to the Kidney Disease: Improving Global Outcomes (KDIGO) guideline using both serum creatinine (SCr) and urine output (UO) criteria 25 . Baseline SCr was defined as the lowest SCr value during 7 days before ICU admission 26,27 . For patients without available pre-admission SCr, we used the first SCr measurement after ICU admission as the baseline SCr 26 . UO rate was calculated by dividing the volume of UO into 6-h, 12-h and 24-h time periods. We analyzed only the first ICU stay for patients who were admitted to ICU more than once. We also excluded patients with age < 18 years old, end-stage renal disease, ICU stay < 48 h, non-AKI and missing data for AKI during the first 48 h.
Outcomes. The primary outcome was the persistence of AKI, which was determined in accordance with the ADQI 16 workgroup consensus 6 . Transient AKI was defined as reversal of AKI within 48 h after AKI diagnosis and for at least 48 h. In contrast, AKI was considered persistent if AKI criteria or RRT use remained present beyond 48 h after AKI diagnosis, or if the condition reversed within 48 h but relapsed within the next 48 h 6,7 . Patients with follow-up time < 48 h or missing data for the persistence of AKI were excluded from the analysis. Secondary outcomes included 28-day mortality, 90-day mortality and use of RRT within 28 days after ICU admission.
Data extraction. We obtained demographic and clinical data within the first 48 h after ICU admission using PostgreSQL tools (version 9.6.20) and Navicat Premium (version 15.0.12). Comorbidities and diagnoses were identified based on the recorded International Classification of Diseases 9th Edition code. Vital signs including temperature, heart rate, respiratory rate and mean arterial pressure were extracted from the electronic charted data. Laboratory data including hemoglobin, white blood cell count, platelet count, bilirubin, albumin, arterial pH, partial pressure of oxygen, partial pressure of carbon dioxide, anion gap, serum electrolytes (sodium, potassium, chloride and bicarbonate), lactate, international normalized ratio and partial thromboplastin time were also recorded. We used the values related to the greatest disease severity for variables measured more than once during the first 48 h. Accordingly, both the maximum and minimum values of some variables were included. In addition, the use of mechanical ventilation, vasopressors, diuretics and RRT and the volume of mean daily intravenous infusion within the first 48 h were collected. We left out RRT initiation when determining the AKI stage, as we chose to record it as another variable.
Statistical analysis. Baseline characteristics and outcomes were compared between patients with transient and persistent AKI. Continuous variables were presented as medians (with interquartile ranges) and compared using Mann-Whitney U test. Categorical variables were presented as numbers (with percentages) and compared using chi-square tests. To ensure the facticity and reliability of the prediction model, we removed two variables with > 30% missing data from model construction, namely maximum bilirubin and minimum albumin (see Supplementary Table S1  www.nature.com/scientificreports/ The sample was randomly divided into the training and testing set by the ratio of 7 to 3. Five machine learning algorithms were used to develop prediction models for persistent AKI in the training set, including logistic regression, RF, support vector machine (SVM), artificial neural network (ANN) and extreme gradient boosting (XGB). RF is a tree-based algorithm, which integrates multiple decision trees through majority voting to determine the results of classification 28 . Gini index was used as the criteria for impurity measurement during the training process. SVM is a supervised classifier, the purpose of which is to establish the optimal maximum-margin hyperplane as decision boundary 29 . We chose Gaussian kernel function as the kernel when developing the SVM model. ANN is a mathematical model simulating the structure and function of biological neural networks, which contains connected nodes named artificial neurons and multiple layers (typically input layer, hidden layer and output layer) 30 . XGB is also a tree-based ensemble classifier, which obtains the final output by weight of multiple weak learners (decision trees) and gradient descent algorithm for minimizing the loss function 31 . Before model construction, categorical variables were preprocessed by one-hot encoding and the prediction variables were standardized. For each machine learning algorithm, we firstly set default hyper-parameters to establish an initial model. After that, parameter tuning was performed by manual grid search. We used five-fold cross-validation to identify optimal hyper-parameters and avoid over-fitting. Briefly, the training set was randomly divided into 5 roughly equal-sized subsets, and then 4 of them were fit into the model while the other was used for model validation. This process was repeated 5 times so that every subset could serve as a validation set. Subsequently, the performance of the final model was assessed on the testing set. We calculated several evaluation indexes of each model, including the area under the receiver operating characteristic curve (AUC), accuracy, precision, recall and F1 score. AUC was selected as the primary performance metric, which was considered an ideal evaluation metric for classifiers independent of threshold setting.
To further extend the clinical applicability of machine learning methods, we also developed a risk prediction model by simplifying the input variables. Firstly, all features were sorted by XGB according to their contribution to each tree in the learning process, and the top 20 important features were selected 31 . Then we used least absolute shrinkage and selection operator (LASSO) method for further feature selection 32 . During the process, cross-validation was performed and the value of λ was identified according to the most regularized model, in which the cross-validated error is within one standard error of the minimum. Fourteen variables were selected as predictors of persistent AKI. Finally, logistic regression was used to construct the simplified prediction model. Model performance was evaluated in the testing set, with the optimal cutoff identified by the maximum Youden index in the training set.

Patient characteristics.
A total of 5984 SA-AKI patients were enrolled in our study from 24,225 septic patients admitted to ICU during the study period. Among them, 2179 (36.4%) patients had an early complete reversal and 3805 (63.6%) developed persistent AKI (Fig. 1).
Baseline characteristics and outcomes of patients stratified by the persistence of AKI are shown in Table 1. Compared to patients with transient AKI, patients with persistent AKI had a higher proportion of emergency admission and medical ICU stay. The prevalence of diabetes mellitus, congestive heart failure, liver disease and chronic kidney disease (CKD) were higher in the persistent AKI patients. Most of the vital signs and laboratory data differed significantly between the two groups, and the measurements were mainly associated with higher disease severity in the persistent AKI group. Furthermore, a larger percentage of the persistent AKI patients received mechanical ventilation, vasopressors and RRT during the first 48 h. Renal dysfunction was more severe in the persistent AKI group, as reflected by higher AKI stage according to SCr or UO criteria. The locations and pathogens of microbiology cultures in SA-AKI patients are shown in Supplementary Tables S3, S4 online, and the 20 most common diagnoses in SA-AKI patients are shown in Supplementary Table S5 online.
Prediction models using machine learning algorithms. We randomly allocated 70% of SA-AKI patients to the training set and the remaining 30% to the testing set. Baseline characteristics were not significantly different between the training and testing set (see Supplementary Table S6 Fig. 3. Ultimately, fourteen variables were selected and entered into the logistic regression model ( Table 3). The simplified model showed adequate discrimination, with an AUC of 0.76 (95% CI 0.74-0.77) in the training set and 0.76 (95% CI 0.73-0.78) in the testing set (Fig. 4). The calibration of the model was overall good, except that it underestimated the risk of persistent AKI when the observed frequency was relatively low (Fig. 5). At the optimal cutoff of 0.63, the simplified model achieved a sensitivity of 63%, specificity of 76%, positive predictive value of 83% and negative predictive value of 53% in the testing set (Table 4).
We used Matlab software (version 9.2) to establish a risk calculator, which could be applied to automatically compute the risk of persistent AKI for SA-AKI patients in clinical settings (see Supplementary Fig. S1

Discussion
In the present study, we explored the applicability of machine learning methods to differentiate between transient and persistent AKI in a large population of SA-AKI patients. The ANN and logistic regression models exhibited the highest AUC among the five machine learning models. Additionally, a simplified risk prediction model was proposed, based on the combination of machine learning algorithms and logistic regression, and could be easily implemented using the risk calculator in daily routines. A growing body of evidence suggests that duration of AKI or renal recovery is associated with outcomes in critically ill septic patients 2,7,8,33,34 . Several clinical tools, including urinary indices 10-12 , imaging techniques 13,17 , prediction models 35,36 , and biomarkers [14][15][16][17] , were investigated in previous studies to predict renal recovery or its surrogate, namely progression to severe AKI. Nevertheless, they were found to be poorly effective or have not been validated in patients with sepsis 9 . A recent study enrolling 184 septic shock patients with AKI found a poor performance of urine cell cycle arrest biomarkers for predicting persistent AKI, with an AUC of 0.67 (95% CI 0.59-0.73). Of note, they also proposed a prediction model combining SCr, UO, norepinephrine dose and extrarenal SOFA at baseline, which performed well with an AUC of 0.81 (95% CI 0.74-0.86) 16 . Due to the complexity of SA-AKI, the clinical model integrating routine parameters may be more effective for predicting short-term reversibility of AKI than any parameter considered alone. A possible way to achieve this is to utilize advanced machine learning approaches, which have been applied in the prevention and management of AKI, such as predicting the development of AKI 37-41 , volume responsiveness in patients with oliguria 42 and mortality in critically ill AKI patients [43][44][45] . Our study corroborated the promise indicated by these previous studies and extended them by demonstrating the applicability of machine learning methods for predicting persistent AKI in a large cohort of SA-AKI patients.
In the current study, ANN and logistic regression achieved the highest AUC among the five machine learning methods. Compared with traditional modeling methods, ANN has the advantages of strong nonlinear mapping ability, great adaptability and high fault tolerance. Several recent studies have shown the effectiveness of neural network-based models in predicting the development of AKI. Le et al. proposed a convolutional neural networks prediction system, which outperformed the XGB model and the SOFA score in predicting AKI 48 h before onset in ICU patients 40 . Similarly, Kim et al. used recurrent neural network to assess future AKI occurrence and   46 . However, due to its "black box" characteristic, ANN is also hard to calculate and interpret. It is difficult to exhibit the complex association between different layers and nodes intuitively and to explain the exact impact of each input variable on the final result, which may limit its rapid clinical application. In this study, the conventional logistic regression showed higher AUC than several novel machine learning algorithms. The results were mainly determined by the nature of the dataset, as any specific modeling approach could not be the optimal method for all tasks 47 . In the logistic regression model, each variable's influence on outcome can be directly reflected by the regression coefficient. Hence, we further utilized it to propose a simplified prediction model with features selected by XGB and LASSO algorithms. The high interpretability and promising performance of the simplified model make it suitable to be applied. Since the present study is an initial attempt, future studies will investigate the extensibility of advanced approaches from other domains 48 and improvement of the existing algorithms 49,50 in predicting the persistence of AKI. Our study has important clinical significance. The prediction model for persistent AKI can assist risk stratification and therapeutic strategies of SA-AKI patients at an early stage 9 . For high-risk patients, large fluid infusion should be cautious to avert detrimental fluid overload. The requirement and optimal timing of RRT can be evaluated for patients without the indication of urgent hemodialysis. Constant monitoring is necessary, especially for high-risk patients, to assess the hemodynamic and fluid status, kidney function, complications of AKI and the risk of long-term adverse sequelae. Additionally, high-risk patients may be the ideal population for AKI clinical trials because they tend to experience no spontaneous and rapid reversal of AKI.
Many factors, including demographics, comorbidities and disease severity, can affect short-term renal recovery 51 . In this study, fourteen predictors of persistent AKI were identified by XGB and LASSO algorithms. The SCr and UO criteria of AKI stage were both strong predictors of persistent AKI. The results further supported that patients who meet both the SCr and UO criteria for AKI are at higher risk of death or RRT 52 . Among patient-related variables, age, CKD, diabetes mellitus and congestive heart failure were identified as predictors of persistent AKI, as they may cause reduced glomerular reserve and delayed or incomplete renal recovery 51 .

Variables
Transient AKI (n = 2179) Persistent AKI (n = 3805) P value   54,55 . A close relationship between mechanical ventilation and worsening of renal function was observed in a large cohort of ICU patients 56 . Metabolic acidosis is common in SA-AKI patients and can directly influence cardiac contractility and sensitivity of adrenergic receptors 57 . Coagulopathy, mainly caused by the activation or injury of endothelial cells, plays an important role in the pathogenesis of SA-AKI through microcirculatory dysfunction 58 . Our results further demonstrated that sepsis-related factors, including those relevant to respiratory failure, metabolic acidosis and coagulation disorder, could contribute to the prediction of persistent AKI. Further studies are required to investigate the exact pathophysiological mechanisms of reversibility of SA-AKI and determine whether modification of these factors can facilitate renal recovery and improve prognosis.
There are some strengths of our study. Firstly, with the combination of logistic regression and feature selection by machine learning algorithms, we established a simplified risk prediction model with high practicability and interpretability. Secondly, fourteen predictors of persistent AKI were selected by state-of-the-art algorithms. The unbiased machine learning methods can help identify important features, which are clinically significant but may be ignored by clinicians according to their traditional experience. Thirdly, an easy-to-use risk calculator was developed to allow automatic quantified assessment of the risk of persistent AKI, which is a useful tool for clinicians to identify high-risk patients and improve clinical decision-making abilities.
However, this study is also subject to some limitations. Firstly, it was a single-center retrospective study based on a publicly accessible database, which may limit the generalizability of the prediction model in patients with differently distributed features. External validation is still necessary, and clinical impact studies should be conducted to assess the model's effectiveness before its clinical implementation. Secondly, although we only included variables with ≤ 30% missing values, there were still 2.2% of all observations missing. Some candidate variables were excluded owing to a large percentage of missing values. Finally, similar to other machine learning models, the performance of our model was not perfect 38,45,47 . Possible reasons include the limited set of predictors, retrospective study design and heterogeneity of SA-AKI patients. Novel biomarkers, which were potential predictors of persistent AKI but not routinely measured in clinical settings, were not included in the prediction model. Based on this study, there is a continuing need for future studies to combine the clinical prediction model and biomarkers to predict persistent AKI. www.nature.com/scientificreports/  www.nature.com/scientificreports/  www.nature.com/scientificreports/ In conclusion, machine learning algorithms are helpful to distinguish between transient and persistent AKI and identify the predictors of persistent AKI in critically ill septic patients. A simplified 14-variable risk prediction model was developed and validated with high practicability and interpretability. A risk calculator was established to facilitate its widespread application in daily clinical practice, which may help identify high-risk patients, guide treatment decisions and improve prognosis. Future prospective studies are needed to demonstrate the model's generalizability and effectiveness and determine whether the addition of novel biomarkers could improve the predictive ability.

Data availability
The datasets analyzed during the current study are available in the MIMIC-III database (https:// mimic. physi onet. org/). Table 4. Performance of the simplified risk prediction model in the training and testing set. AUC, area under the receiver operating characteristic curve; CI, confidence interval; PPV, positive predictive value; NPV, negative predictive value.