This study aimed to establish and validate a machine learning (ML) model for predicting in-hospital mortality in patients with sepsis-associated acute kidney injury (SA-AKI). This study collected data on SA-AKI patients from 2008 to 2019 using the Medical Information Mart for Intensive Care IV. After employing Lasso regression for feature selection, six ML approaches were used to build the model. The optimal model was chosen based on precision and area under curve (AUC). In addition, the best model was interpreted using SHapley Additive exPlanations (SHAP) values and Local Interpretable Model-Agnostic Explanations (LIME) algorithms. There were 8129 sepsis patients eligible for participation; the median age was 68.7 (interquartile range: 57.2–79.6) years, and 57.9% (4708/8129) were male. After selection, 24 of the 44 clinical characteristics gathered after intensive care unit admission remained linked with prognosis and were utilized developing ML models. Among the six models developed, the eXtreme Gradient Boosting (XGBoost) model had the highest AUC, at 0.794. According to the SHAP values, the sequential organ failure assessment score, respiration, simplified acute physiology score II, and age were the four most influential variables in the XGBoost model. Individualized forecasts were clarified using the LIME algorithm. We built and verified ML models that excel in early mortality risk prediction in SA-AKI and the XGBoost model performed best.
Sepsis is a complicated medical condition caused by an infection that triggers a systemic inflammatory response1. It is the most common and dangerous cause of illness and death in critically ill people2. It is well known that sepsis frequently leads to acute kidney injury (AKI). AKI occurs in around 40% of people with severe sepsis, increasing the difficulty, cost, and likelihood of death during treatment3,4,5,6. Sepsis-associated acute kidney injury (SA-AKI) is a complicated condition, including multiple contributing factors associated with a worse prognosis, a longer length of hospital stay, and a more significant number of co-morbidities than in sepsis patients with no AKI5,6,7. It is crucial to accurately predict the prognosis for SA-AKI patients in the intensive care unit (ICU) due to their critical condition.
In critical care medicine, the prognosis of SA-AKI patients is a hot topic. Several scoring systems have been developed to predict outcomes in patients with SA-AKI; however, their performances have been disappointing due to low specificity and sensitivity. These scoring systems include the sequential organ failure assessment (SOFA) score, the simplified acute physiology score II (SAPS II), and the acute physiology and chronic health evaluation II8,9. Moreover, some multivariate prediction models for predicting the outcome of patients with SA-AKI have been developed. These models are based on standard statistical techniques like logistic regression and the Cox proportional risk model. Hu et al. used the Cox proportional risk model to construct a mortality prediction model for 2066 patients with SA-AKI, showing a preferable forecast performance9. However, the links between variables are intricate, including both linear and non-linear relationships; the Cox proportional risk model is, by default, calibrated to handle linear associations between dependent and independent variables, which may oversimplify more complex non-linear relationships. In addition, the Cox proportional risk model is susceptible to multicollinearity between variables, which might lower the model's performance. Consequently, it is crucial to investigate more effective and precise prediction techniques in the care of SA-AKI patients.
As statistical theory and computer technology have advanced, so has an interest in and acceptance of machine learning (ML) among medical professionals. Predictive models for various diseases have significantly benefited from cutting-edge ML approaches, outperforming their more conventional logistic and Cox regression-based counterparts10,11. The clinical applications of ML have ranged from diagnosis to prediction and have been utilized in various clinical domains12,13,14. ML methods have also been used to forecast the prognoses of critically ill patients, with results that are superior to those obtained using the more conventional methods of logistic regression and Cox regression analysis15,16,17,18. However, the advantage of ML algorithms in predicting mortality in SA-AKI patients has not yet been demonstrated. This research tried to create and verify ML models for early predicting in-hospital mortality in SA-AKI patients.
The Medical Information Mart for Intensive Care IV (MIMIC IV) database is an integrated, de-identified, and full clinical dataset that covers all patients who were hospitalized in the ICUs at Beth Israel Deaconess Medical Center in Boston, Massachusetts, between the years 2008 and 201919. We acquired the certificate necessary to apply for database access after passing the exam to ensure the safety of human study participants (No. 35970146). Patient permission and an ethical approval statement were unnecessary because the experiment would not have affected clinical care, and all patient data had already been de-identified20. This research was carried out in conformity with the principles of the 2013 Helsinki Declaration.
This study enrolled adults with sepsis who developed AKI within 48 h of ICU admission. Patients with sepsis were identified within 24 h of ICU admission using the Sepsis-3 criteria, which required the presence of both a probable infectious cause and a SOFA score ≥ 221. AKI was diagnosed using the Kidney Disease: Improving Global Outcomes Clinical Practice Guideline’s (2012) suggested criteria of serum creatinine (Scr) and urine output22. The first available Scr following ICU admission was used as the baseline value if no Scr was available prior to admission23. Due to the importance of maintaining data independence, only the initial ICU hospitalization was included in the study if the patient had multiple admissions. Patients under the age of 18 or with ICU stays shorter than 48 h were excluded.
We gathered data on the patient’s demographic features, chronic disease history, vital signs, laboratory results, Treatments, illness severity scores, and outcomes.
Demographic features obtained for the research consisted of age, sex, and weight. Chronic disease history included chronic pulmonary disease, peptic ulcer disease, peripheral vascular disease, myocardial infarction, cerebrovascular disease, diabetes, acquired immune deficiency syndrome, renal disease, dementia, rheumatic disease, paraplegia, liver disease, cancer, and congestive heart failure. We gathered mean values for vital information, such as heart rate, mean arterial pressure, respiration rate, body temperature, and SpO2, in the first 24 h after ICU admission. The highest values for a variety of laboratory results were collected in the first 20 h following ICU admission. These tests included the Scr, serum glucose, serum chloride, serum calcium, hematocrit, hemoglobin, platelets, anion gap, white blood cell, international normalized ratio, bicarbonate, serum sodium, blood urea nitrogen, serum potassium, prothrombin time, and partial thromboplastin time. We also recorded the quantity of urine passed in the first 24 h after ICU admission. Treatments included using renal replacement therapy, vasopressors, and mechanical ventilation during the first 24 h after ICU admission. In the first 24 h following ICU admission, we analyzed the initial SOFA score and SAPS II to determine the severity of the patient’s conditions.
Preprocessing of data
In this study, missing values for all variables were fewer than 20% (See Supplementary Table S1). When dealing with missing data, we employed the multiple imputation method implemented in the Python ‘miceforest’ package, widely acknowledged as a superior strategy for missing variables. We identified potential mortality-related variables using the least absolute shrinkage and selection operator (LASSO) analysis to reduce overfitting.
The median and interquartile range (IQR) were used to describe the normal distribution of continuous variables, whereas numbers and percentages were used to describe categorical variables. If applicable, The Mann–Whitney or Student’s t-test. Comparing continuous variables between groups using the U test. Apply either the Pearson chi-squared or Fisher's exact test to evaluate the significance of group differences in categorical variables.
R version 4.2.1 and Python version 3.9.12 were used for all statistical analyses. A two-tailed P value below 0.05 was considered statistically significant.
In this study, all participants were randomly divided into two sets a training set (consisting of 80% of patients) and a validation set (consisting of 20% of patients) (See Supplementary Table S2). The dimension of the features was reduced using the LASSO technique. Six different ML methods—logistic regression, support vector machine (SVM), k-nearest neighbor (KNN), decision tree, random forest (RF), and extreme gradient boosting (XGBoost)—were used to create and test models for predicting risks of in-hospital mortality. The ML model prediction ability was measured using accuracy, area under curve (AUC), sensitivity, specificity, and average precision. After considering accuracy and AUC, we settled on our final candidate model. To compare the predictive power between the models, we performed decision curve analysis (DCA) and plotted calibration curves. Using SHapley Additive exPlanations (SHAP) values, we characterized the crucial characteristics that affect mortality risk in the best ML model to study further each characteristic’s significance to the optimal model’s output. At last, the Local Interpretable Model-Agnostic Explanations (LIME) technique is used to fit the model's expected behavior. Finally, a sensitivity analysis of the results was performed.
Ethics approval and consent to participate
MIMIC IV was set up with the approval of the Institutional Review Board at the Massachusetts Institute of Technology. All participant data were anonymized to safeguard their privacy. Due to the use of anonymized health records, ethical approval and informed consent were not required. This study adheres to the ethical criteria outlined in the Helsinki Declaration of 1964.
The total number of people with SA-AKI who were considered for inclusion was 16,473. However, 2269 were ruled out because they had more than one ICU hospitalization, and 6592 excluded for less than 48 h in ICU. In the end, 8129 patients qualified for the study (Fig. 1). The prevalence of death in hospitals was 20% (1629/8129). The median age of these patients was 68.7 (IQR: 57.2–79.6) years, and 57.9% (4708/8129) were male. The top three comorbidities were congestive heart failure (2831/8129, 34.8%), diabetes (2566/8129, 31.6%), and chronic pulmonary illness (2358/8129, 29.0%). Table 1 provides a summary of the base characteristics of the dataset.
Multiple imputations were employed to fill in missing values for each variable. On the first day after being admitted to the ICU, 44 variables were gathered and analyzed using LASSO regression. Twenty-four variables were found to be statistically significant predictors of mortality after feature selection with LASSO analysis (See Supplementary Fig. S1), and these are listed in detail in Supplementary Table S3.
Model development and validation
The total number of patients was 8129, and they were divided randomly between a training group of 6503 (80%) and a validation cohort of 1626 (20%). There were no significant differences in the baseline features between the training and validation sets. Using LASSO regression’s selected 24 variables, we built six ML models: logistic regression, SVM, KNN, decision tree, RF, and XGBoost. The XGBoost model achieved the highest AUC (0.794) in the validation cohort, outperforming logistic regression (0.730), SVM (0.680), KNN (0.601), decision tree (0.585), and RF (0.778) (Fig. 2A). In order to dig even deeper into the performance of the six models, we also measured their accuracy, sensitivity, specificity, and average precision, and the outcomes are tabulated in Table 2. Other clinical disease severity ratings [SOFA score (AUC: 0.701); SAPS II (AUC: 0.706)] did not perform as well as the XGBoost model (Fig. 2B). The DCA curves and calibration curves show that the XGBoost model performs best among the six models (See Supplementary Figs. S2 and S3).
With SHAP values, we hoped to provide more insight into how the XGBoost model predicts deaths. For the XGBoost model, the SHAP summary graphic reveals that the SOFA score, respiratory rate, SAPS II, and age are the four most significant parameters (Fig. 3). In addition, we used SHAP dependence analysis to illustrate the impact of a single input variable had on the XGBoost prediction model’s final results (Fig. 4). Figure 5 displays the findings of a more in-depth analysis of the four most influential clinical features on the XGBoost prediction model's output.
We then took two random samples from the validation set and ran them through the LIME algorithm to shed light on the individual mortality forecast. Figure 6A depicts the case of death reported by the LIME algorithm. 76% was the expected probability of death according to the XGBoost model. The XGBoost model found a SOFA score of 17, a SAPS II of 77, a temperature of 36.27 °C, a history of malignancy, and a hemoglobin level of 9.6 g/dL were all associated with an elevated risk of mortality. Scr levels below 3.2 mg/dL and the absence of a prior history of cerebrovascular illness or paraplegia were found to reduce mortality risk. Both the XGBoost model and the actual outcome for this patient were death. Similarly, Fig. 6B illustrates a survival example utilizing the LIME technique. 10% was the expected probability of death according to the XGBoost model. The patient's age of 86.42 years, the history of cerebrovascular illness, and the hemoglobin level of 9 g/dL increase the risk of mortality, whereas a SOFA score of 4, the absence of a history of cerebrovascular disease, a respiratory rate of 14.98 beats per minute, the absence of a history of paraplegia, and the absence of a history of liver disease decrease the risk of death. Both the actual and expected outcomes confirmed the XGBoost model's prediction of the patient’s survival.
For patients without renal disease (N = 1232), the XGBoost model remained robust in predicting mortality in these patients (AUC: 0.808). Detailed results are shown in Supplementary Fig. S4.
This study developed and verified six ML methods for estimating the risk of in-hospital death among patients with SA-AKI. In predicting mortality in patients with SA-AKI, the XGBoost models outperformed other ML models (such as logistic regression, SVM, KNN, decision tree, and RF models) and conventional risk scores (such as the SOFA score and SAPS II). After doing a feature importance analysis, we found that the SOFA score, respiratory rate, SAPS II, and age were the top 4 features of the XGBoost model in terms of their ability to predict mortality. Also, we have documented how these factors influenced the XGBoost model. Finally, we use the LIME algorithm for personalized forecasting.
SA-AKI is frequent in critically ill patients and is characterized by rapid clinical deterioration and a considerably greater mortality rate than patients without AKI or those with other variables producing AKI24. Clinicians require accurate prediction models to evaluate the risk of dying and make appropriate therapy choices for severely ill patients with SA-AKI. For outcome prediction in intensive care settings, generic metrics such as the SOFA score and SAPS II are widely applied. The SOFA and SAPS II scoring systems have a number of shortcomings compared to ML models, including unsatisfactory predictive performance, poor specificity and sensitivity, a wide range of variability, and a laborious procedure10. According to the results of this research, the conventional severity scoring methods performed poorly compared to the ML model. This could be due to the two reasons listed below. First, the risk of bad outcomes in critically sick patients was assessed using the SOFA and SAPS II scoring systems, which relied heavily on the practitioner’s prior expertise25. Second, unlike multivariate models, these scoring systems cannot analyze a large number of potentially valuable variables, reducing their predictive power26.
Consistent with previous studies, our results show that the XGBoost model outperforms the other ML models in predicting death in SA-AKI patients. Liu et al. found that the XGBoost model predicted mortality in AKI patients better than logistic regression, SVM, and RF27. Zhu et al. evaluated the prediction of hospital mortality for patients on mechanical ventilation and discovered that the XGBoost model performed better than the RF, logistic regression, decision tree, and KNN models28. It is possible that multiple factors contributed to the boost in prediction abilities seen in XGBoost models. Firstly, the XGBoost method, derived from the gradient tree boosting framework, is highly skilled at fitting high-order interactions, discontinuities, and non-linearities. Second, the XGBoost method is resistant to outliers in the predictor variables and multicollinearity among those variables.
To enhance the interpretability of the model, we employ the SHAP value to explain and show the most influential elements of the prediction results. This study's analysis of the XGBoost model’s summary of feature importance found that the SOFA score was the most important predictor of mortality in patients with SA-AKI. The SOFA score quantifies organ impairment by measuring the burden of organ malfunction. The SOFA score was found to have a significant correlation with clinical outcomes, with a high SOFA score typically suggesting a critical condition and poor prognosis. Despite this, none of the prior models predicted the probability of mortality for SA-AKI patients using this crucial factor9,28. The SOFA score was the most heavily weighted variable in the XGBoost model, showing its significance in predicting the mortality of patients with SA-AKI in the present study. Additionally, we found that the rate of breathing is significantly correlated with the risk of dying from SA-AKI. Some research has demonstrated a correlation between breathing rate and inferior outcomes until the present day29. The SAPS II was an additional influential predictor of SA-AKI outcomes. SAPS II is a measure of disease severity, and greater SAPS II scores are linked to higher in-hospital death in critically sick patients30. Our results also showed that age was a significant risk factor for death among critically ill patients identified with SA-AKI. In the absence of co-morbidities and advanced age, the mortality rate from sepsis is less than 5%, according to a study conducted in Australia and New Zealand31.
However, this study also had some shortcomings. First, our study may be subject to selection bias because of its retrospective and observational design. Second, the current study was limited in its ability to conclude cause and effect because it was a retrospective modeling study conducted at a single location utilizing the MIMIC IV database. That being said, more prospective randomized clinical trials are needed to verify our model's efficacy. Third, we estimated specific missing data using the filling method, which may result in divergence from the genuine value. Finally, this study only validated the models internally; therefore, multicenter external validation is still needed to assess the models’ predictive ability. Finally, this study only validated the models internally; therefore, multicenter external validation is still required to assess the predictive potential of the models.
We built and verified ML models that excel in early mortality risk prediction in SA-AKI. The XGBoost model is the most effective of all the algorithms. Further SHAP values and LIME method indicated that SOFA score, respiratory rate, SAPS II, and age were played as the marked contributors for the prediction of death in SA-AKI patients. These findings would be helpful for clinical prediction. However, multi-center studies are still necessary to ensure that if this ML model are broadly applicable and generalizable to various settings and associated with improved clinical decision-making and outcomes.
The datasets presented in the current study are available in the MIMIC IV database (https://physionet.org/content/mimiciv/1.0/).
Bone, R. C., Sprung, C. L. & Sibbald, W. J. Definitions for sepsis and organ failure. Crit. Care Med. 20(6), 724–726 (1992).
Ricci, Z., Polito, A., Polito, A. & Ronco, C. The implications and management of septic acute kidney injury. Nat. Rev. Nephrol. 7(4), 218–225 (2011).
Chertow, G. M., Burdick, E., Honour, M., Bonventre, J. V. & Bates, D. W. Acute kidney injury, mortality, length of stay, and costs in hospitalized patients. J. Am. Soc. Nephrol. 16(11), 3365–3370 (2005).
Uchino, S. et al. Acute renal failure in critically ill patients: A multinational, multicenter study. JAMA 294(7), 813–818 (2005).
Schrier, R. W. & Wang, W. Acute renal failure and sepsis. N. Engl. J. Med. 351(2), 159–169 (2004).
Oppert, M. et al. Acute renal failure in patients with severe sepsis and septic shock: A significant independent risk factor for mortality—results from the German Prevalence Study. Nephrol. Dial. Transplant.: Off. Publ. Eur. Dial. Transplant Assoc: Eur. Renal Assoc. 23(3), 904–909 (2008).
Bagshaw, S. M. et al. Septic acute kidney injury in critically ill patients: Clinical characteristics and outcomes. Clin. J. Am. Soc. Nephrol.: CJASN 2(3), 431–439 (2007).
da Hora, P. R. et al. A clinical score to predict mortality in septic acute kidney injury patients requiring continuous renal replacement therapy: The HELENICC score. BMC Anesthesiol. 17(1), 21 (2017).
Hu, H. et al. A prediction model for assessing prognosis in critically ill patients with sepsis-associated acute kidney injury. Shock (Augusta, Ga) 56(4), 564–572 (2021).
Hou, N. et al. Predicting 30-days mortality for MIMIC-III patients with sepsis-3: A machine learning approach using XGboost. J. Transl. Med. 18(1), 462 (2020).
Du, M., Haag, D. G., Lynch, J. W. & Mittinty, M. N. Comparison of the tree-based machine learning algorithms to cox regression in predicting the survival of oral and pharyngeal cancers: Analyses based on SEER database. Cancers 12(10), 2802 (2020).
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521(7553), 436–444 (2015).
Obermeyer, Z. & Emanuel, E. J. Predicting the future: Big data, machine learning, and clinical medicine. N. Engl. J. Med. 375(13), 1216–1219 (2016).
Yang, F., Wang, H. Z., Mi, H., Lin, C. D. & Cai, W. W. Using random forest for reliable classification and cost-sensitive learning for medical diagnosis. BMC Bioinform 10(Suppl 1 (Suppl 1)), S22 (2009).
Hsieh, M. H., Hsieh, M. J. & Chen, C. M. An artificial neural network model for predicting successful extubation in intensive care units. J. Clin. Med. 7(9), 240 (2018).
Flechet, M. et al. Machine learning versus physicians’ prediction of acute kidney injury in critically ill adults: A prospective evaluation of the AKIpredictor. Crit. Care 23(1), 282 (2019).
Shillan, D., Sterne, J. A. C., Champneys, A. & Gibbison, B. Use of machine learning to analyse routinely collected intensive care unit data: A systematic review. Crit. care 23(1), 284 (2019).
Zhang, Z., Ho, K. M. & Hong, Y. Machine learning for the prediction of volume responsiveness in patients with oliguric acute kidney injury in critical care. Crit. Care (London, England) 23(1), 112 (2019).
Zhou, S., Zeng, Z., Wei, H., Sha, T. & An, S. Early combination of albumin with crystalloids administration might be beneficial for the survival of septic patients: A retrospective analysis from MIMIC-IV database. Ann. Intensive Care 11(1), 42 (2021).
Johnson, A. E. et al. MIMIC-III, a freely accessible critical care database. Sci. Data 3, 160035 (2016).
Singer, M. et al. The third international consensus definitions for sepsis and septic shock (Sepsis-3). JAMA 315(8), 801–810 (2016).
Andrassy, K. M. Comments on ‘KDIGO 2012 clinical practice guideline for the evaluation and management of chronic kidney disease’. Kidney Int. 84(3), 622–623 (2013).
Zhao, G. J. et al. Association between furosemide administration and outcomes in critically ill patients with acute kidney injury. Crit. Care (London, England) 24(1), 75 (2020).
Peters, E. et al. A worldwide multicentre evaluation of the influence of deterioration or improvement of acute kidney injury on clinical outcome in critically ill patients with and without sepsis at ICU admission: Results from The Intensive Care Over Nations audit. Crit. Care 22(1), 188 (2018).
Majdan, M., Brazinova, A., Rusnak, M. & Leitgeb, J. Outcome prediction after traumatic brain injury: Comparison of the performance of routinely used severity scores and multivariable prognostic models. J. Neurosci. Rural Pract. 8(1), 20–29 (2017).
Wu, J. et al. red cell distribution width to platelet ratio is associated with increasing in-hospital mortality in critically ill patients with acute kidney injury. Dis. Mark. 2022, 4802702 (2022).
Liu, J. et al. Predicting mortality of patients with acute kidney injury in the ICU using XGBoost model. PLoS ONE 16(2), e0246306 (2021).
Zhu, Y. et al. Machine learning prediction models for mechanically ventilated patients: Analyses of the MIMIC-III database. Front. Med. 8, 662340 (2021).
Barthel, P. et al. Respiratory rate predicts outcome after acute myocardial infarction: A prospective cohort study. Eur. Heart J. 34(22), 1644–1650 (2013).
Vaara, S. T., Pettilä, V., Reinikainen, M. & Kaukonen, K. M. Population-based incidence, mortality and quality of life in critically ill patients treated with renal replacement therapy: A nationwide retrospective cohort study in Finnish intensive care units. Crit. Care (London, England) 16(1), R13 (2012).
Kaukonen, K. M., Bailey, M., Suzuki, S., Pilcher, D. & Bellomo, R. Mortality related to severe sepsis and septic shock among critically ill patients in Australia and New Zealand, 2000–2012. JAMA 311(13), 1308–1316 (2014).
The authors have declared that no competing interests exist. This work was supported by grants from the Natural Science Foundation of Anhui Province(2008085MH244), Incubation Program of National Natural Science Foundation of China of The Second Hospital of Anhui Medical University(2020GMFY04), Clinical Research Incubation Program of The Second Hospital of Anhui Medical University(2020LCZD01) Co-construction project of clinical and preliminary disciplines of Anhui Medical University in 2020(2020lcxk02) and Co-construction project of clinical and preliminary disciplines of Anhui Medical University in 2021(2021lcxk032). No funding bodies had any role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Li, X., Wu, R., Zhao, W. et al. Machine learning algorithm to predict mortality in critically ill patients with sepsis-associated acute kidney injury. Sci Rep 13, 5223 (2023). https://doi.org/10.1038/s41598-023-32160-z
This article is cited by
Construction and validation of prognostic models in critically Ill patients with sepsis-associated acute kidney injury: interpretable machine learning approach
Journal of Translational Medicine (2023)