Deep learning-based prediction of in-hospital mortality for sepsis

Yong, Li; Zhenzhou, Liu

doi:10.1038/s41598-023-49890-9

Download PDF

Article
Open access
Published: 03 January 2024

Deep learning-based prediction of in-hospital mortality for sepsis

Li Yong¹ &
Liu Zhenzhou¹

Scientific Reports volume 14, Article number: 372 (2024) Cite this article

1 Altmetric
Metrics details

Subjects

Abstract

As a serious blood infection disease, sepsis is characterized by a high mortality risk and many complications. Accurate assessment of mortality risk of patients with sepsis can help physicians in Intensive Care Unit make optimal clinical decisions, which in turn can effectively save patients’ lives. However, most of the current clinical models used for assessing mortality risk in sepsis patients are based on conventional indicators. Unfortunately, some of the conventional indicators have been shown to be inapplicable in the accurate clinical diagnosis nowadays. Meanwhile, traditional evaluation models only focus on a small amount of personal data, causing misdiagnosis of sepsis patients. We refine the core indicators for mortality risk assessment of sepsis from massive clinical electronic medical records with machine learning, and propose a new mortality risk assessment model, DGFSD, for sepsis patients based on deep learning. The DGFSD model can not only learn individual clinical information about unassessed patients, but also obtain information about the structure of the similarity graph between diagnosed patients and patients to be assessed. Numerous experiments have shown that the accuracy of the DGFSD model is superior to baseline methods, and can significantly improve the efficiency of clinical auxiliary diagnosis.

Screening and diagnosis of cardiovascular disease using artificial intelligence-enabled cardiac magnetic resonance imaging

Article Open access 13 May 2024

An overview of clinical decision support systems: benefits, risks, and strategies for success

Article Open access 06 February 2020

Delirium

Article 12 November 2020

Introduction

Sepsis is an acute systemic infection that occurs when various pathogenic bacteria invade the circulation and produce toxins. As a serious blood infection disease, sepsis is characterized by a high mortality risk and many complications^1,2. World Health Organization (WHO) shows that in 2017, about 11 million patients with sepsis worldwide were at risk of death. Severe sepsis can lead to multiple organ failure in patients, with a mortality rate of 9.7%^3,4. Especially in patients who develop septic shock, the mortality rate can reach more than 40%⁵. In 2018, sepsis was responsible for 15% of all neonatal deaths worldwide. In addition, according to the WHO’s Executive Board, sepsis leads to an economic burden of more than $24 billion per year, representing 6.2% of total hospital costs. In recent years, despite some advances have been witnessed in management and treatment, sepsis diagnosis and treatment continues to be a focal area of research in global health^6,7,8. Early identification of septic patients at high risk of death during patient care has been shown to be effective in improving patient outcomes^9,10,11.

However, there are still some challenges in the current risk assessment methods of mortality in sepsis patients. First of all, the indicators recognized in the current widespread clinical scoring methodology are based on the empirical findings of traditional medical experts, and some of the conventional indicators have been shown to be inapplicable in the accurate clinical diagnosis nowadays¹⁰. What’s more, the rapid onset of sepsis results in the inability of septic patients to accumulate as much individual clinical records as patients with chronic illnesses, making it challenging to predict the risk of death in septic patients. Last but not least, the accuracy of existing clinical scoring methods such as SOFA and SAPS II is deficiency, resulting in ineffective assessment of mortality risk in sepsis patients compared to machine learning¹².

In recent years, a large number of researchers have been committed to addressing challenges of mortality risk assessment for sepsis. By using statistical methods, they identified core indicators for assessing mortality risk in patients with sepsis. Using Cox regression model and subgroup analysis, Wang et al.¹³ identified the ratio of blood urea nitrogen to serum albumin as an important predictor of death in sepsis patients. Dias et al.¹⁴ found that afebrile patients with sepsis admitted to the ICU from the ward had higher mortality than febrile patients by using multivariate analysis. Hu et al.¹⁵ used Cox risk model and multifactorial regression analysis to demonstrate that although albumin level is one of the indicators for assessing disease severity in patients with sepsis, hypoproteinemia has no significant effect on the risk of death in patients with sepsis. Risk assessment of mortality in septic patients is generally based on medical datasets and machine learning models for analytical prediction^16,17. Hou et al.¹⁸ used the XGboost model to predict 30-day mortality risk in septic patients in the ICU, demonstrating the clinically significant predictive value of XGboost. On the other hand, Kong et al.¹⁰ compared the predictive performance of four machine learning methods and SAPS II., and showed that the GBM, LASSO, and linear regression (LNR) models had excellent scalability, whereas the random forests(RF) model underestimated high-risk septic patients, and SAPS II is slightly negative. Perng et al.¹⁹ has convinced himself through extensive experiments that convolutional neural networks(CNN) with softmax model outperforms autoencoder, principal component analysis(PCA), and machine learning such as K-nearest neighbor(KNN), support vector machines(SVM), and RF.

Despite the fact that existing studies have yielded some results, there are still some drawbacks. In terms of obtaining core indicators for the assessment of risk of mortality for sepsis, the number of indicators that can be identified by statistical methods is scarce. In addition, although statistical methods can determine that certain indicators have a significant impact on the risk of death in patients with sepsis, statistical methods cannot rank the impact of multiple highly significant indicators on the mortality of sepsis patients, in order to further determine which indicators are the most core indicators. With respect to mortality risk assessment in septic patients, machine learning still focus only on individual clinical records of septic patients with limited records, restricting assessment accuracy.

In this paper, we extract core indicators for mortality risk assessment of sepsis with machine learning, and propose a new mortality risk assessment model, DGFSD, for sepsis patients based on deep learning. Above all, we use machine learning model, XGboost, and adopt the recommendations of clinical experts to filter out the core indicators required for risk assessment, based on a massive amount of internationally available EMRs of sepsis patients. Then, a similarity connectivity graph of sepsis patients is constructed by patients similarity graph to connect patients with similar indicators. After that, we propose a mortality risk prediction model DGFSD for sepsis patients by constructing deep neural network (DNN) and graph convolutional network (GCN). The DGFSD model can not only learn individual clinical information about unassessed patients, but also obtain information about the structure of the similarity graph between diagnosed patients and patients to be assessed. Finally, we perform multiple experiments of the DGFSD model on MIMIC-III, an internationally recognized open medical dataset, and compare the performance of DGFSD with other classical machine learning. The experimental results show the superiority of the DGFSD model in predicting the risk of death of sepsis, and the DGFSD model can reach the criteria for clinical auxiliary diagnostic of sepsis.

Method

Dataset

MIMIC-III²⁰ is a large-scale public dataset jointly released by the Computational Physiology Laboratory of Massachusetts Institute of Technology, the Beth Israel Medical Center, and Philips Medical. ICU patients records from 2001 to 2012 at Beth Israel Deaconess Medical Center were collected in MIMIC-III, which includes data from many types of ICUs, such as the Surgical Care Unit, Medical Care Unit, and Trauma Surgical Care Unit. MIMIC-III contains patient’s vital signs trend data and patient’s clinical data, it is divided into four major categories: patient’s basic information as well as transfer information category, patient's hospital outpatient related information category, patient’s ICU related information category, and auxiliary information category, according to the degree of relevance of the record content. The four categories include 26 data tables such as hospitalization table, discharge table, date-type schedule, medical staff table, and monitoring situation table. etc. Researchers must pass tests in order to gain approval from the manager to use the dataset. We were approved to extract data from the MIMIC-III for research purposes after testing through the Citi Program.

Clinical Characteristics

We followed the criteria below to pick out patients: (1) Patients older than 18 years of age; (2) Patients diagnosed with sepsis according to the third international consensus definition of sepsis and septic shock; (3) Analyze each admission of sepsis patients as an independent sample.

A total of 9432 patients with sepsis were included in the study. 1926 patients died, about 20.4% of the total. According to research by Hu et al.¹², we employed international standardized ratios (INR) and so on as indicators for constructing a mortality risk prediction model by referring to established scoring tools such as SAPS II and APACHE III. The indicators contained both laboratory indicators and vital signs. Laboratory indicators included maximum values of serum creatinine, anion gap, lactate, blood urinary nitrogen (BUN), PH, white blood cell, bicarbonate, ionized calcium, serum calcium, serum chloride, serum sodium, serum potassium, blood glucose, INR, prothrombin time (PT), partial thromboplastin time (PTT), alanine aminotransferase (ALT), alkaline phosphatase (ALP), aspartate aminotransferase (AST), total bilirubin, creatine kinase MB, and lactate dehydrogenase, as well as minimum values of hematocrit and albumin. Vital signs included age, mean values of heart rate, respiratory rate, and body temperature, minimum values of oxygen saturation and the Glasgow Coma Scale (GCS) score (Supplementary Table S1).

We excluded indicators of septic patients with more than 30% missing values to generate a usable dataset (Supplementary Figure S1), and divided the training and test sets in a 7:3 ratio. Furthermore, we interpolated the missing values using the reference values of the indicators²¹.

Baseline

The goal of mortality risk assessment in septic patients is to accurately categorize patient outcomes, so it is essentially a binary classification task. For this reason, we choose accuracy (ACC) as an evaluation metric to assess the performance of the models. Considering that the dataset is unbalanced, we use SMOTE²² algorithm to oversample the dataset, so that the utility of models can be reflected actually by the accuracy. ACC can be described as follows:

$$ACC=\frac{{S}_{r}+{D}_{r}}{{S}_{r}+{D}_{r}+{S}_{f}+{D}_{f}}$$

(1)

where ${S}_{r}$ denotes the number of actual surviving sepsis patients judged to be alive by the model, ${D}_{r}$ is the number of sepsis patients whose actual deaths were determined as deaths by the model. ${S}_{f}$ represents the number of patients who are judged as dead by the model based on the actual survival of sepsis patients. ${D}_{f}$ is used as the number of patients with sepsis that the model determines as surviving when they actually die.

To verify the effectiveness of the DGFSD model, we compared the DGFSD model with the following baseline models:

Decision tree classification (DT): DT employs a tree structure and uses hierarchical reasoning to achieve the final classification. A decision tree is generally represented by a root node, internal nodes and leaf nodes. We define the root node as the full sample of septic patients, the internal nodes as the septic patient feature attribute, and the leaf nodes represent the final decision result of the patient. For prediction, judgment is made inside the tree with eigenvalues, and based on the judgment, it decides which branch node to enter until it reaches the leaf node to get the classification result.
KNN: KNN predicts new data points by searching for the K most similar instances in the entire dataset and summarizing their output variables. We used KNN to make predictions about in-hospital outcomes for a particular septic patient by searching for information about similar patients.
Logistic regression (LR): LR is mainly used to solve binary classification problems. LR calculates the probability of occurrence of a patient's outcome by accepting information about the characteristics of the sepsis patient's data. In particular, LR outperforms clinical scoring methods¹².

Moreover, we perform ablation experiments by eliminating some of the modules in the DGFSD model, and the comparison methods involved include:

DGFSD-D-LR: Information about the structure of the similarity graph between patients is not considered, only obtain information about the individual clinical information of the patients.
DGFSD-G: Information about the individual clinical information of the septic patients is not considered, only obtain information about the structure of the similarity graph between patients.

Problem definition

The clinical records of patients with sepsis can be represented as $S\in {R}^{n*d}$, where $S=\{{s}_{1},{s}_{2},{s}_{3},\dots ,{s}_{n}\}$ and ${S}_{i}$ denotes the ith sample, $d=\{albumin,alp,alt,\dots ,age\}$ indicates the physical indicators, such as albumin, AST, and age. etc. In order to compensate for the issue of insufficient individual clinical data of septic patients, we obtained a large amount of patient information with similar indicators to individual patients by similarity calculation, denoted as similarity matrix $A\in {R}^{n*n}$.

The problem of assessing the risk of death in septic patients can be formalized as follows: given individual clinical data $S$ on septic patients and information on similar patients $A$, the decision objective is to calculate the probability of the risk of death in septic patients through the DGFSD model ${P}_{h}=DGFSD\{S,A\}$.

Model description

To more accurately assess the risk of mortality in patients with sepsis, we first construct a patients similarity graph based on the original sepsis patient records. Then we input the patients’ data and patients similarity graph into autoencoder and GCN, respectively. The DGFSD model connects each layer of the autoencoder to the corresponding GCN layer so that a representation specific to the autoencoder can be integrated into a structurally aware representation of the GCN, and finally output the prediction result through GCN (Fig. 1).

Patients similarity graph

For each septic patient, we locate the top-k similar patients and set up edges to connect them. The formula for calculating the similarity between patients $i$ and $j$ can be described as:

$${X}_{i,j}={e}^{-\frac{{|\left|{S}_{i}-{S}_{j}\right||}^{2}}{2}}$$

(2)

By calculating the similarity matrix $X$, we select the top-k similarities of each patient and construct an undirected patients similarity graph. Finally, the patient's adjacency matrix $A$ can be obtained from the non-graph data of the septic patient.

DNN module

Variations of the basic autoencoder include masked autoencoders, convolutional autoencoder, LSTM encoder-decoder, adversarial autoencoder and deep autoencoder. etc^{23,24,25,26,27}. We opted for the basic autoencoder to learn the clinical data representation of septic patients. We suppose there are $L$ layers in the autoencoder and $l$ denotes the number of layers. Then the septic patient clinical data representation ${H}^{(l)}$ learned by the $l$ th layer encoder can be represented as follows:

$${H}^{(l)}=\varnothing [{{W}_{e}^{(l)}H}^{\left(l-1\right)}+{b}_{e}^{\left(l\right)}]$$

(3)

where $\varnothing$ is the activation function of the fully connected layer, ${W}_{e}^{(l)}$ and ${b}_{e}^{(l)}$ denote the weight matrix and bias of the $l$ th layer of the encoder, respectively. In addition, the input to 0th layer of the encoder is septic patient clinical records $S$.

The decoder section reconstructs the input data by the following description:

$${H}^{(l)}=\varnothing [{W}_{d}^{\left(l\right)}{H}^{\left(l-1\right)}+{b}_{d}^{(l)}]$$

(4)

where ${W}_{d}^{(l)}$ and ${b}_{d}^{(l)}$ are the weight matrix and bias of the lth layer of the decoder, respectively.

GCN module

We enable the GCN to learn both kinds of information by integrating the clinical data representation learned by the DNN into the GCN. The representation ${G}^{(l)}$, learned by the GCN module at $l$ th layer, can be described as follows:

$${G}^{(l)}=\varnothing [{\widetilde{D}}^{-\frac{1}{2}}\widetilde{A}{\widetilde{D}}^{-\frac{1}{2}}{G}^{(l-1)}{W}^{(l-1)}]$$

(5)

$$\widetilde{A}=A+I$$

(6)

$${\widetilde{D}}_{ii}={\sum }_{j}{\widetilde{A}}_{ij}$$

(7)

where $W$ is the weight matrix, $I$ is the unit diagonal matrix of the adjacency matrix $A$. In order to combine the individual information of septic patients learned from the autoencoder into the GCN, we merged ${H}^{(l-1)}$ with ${G}^{(l-1)}$ in the following way:

$${\widetilde{G}}^{(l-1)}=\left(1-\epsilon \right){G}^{\left(l-1\right)}+\epsilon {H}^{(l-1)}$$

(8)

where ϵ is an equilibrium coefficient. Due to reduce hyperparameter search in DGFSD, ϵ is set to 0.5, making the representation of GCN module and DNN module equally important²⁸.We combine the autoencoder and the GCN layer by layer through Eq. (8) and use ${\widetilde{G}}^{(l-1)}$ as the input to the $l$ th layer in the GCN. At this point, the new data representation is as follows:

$${G}^{(l)}=\varnothing [{\widetilde{D}}^{-\frac{1}{2}}\widetilde{A}{\widetilde{D}}^{-\frac{1}{2}}{\widetilde{G}}^{(l-1)}{W}^{(l-1)}]$$

(9)

From Eq. (9), it can be seen that the individual clinical information of septic patients learned by the autoencoder ${H}^{\left(l-1\right)}$ will be propagated in the GCN through the normalized adjacency matrix ${\widetilde{D}}^{-\frac{1}{2}}\widetilde{A}{\widetilde{D}}^{-\frac{1}{2}}$.

As the beginning of the GCN layer, we input the individual data $S$ of the septic patient into the first GCN layer, at which point the first GCN layer is represented as shown below:

$${G}^{(1)}=\varnothing [\widetilde{A}{\widetilde{D}}^{-\frac{1}{2}}S{W}^{\left(1\right)}]$$

(10)

The final layer of the GCN module is a binary classification layer with Relu functionality, the final representation of which is shown below:

$$G=Relu[{\widetilde{D}}^{-\frac{1}{2}}\widetilde{A}{\widetilde{D}}^{-\frac{1}{2}}{Z}^{(l)}{W}^{(l)}]$$

(11)

The result ${g}_{i,j}\in G$ indicates that septic patient $i$ was assessed for outcome $j$, at which point $G$ is considered as a probability distribution.

Results

We evaluate the DGFSD model based on records from a large number of septic patients in MIMIC-III and compare the results of the DGFSD model with baseline models. Through extensive experimental comparison and analysis, we have obtained the following three main conclusions.

RC1. The results of our experiments indicate that serum sodium, serum potassium, and BUN are not central to the assessment of mortality risk in patients with sepsis in the accurate clinical diagnosis.
RC2. We compare DGFSD with baseline models and find that the DGFSD model is more prominent than baseline models, indicating that the DGFSD model can be effectively applied in clinical auxiliary diagnosis.
RC3. The DGFSD model can not only learn individual clinical information of undiagnosed sepsis patients, but also obtain similarity graph structure information between diagnosed and undiagnosed patients, thereby improving the evaluation accuracy of the model.

Core indicators analysis (RC1)

The baseline indicators we used are shown in Table 1. We first balance the dataset using SMOTE algorithm and then use the XGboost to obtain the importance ranks of the baseline indicators (Fig. 2). Incorporating the recommendations of clinical experts, we finalize the top 12 indicators ranked in importance as the core indicators for assessing the risk of death in patients with sepsis.

Table 1 Baseline indicators.

Full size table

The core indicators for assessing the risk of death in patients with sepsis are shown in Table 2. Surviving patients have higher albumin, bicarbonate, and PH. In contrast, ALP, total bilirubin, serum chloride, creatinine, lactate, serum calcium, white blood cell, blood glucose, and age are higher in the deceased patients.

Table 2 Core indicators.

Full size table

Zhang et al.²⁹ believe that serum sodium has a strong relationship to in-hospital mortality in patients with sepsis; SAPS II considers serum potassium, serum sodium, and BUN to be the core indicators for assessing the risk of death in patients with sepsis; Hu et al.¹² consider BUN to be the core indicators for assessing the risk of death in patients with sepsis. However, Fig. 2 shows that the importance of three indicators, serum sodium, serum potassium and BUN, is not as significant as previous research. Therefore, we do not consider the three indicators to be core indicators for mortality risk assessment in septic patients.

Model comparison (RC2)

The experimental result of the comparison experiments of the DGSFD model with baseline models are shown in Fig. 3. The DGFSD model outperforms the baseline models. DGFSD, as a deep learning model, have an accuracy of 82.78%, which is superior to LR (78.80%), DT (75.78%) and KNN (76.07%). It shows that the DGFSD model can be used for the accurate clinical diagnosis.

Ablation study (RC3)

The results of the ablation experiments of the DGFSD model are shown in Fig. 4. The DGFSD model can not only learn individual clinical information about unassessed patients, but also obtain information about the structure of the similarity graph between diagnosed patients and patients to be assessed. As shown in Fig. 4, the DGFSD model has the most ascendant performance. DGFSD-G only learns the similarity graph structure information between septic patients, and the experimental result shows that it is not as accurate as the DGFSD model. Meanwhile, DGFSD-D-LR only obtains individual clinical information of sepsis patients, and experimental result shows that it gains the most powerless performance.

The ablation experiment shows that this multi-representation learning mode, DGFSD, can indeed improve the performance of the model in assessing the risk of death in sepsis patients.

Conclusions and discussions

We refine the core indicators for assessing mortality risk assessment of sepsis that are more relevant to the accurate clinical diagnosis. At the same time, We incorporate graph neural networks into the task of mortality risk assessing in septic patients, and propose a deep learning-based mortality risk assessment model DGFSD.

Specifically, we extract indicators importance rankings for mortality risk assessment of septic patients by XGboost model, and then cream off core indicators for assessing the risk of death of sepsis, taking into account the recommendations of clinicians. We construct patients similarity graph and combine two deep learning modules, DNN and GCN, to build DGFSD model. The DGFSD model can not only learn individual clinical information about unassessed patients, but also obtain information about the structure of the similarity graph between diagnosed patients and patients to be assessed. Numerous experiments have shown that the accuracy of the DGFSD model is superior to state-of-the-art methods available, and can significantly improve the efficiency of clinical auxiliary diagnosis.

Compared with existing studies, our study has several strengths. Firstly, we identify core indicators for assessing the risk of death of sepsis that are more consistent with clinical application, based on machine learning model XGboost, and in conjunction with the recommendations of clinical professional. Secondly, we improve the prediction accuracy by constructing DGFSD model that can not only learn individual clinical information about unassessed patients, but also obtain information about the structure of the similarity graph between diagnosed patients and patients to be assessed.

However, our experiments still have limitations. To begin with, MIMIC-III only contains EMRs on patients in the United States and lacks EMRs in other countries. So the validity of the DGFSD model for patients in other countries needs to be further investigated. Subsequently, effectiveness of the core indicators we refined for mortality risk assessment in septic patients need to be empirically tested in the clinical setting. In addition, many unmeasured confounding factors may have an impact on the mortality of sepsis patients, such as treatment strategies. Finally, the DGFSD model is a black-box model and the interpretability of the model requires further research.

Future research will extend the DGFSD model to heterogeneous information learning models and enhance the interpretability of the model, in addition to conducting clinical validation.

Data availability

The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.

References

Cecconi, M., Evans, L., Levy, M. & Rhodes, A. Sepsis and septic shock. Lancet 392(10141), 75–87 (2018).
Article PubMed Google Scholar
Singer, M. et al. The Third international consensus definitions for sepsis and septic shock (Sepsis-3). JAMA 315(8), 801–810 (2016).
Article CAS PubMed PubMed Central Google Scholar
Perman, S. M., Goyal, M. & Gaieski, D. F. Initial emergency department diagnosis and management of adult patients with severe sepsis and septic shock. Scand. J. Trauma Resuscit. Emerg. Med. 20, 41 (2012).
Article Google Scholar
Rudd, K. E. et al. Global, regional, and national sepsis incidence and mortality, 1990–2017: Analysis for the Global Burden of Disease Study. Lancet. 395(10219), 200–211 (2020).
Article PubMed PubMed Central Google Scholar
Rangel-Frausto, M. S. et al. The natural history of the systemic inflammatory response syndrome (SIRS) A prospective study. JAMA 273(2), 117–123 (1995).
Article CAS PubMed Google Scholar
Reinhart, K. et al. Recognizing sepsis as a Global Health priority - A WHO resolution. N. Engl. J. Med. 377(5), 414–417 (2017).
Article MathSciNet PubMed Google Scholar
Jarczak, D., Kluge, S. & Nierhaus, A. Sepsis-pathophysiology and therapeutic concepts. Front. Med. 8, 628302 (2021).
Article Google Scholar
Wang, D. et al. A machine learning model for accurate prediction of sepsis in ICU patients. Front. Public Health 9, 754348 (2021).
Article ADS PubMed PubMed Central Google Scholar
Andaluz-Ojeda, D. et al. Early natural killer cell counts in blood predict mortality in severe sepsis. Crit. Care 15(5), R243 (2011).
Article PubMed PubMed Central Google Scholar
Kong, G., Lin, K. & Hu, Y. Using machine learning methods to predict in-hospital mortality of sepsis patients in the ICU. BMC Med. Inform. Decis. Mak. 20(1), 251 (2020).
Article PubMed PubMed Central Google Scholar
Bao, C., Deng, F. & Zhao, S. Machine-learning models for prediction of sepsis patients mortality. Med. Int. 47(6), 315–325 (2023).
CAS Google Scholar
Hu, C. et al. Interpretable machine learning for early prediction of prognosis in sepsis: A discovery and validation study. Infect. Dis. Therapy 11(3), 1117–1132 (2022).
Article MathSciNet Google Scholar
Wang, Y. et al. Prognostic impact of blood urea nitrogen to albumin ratio on patients with sepsis: A retrospective cohort study. Sci. Rep. 13(1), 10013 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Dias, A. et al. Fever is associated with earlier antibiotic onset and reduced mortality in patients with sepsis admitted to the ICU. Sci. Rep. 11(1), 23949 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Hu, J., Lv, C., Hu, X. & Liu, J. Effect of hypoproteinemia on the mortality of sepsis patients in the ICU: A retrospective cohort study. Sci. Rep. 11(1), 24379 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Yao, R. et al. A machine learning-based prediction of hospital mortality in patients with postoperative sepsis. Front. Med. 7, 445 (2020).
Article Google Scholar
Van Doorn, W. P. T. M. et al. A comparison of machine learning models versus clinical evaluation for mortality prediction in patients with sepsis. PloS One 16(1), e0245157 (2021).
Article PubMed PubMed Central Google Scholar
Hou, N. et al. Predicting 30-days mortality for MIMIC-III patients with sepsis-3: A machine learning approach using XGboost. J. Transl. Med. 18(1), 462 (2020).
Article CAS PubMed PubMed Central Google Scholar
Perng, J. W. et al. Mortality prediction of septic patients in the emergency department based on machine learning. J. Clin. Med. 8(11), 1906 (2019).
Article MathSciNet PubMed PubMed Central Google Scholar
Johnson, A. et al. MIMIC-III, a freely accessible critical care database. Sci Data 3, 160035 (2016).
Article CAS PubMed PubMed Central Google Scholar
Silva, D. B., Schmidt, D., Costa, C. A., Righi, R. D. & Eskofier, B. DeepSigns: A predictive model based on deep learning for the early detection of patient health deterioration. Expert Syst. 165, 113905 (2021).
Article Google Scholar
Chawla, N. V. et al. Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002).
Article Google Scholar
He, K. M. et al. Masked Autoencoders Are Scalable Vision Learners. Preprint at https://arxiv.org/abs/2111.06377 (2021).
Antonopoulos, I. et al. Artificial intelligence and machine learning approaches to energy demand-side response: A systematic review. Renew. Sustain. Energy Rev. 130, 109899 (2020).
Article Google Scholar
Pang, G. S., Shen, C. H., Cao, L. B. & Den Hengel, A. V. Deep learning for anomaly detection: A review. ACM Comput. Surv. 54(2), 38 (2021).
Google Scholar
Ruff, L. et al. A unifying review of deep and shallow anomaly detection. Proc. IEEE 109(5), 756–795 (2021).
Article CAS Google Scholar
Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006).
Article ADS MathSciNet CAS PubMed Google Scholar
Bo, D. et al. Structural Deep Clustering Network. in Proceedings of The Web Conference 2020,1400–1410 (2020).
Zhang, K. et al. STAPLAg: A convenient early warning score for use in infected patients in the intensive care unit. Medicine 99(22), e20274 (2020).
Article PubMed Google Scholar

Download references

Acknowledgements

We acknowledge the financial support for this work from the Natural Science Foundation of China (No. 72364033) ; Gansu Provincial Science and Technology Plan Project (No. 23JRZA397); Northwest Normal University Major Research Project Incubation Program, China (No. NWNU-LKZD2021-06).

Author information

Authors and Affiliations

College of Computer Science and Engineering, Northwest Normal University, Lanzhou, 730070, People’s Republic of China
Li Yong & Liu Zhenzhou

Authors

Li Yong
View author publications
You can also search for this author in PubMed Google Scholar
Liu Zhenzhou
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

L.Y. and Z.L. are co-first authors who jointly designed the research; Z.L. performed experiments and wrote the manuscript. L.Y. supervised the project and revised the manuscript.

Corresponding author

Correspondence to Liu Zhenzhou.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yong, L., Zhenzhou, L. Deep learning-based prediction of in-hospital mortality for sepsis. Sci Rep 14, 372 (2024). https://doi.org/10.1038/s41598-023-49890-9

Download citation

Received: 27 September 2023
Accepted: 13 December 2023
Published: 03 January 2024
DOI: https://doi.org/10.1038/s41598-023-49890-9

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.