A case-based reasoning system for neonatal survival and LOS prediction in neonatal intensive care units: a development and validation study

Kermani, Farzaneh; Zarkesh, Mohammad Reza; Vaziri, Mostafa; Sheikhtaheri, Abbas

doi:10.1038/s41598-023-35333-y

Download PDF

Article
Open access
Published: 24 May 2023

A case-based reasoning system for neonatal survival and LOS prediction in neonatal intensive care units: a development and validation study

Scientific Reports volume 13, Article number: 8421 (2023) Cite this article

919 Accesses
1 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Early prediction of neonates' survival and Length of Stay (LOS) in Neonatal Intensive Care Units (NICU) is effective in decision-making. We developed an intelligent system to predict neonatal survival and LOS using the "Case-Based Reasoning” (CBR) method. We developed a web-based CBR system based on K-Nearest Neighborhood (KNN) on 1682 neonates and 17 variables for mortality and 13 variables for LOS and evaluated the system with 336 retrospectively collected data. We implemented the system in a NICU to externally validate the system and evaluate the system prediction acceptability and usability. Our internal validation on the balanced case base showed high accuracy (97.02%), and F-score (0.984) for survival prediction. The root Mean Square Error (RMSE) for LOS was 4.78 days. External validation on the balanced case base indicated high accuracy (98.91%), and F-score (0.993) to predict survival. RMSE for LOS was 3.27 days. Usability evaluation showed that more than half of the issues identified were related to appearance and rated as a low priority to be fixed. Acceptability assessment showed a high acceptance and confidence in responses. The usability score (80.71) indicated high system usability for neonatologists. This system is available at http://neonatalcdss.ir/. Positive results of our system in terms of performance, acceptability, and usability indicated this system can be used to improve neonatal care.

Development and validation of machine learning-based clinical decision support tool for identifying malnutrition in NICU patients

Article Open access 30 March 2023

Development and validation of an early warning tool for sepsis and decompensation in children during emergency department triage

Article Open access 21 April 2021

Predicting clinical outcomes using artificial intelligence and machine learning in neonatal intensive care units: a systematic review

Article 13 May 2022

Introduction

Neonatal mortality is considered one of the main health indicators. Mortality within the first month of life is 18 cases per every 1000 live births. Based on statistics, 2.4 million newborns (47%) in 2019 died within the first month of life¹. A systematic review indicates that more than half of neonatal deaths occur within the first three days of life, and death within the first day is about two-thirds of this neonatal mortality rate². Many of these deaths can be prevented through high-quality care and specialized post-delivery care for mothers and newborns, as well as taking care of sick or premature newborns¹. Thus, neonatal mortality is considered very important and is known as a quality indicator for comparing the quality of care provided in Neonatal Intensive Care Units (NICUs)³.

Premature or sick newborns receive specialized care in NICUs, whose care period in this unit is another important challenge. The Length of Stay (LOS) of these newborns is six times as large as that of normal newborns, and in most cases, they require extensive as well as advanced care^4,5. Their LOS elevation in NICUs increases the burden of healthcare services⁴, the level of morbidity⁶, and hospitalization costs⁷.

Accurate estimation of the risk of in-hospital mortality of newborns is necessary for the management of healthcare quality and logical use of resources⁸. Further, precise estimation of the number of days requiring intensive care and thus costs of hospitalization is important to inform parents, healthcare specialists, insurance companies, and governmental organizations for planning, and policymaking. Indeed, the physician taking care of the newborn should be able to make decisions within a short time after childbirth^5,9. Nevertheless, decision-making on this issue is very complex, because various diagnostic and clinical factors complicate the prediction and decision-making by physicians on neonatal mortality and LOS in NICUs. In this regard, the physician should consider various factors simultaneously including the newborn status at the time of childbirth such as Birth Weight (BW), Gestational Age (GA), Apgar score, type of delivery, the maternal health condition at the time of delivery, and so on^4,10,11. Moreover, regarding the use of resources in NICUs, there should be a combination of information about the neonates who have died or have been eventually discharged alive, whereby consideration of only one of mortality or LOS does not reflect a general view of the care provided in NICUs¹². Nevertheless, many studies in this regard considered only one of these challenges^7,13.

Clinical decision-making requires combining observations, and the experience of physicians with new diagnostic and specialized methods, the available clinical guidelines, and novel therapeutic strategies¹⁴. This can be achieved through Artificial Intelligence (AI) methods such as Case-Based Reasoning (CBR). This method has high flexibility and offers suitable performance in solving new problems based on the solutions for similar problems, and more importantly, these systems are well accepted by physicians¹⁵.

Many studies have been performed to predict neonatal mortality (or survival) or LOS using statistical or AI methods; for example, predicting neonates' LOS in NICUs using multivariate regression models¹⁶, a fuzzy expert system for predicting the risk of neonatal death¹³, K-Nearest Neighborhood (KNN), Random Forest (RF) and Bayesian network to classify the causes of fetal death¹⁷, decision tree (C5.0) and Artificial Neural Network (ANN) to predict neonatal mortality and preterm births in twin pregnancies¹⁸; However, studies on the use of CBR to predict both neonatal survival and LOS are rare. Therefore, this study was conducted to develop and evaluate a web-based CBR system that simultaneously predicts neonatal survival and LOS in NICUs.

Related work

Case-based reasoning

CBR is a problem-solving approach, which can employ specialized knowledge of previously-solved problems (cases). The new problem is solved by finding similar cases in the past and reusing their solutions to solve a new situation^19,20.

The cycle of CBR is summarized in four steps:

1.
Retrieval: the aim of “retrieval” is finding a case that has the greatest similarity (the most suitable) to the new problem²⁰. Retrieval occurs based on applying a similarity index between the new case and all available cases in the case library. Thus, the set of cases is called retrieved cases that are ranked based on the similarity index^21,22.
2.
Reuse: at this step, the case or cases standing at the top of the list of the previous step are reused and their solutions are adapted²¹. If the situation of the new problem is exactly similar to the retrieved case, it is known as the most successful solution^20,23. Otherwise, adoption should be used for the new problem, which can be done manually or automatically²⁰. The outcome of the reuse step is creating a solution for the new case, which is called the "solved case"²¹.
3.
Revise: this step involves testing the solution in a real setting or an assessment by a supervisor, a specialist, or a modeler/simulator. At the end of this step, the solved cases are considered tested or revised cases, since the system should remember only valid cases with a proper solution²⁴.
4.
Retaining the new case: the important feature of CBR is its learning. When a problem is solved successfully, its experience is retained in the system to solve similar problems in the future²⁰; when the "revise" step creates a new case, the case base is updated with the new case (learned) to solve future problems^20,21.

Previous research

A Nationwide cohort study in the Netherlands used a multiple logistic regression model to estimate the risk of neonatal mortality within 28 days after birth and indicated an Area Under Curve (AUC) of 0.83²⁵. Cooper et al.²⁶ developed (6499 cases) and validated (3552 cases) a superleaning algorithm to predict 30-day neonatal postoperative mortality. The superlearning algorithm (14 machine learning and regression algorithms) was performed on demographic preoperative clinical data. According to the results, the superlearning algorithm outperformed all individual algorithms with regard to AUC which was 0.91 for the development and 0.87 for the validation.

In another study, researchers applied KNN, RF, and Bayesian network algorithms to classify the causes of fetal death using 49 features. The results showed that KNN and RF had the best performance (accuracy of 81.38% and 81.84%, respectively)¹⁷. Performance evaluation of a fuzzy expert system designed to predict neonatal mortality risk showed an accuracy (90%), sensitivity (83%), and specificity (97%)¹³.

Another study predicted the mortality and LOS for neonatal admissions to a private hospital NICU in Southern Africa using a logistic regression model. The proposed model for predicting neonatal mortality had a good fit (AUC: 0.85 and accuracy: 86.4%), but the low positive predictive value of this model reduced its performance. Furthermore, the Poisson log-linear model had a good fit (R² = 0.70) for predicting LOS⁹. Coimbra et al.²⁷ introduced a decision support system to predict the LOS for preterm infants using 284 cases by the CBR approach. The proposed model led to an improved "retrieval step" using optimization and the logical programming method and reduced the computational time by about 21.3% and an accuracy of 84.9%.

Rodriguez et al.²⁸ proposed a prediction model for pediatric mortality risk by combining CBR, fuzzy set theory, and ANN models. After problem-solving by using the fuzzy ANN model, the CBR made justification of the solutions and stored experiences in the case base by using the KNN method (K = 3). The model was developed with 1079 cases and 33 features. Eventually, 99 cases were evaluated by seven pediatricians, which resulted in 89.89% classification accuracy.

Adawiyah et al.²⁹ suggested a prediction system to detect preterm births based on CBR. For system development, 18 variables were selected and local and global similarity was determined based on Minkowsky distance. Twenty cases were used to evaluate the system in which in 18 cases, the system responses and the recorded results were similar. Hence, the system was able to detect premature infants with 90% accuracy.

Our contribution

Generally, a variety of studies have dealt with providing a decision support system in the neonatal care domain by benefiting from artificial intelligence methods^13,17,18. Studies on mortality, as well as LOS, have mostly focused on statistical methods and providing models^{3,9,16,26,30,31}. On the other hand, we proposed a system to predict the survival and LOS of newborns in NICUs, which can be updated by adding new data (new cases) to its case base. Then, based on all data (whether at the model development or its usage) it can perform the prediction. Furthermore, the system design in this study deals with survival and LOS simultaneously. In other words, considering the importance of simultaneous prediction of survival and LOS for physicians and families of newborns, this system deals with predicting both outcomes concurrently, while previous studies have dealt with only one of these issues alone^{3,13,16,26,27,31}.

Meanwhile, most previous studies have only focused on the evaluation using retrospective data without any implementation in a real setting or external validation and consideration of system acceptability and usability^{3,9,13,16,17,18,26,27,31,32,33}. On the other hand, in this study in addition to the evaluation of the system with retrospective data, the system was implemented in a hospital setting and its external validation was evaluated prospectively. In addition, the usability and acceptability of proposed responses were also evaluated in a clinical setting.

Results

Development phase

The dataset for developing the case base contained 1346 records, which included 1225 alive (91.01%), and 121 dead (8.99%) neonates with an average LOS of 15 days (0–191 days). The distribution of the selected qualitative and quantitative features in the dataset is presented in Supplementary Table S1.

We developed the case-based system using MySQL-V5.2. The CBR process was also implemented according to the proposed architecture by weighted Euclidean distance function and KNN algorithm with PHP programming language. We also normalized the collected data using the maximum-minimum normalization method.

In the proposed CBR system, the problem-solving cycle consists of four steps; retrieval of the similar case(s) to the new problem (retrieval), using the retrieved solution to answer the new problem (reuse), reviewing the new suggested solution (revise), and maintaining the new case and using it for the future problems (retain).

Retrieval: To find a similar case(s), the weighted Euclidean distance similarity function and the KNN algorithm were applied.
Reuse: During the CBR cycle, the suggested solutions are presented as "Approved" or "Unapproved". In this step, the user can apply the system suggestions as a solution to the new problem ("Approved" case).
Revise: When the proposed solution was not approved by the neonatologist, the new case is considered "Unapproved" and resolved by the neonatologist.
Retain: Since the important part of the CBR cycle is learning from the previous cases, the solved problems are maintained in temporary tables. For this purpose, after finding and displaying the system response to the user, this response is temporarily stored in a temporary table, and after the determination of the final neonate's status in the real environment (alive/dead and LOS), the outcome is recorded in the system and transferred from the temporary table to the permanent case base.

We finally developed the web-based CBR prediction system for neonatal survival and LOS, which is available at www.neonatalcdss.ir. Figures 1 and 2 show the data entry and output views for survival and the LOS prediction system. The details related to this system are presented in sections "Evaluation phase" and "Acceptability and confidence evaluation".

Evaluation phase

Retrospective evaluation

The original dataset for the evaluation included 336 records which contained 323 alive (96.13%) and 13 dead (3.87%) neonates with an average LOS of 8.5 days (0–86 days). The distribution of qualitative and quantitative variables of these neonates is presented in Supplementary Table S2.

The performance evaluation on the unbalanced case base regarding neonatal survival showed that the accuracy (97.02%), precision (98.15%), specificity (53.84%), sensitivity (98.76%), F-score (0.984), Matthews Correlation Coefficient (MCC) (0.57), and Kappa coefficient (0.624). In addition, the results of system performance on the neonatal LOS showed the RMSE was 4.79 days (Table 1). The results for the balanced case base showed an improved performance. Supplementary Tables S3–S7 shows the confusion matrix.

Table 1 System performance on retrospective data.

Full size table

External validation

During the implementation period for the external validation, 92 neonates were admitted and included in the analysis. 74 (80.43%) neonates were finally alive, and 18 (19.57%) were dead. The average LOS was 11.39 days (1–90 days). The characteristics of these neonates regarding the selected qualitative and quantitative features are shown in Supplementary Table S8.

External validation on the unbalanced case base showed the accuracy and specificity measures were 97.82% and 88.88%, respectively. Furthermore, the kappa coefficient was 0.928 which indicated a very good agreement between the system predictions and the real outcome. In addition, the external validation for the neonatal LOS showed the RMSE was 3.49 days (Table 2). Furthermore, the system performance for the balanced case base indicated an improved performance compared to the original case base (Table 2). Given that, the external validation was performed in another hospital using prospectively collected data, the improved results show that the system can be used accurately in other healthcare centers. The confusion matrix is presented in Supplementary Tables S3–S7.

Table 2 System performance on prospective data.

Full size table

Acceptability and confidence evaluation

The acceptance and confidence levels were evaluated by using a questionnaire with a 5-Likert scale ranging from one to five (more details in “Acceptability and confidence evaluation”). The Physicians’ acceptance of survival prediction system outputs was higher than the LOS prediction system. For the survival prediction system, the mean score for acceptability and confidence were 4.88 and 4.25, respectively. Furthermore, the physicians’ acceptance and confidence in LOS prediction system responses were 4.96 and 3.96, respectively.

Usability evaluation

Table 3 shows the completed tasks and the mean and standard deviation of the completion time for each task per second. All users performed all tasks successfully.

Table 3 The amount of performed tasks in each scenario.

Full size table

As presented in Table 3, the longest time to perform a task was related to “registration in the system” (202.6 s). It was followed by “inputting the data related to a new case in the neonatal survival system and retrieving similar cases” (111.6 s).

Analysis of think-aloud data indicated that a set of 17 problems were identified. We categorized these problems into groups of "interface design problems", "notifications and guides", as well as "editing the elements". The interface design problems were related to adjustments to the interface and the customized profile. The problem with notifications was related to displaying the notifications pending to be verified by the system administrator for confirming the new users and the new cases. Another suggestion was adding a guideline for the data entry into the system. Further, the “editing the elements” was related to the modification of some terminologies, clarifying the measurement scales of variables, and using a single protocol at the time of data entry.

We categorized these problems and suggestions according to the Nielsen severity scale. A score of 0 means this is not a usability problem at all (2 issues), a score of 1 means cosmetic problems that do not need to be fixed unless extra time is available on the project (8 issues), a score of 2 indicates minor and low priority usability problems but are important to be fixed (4 issues), score 3 means major usability problems which are important to be fixed, so should be given high priority (3 issues), and score 4 reflects usability catastrophic problems that are imperative to be fixed before the product can be released (0 issues)³⁴. In other words, more than half of the usability problems (12 issues out of 17) were due to appearance problems with low usability priority. Further, ten positive comments were provided by the users concerning the system functionalities. Furthermore, the final score of the participants for the SUS was 80.71, suggesting the high usability of the system.

Discussion

We developed, implemented, and evaluated a CBR system to predict neonatal deaths and LOS in NICUs. Our evaluation on the retrospective data with the balanced case base indicated 0.986 and 4.78 days for F-score and RSME, respectively. The results of external validation on the balanced case base indicated 0.993 for the F-score and 3.27 days for the RMSE. In another study that was conducted on the same dataset, different feature selection methods like neonatologists' opinions, and statistical and machine learning methods were performed, and the results showed that the neonatologists' opinions resulted in a better performance³⁵. Therefore, we implemented the CBR system based on features selected by neonatologists.

In another study, the accuracy of a CBR system to detect preterm labor and premature births was 90%²⁹. Jaskari’s study³⁶ for predicting neonatal mortality had a good AUC (0.922) and an F-score (0.477) for the RF classifier. Cooper et al.²⁶ introduced a postoperative mortality risk prediction; their model AUC for the development and validation phases was 0.91 and 0.87, respectively. Beluzos et al.³⁷ developed a new decision-support method for classifying neonates based on neonatal mortality risk and obtained accuracy and AUC of 93% and 0.965, respectively. Other researchers developed a fuzzy expert system to predict neonatal mortality risk with 90% accuracy¹³. A decision tree-based decision support system (DSS) was reported with 63.24% sensitivity and 99.95% specificity for predicting mortality after 10 min, and 63.24% sensitivity and 91.97% specificity for twin pregnancy deaths¹⁸. A pediatric death risk prediction was introduced by combining CBR, ANN, and fuzzy methods with 89% accuracy²⁸. The accuracy, F-score, and sensitivity measures for predicting mortality in these studies were lower than in our study. In Sheikhtaheri’s study³⁵, which was conducted on the same data as the current study, ANN outperformed other machine learning models; however, their ANN has a lower performance compared to our suggested CBR system.

The accuracy of the LOS prediction system provided by Coimbra et al. was 84.9%²⁷. Pepler’s suggested model had an AUC and accuracy of 0.85 and 86.4%, respectively for death prediction and R² = 0.70 for LOS prediction⁹. Among the reviewed studies, only in one study²⁸, the system has been implemented in a real environment and performed external validation. Table 4 summarizes the related studies.

Table 4 Comparison of our results with the related literature.

Full size table

One of the factors affecting the approval of DSSs is the users’ acceptance and a usable user interface. In this regard, cognitive methods have gained popularity for identifying system usability problems. Meanwhile, an important prerequisite in designing an effective user interface is minimizing cognitive demands³⁸. In this study, according to the Nielsen scale, more than half of the usability problems (12 out of 17 cases) were due to the appearance of the system with low priority. Further, the SUS score (80.71) indicates that the system is considered usable and user-friendly by neonatologists.

In addition, the evaluation of the system acceptability suggested that neonatologists were satisfied and confident in the outputs of the system for most cases. Other researchers evaluated the usability of a DSS for antibiotic prescription in NICUs³⁹, mobile applications for perinatal period health⁴⁰, and mobile applications for pregnant women⁴¹ and highlighted the importance of system usability in this field. For example, in line with our results, physicians indicated the importance of the appearance and design of the user interface of a DSS for prescribing antibiotics⁴². Other usability evaluations of CBR systems indicated the importance of learnability, memorability⁴³, ease of use, and confidence in these systems⁴⁴. These studies suggest that CBR systems are accepted by physicians if they are designed properly.

The main audience of this study is neonatologists, who through the system can make decisions on determining the outcomes of neonates at the time of admission in NICUs. In addition, the use of the system to identify neonates at risk of death allows for developing a specialized team for better decision-making and providing advanced care for neonates at higher risk. Moreover, healthcare providers and hospital managers would be able to allocate and plan properly for hospital resources to manage NICU beds and workload. Application of this system can lead to an improved notification to parents about the duration required for their babies’ hospitalization in NICU as well as the healthcare costs plus notification to them about the neonate’s outcome as well as their psychological preparation.

Some limitations should be considered. The number of dead neonates was far lower than the live ones. Despite creating artificial data for the mortality class, “specificity” was low for the retrospective data. However, external validation suggested an improved performance of the system for predicting neonate survival and LOS. Further, the small number of samples for the external validation was due to the implementation of this system in only one hospital in 3 months which is another limitation of this study. It is suggested that the system should be implemented in more hospitals and evaluated with more samples. In addition, we applied simple methods (mean, median, and mode replacement) to impute the missing values, it is suggested that other researchers apply more advanced imputing methods in future studies.

In conclusion, we introduced a web-based CBR system for predicting neonatal death and LOS in NICUs. The evaluation showed that the system has a good performance in predicting neonatal survival and LOS based on similar cases. Moreover, the system outputs are mostly acceptable and trustable.

Methods

Study design and settings

This study was performed in Tehran, Iran on NICU-admitted neonates. To design the system, we used the data available in the "Maternal, Fetal, and Neonatal Research Center" as an academic center. External validation of the system was performed prospectively with the admitted neonates to the NICU of “Yas” hospital affiliated with Tehran University of Medical Sciences (TUMS).